Auditability
Last reviewed April 2026
The regulator asks why a specific customer was declined for a mortgage. The decision was made by an AI model seven months ago. Can the firm reproduce the exact inputs, the exact model version, and the exact output that produced that decision? If the answer is no, the firm has an auditability problem, and in financial services, an unauditable decision is an indefensible one.
What is auditability?
Auditability is the capability to trace, reproduce, and verify the decisions made by an AI system after the fact. It requires that every material decision is accompanied by a record of the input data, the model version, the model's output, any human actions taken on that output, and the timestamp of each step. For financial services, where decisions affect individuals' access to credit, insurance, and financial products, auditability is the mechanism through which accountability is made real.
The technical requirements are specific. The input data must be captured at the point of decision, not reconstructed later, because data can change between the decision and the audit. The model version must be recorded, because models are updated and the current version may produce a different output from the version that made the original decision. The output must be stored, not just the final decision, because post-processing rules may transform the model's raw output before it reaches the customer. And the human actions, overrides, approvals, modifications, must be logged with the identity of the person and the reason for the action.
Auditability serves multiple purposes: responding to customer complaints, supporting model validation and monitoring, meeting regulatory enquiry requirements, and enabling internal audit reviews. Each purpose has different requirements for speed of access, depth of detail, and retention period. The audit trail must be designed to serve all of them.
The landscape
The EU AI Act requires that high-risk AI systems maintain logs sufficient to ensure traceability of the system's functioning throughout its lifecycle. Article 12 specifies that the logging must enable monitoring of the system's operation and post-market monitoring. For financial institutions, this means the audit trail must capture not just what the model decided but how the model was performing at the time of the decision.
The PRA's SS1/23 requires that models be subject to effective challenge, which requires access to the model's inputs, outputs, and assumptions. For AI models, effective challenge requires the ability to replay decisions, understand why a specific output was produced, and test whether the model's behaviour is consistent with its documented purpose and limitations.
GDPR's subject access rights (Article 15) and the right to an explanation of automated decisions (Article 22) create individual-level auditability requirements. A customer who has been subject to an automated decision can request information about the logic involved. The firm must be able to retrieve the specific decision, its basis, and the meaningful information about the logic used. This is an individual record-level requirement, not a system-level one.
How AI changes this
Decision logging infrastructure captures every model invocation with its full context: input features, model version identifier, raw output, post-processing transformations, and final decision. Modern ML serving platforms can capture this automatically, but the logging must be configured deliberately. Default logging configurations often omit input data or post-processing steps that are essential for auditability.
Immutable audit trails, stored in append-only databases or distributed ledger systems, prevent after-the-fact modification of decision records. In financial services, where decisions are subject to legal and regulatory challenge, the integrity of the audit trail is as important as its completeness. Immutability provides evidence that the record accurately reflects what happened at the time of the decision.
Decision replay capability allows auditors to reproduce a historical decision using the same inputs and model version. This requires versioned model storage (every model version is preserved and can be re-instantiated) and point-in-time data snapshots (the input data as it existed at the time of the decision). Replay is the strongest form of auditability because it demonstrates not just what happened but that the recorded decision is consistent with the model's actual behaviour.
Explanation generation at decision time, stored alongside the decision record, provides the narrative that supports the audit trail. A record that shows the decision and the explanation together enables faster response to customer complaints and regulatory enquiries, because the explanation does not need to be regenerated from the model after the fact (which may not be possible if the model has since been updated).
What to know before you start
Design auditability into the architecture from the start. Retrofitting audit logging into an existing AI system is expensive, disruptive, and often incomplete. Every AI system should include decision logging, model versioning, and input data capture as architectural requirements from the design stage. Treat these as non-functional requirements with the same priority as performance and availability.
Storage costs are real but manageable. Logging every model invocation with full input data produces significant data volumes for high-throughput systems. Design a retention policy that balances auditability requirements with storage costs: full detail for the most recent period (twelve months is common), summarised records for longer retention, and specific decision records preserved indefinitely when they are subject to a complaint or regulatory enquiry.
Test the audit trail before you need it. Run a simulated regulatory enquiry: select a random decision from six months ago and attempt to retrieve the full audit record, including inputs, model version, output, and human actions. The time to retrieve and the completeness of the record will reveal gaps in your auditability infrastructure that are better discovered in a test than during an actual enquiry.
Start with the highest-risk decisions: credit, fraud detection, and insurance pricing. Implement full decision logging for these systems first. Extend to lower-risk systems as the infrastructure matures and the controls framework clarifies the auditability requirements for each risk tier.
Last updated
Exploring AI for your organisation? There are fifteen minutes on the calendar.
Let’s build AI together