AI Agent

Last reviewed April 2026

An AI model that answers questions is useful. One that takes action is different. An AI agent can research a customer's complaint, check three internal systems, draft a response, and flag it for human approval, all without a person orchestrating each step. In financial services, where regulatory accountability requires a clear chain of responsibility, the line between assistance and autonomy must be drawn carefully.

What is an AI agent?

An AI agent is a system that uses a large language model to plan and execute multi-step tasks, making decisions about what actions to take, in what order, and how to handle intermediate results. Unlike a simple chatbot that responds to a single query, an agent reasons about a goal, breaks it into sub-tasks, uses tools (databases, APIs, search engines) to complete each sub-task, and iterates until the goal is met.

The defining characteristic is autonomy within bounds. A chatbot waits for instructions. An agent pursues an objective. Ask an agent to "investigate this customer complaint and prepare a response" and it decides which systems to query, what information to extract, and how to structure the response. This decision-making capacity is what makes agents powerful and what makes governance essential.

The technology is built on function calling: the model decides which tool to invoke (search a database, call an API, read a document) and interprets the result to decide the next action. Each tool call is an action with consequences. Reading a database is low risk. Sending an email to a customer or updating a policy record is not. The risk profile depends entirely on which tools the agent can access.

The landscape

The agent landscape is maturing rapidly but remains early. OpenAI, Anthropic, Google, and open-source frameworks (LangGraph, CrewAI, AutoGen) all offer agent capabilities. The tooling is functional but the patterns for production deployment in regulated environments are still emerging. Most financial services deployments today are internal, assisting analysts and operators rather than acting on behalf of customers.

The EU AI Act's requirements for human oversight in high-risk applications are directly relevant to agents. An agent that makes or influences decisions about credit, insurance, or fraud must operate within a framework that ensures meaningful human control. "Human in the loop" is easy to state but complex to implement when an agent can take twenty actions in the time it takes a human to review one.

The FCA's Senior Managers and Certification Regime means someone in the firm is accountable for the agent's actions. If an agent sends an incorrect communication to a customer or makes an error in a regulatory filing, the responsible senior manager cannot delegate accountability to the software. This does not prohibit agents. It demands that their scope, permissions, and oversight mechanisms are clearly defined.

How AI changes this

Operational investigation is the strongest near-term use case. A compliance analyst investigating a suspicious transaction currently queries three or four systems manually, cross-references the results, documents the findings, and drafts a recommendation. An agent performs the same workflow in minutes. It queries the transaction monitoring system, retrieves the customer's KYC profile, searches for adverse media, checks sanctions lists, and produces a structured investigation summary. The analyst reviews and decides.

Customer service triage becomes more capable. An agent handling an insurance claim enquiry can check the policy status, retrieve the claims history, identify the relevant policy terms, and draft a response that addresses the customer's specific question. This goes beyond the scripted responses of traditional chatbots while staying within defined boundaries. The agent acts on information from authorised sources rather than generating from its training data.

Report generation for regulatory reporting and internal management benefits from agents that can gather data from multiple systems, apply business rules, and produce a structured output. A quarterly risk report that previously required an analyst to query five systems and spend two days compiling data can be drafted by an agent in hours, with every figure traceable to its source.

What to know before you start

Define the agent's action space before building it. An agent with read-only access to internal systems is fundamentally different from one that can update records, send communications, or execute transactions. Start with read-only agents that assist human decision-makers. Expand the action space incrementally, with governance approval for each new capability. An agentic workflow with write access to a customer-facing system needs the same change management rigour as a new software release.

Logging is non-negotiable. Every tool call, every intermediate reasoning step, and every decision the agent makes must be logged in an immutable audit trail. When a regulator asks why a particular action was taken, you need to show the complete chain of reasoning and data that led to it. Build observability from the start, not after the first incident.

Guardrails must operate at the tool level, not just the prompt level. A prompt that says "never send customer data externally" can be circumvented. A tool configuration that physically cannot call external APIs cannot. Defence in depth applies to agents as much as it applies to network security.

Start with a well-defined, repetitive investigation workflow where the steps are known but time-consuming. Fraud investigation triage, complaint handling research, or vendor due diligence are good candidates. Build the agent, run it alongside the human process for a month, compare outputs, and measure accuracy before granting any autonomy.

Last updated

Exploring AI for your organisation? There are fifteen minutes on the calendar.

Let’s build AI together
← Back to AI Glossary