Prompt Engineering

Last reviewed April 2026

The same language model can produce a usable first draft of a regulatory report or a vague summary that wastes everyone's time. The difference is not the model. It is the instruction. Prompt engineering is the practice of designing inputs to large language models that produce reliable, useful outputs, and in regulated financial services, that reliability is not optional.

What is prompt engineering?

Prompt engineering is the design and optimisation of the text instructions given to a language model to control its output. It encompasses the system prompt (persistent instructions that define the model's role and constraints), the user prompt (the specific question or task), and the context window (the documents or data provided alongside the prompt).

The practice matters because language models are sensitive to phrasing. A prompt that says "summarise this document" produces different output from one that says "extract the five key regulatory obligations from this document, list each with the specific paragraph reference, and flag any that conflict with our existing policies." The second prompt produces output a compliance officer can use. The first produces output they must rewrite.

In financial services, prompt engineering is not a creative exercise. It is a control mechanism. The prompt determines whether the model produces structured, verifiable output or free-form text that requires extensive human review. The quality of the prompt is directly proportional to the usefulness of the model in a regulated environment.

The landscape

Prompt engineering has moved from an informal skill to a structured discipline. Major AI providers publish prompt engineering guides. Dedicated tooling for prompt management, versioning, and testing has emerged. The analogy to software engineering is apt: prompts are instructions that should be version-controlled, tested, peer-reviewed, and monitored in production.

The PRA's SS1/23 on model risk management does not mention prompts explicitly, but its requirements for model documentation, validation, and change management apply to the prompt layer of an AI system. Changing a system prompt can change the model's behaviour as significantly as retraining it. A prompt change that is not tested, documented, and approved is an uncontrolled model change.

The EU AI Act's requirements for transparency and human oversight in high-risk applications extend to the prompt layer. If a model's behaviour is determined by its prompt, and the prompt is opaque to the oversight function, the firm cannot demonstrate effective human oversight. Prompt documentation is part of the compliance evidence.

How AI changes this

Structured prompts transform LLMs from general-purpose tools into domain-specific assistants. A prompt that instructs the model to "act as a UK compliance analyst, cite specific FCA handbook references, and flag any uncertainty in your assessment" produces qualitatively different output from an unconstrained model. This is how institutions deploy one foundation model across multiple use cases: different prompts for compliance, for claims processing, for customer service.

Few-shot prompting (providing examples of desired input-output pairs within the prompt) is particularly effective for financial services tasks. Show the model three examples of correctly formatted suspicious activity report narratives, and its output for the fourth will follow the same structure. This technique reduces the need for fine-tuning while achieving 80 to 90 per cent of the quality improvement.

Chain-of-thought prompting forces the model to show its reasoning, which is essential for auditability. Rather than asking "is this transaction suspicious?" and getting a yes or no, ask "assess this transaction step by step: first evaluate the transaction amount relative to the customer's profile, then assess the counterparty risk, then evaluate the geographic risk, and finally provide your assessment with confidence level." The detailed reasoning is the audit trail.

What to know before you start

Treat prompts as code, not as conversation. Store them in version control. Write tests that validate output quality against expected results. Require peer review for changes to production prompts. A casual edit to a system prompt can silently change the behaviour of a system that processes thousands of documents per day.

Prompt injection is a security risk. A malicious user (or a malicious document processed by the system) can include text that overrides the system prompt's instructions. In a financial services context, this could mean a document that instructs the model to ignore compliance checks or reveal system information. Guardrails and input validation are necessary defences. Never trust that the prompt alone will constrain the model's behaviour.

The best prompt for today's model may not work for tomorrow's model version. Model providers update their models regularly, and behaviour can change. Build regression tests that run against your prompt library with each model version. Automate the comparison so that behaviour changes are detected before they reach production.

Start by auditing how your organisation currently uses language models. Collect the prompts people are writing ad hoc in chat interfaces. Identify the five most common use cases, design optimised prompts for each, and make them available as templates. The productivity difference between a well-designed prompt and an ad hoc one is typically 40 to 60 per cent, measured in time to a usable output.

Last updated May 2026

Exploring AI for your organisation? There are fifteen minutes on the calendar.

Let’s build AI together

← Back to AI Glossary