Embeddings
Last reviewed April 2026
Financial institutions sit on vast stores of unstructured data: policy wordings, customer complaints, regulatory guidance, internal memos. Traditional search fails because the same concept appears under different words. Embeddings turn text into numerical vectors that capture meaning, not just keywords, making it possible to find what you need even when you don't know the exact terminology.
What are embeddings?
An embedding is a numerical representation of a piece of content (a word, sentence, paragraph, or document) in a high-dimensional vector space. Points that are close together in this space are similar in meaning. "Anti-money laundering" and "financial crime prevention" end up near each other even though they share no words. "Bank" as a financial institution and "bank" as a riverbank end up far apart because the model has learned their meanings from context.
In practice, an embedding model takes a chunk of text and returns a list of numbers, typically 768 to 3,072 dimensions. These numbers encode semantic meaning in a way that machines can compare. The cosine similarity between two embedding vectors tells you how conceptually related the underlying texts are. This is the foundation of semantic search, recommendation systems, and clustering.
Embeddings are not new. Word2Vec was published in 2013. What has changed is the quality and accessibility. Modern embedding models, trained on billions of documents, produce representations that capture subtle distinctions: the difference between a regulatory breach and a regulatory consultation, or between a standard motor claim and a subrogation opportunity.
The landscape
The tooling has matured rapidly. Major cloud providers (AWS, Azure, GCP) offer embedding APIs. Open-source models from providers like Hugging Face can run on private infrastructure. Vector databases, purpose-built to store and query embeddings at scale, have moved from experimental to enterprise-grade in under three years.
For financial services, the data residency question is the first consideration. Sending customer complaints to a third-party embedding API means sending the text of those complaints to a third party. The FCA's outsourcing expectations and the EU AI Act's data governance requirements both apply. Self-hosted embedding models avoid this issue entirely, and the computational cost of running them is modest compared to running a large language model.
Quality varies significantly between models and domains. A general-purpose embedding model may not distinguish effectively between financial concepts that matter to your business. "Capital adequacy ratio" and "leverage ratio" are both banking metrics, but they measure different things. Domain-adapted embedding models, fine-tuned on financial text, consistently outperform general models for financial services search and classification tasks. Benchmark accuracy improvements of 15 to 25 per cent are typical when moving from a general to a domain-adapted model.
How AI changes this
Semantic search across compliance archives is the highest-value entry point. When a regulator issues new guidance, a compliance team needs to find all internal policies, past decisions, and customer communications that relate to the new requirement. Keyword search misses documents that discuss the same concept using different language. Embedding-based search finds them. One UK bank reduced the time to complete a regulatory mapping exercise from three weeks to two days after deploying semantic search over its policy library.
Customer complaint analysis becomes tractable at scale. Rather than reading thousands of complaints to identify themes, embeddings allow you to cluster complaints by meaning, revealing patterns that keyword categorisation misses. A cluster of complaints about "unexpected charges after switching" and "fees I wasn't told about when I moved" are the same issue, expressed differently.
Embeddings are the enabling layer for retrieval-augmented generation. When an LLM needs to answer a question about your organisation's policies, embeddings determine which documents are retrieved as context. The quality of the embedding directly determines the quality of the LLM's response. Poor embeddings feed irrelevant context, producing confident but wrong answers.
What to know before you start
Chunking strategy matters more than model choice. Before text can be embedded, it must be split into chunks. Too large and the embedding loses specificity. Too small and it loses context. A 500-word policy section about capital requirements and client money protections produces a blurred embedding that is weakly relevant to both topics. Splitting by logical section (headings, clauses, paragraphs) outperforms splitting by character count.
Embedding models need updating. Your document corpus changes as regulations evolve and new policies are written. If you embedded your policy library six months ago and haven't re-embedded new documents, your search is blind to recent changes. Build a pipeline that embeds new and updated documents automatically. Treat your embedding index the way you treat a search index: continuously refreshed.
Test retrieval quality with domain experts, not just technical metrics. Cosine similarity scores look precise but they don't tell you whether the right document was returned. Have compliance officers, underwriters, or claims handlers evaluate the results against real queries they would actually run. A system that scores well on benchmarks but returns the wrong regulatory guidance is worse than useless.
Start with a well-defined, high-value corpus: your data governance policy library, your regulatory correspondence archive, or your product documentation. Embed it, build a search interface, and test it with the people who use those documents daily. The feedback will tell you whether your chunking, model choice, and retrieval logic are working before you scale to the entire organisation.
Last updated
Exploring AI for your organisation? There are fifteen minutes on the calendar.
Let’s build AI together