Machine Learning (ML)
Last reviewed April 2026
A credit scoring model that learned from ten years of lending data just declined a profitable customer segment that didn't exist five years ago. Machine learning (ML) is only as good as the data it learns from, and in financial services, yesterday's patterns are an unreliable guide to tomorrow's risks.
What is machine learning?
Machine learning is the subset of artificial intelligence where systems learn patterns from data rather than following explicitly programmed rules. Instead of a developer writing "if transaction amount exceeds 10,000 pounds, flag for review," an ML model examines thousands of historical transactions, both legitimate and fraudulent, and learns which combinations of features predict fraud. The model discovers the rules from the data.
Three categories cover most ML in financial services. Supervised learning trains on labelled examples: past loans marked as defaulted or repaid, past claims marked as fraudulent or legitimate. The model learns to predict the label for new cases. Unsupervised learning finds structure in unlabelled data: clustering customers by behaviour, detecting anomalies in transaction patterns. Reinforcement learning, where a model learns by trial and error in an environment, has limited but growing applications in trading and portfolio optimisation.
The practical challenge is not building the model. Open-source libraries and cloud platforms make model construction accessible to any team with basic data science skills. The challenge is everything around the model: sourcing clean training data, validating that the model performs fairly across customer segments, deploying it into production systems, and monitoring it as the underlying data distribution shifts. The model is 10 per cent of the work. The infrastructure is the other 90 per cent.
The landscape
The PRA's SS1/23 on model risk management sets clear expectations for how banks govern ML models. Every model must have a defined owner, documented assumptions, validated performance, and ongoing monitoring. The days of a data scientist building a model on their laptop and pushing it to production without governance are over, at least at regulated institutions.
The EU AI Act adds a second regulatory layer for ML models used in high-risk applications. Credit scoring, insurance pricing, and fraud detection models must demonstrate data quality, bias testing, and explainability. For gradient-boosted trees and logistic regression, explainability is achievable. For deep neural networks, it remains technically challenging and is pushing institutions toward inherently interpretable model architectures.
AutoML platforms are compressing the model development cycle. Automated feature engineering, hyperparameter tuning, and model selection reduce the time from data to prototype from months to days. This democratisation is double-edged: faster development is valuable, but faster deployment of poorly governed models is dangerous. The bottleneck has shifted from model building to model governance.
How AI changes this
Ensemble methods that combine multiple ML models outperform any single model on most financial services tasks. A fraud detection system might combine a gradient-boosted tree for transaction-level scoring, a graph neural network for network analysis, and a recurrent neural network for sequence modelling. Each captures different patterns. The ensemble catches what any individual model misses.
Transfer learning and foundation models are reducing the data requirements for new use cases. A model pre-trained on broad financial text can be fine-tuned for contract extraction, regulatory classification, or customer complaint categorisation with a fraction of the labelled data previously required. This changes the economics: use cases that were too niche to justify the data collection and labelling cost become viable.
Federated learning allows institutions to train models collaboratively without sharing raw data. Each institution trains on its own data and shares only the model updates, not the underlying records. This is particularly relevant for anti-money laundering, where cross-institutional patterns are valuable but data sharing is legally and commercially constrained. The technique is emerging but not yet mainstream in production financial services systems.
What to know before you start
Choose the simplest model that meets the performance requirement. Logistic regression and gradient-boosted trees are interpretable, well understood by regulators, and sufficient for most classification tasks in financial services. Deep learning adds value for unstructured data (text, images, sequences) but introduces explainability and validation complexity. Complexity must earn its place.
Your model monitoring capability must be in place before your first model reaches production. ML models degrade as the data distribution shifts. A model trained on pre-pandemic lending data will misjudge post-pandemic borrower behaviour. Without monitoring, you won't know the model is failing until the losses appear in your financial results, months after the degradation began.
Bias testing is not a one-time exercise. A model that is fair at launch can become discriminatory as the population changes. Test across protected characteristics at deployment and continuously thereafter. The FCA's expectations on fair treatment apply to ML-driven decisions as much as to human ones.
Start with a use case where you have abundant, well-labelled historical data and a clear business metric to optimise. Fraud detection and credit scoring are common starting points because the labels (fraud/not fraud, default/repaid) are definitive and the business value of improved accuracy is measurable in pounds.
Last updated
Exploring AI for your organisation? There are fifteen minutes on the calendar.
Let’s build AI together