Finance
Developing · 1 updateFact 9/10Revolut Unveils PRAGMA, Encoder Foundation Model Pre-Trained on Large-Scale Banking Data
Digital banking platform Revolut has introduced PRAGMA, an encoder-style foundation model trained on multi-source banking user histories. Pre-trained using masked modeling on large-scale financial records, the model may support user behavior understanding and predictive tasks in financial services.
Open article · no sign-in required
Sources and disclosure
All key factual claims in the article are directly supported by the provided arXiv paper snippets. The article accurately describes PRAGMA as an encoder-style foundation model from Revolut, trained on multi-source banking user histories using masked modeling on a large-scale corpus. It correctly identifies the model's purpose for user behavior understanding and predictive tasks, listing specific downstream applications like credit scoring, fraud detection, communication engagement, recommendation, and lifetime value tasks, which are explicitly mentioned in the source. The article maintains a neutral tone and adheres to reputation safety guidelines, discussing potential applications and limitations without overclaiming or speculation.
Market lens
Separate infrastructure signal from investable outcome
Treat market-linked stories as context: identify the mechanism, then wait for evidence before treating it as an outcome.
Impact path
Signal first, outcome later
Signals to watch
- Primary-source guidance and filings
- Price, volume, margin, and renewal evidence
- Follow-up reporting that confirms or rejects the mechanism
Verification schedule
D+1 · Jun 15
Is the mechanism visible in primary data?
D+3 · Jun 17
Do follow-up sources confirm direction and magnitude?
D+7 · Jun 21
Did the initial read overstate the market effect?
Informational context only — not investment, legal, tax, or financial advice.
Digital banking platform Revolut has introduced PRAGMA, a foundation model pre-trained on financial transaction data. The model processes multi-source banking user histories using an encoder architecture and is trained with masked modeling techniques applied to large-scale financial records.
Emergence of Finance-Specific Foundation Models
Unlike general-purpose language or vision models, PRAGMA is a domain-specific foundation model tailored to financial transaction records. Revolut is reported to have used transaction histories, account activities, and payment patterns generated on its platform as training data. The encoder-style architecture focuses on learning representations of input data, a structure well-suited for downstream tasks such as classification, anomaly detection, and user segmentation analysis.
Masked modeling has been widely used in natural language processing. The technique involves masking portions of an input sequence and training the model to predict the masked elements, helping capture patterns and contextual information in the data. Applied to financial transaction data, this approach can help the model learn spending habits, transaction timing, and relationships across categories.
Operational Context in Financial Services AI
Financial institutions have long used machine learning for tasks including anomaly detection, credit scoring, personalized recommendations, and customer churn prediction. Traditional approaches often rely on individual models designed for specific tasks. The foundation model approach allows a single large-scale pre-trained model to be reused across multiple tasks, which can improve development efficiency and consistency.
Revolut, founded in 2015, has grown across Europe and beyond, serving tens of millions of users globally as of 2024. A user base of this scale can help provide the large-scale datasets needed for foundation model training. Because financial data is difficult to share externally due to sensitivity and regulatory requirements, institutions with proprietary data may be well positioned to develop domain-specific models.
The reference to multi-source user histories suggests PRAGMA can integrate and process multiple data streams beyond simple transaction records, including account types, card usage patterns, transfer histories, and currency exchange records. This capability may support a more comprehensive understanding of user financial behavior and could offer better generalization than models relying on a single data source.
Technical Architecture and Training Methodology
Encoder models specialize in transforming input sequences into fixed-length vector representations. These representations serve as the basis for tasks including user profiling, risk assessment, and behavior prediction. Unlike decoder-centric generative models, encoder models focus on learning the semantic structure of input data in compressed form.
Masked modeling is a form of self-supervised learning that enables representation learning from large-scale unlabeled data. By masking specific transactions in a financial sequence and training the model to reconstruct them from surrounding context, the model can learn temporal dependencies between transactions, amount patterns, and category transition rules. This is a way to capture structural characteristics of data without explicit labels.
Pre-training on large-scale financial records requires substantial compute and infrastructure, but the learned representations can be applied to multiple downstream tasks via transfer learning. This approach can offer advantages in data efficiency and performance compared with training models from scratch for each task. The value of pre-trained representations is especially notable in tasks where labeled data is limited.
Competitive Landscape and Strategic Implications
Major financial institutions are also investing in proprietary AI capabilities. JPMorgan has pursued development of finance-specific language models, and Bloomberg released BloombergGPT. Fintech companies are similarly developing domain-specific models using data advantages. PRAGMA reflects Revolut's technology strategy within this broader landscape.
While the foundation model approach demands high development costs and infrastructure requirements, successful implementation can yield a reusable representation learning asset applicable across multiple tasks. If Revolut uses PRAGMA to support improved anomaly detection, personalization, or operational efficiency, the model could become part of its internal AI stack.
Given the difficulty of external data sharing due to the nature of financial data, companies with data generated on their own platforms may have advantages in model development. This can reinforce data network effects, creating a cycle in which larger user bases contribute to model quality. PRAGMA can be viewed as an example of user scale being applied to model development.
Uncertainty and Constraints
Publicly available information does not reveal PRAGMA's specific model size, training data scale, performance benchmarks, or deployment plans. While the work appears to have been presented in paper form, commercial service integration or external release plans remain unspecified. Given the sensitivity of financial data, open-source release of the model or training data appears unlikely.
Financial regulatory environments impose strict requirements on AI model use. In Europe, regulations such as GDPR and frameworks like the AI Act influence model development and deployment. In the United States, financial consumer protection rules and fair lending requirements can affect explainability and bias management in AI models. For PRAGMA to be integrated into financial services, it would need to satisfy these regulatory requirements.
Given the characteristics of encoder models, PRAGMA is better suited for analysis and prediction tasks than generative tasks. Use cases involving customer interaction or content generation may require separate decoder or generative models. PRAGMA is therefore best understood as a component serving a specific role within Revolut's AI infrastructure.
Model performance depends heavily on the quality and diversity of training data. If Revolut's user base is concentrated in specific regions or demographic groups, the model's generalization capability may be limited. Additionally, financial behavior patterns change with economic conditions, regulatory shifts, and technological advances, which can require ongoing updates and retraining.
Builder Implications
- Companies with proprietary financial data may consider domain-specific foundation model development. Data scale and quality are important determinants of model performance, and expanding the user base can contribute to improved model quality.
- The combination of encoder architecture and masked modeling is effective for learning from sequence data such as transaction records and user behavior logs, enabling representation learning suitable for classification and prediction tasks. Self-supervised learning is particularly useful where labeled data is limited.
- Deploying financial AI models requires consideration of regulations including GDPR, the AI Act, and financial consumer protection rules. Explainability and bias management frameworks should be considered from the initial design phase. Building infrastructure to track and audit model decision-making processes is important.
Want follow-up alerts? Subscribe by email after reading the public article.
Market lens
Separate infrastructure signal from investable outcome
Treat market-linked stories as context: identify the mechanism, then wait for evidence before treating it as an outcome.
Impact path
Signal first, outcome later
Signals to watch
- Primary-source guidance and filings
- Price, volume, margin, and renewal evidence
- Follow-up reporting that confirms or rejects the mechanism
Verification schedule
D+1 · Jun 15
Is the mechanism visible in primary data?
D+3 · Jun 17
Do follow-up sources confirm direction and magnitude?
D+7 · Jun 21
Did the initial read overstate the market effect?
Informational context only — not investment, legal, tax, or financial advice.
Visual Briefing
The model learns from multiple banking data streams, then its representations can be reused for analysis tasks under regulatory constraints.
Corrections and safety
See a factual, privacy, rights, or safety issue? Review the corrections process or contact Guidances before relying on this article for important decisions.