Synthesized, today unveils details of a successful collaboration with the Financial Conduct Authority (FCA) and a leading fraud prevention vendor, aimed at assisting the development of new solutions to detect and prevent fraud and scams exacerbated by the Covid-19 pandemic.
Synthesized engaged in the FCA’s DataSprint, aimed at developing synthetic datasets to be used by participants in the Digital Sandbox Pilot, jointly launched by the FCA and City of London Corporation. The objective of the collaboration was to solve the challenge of building a better synthesised transactional bank fraud model.
Transactional bank fraud can be a notoriously difficult and complex problem to address. Collaboration can be difficult due to the need for highly controlled records review even within the same company or government body. Given the urgency of this growing problem, it is imperative to quickly bring to market faster, safer, and more collaborative data projects.
The result of this collaboration produced a safe-to-share synthetic dataset that will allow participants of the Digital Sandbox Pilot to analyse fraudulent digital banking transactions and apply this knowledge to better detect fraudulent activity in their own environments.
Synthesized, who are pioneers in automating all stages of data provisioning and data preparation, applied its cutting-edge AI data synthetisation technology. This was uniquely able to create a synthetic version of the original fraud dataset supported by the fraud prevention vendor.
The London-based AI company successfully leveraged the platform’s generative adversarial modelling, differential privacy, and other tactics in addition to a dataset from the fraud prevention vendor. The original dataset contained five million rows and 724 columns representing real bank digital payments. Synthesized automatically derived the deep statistical properties of the original dataset in order to create a highly representative version for predictive purposes.
To help ensure data output quality, Derek Snow, a Research Associate from The Alan Turing Institute, independently assisted the FCA in ensuring that the resulting data was high quality. Among other things, Mr. Snow evaluated and tested the feature correlations, joint distributions, and predictability of the data.
The enhanced data project produced by Synthesized revealed two key advantages for the financial industry:
Cyber Safety by Default: The synthetisation engine creates an entirely new data project that while representative of the original, cannot be linked back to the original dataset. Hence, it is not vulnerable to linkage attacks and is safe for use by participants in the Digital Sandbox Pilot.
- High-Quality Data: Vetted by the Allan Turing Institute, the new synthetic data that the project produced was highly accurate and rendered the same properties as the original. This means all fraud solutions and tactics tested will be relevant to the original dataset and users can apply the intelligence gathered as a result of real-world transactional fraud.
Simon Swan, the lead machine learning (ML) engineer from Synthesized for this project noted, “Datasets of this size and complexity offer interesting and insightful challenges for Synthesized’s core engine. It is always exciting to see just how powerful our platform is when we are able to successfully recreate new, highly detailed data at such scale, as was the case with this collaboration.”
Dr Nicolai Baldin, CEO and Founder of Synthesized added, “Synthesized was delighted to be involved as development partner on this game-changing fraud initiative. It is further validation of both our product leadership and unique data capabilities in the data privacy and secure collaboration space. This provides further proof of our depth of in-house expertise employing the latest ML techniques within our platform.”