Artificial intelligence is all the rage. From large language models like ChatGPT to image generators like Stable Diffusion, DALL-E, and Midjourney, generative AI products have become household brands, breaking through the divide between the tech-savvy and the average consumer even before making their way into everyday products like Bing or Google Search. Writers, painters, and creators from all walks of life are now grappling with the existential threat posed by these powerful algorithms, ready to challenge human craftsmanship. The truth is AI will not replace skilled humans. Instead, it will radically transform how we create, pushing us all to become expert prompters of powerful AI systems. We’ll also be the subject of AI-based decision-making systems that might be less than fair, creating a new underclass of society, some call the excoded. People are already the subject of AI-powered privacy attacks, scams, and other misfortunes. Bankers will not be spared.
Although generative AI and machine learning is nothing new in banking and finance, applications of these kinds have so far been limited to being the experimental toys of innovation departments and the subject of fancy presentations about banks of the far-away future. The stark reality is that – according to research by KDNuggets – 80% of machine learning projects die a silent death, never even making it to production. Banks are especially cautious, and that’s not a bug but a feature. Unless a piece of technology is 100% reliable, it should not see the light of day, let alone make automated decisions about thousands of customers.
At the same time, the pressure to innovate and to make use of the immense power artificial intelligence and ML models can bring to the front, middle, and back office operations is huge. From AI-supported customer service to credit risk scoring algorithms and mortgage analytics, the list of potential areas of applications is extensive. What stops most banks from scaling AI and machine learning products and transforming traditional functions along the way is limited data access.
The missing piece of the AI puzzle – meaningful customer data
Banks have been profiting from gathering data and making decisions based on the available information for hundreds of years. The books have always been the single source of truth, and there is nothing new in that sense. What is new is the scale of things. The established dominance of digital services makes transaction data an ever-growing asset banks and financial service providers are keen to derive their intelligence from.
Privacy blindposts hinder this effort. The Mobey Forum, a knowledge platform for the banking industry, found that banks and financial institutions rely too heavily on legacy data anonymization tools, not even being aware of the privacy and security risks these old tools – think aggregation, hashing, generalization or randomization – pose for them. Not only did adversaries advance technologically and use AI tools to reidentify individuals in leaked datasets, but the sheer amount and quality of the data collected made data anonymization an increasingly difficult job.
The pandemic only accelerated this trend, making digital services the default way to do banking worldwide. The more data you have, the more difficult it is to anonymize. Behavioral data, like transaction or mobility data, is especially tough to anonymize, even though this kind of data is the most valuable fuel for training machine learning and AI applications. The trouble with behavioral data is that it is sequential. This sequential or time-series quality makes behavioral data amazingly insightful for algorithms and notoriously difficult to anonymize. Sequential data, like transaction data, act almost like fingerprints, easily singling our individuals from huge groups by their unique spending patterns.
In order to protect customers today, it is no longer sufficient to use data anonymization tools and security protocols a decade old. A new generation of privacy-enhancing technologies is available to enable safe, machine learning-ready data anonymization. Some of these technologies, like synthetic data, use AI themselves, while others, like federated learning, have been created with the specific requirements of machine learning development in mind, allowing banks and financial institutions to innovate and scale their innovations safely.
AI-generated synthetic data and the proverbial kitchen knife
Just like any tool, kitchen knives and algorithms included – synthetic data can serve good causes, while some use it to commit crimes. AI-generated data can be so realistic that even those who develop synthetic data products can be fooled by it. JPMorgan Chase acquired a financial planning company for $175 million, not realizing that the accounts they were looking at were synthetic. The power of generative AI is formidable, and we’ll all have to brace ourselves for fakes that look real, be it a picture, a video, or a CSV file. What makes AI-powered synthetic data generators especially valuable is their ability to augment, diversify and simulate data on top of anonymizing it. In that sense, AI-generated data is better than real data since reality is constrained, while synthetic universes are malleable. For better or for worse.
The low-hanging fruits of generative AI in banking
There are plenty of use cases for generative AI in banking, but let’s narrow down to the core opportunities all banks should be implementing already or, at the very least, exploring.
1. Product development and personalization
Product development needs meaningful behavioral data. It’s impossible to build customer-centric applications without fast and easy access to meaningful customer data and a deep understanding of customer behavior. Legacy data anonymization tools stand in the way of great banking products, destroying intelligence and, in most cases, still endangering privacy. If you want to build a banking app with 1 cent transactions because your IT department decided to mask transaction data before handing it over to the development team – off-shore or in-house – you won’t see a great final product. Synthetic test data generation can provide a level of granularity and satisfy the strictest privacy requirements at the same time. Datasets can be blown up in size for stress testing applications, but they can also be subsetted for easier access while keeping the data patterns of the original intact. Customer behavior can be analyzed in great detail without ever exposing a single individual’s transactions.
2. Risk prediction and management
Transaction intelligence is already used in anomaly detection by companies like SWIFT, which act as the connecting tissue between banks and financial services providers globally. The accuracy of risk prediction can be improved with synthetically augmented transaction data, providing machine learning algorithms with more intelligence and less bias. Synthetic data generators can also simulate economic scenarios and predict losses based on historical data. The imaginative power of AI algorithms combined with the ability to derive insights hidden from the naked eye makes generative approaches for market predictions a must-have tool many are already busy building and using.
3. Data sharing
By providing curated synthetic versions of the most important datasets downstream, everyone across the organization can become a data consumer without prohibitions, risks, or lengthy access protocols. Overriding the siloed way of operating so typical of banks comes with a host of benefits: data access becomes less costly and time-consuming, while data literacy and interdepartmental intelligence sharing increases – even across national borders separating subsidiaries. Making the leap to the cloud and scaling ML and AI is simply impossible without meaningful and agile data sharing.
Synthetic data use cases are plenty, but these – product development, risk prediction, and data sharing – should be the first tackled, unlocking new paths to better services and increasing productivity, as well as privacy. As long as the kitchen knife is used by a responsible chef and not by a serial killer, no one should be afraid.
This article is originally published on February 12, 2023.