Fraud Prevention and Online Authentication Report 2019/2020
Machine learning is a technology that has been with us for some time now. Sometimes understated or used just as a buzz word, we cannot deny its impact and benefits on the human life.
From personal assistants and social media advertising services to medical diagnosis, image processing, and financial prediction, this innovative technology impacts our everyday life and supports business decisions for some of the worlds leading companies. For instance, machine learning (ML) solutions could assist financial services institutions to predict financial transactions fraud or outcomes of investments. Furthermore, banks can apply machine learning models to create targeted upselling and cross selling marketing campaigns.
Usually, the common ML techniques applied involve dealing with large amounts of data that needs to be shared and prepared before the actual learning phase. However, compliance with privacy laws (e.g. GDPR in Europe, the Personal Data Protection Bill in India, etc.) requires that most of the data and the computation to be kept in a secure environment, usually in-house, and not outsourced to cloud or multi-tenant shared environments.
At the beginning of October 2019, IBM scientists have published a paper demonstrating how homomorphic encryption (HE) enabled a bank to run machine learning algorithms on their sensitive client data, while keeping it encrypted.
Towards a homomorphic machine learning Big Data pipeline in finance
As data management and data protection are top concerns for financial institutions, The Paypers has been closely watching this space and has spoken with Flavio Bergamaschi, IBM Senior Research Scientist and one of the scientists behind IBMs pilot to find more about the research.
Imagine what you could do if you could compute on encrypted data without ever decrypt it. This was the message that dominated Flavios presentation and opened a whole spectrum of possibilities, new scenarios about what we can do today or we're not even considering doing because we cannot share information.
Broadly speaking, homomorphic encryption (HE) enables us to do the processing of the data without giving access to the data, and this is technically done by computing on encrypted data. The technology promises to generally transform and disrupt how business is currently done in many industries such as, but not limited to, healthcare, medical sciences, and finance.
His explanation recalled an interview that we had in May 2019 with Michael Osborne, a Security Researcher at IBMs Zurich Research Laboratory, one year after GDPR was passed in Europe. Back then, Michael agreed that banks are left with a dilemma: on the one hand, if they do not have sufficient technologies for fraud detection they can be fined and, on the other hand, if they do it in such a way that there is a breach and there is a kind of a risk to data subjects, they can again be fined within GDPR law. So, at the end of the day, its all about how you can do it; but IBM researchers solved this puzzle, as homomorphic encryption (HE) allows us to resolve the paradox of need to know vs. need to share.
The beginnings of homomorphic encryption
The first fully homomorphic encryption scheme was invented in 2009 by Craig Gentry. Going through the chronology of HE, Flavio explained that Gentrys invention described an encryption scheme that supports both multiplication and addition operations that can be used to perform arbitrary computation. Before this technology was developed, one could do either one or the other, but not both. But how long did it take to do one multiplication of one bit back in 2009? Flavios reply came disappointingly: the performance predictions were disappointing to the point that it was branded as "not in my lifetime". However, 10 years later and after many algorithmic improvements, the performance today is very adequate for many use cases where keeping the privacy and confidentiality of the data is paramount...
When it comes to real life applications, the engineering team started developing use cases for genomics (finding similarities between two genomic sequences, predicting a genetic predisposition to a specific condition or disease), oblivious queries (perform queries without revealing the query data), private set intersections (finding intersections of data without revealing anything more than the intersection), and prediction models for finance (investments, risk score determination).
In 2019, IBM managed to reduce the speed of the homomorphic computation, making it orders of magnitude faster than it was believed before.
How computing is done today
To stress the breakthrough of the research, Flavio demonstrated how computing is done today using a diagram that involves data exchange between two entities: Alice and Bob, plus Eve trying to eavesdrop the communication.
When Alice needs some service from an entity which we call Bob, it will encrypt the data when the data is in storage or when it's in transmission, to prevent Eve from grabbing unprotected data. Still, even if Eve steals that data, she is going to take it in an encrypted form. But Bob needs to decrypt the data in order to do anything with it.
I guess I seemed a bit puzzled by his diagram, so Flavio came up with a real life example when you buy something from an online shopping site you send your credit card details, and, most of the time, the details go to the site through an encrypted channel. But, when it gets to the source, the service needs to decrypt that info to process your order. This is the honest, but curious threat model. Because it's honest what the service is proposing to do for you, i.e. process a payments/transaction, but is curious as it wants to learn/extract information from your data.
With homomorphic encryption this model is changed because now the entity that provides the service, Bob, not only cannot see the data, but he doesnt have the ability to decrypt that data either, because he doesn't have the key. Nevertheless, he can still compute on that data and provide the service that he proposed to provide.
Shift in the security paradigm
Both Flavio and I agreed that security is crucial and protecting data privacy has become a major concern for users, and companies need to be careful when handling data.
Before homomorphic encryption was discovered, you would first implement the business logic of the application, and then the security team would build walls around it, to protect it. Data would be encrypted for storage in the disc or when it was transmitted but would have to be decrypted whenever you needed to do something with it he added.
Homomorphic encryption changed the picture because now, the cryptography is entangled with the business logic and we can have the data always encrypted while at rest/storage, transmission, and even while we are computing it.
The finance opportunity
Financial organisations have so many different departments. For instance, a bank could have a retail banking part, loans, investments, insurance, health insurance, etc. This translates into a lot of information, which due to privacy legislation such as GDPR, antitrust or anti-competitive business legislation, may not be combined by analysts in a clear form, as there is too much risk for data exfiltration and leaks. If all this data is encrypted, and computation can still be performed, without accessing that data, there is a lesser risk if the data leaks because it is encrypted. Only the machine, without accessing the data in the clear, can perform computations such as running models to analyse and predict data for marketing, fraud detection, loans, financial health of the account holder, and be able to offer services.
By using HE encrypted models and data, IBM team demonstrated the feasibility of predicting if a customer might need a personal loan soon, enabling targeted marketing. Typically, this is done behind a firewall in a segregated environment Flavio explained, limiting a bank to only using machine learning tools and resources built or installed in-house. Homomorphic encryption can successfully be used to protect the privacy and confidentiality of data used both in the creation of predictive models and running predictions theoretically freeing the bank to safely outsource sensitive data to a hybrid and/or public cloud for analysis with peace of mind.
Finally, I got fully convinced. Lets say you are looking to make an investment with a bank, and you want to make that in a way that you dont want to reveal with your bank what sort of volumes you might want to invest. In this case, the bank could deploy machine learning models on your encrypted data that will predict the risk for investment or returns, and offer you a service/offer, which you might accept it or not.
I would like to thank Flavio and the whole IBM team for an insightful presentation on homomorphic encryption, and what other best way to conclude than to quote him: Imagine what you could do if you could compute on encrypted data without ever decrypt it. Feel free to share your thoughts with us at mirelac@thepaypers.com .
About Flavio Bergamaschi
Flavio Bergamaschi is a Senior Research Scientist and currently the leader of the group developing IBM's Fully Homomorphic Encryption (FHE) technology for robustness, serviceability and usability, and designing and developing real world FHE applications. He also represents IBM in the industry-wide homomorphic encryption standards.
His areas of expertise include cryptography, distributed systems (MIMD & SIMD), signal processing and machine learning.
About Mirela Ciobanu
Mirela Ciobanu is a Senior Editor at The Paypers and has been actively involved in covering digital payments and related topics, especially in the cryptocurrency, online security and fraud prevention space. She is passionate about finding the latest news on data breaches, machine learning, digital identity, blockchain, and she is an active advocate of the need to keep our online data/presence protected. Mirela has a bachelor degree in English language and holds a Masters degree in Marketing.
Excerpt from:
Taking UX and finance security to the next level with IBM's machine learning - The Paypers