Basics, history and use-cases
Sam Rososhek (Ph.D., Professor), Ilya Rososhek (Senior Developer)
Introduction
- What is Homomorphic Encryption
- Where is its role in cryptography
- What are use-cases?
Actually, the phrase “homomorphic encryption” itself is so niche, that even experts in cybersecurity barely heard about it, or at least will have trouble explaining what it is and how it works. You may do a short experiment, go to your security officer or anyone responsible for IT-security / cybersecurity in your company and ask these questions.
This article is the first in the series of articles about homomorphic encryption and aiming to do 3 things:
- To answer the questions above.
- To make homomorphic encryption more popular, spreading the knowledge about it.
- Going over, clarifying every definition in the simple words. So after reading this article, everyone could say: “yeah, I understood this topic about homomorphic encryption and even can explain it to others”.
Although it’s not a scientific article, you may find some simple formulas, specific mathematical and cryptographic definitions and statements. At the same time, the goal is - to keep this text as simple and short as possible, to not become a long life story.
Basics
Historically, Cryptography has to solve 3 main tasks:
- Confidentiality
- Authentication
- Data integrity
For each task there were developed specific methods. To support confidentiality symmetric and asymmetric cryptography is usually used, where the most popular scheme is AES + RSA.
Where RSA is typically used for key generation and AES is responsible for the encryption itself. Both definitions of symmetric and asymmetric cryptography are quite wide, so let’s say just a few words here. Symmetric is such a cipher which uses the same key (private/secret) to transform plaintext (our target of encryption) to the ciphertext (meaning encrypted data) and vice versa; while asymmetric cipher uses a pair of keys, private and public. Where public key is literally public and is used to encrypt a plaintext, then private is used to decrypt ciphertext, it shall be held securely and not shared to anyone.
For authentication might be used various authentication protocols/methods, like well known 2-factor authentication, where a service provider uses password authentication + one-time code sent via cell phone (just one of such examples).
To check data integrity might be used various hash functions, such as SHA-2 and SHA-256. Hash function is such a function which is used to map arbitrary(big) size values to specific, which allows later usage to compare with initial state. For instance, we have 1 png file, with a picture of something. Hash sum allows us to control its size bit-wise, so if we want to send it via the internet and make sure that recipient will get an identical picture - is sufficient to generate its hash sum by a sender and then by addressee and compare then at the end..
In fact, currently developed and existing cryptographic instruments can ensure data protection “in transit” (meaning, encrypt data before any operation of moving/sending them) and “at rest” (meaning, it could be stored in a Cloud, on a hard drive, USB drive and assumed as not actively used; usually, hackers consider such information as the most valuable). But how can we make sure to protect the most important type - “in use”? (meaning, that a user would like to actively work with data securely, keeping them in a protected/encrypted state). None of existing symmetric or asymmetric ciphers can do that, and here on our scene is going to appear Homomorphic Encryption.
For many years homomorphic encryption has been called “The Holy Grail” of cryptography and many scientists/cryptographers were dreaming of having its practical implementation. So what is the definition and purpose of homomorphic encryption? Homomorphic encryption is such a type of encryption, which allows computations on encrypted data without access, and usage of private (secure) key achieving the goal of protecting the privacy of data during communication, operations and storage process. The results of such computations are also encrypted data, which can continue to be used for further secure operations. In other words, homomorphic encryption allows to encrypt data and not decrypt it, but being able to operate with for absolutely everyone, because a security key isn’t required. And the last - homomorphic encryption is intended to be used for arithmetic operations, and eventually all computations on encrypted data are basic mathematical operations, such as multiplication and addition (also subtraction and division, but we will talk about it and related complexity in the next article).
There are 3 types of Homomorphic encryption: 1) Partially Homomorphic Encryption (PHE), 2) Somewhat Homomorphic Encryption (SHE) and 3) Fully Homomorphic Encryption (FHE). From all types the highest priority belongs to Fully Homomorphic Encryption (FHE). Thus, let’s consider only this type, since others have limited functionality and, generally speaking, not so interesting and important for a real use.
Let’s emphasize several use-cases which can be considered in general for the FHE scheme. We won’t go into details, just listing them to increase general understanding and somehow accomplish our descriptive picture. These use-cases may generally indicate what niches for homomorphic encryption are:
- Secure data in the Cloud.
- Enabling various data analytics scenarios.
- Secure various sections in Automotive or Robotics which may require mathematical operations.
- Medical niche and, for instance, genomics calculations or general securing of patient’s data.
- Banking - respectively any numeric related computations.
- General usage for Machine learning and Deep learning which can be performed on fully encrypted data etc.
History
Thuswise, in 1978, Rivest et al.[1], suggested the definition - “privacy homomorphism”, which was the first step to Homomorphic Encryption. But their approach was hacked by Brickell & Jacobi in 1987[2]. Anyway, this topic was raised again in 1991, when Feigenbaum & Merritt[3] raised another question: Is there such an encryption function E(), such that both ciphertext E(x+y) and E(x*y) can be easily computed from ciphertext E(x) and E(y)?
Feigenbaum & Merritt[3] wanted to research - is there an algebraically homomorphic scheme of encryption, which can be designed? Let’s note, that algebraically homomorphic scheme of encryption E, means that for every open data x and y, from their ciphertexts E(x), E(y), we have E(x+y) = E(x) + E(y) and E(x*y) = E(x) * E(y). In other words, computing the sum of ciphertexts E(x) and E(y) we get ciphertext of sum x+y and ciphertext of product x*y - it allows privately summate and multiply numbers x and y. Unfortunately, there was no answer on this question, raised by Feigenbaum & Merritt[3], whether we may have an effective and secure, algebraically homomorphic scheme of encryption or not.
At the same time, in 2009, Gentry[4] gave a theoretical answer on that question and built Fully Homomorphic Encryption scheme (FHE). His construction consists of several steps. At the first step he built Somewhat Homomorphic Encryption(SHE), which supports computation of low-degree polynomials on encrypted data. At the second step, he squashes the procedure of decryption in such a way that it might be introduced as a low-degree polynomials, supported by the encryption scheme. And finally, at the last step he used the procedure of what is called “bootstrapping mechanism” to get FHE scheme. In other words, to make non-homomorphic things to be “like-homomorphic”. This last procedure allows to compute high-degree polynomials on encrypted data. Entire construction looks very complicated for every non expert in this field, and even for experts from standard Encryption fields, like symmetric and asymmetric.
The reason for such complexity is that to make successful encryption of data and its later usage, shall be done specific configurations and preparations with taken into account “noise” - every existing FHE scheme produce noisy ciphertext. And performing arithmetic operation on this ciphertext leads to both noise and ciphertext growth. Because of that, all existing schemes are limited by multiplicative depth which is around 20 consequent operations, otherwise, noise will be so big that decryption will be not possible. To be able to achieve this multiplicative depth and decrease noise and size of the ciphertext, as mentioned before, Gentry proposed a “bootstrapping” mechanism, which is a generic procedure of “refreshing” a ciphertext after every arithmetic operation. As already mentioned, it enables FHE scheme from Partial and Somewhat homomorphic encryption (PHE, SHE). Bootstrapping is another reason for slowness, because it may take minutes depending on the multiplicative depth. All this complexity and nuances will be handled in our next article, when we will go over existing schemes and programmatic implementations.
By community was done a huge amount of work to develop different schemes of FHE. For the last decade, there were developed many of them, and many different big companies, such as IBM and Microsoft, invest money to the R&D in this direction. While the most popular schemes currently are: DGHW, BGV, BFV and CKKS. Beside that, there were developed several open-source libraries, such as HElib, PALISADE, Lattigo, SEAL etc which allow to introduce several more or less successful implementations of FHE with various pros and cons. At the same time, the last status of FHE, despite all efforts, is marked as “non practical” due to huge computational overhead, so that its performance is still so slow, to be called impractical even up till now.
Finally, let us provide few examples of challenges and real use-cases (just 3 out of tens):
- Use-case N1 An owner of some company has sensitive data/information, which he would like to use for computations, but he can’t share them even for own employees. The challenge is, that using standard techniques such as AES/RSA etc, will make impossible computations on encrypted data, while providing of plain-text to 3-rd parties is not acceptable as well
- Use-case N2 A Hospital got a very good business proposition from some Research Center to make statistical computations on patients data. But, of course, this Hospital can’t provide patients data because it violates the law. Doing standard Encryption by the standard instruments, same as in use-case N1 makes impossible any computations on encrypted data too.
- And the last. Some abstract Bank, would like to build a model of investments for the upcoming year and be able to predict some risks and potential income/revenue, using it. Usually, such things might be done using Machine Learning / Deep Learning with some big, real financial data-sets. At the same time, this Bank can’t provide such statistics and data-sets to anyone else outside. So how to solve this and above challenges?
Generally speaking, these use-cases introduce one huge issue for all modern Cloud solutions, would it be MS, Google, Amazon or any other - many-many companies still don’t want to migrate to any cloud-based solutions and are afraid of it, due its not enough security and privacy. Despite many improvements over the last years, still there is not enough trust. Those who will be able to implement practical FHE scheme - will open the door and huge market to this and many other directions.
Answers for use-cases and discussion about all of the above-mentioned FHE schemes, their nuances and status is going to be the topic for our next article.
Stay tuned! June, 2022
References
[1] R. Rivest, L. Adleman and M. Dertouzos. On Data Banks and Privacy homomorphism. Found. of Secure Communic., 1978, pp. 169-177.
[2] E.Brickell and J.Jacobi. On privacy homomorphisms. Lecture Notes in Computer Science, v.304(1987), pp.117-125.
[3] J. Feigenbaum and M.Merritt. Open Questions, Talk Abstract and. Summary of Discussions. DJMACS Series in Discrete Math and Theoretical Computer Science, v2(1991), pp.1-45
[4] C.Gentry. Fully Homomorphic Encryption using ideal lattices. STOC-09,2009,pp.169-178.