Existing schemes and programmatic implementations
Sam Rososhek (Ph.D., Professor), Ilya Rososhek (Senior Developer)
Introduction
- What are the most popular HE schemes?
- What are open-source libraries based on these schemes?
- How can they respond to challenging use-cases raised in the 1-st article
“Asymmetry”, “serialization”, “negative computations”, “encryption parameters” and “ciphertext size”, “noise budget”, “recryption”, “bootstrapping”, “relinearization” etc…. This is the list of definitions which fully describe in simple words the general complexity of all existing homomorphic encryption schemes. If we understand these definitions and the background, we may say that we approximately understand Homomorphic Encryption(HE). In our first article[1] were described the basics of traditional encryption and some general understanding about homomorphic encryption, giving several interesting and challenging use-cases for HE. The second article is going to explain all general definitions, complexity and issues of existing HE schemes, also to shed some light on programmatic implementations and their problems, explaining use-cases given in the first article.
Homomorphic encryption schemes and open-source libraries.
Existing HE programmatic implementations / libraries (PALISADE[2], SEAL[3], HElib[4], Lattigo[5], TFHE[6] and others) are primarily intended for researchers and scientists. Regular users can not use them having not enough solid mathematical basis and because none of these libraries have default presets for possible use-cases to be able to match with sufficient security level and performance. They don't have it, because it is impossible to implement such. The required tuning of parameters is very fancy, complex and specific, any mistake will lead to very noisy ciphertext and not operable, corrupted data.
Since the first FHE scheme was developed (by Gentry, 2009[7]), for now we can see a variety of FHE schemes implementations. The most popular are: BGV[8], BFV[9], CKKS[10] and DGHV[11] First three schemes use for data encryption ring of integer polynomials by modulo some fixed polynomial. Such an approach immediately excludes from potential users the most of people, which aren’t familiar with rings, groups, polynomials and specificness of HE (meaning even regular software developers are excluded). Only homomorphic encryption experts can work with it, because it requires special tuning of data for its encryption and later homomorphic operations.
Scheme DGHV is more user-friendly and more accessible for regular users.
Note that BGV, BFV and DGHV allow computations on integers, CKKS - on float-point numbers. These HE schemes were implemented in various open-source libraries, the most popular and interesting are such as HElib, SEAL, PALISADE, Lattigo, TFHE and others. HE libraries have different features and we are going to briefly describe their basics, following: [12]
- Asymmetry.
All of the above-mentioned libraries have been implemented as public key encryption schemes, where public key is used for encryption and secret key for decryption. - Serialization.
Some of the HE libraries, such as HElib, SEAL, TFHE provide custom APIs to serialize/deserialize keys and ciphertexts for local storage and retrieval. Libraries that do not provide these features, require developers to create their own implementation of serialization (which might be challenging with complex data types) unless ciphertext can be represented in primitive types, such as string or big integers using the same library. - Negative computations.
It corresponds to subtracting operand 1 with operand 2, in case operand 2 > operand 1 this means that the resulting number should be a negative one. SEAL uses BFG and CKKS for encryption, which allows converting integers or reals in polynomial space, and negative computations are supported. But if ciphertext is encrypted using PolyCRT Builder, then the resulting homomorphic subtraction will not be negative (this is due to the nature of Chinese Remainder Theorem - CRT, which deals only with absolute values). - Encryption parameters, ciphertext size.
Implementing HE through any library requires selection of certain encryption parameters, such as polynomial modulus, coefficient modulus, plain-text modulus, noise standard deviation, random number generator etc. to be initialized. Choice of these parameters can significantly affect ciphertext size, RAM required, noise budget, speed, performance and security of encryption. Ciphertext size is usually large with complex ciphertext data types(see below small table which demonstrates how big the size of ciphertexts are [13]) in libraries like SEAL and HElib all operations such as matrix rotation that use Galois keys (in SEAL) may result in huge RAM requirements.
Scheme Information Platform Parameters Running Times Implemented Scheme Base Scheme Software Security Parameter, λ Dimension, n PK Size KeyGen Enc Dec Recrypt GH11 (Gentry and Halevi 2011) Gen09 (Gentry 2009) C/C++ 72 33,768 2.25GB 2.2 h 3 min (SWHE) 0.66 s (SWHE) 31 min CMNT11 (Coron et al. 2011) DGHV10 (Van Dijk et al. 2010) Sage 4.5.3 72 7,897 802MB 43 min 2 min 57 s 0.05 s 14 min 33 s CNT12 (with compressed PK) (Coron et al. 2012) DGHV10 (Van Dijk et al. 2010) Sage 4.7.2 72 7,897 10.3MB 10 min 7 min 15 s 0.05 s 11 min 34 s CNT12 (leveled) (Coron et al. 2012) DGHV10 (Van Dijk et al. 2010) Sage 4.7.2 72 5,700 18MB 6 min 18 s 3.4 s 0.00 s 2 h 27 min - Noise budget.
A noise is usually added to ciphertext to guarantee security of HE scheme. This term could be integer or polynomial with coefficients in {0,1}. Size of noise term depends on security level and correctness properties of each HE scheme. Because homomorphic operations increase noise, resulting ciphertext would become too corrupt to be decrypted. Noise budget is the total amount of noise that can be added until decryption fails. Such operations as addition and subtraction have a small impact on noise compared to multiplication. - Recryption.
Re-encryption is a technique to re-generate the noise budget of ciphertext that was depleted by arbitrary computations. Recryption boosts bounded depth to unbounded depth. This implies that noisy ciphertext can be converted into noise-free ciphertext (of the same plaintext) without secret key. Libraries that do not have recryption functionality implemented, provide no means of converting noisy ciphertext to noise-free ciphertext. Thus, they limit the number of homomorphic operations on a ciphertext. - Bootstrapping.
Bootstrapping is a technique to remove noise by passing ciphertext and encrypted private key into a circuit that represents the decryption algorithm of the FHE scheme. This results in a new ciphertext that corresponds to the original ciphertext but with no noise. In TFHE library after every gate-by-gate operation, bootstrapping is applied on resulting ciphertext and any number of homomorphic operations can be performed, but only on theory. In fact this limit is not more than 20-30 operations which already have too high cost and extremely slow performance, which might be from several minutes to 1 hour. - Relinearization.
Two input ciphertexts of size m and n, result in ciphertext of size m+n-1 after multiplication. Consumption of noise budget is much higher during multiplication, especially when input ciphertexts sizes are huge. Relinearization reduces the size of resulting ciphertext after multiplication. A ciphertext of size k+1 when relinearized, produces a ciphertext of size k. After repeated steps, this can result in ciphertext of size 2 that can be decrypted using a smaller degree decryption function to yield the same result. Thus, relinearization of resulting ciphertext after multiplication can significantly improve performance on subsequent operations, although relinearization by itself has both computational and noise budget costs. - Multiplicative depth.
It’s one of the most important challenges for HE schemes and libraries, because currently existing ones can't respond to this challenge, combining several above problems. It relates to “relinealizarion”, “multiplicative depth”, “noise budget”. Depth is a maximum possible amount of multiplications which a cipher can perform, where 3 main tasks have to be solved: performance per second, maximum possible amount of multiplication until decryption is possible and after each multiplication noise and ciphertext significantly grow, so it shall be reduced. But anyway, once maximum is reached, noise’s impact will make ciphertext undecryptable. Basically, most of the ciphers get stuck after 20-30 consequent multiplications. - Homomorphic Division.
In general - none existing in the World HE schemes aren’t able to perform homomorphic division. For instance, BGV and BFV schemes do not allow division due to the randomness and complexity of ciphertext. It’s possible to approximate division(only by CKKS) but using all kinds of expansions. FHE division of ciphertext A at ciphertext B is performed by computing the inverse of plaintext corresponding to B (decrypt, inverse and encrypt) and multiplying on plaintext of A and its result again encrypt. Because homomorphic division operations are very expensive (even much more expensive than homomorphic multiplication), none of existing libraries did not implement homomorphic division. It’s clear that such an approach is absolutely insecure and simply breaks all the idea of homomorphic encryption.
Aside note
In our third article we will describe the new approach, based on breakthrough math. fundamentals and very fast programmatic implementation (compared to any existing library, even our python prototype many orders of magnitude faster, for instance python implementation based on integers gives 500,000 homomorphic multiplications per second and also allows no limit of multiplicative depth). This implementation has no “noise”, “noise budget”, “recryption”, “bootstrapping”, “relinearization” etc. things. Our integer based implementation has literally endless multiplicative depth with no ciphertext grows. Float-point based also has none of the above “features”, but has limited multiplicative depth around 200+ multiplications, with acceptable ciphertext growth(in terms of size after 200 multiplication its size around 6784 bytes) Also homomorphic division is available in addition to all the advantages.
Challenges and use-cases
As we already mentioned at the very beginning, existing HE programmatic implementations are primarily intended for researchers, scientists and developers. Thus we shall stay in the position of common users and experienced users and declare some basic requirements to HE software to make it useful and user-friendly. Let’s make a short intro to such requirements, how they could look like:
- Let’s build a homomorphic analogue of Excel to operate in a similar way.
- Let’s make it possible to perform all 4 basic arithmetic operations: “-”, “+”, “*”, “/”.
- Multiplicative depth of such operations shall be big enough, at least more than 100-150 and above consequent operations.
Now, let’s consider a very simple procedure of calculating average yearly salary for some abstract group of employees in a private implementation, i.e. doing mathematical operations homomorphically. For that we need simply find a sum of salaries for all employees and then divide for employees amount. To make it possible via HE schemes we shall first encrypt data homomorphically to be able to operate with. It would seem easy… but no one of existing HE schemes can do that. Because none of them can perform division operations. The only solution varies researchers suggest is not private and brakes all the principles of homomorphic encryption, they suggest to deliver ciphertext to the owner of operation so it could decrypt data with own private key, then perform division, again encrypt data back and send it to original location (would it be cloud or whatever else)
Unfortunately, this one but super important “cons” makes a huge impact on the usage of any existing HE scheme, limiting it even more, in addition to “noise”, “bootstrapping” and other “features”. In the official procedure suggested as a replacement of homomorphic division - “decrypt-divide-encrypt” we have a procedure of private key usage. Imagine we have data stored in the Cloud, “chosen ciphertext attack” (CCA) becomes possible here. In the article “Danger of usage FHE”[14] described a scenario when a cybercriminal executes the attack “ Chosen-ciphertext attack: recover the BFV scheme private key with a single decryption query”, so generally speaking it’s such an attack which theoretically can be performed on any existing HE scheme to extract secret key ciphertext - this is specificness of existing schemes, where secret key included into ciphertext.
Division operation is also required for such basic computations as standard deviation, x2-test and many other statistical calculations. Such replacement makes a safety threat using theoretically secure existing HE libraries.
Let’s also take a look at another approach which is intended to increase the effectiveness and performance of existing libraries - hardware implementation. On the one hand, hardware implementation significantly increases the performance of HE scheme, but on the other hand creates new security threats, such as “side channel” attack. Recently such an attack was performed and described[15]. In general, as stated in the article “New insights into FHE libraries”([16], appendix B): “However, FHE is still in its instancy and continues to expand and become more efficient in terms of memory and computational overhead each year”.
Now, we are going to see how existing libraries could be used in various use-cases, which we set in the article N1 and also partially mentioned in the current article[17]
Use-case N1.
Previously described in our article N1, let us remind all use-cases:
“The owner of some company has sensitive data/information, which he would like to use for
computations, but he can’t share them even for own employees. The challenge is, that using standard
techniques such as AES/RSA etc, will make impossible computations on encrypted data, while
providing plaintext to 3-rd parties is not acceptable as well”.
But trying to use existing HE libraries also won't be possible due to its issues and being impractical still. To be able to work with FHE libraries required subject matter expert (SME), data holder has to hire him. Next - for every computation is required to set special parameters and, respectively, again encrypt data. It’s possible to make an analogue with the car, which shall be repaired before every ride. Having any other issues compared to this one, are much less critical. It’s barely possible that anyone would like to buy such a car.
Use-case N2.
Previously described in our article N1.
“A Hospital got a very good business proposition from some Research Center to make statistical
computations on patients data. But, of course, this Hospital can’t provide patients data because
it violates the law. Doing standard Encryption by the standard instruments, same as in use-case N1
makes impossible any computations on encrypted data too”.
Because the supply can be very valuably proposed, a Hospital may hire an expert to work with HE libraries in the mode of one-time-car as we stated above. But out of homomorphic encryption in any HE library makes the expense/cost of computations very high, and even more - it will make an issue with privacy, because proposed scheme by HE community and existing libraries required - “decrypt-inverse-encrypt” steps, instead of pure homomorphic division, so that plain-text (customer’s real data) is under the risk and makes them vulnerable for potential attacks.
Use-case N3.
Previously described in our article N1.
“Some abstract Bank, would like to build a model of investments for the upcoming year and be able
to predict some risks and potential income/revenue, using it. Usually, such things might be done
using Machine Learning / Deep Learning with some big, real financial data-sets. At the same time,
this Bank can’t provide such statistics and data-sets to anyone else outside. So how to solve this
and the above challenges?”
And again, the mode of one-time-car and out of homomorphic division doesn’t allow again to easily adopt described in this article HE libraries. Thus, for the wide adoption of HE computations required new schemes which may solve not only the issue of noise and its growth, as well as computational overhead but implementation of pure homomorphic division operation.
Conclusion.
In fact, still none of existing implementations can give good enough performance, where on all providing benchmarks researchers give generally good results, but doing so, it’s an artificial results because they do not take into account all required costs: which as we stated above are “budget noise”, “key switching”, “modulo switching” and many other operations which in sum neutralizes all the achievements in the performance. Having in addition bootstrapping and other mechanisms brings even higher cost for the performance, still considering it as impractical.
“Performance of any encryption scheme is evaluated with three different criteria: security, speed, and simplicity. First, an encryption scheme must be secure so that an attacker cannot obtain any type of information by using a reasonable amount of resources. Second, its efficiency must not disturb the user’s comfort; i.e., it must be transparent to the users because users prefer usability against security. Lastly, if and only if an encryption scheme is understandable by the other area practitioners, they will implement the scheme for their applications and productions. If the existing FHE schemes are evaluated in terms of the three criteria, there is, though getting closer, still substantial room for improvement in terms of all these criteria, especially for the speed performance”[18]
Hype? There are even startups, which try to build around existing implementations, some marketing deliveries, products to sell… but we think that it’s a hype now, which is just starting and they try to be first there. It can’t find any real customers due its expense, complexity and any performance improvements, so far shall face with decreasing security level to be able to sacrifice it to the performance.
Our third article is coming soon - “Practical and fast implementation of FHE scheme”, where we will describe our new, cutting edge math. fundamentals, crypto-primitives and programmatic implementation. Which is, as we see it, at the moment the only one that isn't based on the same mathematical fundamentals (lattice based) used by all HE schemes and the only one which can be considered as fully practical and really fast from FHE point of view, also being very user-friendly.
Stay tuned! July, 2022
References
[1] Homomorphic encryption. Part1. “Basics, history and use-cases.”
[2] PALISADE. Lattice crypto library. Originally funded and developed, led by DARPA - USA security agency. Developed in c++, python, java and mainly supports most of the popular HE schemes, such as CKKS, BGV, BFV.
[3] SEAL - Microsoft library. Implemented in c++ and supports various HE schemes, such as BFV and CKKS
[4] HElib - one of the first open-source libraries, developed by IBM in 2013, shortly after Gentry’s suggested FHE improvements. Developed in c++
[5] Lattigo is a lattice based, Go module that implements Ring-Learning-With-Errors-based homomorphic-encryption primitives and Multiparty-Homomorphic-Encryption-based secure protocols
[6] FHE library over torus. J. Chilotti,, J. of Cryptography, v.33, No.1(2020), pp-34-91
[7] C. Gentry. Fully Homomorphic Encryption scheme. Ph.D. thesis, Stanford Univ., 2009
[8] BGV - Z. Brakerski, C. Gentry. Leveled Fully Homomorphic Encryption scheme without bootstrapping. Theor. Comp. science conf. (2012) pp 309-325
[9] BFV - J. Fan & F. Vercaunteren. Somewhat practical Fully Homomorphic Encryption scheme.
[10] CKKS - H. Cheon, A. Kim, M. Kim, Y. Song. Homomorphic encryption for arithmetic of approximate numbers. Intern. conf. On theory and apps of cryptography (2017), pp 400-437
[11] DGHV - M. van Dijk, C. Gentry, S. Haleli and V. Vaikuntanalhan. Fully Homomorphic Encryption scheme over integers. Advances in cryptogr., EUROCRYPT-2010, 2010, pp 24-43
[12] A Review of Homomorphic Encryption Libraries for Secure Computation dec 2018
[13] A Survey on Homomorphic Encryption Schemes: Theory and Implementation, p.22
[15] F. Audin et al. RevEAL single-trace side channel leakage of SEAL homomorphic encryption library.
[16] Charles Gouert, Dimitris Mouris, and Nektarios Georgios Tsoutsos.
[17] A Survey on Homomorphic Encryption Schemes: Theory and Implementation,p.27