research-article

Privacy-preserving spam filtering using homomorphic and functional encryption

Authors:

Naveen Karunanayake,

Suranga Seneviratne,

Peizhao HuAuthors Info & Claims

Volume 197, Issue C

Pages 230 - 241

https://doi.org/10.1016/j.comcom.2022.11.002

Published: 01 January 2023 Publication History

Abstract

Conventional spam classification requires the end-users to reveal the content of incoming emails to a classifier so that text analysis can be performed. On the other hand, new cryptographic primitives allow this classification task to be performed on encrypted emails without revealing the email contents, hence preserves user data privacy. In this paper, we construct a spam classification framework that enables the classification of encrypted emails. Our model is based on a neural network with a quadratic network component and a multi-layer perceptron network component. The quadratic network architecture is compatible with the operation of an existing quadratic functional encryption scheme. To protect email content privacy, we proposed two spam classification solutions based on homomorphic encryption (HE) and functional encryption (FE) that enables our classifiers to predict the label of encrypted emails. The evaluation results on real-world spam datasets indicate that our proposed spam classification solutions achieve accuracies over 95%. Our performance study and security analysis provide pros and cons of each proposed solution. For instance, the FE solution predicts a label of an encrypted email in less than 31 s whereas the HE solution takes up to 265 s to do so. Nonetheless, the HE solution is not prone to potential information leakage as the FE solution.

References

[1]

M. Sahami, S. Dumais, D. Heckerman, E. Horvitz, A Bayesian approach to filtering junk e-mail, in: Learning for Text Categorization: Papers from the 1998 Workshop, Vol. 62, 1998, pp. 98–105.

[2]

J. Rennie, ifile: An application of machine learning to e-mail filtering, in: Proc. KDD 2000 Workshop on Text Mining.

[3]

Taylor B., Fingal D., Aberdeen D., The war against spam: A report from the front line, 2007.

[4]

Marks G., How spam was solved, 2011, URL https://www.forbes.com/sites/quickerbettertech/2011/10/17/how-spam-was-solved.

[5]

Elgan M., Has the spam problem been solved?, 2011, URL https://www.computerworld.com/article/2498784/has-the-spam-problem-been-solved-.html.

[6]

Wiggers K., Gmail is now blocking 100 million more spam emails a day, thanks to TensorFlow, 2019, URL https://www.computerworld.com/article/2498784/has-the-spam-problem-been-solved-.html.

[7]

Google Blog, How machine learning in g suite makes people more productive, 2017, URL https://www.blog.google/products/g-suite/how-machine-learning-g-suite-makes-people-more-productive/.

[8]

Metz C., Google says its AI catches 99.9% of gmail spam, 2015, URL https://www.wired.com/2015/07/google-says-ai-catches-99-9-percent-gmail-spam/.

[9]

Pathak M.A., Sharifi M., Raj B., Privacy preserving spam filtering, 2011, arXiv preprint arXiv:1102.4021.

[10]

Paillier P., Public-key cryptosystems based on composite degree residuosity classes, in: Stern J. (Ed.), EUROCRYPT’99, Springer Berlin Heidelberg, Berlin, Heidelberg, 1999, pp. 223–238.

[11]

Khedr A., Gulak G., Vaikuntanathan V., SHIELD: scalable homomorphic implementation of encrypted data-classifiers, IEEE Trans. Comput. 65 (9) (2015) 2848–2858.

[12]

Boneh D., Sahai A., Waters B., Functional encryption: Definitions and challenges, in: Theory of Cryptography, 2011, pp. 253–273.

[13]

E. Dufour-Sans, R. Gay, D. Pointcheval, Reading in the Dark: Classifying Encrypted Digits with Functional Encryption, Cryptology ePrint Archive, Report 2018/206, 2018.

[14]

C.E.Z. Baltico, D. Catalano, D. Fiore, R. Gay, Practical Functional Encryption for Quadratic Functions with Applications to Predicate Encryption, Cryptology ePrint Archive, Report 2017/151, 2017.

[15]

D. Ligier, S. Carpov, C. Fontaine, R. Sirdey, Privacy Preserving Data Classification using Inner-product Functional Encryption, in: Proceedings of the 3rd ICISSP, 2017.

[16]

Ryffel T., Pointcheval D., Bach F.R., Dufour-Sans E., Gay R., Partially encrypted deep learning using functional encryption, in: NeurIPS’19, 2019.

[17]

Brakerski Z., Fully homomorphic encryption without modulus switching from classical gapsvp, in: Annual Cryptology Conference, Springer, 2012, pp. 868–886.

[18]

J. Fan, F. Vercauteren, Somewhat Practical Fully Homomorphic Encryption, Cryptology ePrint Archive, Report 2012/144, 2012.

[19]

Lyubashevsky V., Peikert C., Regev O., On ideal lattices and learning with errors over rings, in: Annual International Conference on the Theory and Applications of Cryptographic Techniques, Springer, 2010, pp. 1–23.

[20]

Aloufi A., Hu P., Song Y., Lauter K., Computing blindfolded on data homomorphically encrypted under multiple keys: A survey, ACM Comput. Surv. 54 (9) (2021),.

Digital Library

[21]

Brakerski Z., Vaikuntanathan V., Efficient fully homomorphic encryption from (standard) LWE, in: FOCS’11, IEEE, 2011, pp. 97–106.

[22]

Marc T., Stopar M., Hartman J., Bizjak M., Modic J., Privacy-enhanced machine learning with functional encryption, in: ESORICS’19, 2019.

[23]

Abdalla M., Bourse F., De Caro A., Pointcheval D., Simple functional encryption schemes for inner products, in: Katz J. (Ed.), PKC’15, Springer Berlin Heidelberg, Berlin, Heidelberg, 2015.

[24]

Baltico C.E.Z., Catalano D., Fiore D., Gay R., Practical functional encryption for quadratic functions with applications to predicate encryption, in: CRYPTO’17, 2017, pp. 67–98.

[25]

Xu R., Joshi J.B.D., Li C., CryptoNN: Training neural networks over encrypted data, 2019, arXiv:1904.07303.

[26]

Bost R., Popa R.A., Tu S., Goldwasser S., Machine learning classification over encrypted data, in: NDSS’15, 2015.

[27]

Demertzis I., Froelicher D., Luo N., Hovd M.N., I-SEAL2: Identifying spam EmAiL with SEAL, in: Lauter K., Dai W., Laine K. (Eds.), Protecting Privacy Through Homomorphic Encryption, Springer International Publishing, Cham, 2021, pp. 129–132.

[28]

T. Gupta, H. Fingler, L. Alvisi, M. Walfish, Pretzel: Email encryption and provider-supplied functions are compatible, in: Proceedings of the ACM Special Interest Group on Data Communication, 2017, pp. 169–182.

[29]

Zhong Z., Ramaswamy L., Li K., ALPACAS: A large-scale privacy-aware collaborative anti-spam system, in: INFOCOM’08, IEEE, 2008, pp. 556–564.

[30]

C. Juvekar, V. Vaikuntanathan, A. Chandrakasan, GAZELLE: A low latency framework for secure neural network inference, in: 27th USENIX Security Symposium, 2018.

[31]

J. Liu, M. Juuti, Y. Lu, N. Asokan, Oblivious neural network predictions via minionn transformations, in: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, 2017.

[32]

Yang Y., Pedersen J.O., A comparative study on feature selection in text categorization, in: Icml, Vol. 97, Nashville, TN, USA, 1997, p. 35.

[33]

Pouyanfar S., Sadiq S., Yan Y., Tian H., Tao Y., Reyes M.P., Shyu M.-L., Chen S.-C., Iyengar S.S., A survey on deep learning: Algorithms, techniques, and applications, ACM Comput. Surv. 51 (5) (2018) 1–36.

Digital Library

[34]

Tensorflow Documentation, Post-training quantization, 2020, URL https://www.tensorflow.org/lite/performance/post_training_quantization.

[35]

SEAL, Microsoft SEAL (release 3.6), 2020, Microsoft Research, Redmond, WA, https://github.com/Microsoft/SEAL.

[36]

Halevi S., Shoup V., Algorithms in helib, in: Garay J.A., Gennaro R. (Eds.), Advances in Cryptology – CRYPTO 2014, Springer Berlin Heidelberg, Berlin, Heidelberg, 2014, pp. 554–571.

[37]

JHU Security and Crypto Lab, Charm, URL https://github.com/JHUISI/charm/tree/dev/charm/toolbox.

[38]

Cormack G.V., Lynam T.R., 2007 TREC public spam corpus, 2021, https://plg.uwaterloo.ca/~gvcormac/treccorpus07/, Online; accessed 29 May 2021.

[39]

CEAS 2008, Live Spam Challenge Corpusn, URL https://plg.uwaterloo.ca/~gvcormac/ceascorpus/.

[40]

Metsis V., Androutsopoulos I., Paliouras G., Enron-spam datasets, 2021, http://www2.aueb.gr/users/ion/data/enron-spam/, Online; accessed 29 May 2021.

[41]

Almeida T.A., Yamakami A., Compression-based spam filter, Secur. Commun. Netw. 9 (4) (2016) 327–335.

[42]

Sculley D., Cormack G.V., Going mini: Extreme lightweight spam filters, 2009.

[43]

E.D. Sans, Reading In The Dark: Classifying Encrypted Digits with Functional Encryption, URL https://github.com/edufoursans/reading-in-the-dark.

[44]

Akinyele J.A., Garman C., Miers I., Pagano M.W., Rushanan M., Green M., Rubin A.D., Charm: a framework for rapidly prototyping cryptosystems, J. Cryptogr. Eng. 3 (2) (2013).

[45]

IETF, Barreto-Naehrig Curves, URL https://tools.ietf.org/id/draft-kasamatsu-bncurves-01.html.

[46]

S. Yoo, Y. Yang, F. Lin, I.-C. Moon, Mining social networks for personalized email prioritization, in: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2009, pp. 967–976.

[47]

Chiou P.-R., Lin P.-C., Li C.-T., Blocking spam sessions with greylisting and block listing based on client behavior, in: 2013 15th International Conference on Advanced Communications Technology, ICACT, IEEE, 2013, pp. 184–189.

[48]

M. Kucherawy, D. Crocker, Email greylisting: An applicability statement for smtp, Technical Report, 2012.

[49]

Song C., Raghunathan A., Information leakage in embedding models, in: CCS’20, 2020, pp. 377–390.

[50]

Nasr M., Shokri R., Houmansadr A., Comprehensive privacy analysis of deep learning: Passive and active white-box inference attacks against centralized and federated learning, in: S&P’19, 2019, pp. 739–753,.

[51]

Melis L., Song C., De Cristofaro E., Shmatikov V., Exploiting unintended feature leakage in collaborative learning, in: S&P’19, 2019, pp. 691–706,.

[52]

Ganju K., Wang Q., Yang W., Gunter C.A., Borisov N., Property inference attacks on fully connected neural networks using permutation invariant representations, in: CCS’18, 2018, pp. 619–633.

[53]

Shokri R., Stronati M., Song C., Shmatikov V., Membership inference attacks against machine learning models, in: S&P’17, 2017, pp. 3–18,.

[54]

Sun Y., Chong N., Ochiai H., Privacy-preserving phishing email detection based on federated learning and LSTM, 2021, arXiv preprint arXiv:2110.06025.

[55]

Thapa C., Tang J.W., Abuadbba A., Gao Y., Camtepe S., Nepal S., Almashor M., Zheng Y., Evaluation of federated learning in phishing email detection, 2020, arXiv preprint arXiv:2007.13300.

[56]

Lyubashevsky V., Peikert C., Regev O., A toolkit for ring-LWE cryptography, in: EUROCRYPT’13, in: LNCS, Springer, 2013.

[57]

Samardzic N., Feldmann A., Krastev A., Devadas S., Dreslinski R., Peikert C., Sanchez D., F1: A fast and programmable accelerator for fully homomorphic encryption, in: MICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO ’21, Association for Computing Machinery, New York, NY, USA, 2021, pp. 238–252,.

Digital Library

[58]

Samardzic N., Feldmann A., Krastev A., Manohar N., Genise N., Devadas S., Eldefrawy K., Peikert C., Sanchez D., CraterLake: A hardware accelerator for efficient unbounded computation on encrypted data, in: Proceedings of the 49th Annual International Symposium on Computer Architecture, ISCA ’22, Association for Computing Machinery, New York, NY, USA, 2022, pp. 173–187,.

Digital Library

Index Terms

Privacy-preserving spam filtering using homomorphic and functional encryption

Index terms have been assigned to the content through auto-classification.

Recommendations

Delegatable homomorphic encryption with applications to secure outsourcing of computation
CT-RSA'12: Proceedings of the 12th conference on Topics in Cryptology

We propose a new cryptographic primitive called Delegatable Homomorphic Encryption (DHE). This allows a Trusted Authority to control/delegate the evaluation of circuits over encrypted data to untrusted workers/evaluators by issuing tokens. This ...
On the Relationship between Functional Encryption, Obfuscation, and Fully Homomorphic Encryption
IMACC 2013: Proceedings of the 14th IMA International Conference on Cryptography and Coding - Volume 8308

We investigate the relationship between Functional Encryption FE and Fully Homomorphic Encryption FHE, demonstrating that, under certain assumptions, a Functional Encryption scheme supporting evaluation on two ciphertexts implies Fully Homomorphic ...
Publicly Auditable Functional Encryption
Applied Cryptography and Network Security
Abstract
We introduce the notion of publicly auditable functional encryption (PAFE). Compared to standard functional encryption, PAFE operates in an extended setting that includes an entity called auditor, besides key-generating authority, encryptor, and ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Computer Communications

Computer Communications Volume 197, Issue C

Jan 2023

307 pages

ISSN:0140-3664

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 January 2023

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Mar 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents