417 results sorted by ID
Possible spell-corrected query: parallel implementation
An efficient collision attack on Castryck-Decru-Smith’s hash function
Ryo Ohashi, Hiroshi Onuki
Attacks and cryptanalysis
In 2020, Castryck-Decru-Smith constructed a hash function, using the (2,2)-isogeny graph of superspecial principally polarized abelian surfaces. In their construction, the initial surface was chosen from vertices very "close" to the square of a supersingular elliptic curve with a known endomorphism ring.
In this paper, we introduce an algorithm for detecting a collision on their hash function. Under some heuristic assumptions, the time complexity and space complexity of our algorithm are...
Secure Computation with Parallel Calls to 2-ary Functions
Varun Narayanan, Shubham Vivek Pawar, Akshayaram Srinivasan
Cryptographic protocols
Reductions are the workhorses of cryptography. They allow constructions of complex cryptographic primitives from simple building blocks. A prominent example is the non-interactive reduction from securely computing a ``complex" function $f$ to securely computing a ``simple" function $g$ via randomized encodings.
Prior work equated simplicity with functions of small degree. In this work, we consider a different notion of simplicity where we require $g$ to only take inputs from a small...
Optimized One-Dimensional SQIsign Verification on Intel and Cortex-M4
Marius A. Aardal, Gora Adj, Arwa Alblooshi, Diego F. Aranha, Isaac A. Canales-Martínez, Jorge Chavez-Saab, Décio Luiz Gazzoni Filho, Krijn Reijnders, Francisco Rodríguez-Henríquez
Public-key cryptography
SQIsign is a well-known post-quantum signature scheme due to its small combined signature and public-key size. However, SQIsign suffers from notably long signing times, and verification times are not short either. To improve this, recent research has explored both one-dimensional and two-dimensional variants of SQIsign, each with distinct characteristics. In particular, SQIsign2D's efficient signing and verification times have made it a focal point of recent research. However, the absence of...
HEonGPU: a GPU-based Fully Homomorphic Encryption Library 1.0
Ali Şah Özcan, Erkay Savaş
Implementation
HEonGPU is a high-performance library designed to optimize Fully Homomorphic Encryption (FHE) operations on Graphics Processing Unit (GPU). By leveraging the parallel processing capac- ity of GPUs, HEonGPU significantly reduces the computational overhead typically associated with FHE by executing complex operation concurrently. This allows for faster execution of homomorphic computations on encrypted data, enabling real-time applications in privacy-preserving machine learn- ing and secure...
Marian: An Open Source RISC-V Processor with Zvk Vector Cryptography Extensions
Thomas Szymkowiak, Endrit Isufi, Markku-Juhani Saarinen
Implementation
The RISC-V Vector Cryptography Extensions (Zvk) were ratified in 2023 and integrated into the main ISA manuals in 2024. These extensions support high-speed symmetric cryptography (AES, SHA2, SM3, SM4) operating on the vector register file and offer significant performance improvements over scalar cryptography extensions (Zk) due to data parallelism. As a ratified extension, Zvk is supported by compiler toolchains and is already being integrated into popular cryptographic middleware such as...
Scabbard: An Exploratory Study on Hardware Aware Design Choices of Learning with Rounding-based Key Encapsulation Mechanisms
Suparna Kundu, Quinten Norga, Angshuman Karmakar, Shreya Gangopadhyay, Jose Maria Bermudo Mera, Ingrid Verbauwhede
Implementation
Recently, the construction of cryptographic schemes based on hard lattice problems has gained immense popularity. Apart from being quantum resistant, lattice-based cryptography allows a wide range of variations in the underlying hard problem. As cryptographic schemes can work in different environments under different operational constraints such as memory footprint, silicon area, efficiency, power requirement, etc., such variations in the underlying hard problem are very useful for designers...
High-Throughput GPU Implementation of Dilithium Post-Quantum Digital Signature
Shiyu Shen, Hao Yang, Wangchen Dai, Hong Zhang, Zhe Liu, Yunlei Zhao
Implementation
Digital signatures are fundamental building blocks in various protocols to provide integrity and authenticity. The development of the quantum computing has raised concerns about the security guarantees afforded by classical signature schemes. CRYSTALS-Dilithium is an efficient post-quantum digital signature scheme based on lattice cryptography and has been selected as the primary algorithm for standardization by the National Institute of Standards and Technology. In this work, we present a...
A Compact and Parallel Swap-Based Shuffler based on butterfly Network and its complexity against Side Channel Analysis
Jong-Yeon Park, Wonil Lee, Bo Gyeong Kang, Il-jong Song, Jaekeun Oh, Kouichi Sakurai
Foundations
A prominent countermeasure against side channel attacks, the hiding countermeasure, typically involves shuffling operations using a permutation algorithm. Especially in the era of Post-Quantum Cryptography, the importance of the hiding coun- termeasure is emphasized due to computational characteristics like those of lattice and code-based cryptography. In this context, swiftly and securely generating permutations has a critical impact on an algorithm’s security and efficiency. The widely...
Hekaton: Horizontally-Scalable zkSNARKs via Proof Aggregation
Michael Rosenberg, Tushar Mopuri, Hossein Hafezi, Ian Miers, Pratyush Mishra
Cryptographic protocols
Zero-knowledge Succinct Non-interactive ARguments of Knowledge (zkSNARKs) allow a prover to convince a verifier of the correct execution of a large computation in private and easily-verifiable manner. These properties make zkSNARKs a powerful tool for adding accountability, scalability, and privacy to numerous systems such as blockchains and verifiable key directories. Unfortunately, existing zkSNARKs are unable to scale to large computations due to time and space complexity requirements...
Optimized Privacy-Preserving Clustering with Fully Homomorphic Encryption
Chen Yang, Jingwei Chen, Wenyuan Wu, Yong Feng
Public-key cryptography
Clustering is a crucial unsupervised learning method extensively used in the field of data analysis. For analyzing big data, outsourced computation is an effective solution but privacy concerns arise when involving sensitive information. Fully homomorphic encryption (FHE) enables computations on encrypted data, making it ideal for such scenarios. However, existing privacy-preserving clustering based on FHE are often constrained by the high computational overhead incurred from FHE, typically...
Oryx: Private detection of cycles in federated graphs
Ke Zhong, Sebastian Angel
Cryptographic protocols
This paper proposes Oryx, a system for efficiently detecting cycles in federated graphs where parts of the graph are held by different parties and are private. Cycle detection is an important building block in designing fraud detection algorithms that operate on confidential transaction data held by different financial institutions. Oryx allows detecting cycles of various length while keeping the topology of the graphs secret, and it does so efficiently; Oryx achieves quasilinear...
Quantum Implementation of LSH
Yujin Oh, Kyungbae Jang, Hwajeong Seo
Implementation
As quantum computing progresses, the assessment of cryptographic algorithm resilience against quantum attack gains significance interests in the field of cryptanalysis. Consequently, this paper implements the depth-optimized quantum circuit of Korean hash function (i.e., LSH) and estimates its quantum attack cost in quantum circuits. By utilizing an optimized quantum adder and employing parallelization techniques, the proposed quantum circuit achieves a 78.8\% improvement in full depth and a...
GRASP: Accelerating Hash-based PQC Performance on GPU Parallel Architecture
Yijing Ning, Jiankuo Dong, Jingqiang Lin, Fangyu Zheng, Yu Fu, Zhenjiang Dong, Fu Xiao
Implementation
$SPHINCS^+$, one of the Post-Quantum Cryptography Digital Signature Algorithms (PQC-DSA) selected by NIST in the third round, features very short public and private key lengths but faces significant performance challenges compared to other post-quantum cryptographic schemes, limiting its suitability for real-world applications. To address these challenges, we propose the GPU-based paRallel Accelerated $SPHINCS^+$ (GRASP), which leverages GPU technology to enhance the efficiency of...
ICICLE v2: Polynomial API for Coding ZK Provers to Run on Specialized Hardware
Karthik Inbasekar, Yuval Shekel, Michael Asa
Applications
Polynomials play a central role in cryptography. In the context of Zero Knowledge Proofs (ZKPs), protocols can be exclusively expressed using polynomials, making them a powerful abstraction tool, as demonstrated in most ZK research papers. Our first contribution is a high-level framework that enables practitioners to implement ZKPs in a more natural way, based solely on polynomial primitives.
ZK provers are considered computationally intensive algorithms with a high degree of...
Approximate CRT-Based Gadget Decomposition and Application to TFHE Blind Rotation
Olivier Bernard, Marc Joye
Implementation
One of the main issues to deal with for fully homomorphic encryption is the noise growth when operating on ciphertexts. To some extent, this can be controlled thanks to a so-called gadget decomposition. A gadget decomposition typically relies on radix- or CRT-based representations to split elements as vectors of smaller chunks whose inner products with the corresponding gadget vector rebuilds (an approximation of) the original elements. Radix-based gadget decompositions present the advantage...
Low-Latency Linear Transformations with Small Key Transmission for Private Neural Network on Homomorphic Encryption
Byeong-Seo Min, Joon-Woo Lee
Applications
In the field of Artificial Intelligence (AI), convolution operations have primarily been used in Convolutional Neural Networks (CNNs). However, its utility is increasing with the appearance of convolution integrated transformers or state space models where convolution is a constituent element. In the field of private AI, generalized algorithm, multiplexed parallel convolution was recently proposed to implement CNNs based on the Homomorphic Encryption scheme, residue number system variant...
On the parallelization of square-root Vélu's formulas
Jorge Chávez-Saab, Odalis Ortega, Amalia Pizarro-Madariaga
Implementation
A primary challenge in isogeny-based cryptography lies in the substantial computational cost associated to computing and evaluating prime-degree isogenies. This computation traditionally relied on Vélu's formulas, an approach with time complexity linear in the degree but which was further enhanced by Bernstein, De Feo, Leroux, and Smith to a square-root complexity. The improved square-root Vélu's formulas exhibit a degree of parallelizability that has not been exploited in major...
Faster verifications and smaller signatures: Trade-offs for ALTEQ using rejections
Arnaud Sipasseuth
Public-key cryptography
In this paper, we introduce a new probability function parameter in the instantiations of the Goldreich-Micali-Wigderson with Fiat-Shamir and unbalanced challenges used in ALTEQ, a recent NIST PQC candidate in the call for additional signatures. This probability set at 100% does not bring any changes in the scheme, but modifies the public challenge generation process when below 100%, by injecting potential rejections in otherwise completely valid inputs.
From a theoretical point of view,...
White-box filtering attacks breaking SEL masking: from exponential to polynomial time
Alex Charlès, Aleksei Udovenko
Attacks and cryptanalysis
This work proposes a new white-box attack technique called filtering, which can be combined with any other trace-based attack method. The idea is to filter the traces based on the value of an intermediate variable in the implementation, aiming to fix a share of a sensitive value and degrade the security of an involved masking scheme.
Coupled with LDA (filtered LDA, FLDA), it leads to an attack defeating the state-of-the-art SEL masking scheme (CHES 2021) of arbitrary degree and number of...
Fast Parallelizable Misuse-Resistant Authenticated Encryption: Low Latency (Decryption-Fast) SIV
Mustafa Khairallah
Secret-key cryptography
MRAE security is an important goal for many AEAD applications where the nonce uniqueness cannot be maintained and security risks are significant. However, MRAE schemes can be quite expensive. Two of the SoTA MRAE-secure schemes; Deoxys-II and AES-GCM-SIV rely on internal parallelism and special instructions to achieve competitive performance. However, they both suffer from the same bottleneck, they have at least one call to the underlying primitive that cannot be parallelized to any other...
Number-Theoretic Transform Architecture for Fully Homomorphic Encryption from Hypercube Topology
Jingwei Hu, Yuhong Fang, Wangchen Dai
Implementation
This paper introduces a high-performance and scalable hardware architecture designed for the Number-Theoretic Transform (NTT), a fundamental component extensively utilized in lattice-based encryption and fully homomorphic encryption schemes.
The underlying rationale behind this research is to harness the advantages of the hypercube topology. This topology serves to significantly diminish the volume of data exchanges required during each iteration of the NTT, reducing it to a complexity of...
Reckle Trees: Updatable Merkle Batch Proofs with Applications
Charalampos Papamanthou, Shravan Srinivasan, Nicolas Gailly, Ismael Hishon-Rezaizadeh, Andrus Salumets, Stjepan Golemac
Cryptographic protocols
We propose Reckle trees, a new vector commitment based on succinct RECursive arguments and MerKLE trees. Reckle trees' distinguishing feature is their support for succinct batch proofs that are updatable - enabling new applications in the blockchain setting where a proof needs to be computed and efficiently maintained over a moving stream of blocks. Our technical approach is based on embedding the computation of the batch hash inside the recursive Merkle verification via a hash-based...
Ceno: Non-uniform, Segment and Parallel Zero-knowledge Virtual Machine
Tianyi Liu, Zhenfei Zhang, Yuncong Zhang, Wenqing Hu, Ye Zhang
Cryptographic protocols
In this paper, we explore a novel Zero-knowledge Virtual Machine (zkVM) framework leveraging succinct, non-interactive zero-knowledge proofs for verifiable computation over any code. Our approach divides the proof of program execution into two stages. In the first stage, the process breaks down program execution into segments, identifying and grouping identical sections. These segments are then proved through data-parallel circuits that allow for varying amounts of duplication. In the...
Simple constructions of linear-depth t-designs and pseudorandom unitaries
Tony Metger, Alexander Poremba, Makrand Sinha, Henry Yuen
Foundations
Uniformly random unitaries, i.e., unitaries drawn from the Haar measure, have many useful properties, but cannot be implemented efficiently. This has motivated a long line of research into random unitaries that ``look'' sufficiently Haar random while also being efficient to implement.
Two different notions of derandomisation have emerged:
$t$-designs are random unitaries that information-theoretically reproduce the first $t$ moments of the Haar measure, and pseudorandom unitaries (PRUs)...
Multiplex: TBC-based Authenticated Encryption with Sponge-Like Rate
Thomas Peters, Yaobin Shen, François-Xavier Standaert
Secret-key cryptography
Authenticated Encryption (AE) modes of operation based on Tweakable Block Ciphers (TBC) usually measure efficiency in the number of calls to the underlying primitive per message block. On the one hand, many existing solutions reach a primitive-rate of 1, meaning that each n-bit block of message asymptotically needs a single call to the TBC with output length n. On the other hand, while these modes look optimal in a blackbox setting, they become less attractive when leakage comes into play,...
Hardware Acceleration of the Prime-Factor and Rader NTT for BGV Fully Homomorphic Encryption
David Du Pont, Jonas Bertels, Furkan Turan, Michiel Van Beirendonck, Ingrid Verbauwhede
Implementation
Fully Homomorphic Encryption (FHE) enables computation on encrypted data, holding immense potential for enhancing data privacy and security in various applications. Presently, FHE adoption is hindered by slow computation times, caused by data being encrypted into large polynomials. Optimized FHE libraries and hardware acceleration are emerging to tackle this performance bottleneck. Often, these libraries implement the Number Theoretic Transform (NTT) algorithm for efficient polynomial...
LightDAG: A Low-latency DAG-based BFT Consensus through Lightweight Broadcast
Xiaohai Dai, Guanxiong Wang, Jiang Xiao, Zhengxuan Guo, Rui Hao, Xia Xie, Hai Jin
Applications
To improve the throughput of Byzantine Fault Tolerance (BFT) consensus protocols, the Directed Acyclic Graph (DAG) topology has been introduced to parallel data processing, leading to the development of DAG-based BFT consensus. However, existing DAG-based works heavily rely on Reliable Broadcast (RBC) protocols for block broadcasting, which introduces significant latency due to the three communication steps involved in each RBC. For instance, DAGRider, a representative DAG-based protocol,...
Logstar: Efficient Linear* Time Secure Merge
Suvradip Chakraborty, Stanislav Peceny, Srinivasan Raghuraman, Peter Rindal
Cryptographic protocols
Secure merge considers the problem of combining two sorted lists into a single sorted secret-shared list. Merge is a fundamental building block for many real-world applications. For example, secure merge can implement a large number of SQL-like database joins, which are essential for almost any data processing task such as privacy-preserving fraud detection, ad conversion rates, data deduplication, and many more.
We present two constructions with communication bandwidth and rounds...
Scalable Collaborative zk-SNARK: Fully Distributed Proof Generation and Malicious Security
Xuanming Liu, Zhelei Zhou, Yinghao Wang, Bingsheng Zhang, Xiaohu Yang
Cryptographic protocols
The notion of collaborative zk-SNARK is introduced by Ozdemir and Boneh (USENIX 2022), which allows multiple parties to jointly create a zk-SNARK proof over distributed secrets (also known as the witness).
This approach ensures the privacy of the witness, as no corrupted servers involved in the proof generation can learn anything about the honest servers' witness.
Later, Garg et al. continued the study, focusing on how to achieve faster proof generation (USENIX 2023).
However, their...
Practical Post-Quantum Signatures for Privacy
Sven Argo, Tim Güneysu, Corentin Jeudy, Georg Land, Adeline Roux-Langlois, Olivier Sanders
Public-key cryptography
The transition to post-quantum cryptography has been an enormous challenge and effort for cryptographers over the last decade, with impressive results such as the future NIST standards. However, the latter has so far only considered central cryptographic mechanisms (signatures or KEM) and not more advanced ones, e.g., targeting privacy-preserving applications. Of particular interest is the family of solutions called blind signatures, group signatures and anonymous credentials, for which...
Elastic MSM: A Fast, Elastic and Modular Preprocessing Technique for Multi-Scalar Multiplication Algorithm on GPUs
Xudong Zhu, Haoqi He, Zhengbang Yang, Yi Deng, Lutan Zhao, Rui Hou
Implementation
Zero-knowledge proof (ZKP) is a cryptographic primitive that enables a prover to convince a verifier that a statement is true, without revealing any other information beyond the correctness of the statement itself. Due to its powerful capabilities, its most practical type, called zero-knowledge Succinct Non-interactive ARgument of Knowledge (zkSNARK), has been widely deployed in various privacy preserving applications such as cryptocurrencies and verifiable computation. Although...
Blink: Breaking Lattice-Based Schemes Implemented in Parallel with Chosen-Ciphertext Attack
Jian Wang, Weiqiong Cao, Hua Chen, Haoyuan Li
Attacks and cryptanalysis
As the message recovery-based attack poses a serious threat to lattice-based schemes, we conducted a study on the side-channel secu- rity of parallel implementations of lattice-based key encapsulation mech- anisms. Initially, we developed a power model to describe the power leakage during message encoding. Utilizing this power model, we pro- pose a multi-ciphertext message recovery attack, which can retrieve the required messages for a chosen ciphertext attack through a suitable mes- sage...
In-depth Correlation Power Analysis Attacks on a Hardware Implementation of CRYSTALS-Dilithium
Huaxin Wang, Yiwen Gao, Yuejun Liu, Qian Zhang, Yongbin Zhou
Attacks and cryptanalysis
During the standardisation process of post-quantum cryptography, NIST encourages research on side-channel analysis for candidate schemes. As the recommended lattice signature scheme, CRYSTALS-Dilithium, when implemented on hardware, has only been subjected to the side-channel attack presented by Steffen et al. in IACR ePrint 2022. This attack is not complete and requires excessive traces. Therefore, we investigate the leakage of an FPGA (Kintex7) implementation of CRYSTALS-Dilithium using...
2023/1889
Last updated: 2024-10-09
Fully Parallel, One-Cycle Random Shuffling for Efficient Countermeasure against Side Channel Attack and its Complexity Verification.
Jong-Yeon Park, Dongsoo Lee, Seonggyeom Kim, Wonil lee, Bo Gyeong Kang, Kouichi Sakurai
Foundations
Hiding countermeasures are the most widely utilized techniques for thwarting side-channel attacks, and their significance has been further emphasized with the advent of Post Quantum Cryptography (PQC) algorithms, owing to the extensive use of vector operations. Commonly, the Fisher-Yates algorithm is adopted in hiding countermeasures with permuted operation for its security and efficiency in implementation, yet the inherently sequential nature of the algorithm imposes limitations on hardware...
XorSHAP: Privacy-Preserving Explainable AI for Decision Tree Models
Dimitar Jetchev, Marius Vuille
Applications
Explainable AI (XAI) refers to the development of AI systems and machine learning models in a way that humans can understand, interpret and trust the predictions, decisions and outputs of these models. A common approach to explainability is feature importance, that is, determining which input features of the model have the most significant impact on the model prediction. Two major techniques for computing feature importance are LIME (Local Interpretable Model-agnostic Explanations) and...
Optimizing AES Threshold Implementation under the Glitch-Extended Probing Model
Fu Yao, Hua Chen, Yongzhuang Wei, Enes Pasalic, Feng Zhou, Limin Fan
Implementation
Threshold Implementation (TI) is a well-known Boolean masking technique that provides provable security against side-channel attacks. In the presence of glitches, the probing model was replaced by the so-called glitch-extended probing model which specifies a broader security framework. In CHES 2021, Shahmirzadi et al. introduced a general search method for finding first-order 2-share TI schemes without fresh randomness (under the presence of glitches) for a given encryption algorithm....
Design of a Linear Layer Optimised for Bitsliced 32-bit Implementation
Gaëtan Leurent, Clara Pernot
Secret-key cryptography
The linear layer of block ciphers plays an important role in their security.
In particular, ciphers designed following the wide-trail strategy use the branch number of the linear layer to derive bounds on the probability of linear and differential trails.
At FSE 2014, the LS-design construction was introduced as a simple and regular structure to design bitsliced block ciphers. It considers the internal state as a bit matrix, and applies alternatively an identical S-Box on all the columns,...
Sloth: Key Stretching and Deniable Encryption using Secure Elements on Smartphones
Daniel Hugenroth, Alberto Sonnino, Sam Cutler, Alastair R. Beresford
Cryptographic protocols
Privacy enhancing technologies must not only protect sensitive data in-transit, but also locally at-rest. For example, anonymity networks hide the sender and/or recipient of a message from network adversaries. However, if a participating device is physically captured, its owner can be pressured to give access to the stored conversations. Therefore, client software should allow the user to plausibly deny the existence of meaningful data. Since biometrics can be collected without consent and...
A Lattice Attack on CRYSTALS-Kyber with Correlation Power Analysis
Yen-Ting Kuo, Atsushi Takayasu
Attacks and cryptanalysis
CRYSTALS-Kyber is a key-encapsulation mechanism, whose security is based on the hardness of solving the learning-with-errors (LWE) problem over module lattices. As in its specification, Kyber prescribes the usage of the Number Theoretic Transform (NTT) for efficient polynomial multiplication. Side-channel assisted attacks against Post-Quantum Cryptography (PQC) algorithms like Kyber remain a concern in the ongoing standardization process of quantum-computer-resistant cryptosystems. Among the...
Concrete Analysis of Quantum Lattice Enumeration
Shi Bai, Maya-Iggy van Hoof, Floyd B. Johnson, Tanja Lange, Tran Ngo
Attacks and cryptanalysis
Lattice reduction algorithms such as BKZ (Block-Korkine-Zolotarev) play a central role in estimating the security of lattice-based cryptography. The subroutine in BKZ which finds the shortest vector in a projected sublattice can be instantiated with enumeration algorithms. The enumeration procedure can be seen as a depth-first search on some ``enumeration tree'' whose nodes denote a partial assignment of the coefficients, corresponding to lattice points as a linear combination of the lattice...
TMVP-based Polynomial Convolution for Saber and Sable on GPU using CUDA-cores and Tensor-cores
Muhammad Asfand Hafeez, Wai-Kong Lee, Angshuman Karmakar, Seong Oun Hwang
Implementation
Recently proposed lattice-based cryptography algorithms can be used to protect the IoT communication against the threat from quantum computers, but they are computationally heavy. In particular, polynomial multiplication is one of the most time-consuming operations in lattice-based cryptography. To achieve efficient implementation, the Number Theoretic Transform (NTT) algorithm is an ideal choice, but it has certain limitations on the parameters, which not all lattice-based schemes can...
Leveraging GPU in Homomorphic Encryption: Framework Design and Analysis of BFV Variants
Shiyu Shen, Hao Yang, Wangchen Dai, Lu Zhou, Zhe Liu, Yunlei Zhao
Implementation
Homomorphic Encryption (HE) enhances data security by facilitating computations on encrypted data, opening new paths for privacy-focused computations. The Brakerski-Fan-Vercauteren (BFV) scheme, a promising HE scheme, raises considerable performance challenges. Graphics Processing Units (GPUs), with considerable parallel processing abilities, have emerged as an effective solution.
In this work, we present an in-depth study focusing on accelerating and comparing BFV variants on GPUs,...
XNET: A Real-Time Unified Secure Inference Framework Using Homomorphic Encryption
Hao Yang, Shiyu Shen, Siyang Jiang, Lu Zhou, Wangchen Dai, Yunlei Zhao
Applications
Homomorphic Encryption (HE) presents a promising solution to securing neural networks for Machine Learning as a Service (MLaaS). Despite its potential, the real-time applicability of current HE-based solutions remains a challenge, and the diversity in network structures often results in inefficient implementations and maintenance. To address these issues, we introduce a unified and compact network structure for real-time inference in convolutional neural networks based on HE. We further...
Parallel Hardware for Isogeny-based VDF: Attacker's Perspective
David Jacquemin, Anisha Mukherjee, Ahmet Can Mert, Sujoy Sinha Roy
Implementation
The long running time of isogeny-based cryptographic constructions has proved to be a boon in disguise for one particular type of primitive called Verifiable Delay Functions (VDFs). VDFs are characterised by sequential function evaluation but an immediate output verification. In order to ensure secure use of VDFs in real-world applications, it is important to determine the fastest implementation. Considering the point of view of an attacker (say with unbounded resources), this paper aims to...
Multimixer-128: Universal Keyed Hashing Based on Integer Multiplication
Koustabh Ghosh, Parisa Amiri Eliasi, Joan Daemen
Secret-key cryptography
In this paper we introduce a new keyed hash function based on 32-bit integer multiplication that we call Multimixer-128. In our approach, we follow the key-then-hash parallel paradigm. So, we first add a variable length input message to a secret key and split the result into blocks. A fixed length public function based on integer multiplication is then applied on each block and their results are added to form the digest. We prove an upper bound of $2^{-127}$ for the universality of...
ACE-HoT: Accelerating an extreme amount of symmetric Cipher Evaluations for High-Order avalanche Tests
Emanuele Bellini, Juan Grados, Mohamed Rachidi, Nitin Satpute, Joan Daemen, Solane Elhirch
Implementation
In this work, we tackle the problem of estimating the security of iterated symmetric ciphers in an efficient manner, with tests that do not require a deep analysis of the internal structure of the cipher. This is particularly useful during the design phase of these ciphers, especially for quickly testing several combinations of possible parameters defining several cipher design variants.
We consider a popular statistical test that allows us to determine the probability of flipping each...
Antrag: Annular NTRU Trapdoor Generation
Thomas Espitau, Thi Thu Quyen Nguyen, Chao Sun, Mehdi Tibouchi, Alexandre Wallet
Public-key cryptography
In this paper, we introduce a novel trapdoor generation technique for
Prest's hybrid sampler over NTRU lattices. Prest's sampler is used in
particular in the recently proposed Mitaka signature scheme
(Eurocrypt 2022), a variant of the Falcon signature scheme, one of the
candidates selected by NIST for standardization. Mitaka was introduced
to address Falcon's main drawback, namely the fact that the lattice
Gaussian sampler used in its signature generation is highly...
Boosting the Performance of High-Assurance Cryptography: Parallel Execution and Optimizing Memory Access in Formally-Verified Line-Point Zero-Knowledge
Samuel Dittmer, Karim Eldefrawy, Stéphane Graham-Lengrand, Steve Lu, Rafail Ostrovsky, Vitor Pereira
Cryptographic protocols
Despite the notable advances in the development of high-assurance, verified implementations of cryptographic protocols, such implementations typically face significant performance overheads, particularly due to the penalties induced by formal verification and automated extraction of executable code. In this paper, we address some core performance challenges facing computer-aided cryptography by presenting a formal treatment for accelerating such verified implementations based on multiple...
Pianist: Scalable zkRollups via Fully Distributed Zero-Knowledge Proofs
Tianyi Liu, Tiancheng Xie, Jiaheng Zhang, Dawn Song, Yupeng Zhang
Cryptographic protocols
In the past decade, blockchains have seen various financial and technological innovations, with cryptocurrencies reaching a market cap of over 1 trillion dollars. However, scalability is one of the key issues hindering the deployment of blockchains in many applications. To improve the throughput of the transactions, zkRollups and zkEVM techniques using the cryptographic primitive of zero-knowledge proofs (ZKPs) have been proposed and many companies are adopting these technologies in the...
Whipping the MAYO Signature Scheme using Hardware Platforms
Florian Hirner, Michael Streibl, Florian Krieger, Ahmet Can Mert, Sujoy Sinha Roy
Implementation
NIST issued a new call in 2023 to diversify the portfolio of quantum-resistant digital signature schemes since the current portfolio relies on lattice problems. The MAYO scheme, which builds on the Unbalanced Oil and Vinegar (UOV) problem, is a promising candidate for this new call. MAYO introduces emulsifier maps and a novel 'whipping' technique to significantly reduce the key sizes compared to previous UOV schemes.
This paper provides a comprehensive analysis of the implementation...
LOL: A Highly Flexible Framework for Designing Stream Ciphers
Dengguo Feng, Lin Jiao, Yonglin Hao, Qunxiong Zheng, Wenling Wu, Wenfeng Qi, Lei Zhang, Liting Zhang, Siwei Sun, Tian Tian
Secret-key cryptography
In this paper, we propose LOL, a general framework for designing blockwise stream ciphers, to achieve ultrafast software implementations for the ubiquitous virtual networks in 5G/6G environments and high-security level for post-quantum cryptography. The LOL framework is structurally strong, and all its components as well as the LOL framework itself enjoy high flexibility with various extensions. Following the LOL framework, we propose new stream cipher designs named LOL-MINI and LOL-DOUBLE...
HI-Kyber: A novel high-performance implementation scheme of Kyber based on GPU
Xinyi Ji, Jiankuo Dong, Pinchang Zhang, Deng Tonggui, Hua Jiafeng, Fu Xiao
Implementation
CRYSTALS-Kyber, as the only public key encryption (PKE) algorithm selected by the National Institute of Standards and Technology (NIST) in the third round, is considered one of the most promising post-quantum cryptography (PQC) schemes. Lattice-based cryptography uses complex discrete alogarithm problems on lattices to build secure encryption and decryption systems to resist attacks from quantum computing. Performance is an important bottleneck affecting the promotion of post quantum...
Faster cellular automata cryptosystems with neighbor sequences
Kittiphop Phalakarn, Athasit Surarerks
The encryption processes and cryptosystems are very important. We use them to protect our private information over the Internet. Cellular automata are ones of the computational models that can also be used in cryptosystems. The advantage of the cellular automata is their abilities to work in parallel, and thus can reduce the encryption time. Some applications require the encryption time to be small, so this paper aims to reduce the encryption time of the cellular automata cryptosystems. We...
Long Paper: Provable Secure Parallel Gadgets
Francesco Berti, Sebastian Faust, Maximilian Orlt
Side-channel attacks are a fundamental threat to the security of cryptographic implementations. One of the most prominent countermeasures against side-channel attacks is masking, where each intermediate value of the computation is secret shared, thereby concealing the computation's sensitive information. An important security model to study the security of masking schemes is the random probing model, in which the adversary obtains each intermediate value of the computation with some...
Composable Oblivious Pseudo-Random Functions via Garbled Circuits
Sebastian Faller, Astrid Ottenhues, Johannes Ottenhues
Cryptographic protocols
Oblivious Pseudo-Random Functions (OPRFs) are a central
tool for building modern protocols for authentication and distributed
computation. For example, OPRFs enable simple login protocols that do
not reveal the password to the provider, which helps to mitigate known
shortcomings of password-based authentication such as password reuse
or mix-up. Reliable treatment of passwords becomes more and more
important as we login to a multitude of services with different passwords
in our daily...
Analysis of Parallel Implementation of Pilsung Block Cipher On Graphics Processing Unit
Siwoo Eum, Hyunjun Kim, Minho Song, Hwajeong Seo
Implementation
This paper focuses on the GPU implementation of the Pilsung block cipher used in the Red Star 3.0 operating system developed in North Korea. The Pilsung block cipher is designed based on AES. One notable feature of the Pilsung block cipher is that the table calculations required for encryption take longer than the encryption process itself. This paper emphasizes the parallel implementation of the Pilsung block cipher by leveraging the parallel processing capabilities of GPUs and evaluates...
Randomness Generation for Secure Hardware Masking - Unrolled Trivium to the Rescue
Gaëtan Cassiers, Loïc Masure, Charles Momin, Thorben Moos, Amir Moradi, François-Xavier Standaert
Implementation
Masking is a prominent strategy to protect cryptographic implementations against side-channel analysis. Its popularity arises from the exponential security gains that can be achieved for (approximately) quadratic resource utilization. Many variants of the countermeasure tailored for different optimization goals have been proposed. The common denominator among all of them is the implicit demand for robust and high entropy randomness. Simply assuming that uniformly distributed random bits are...
A Side-Channel Attack on a Bitsliced Higher-Order Masked CRYSTALS-Kyber Implementation
Ruize Wang, Martin Brisfors, Elena Dubrova
Attacks and cryptanalysis
In response to side-channel attacks on masked implementations of post-quantum cryptographic algorithms, a new bitsliced higher-order masked implementation of CRYSTALS-Kyber has been presented at CHES'2022. The bitsliced implementations are typically more difficult to break by side-channel analysis because they execute a single instruction across multiple bits in parallel. However, in this paper, we reveal new vulnerabilities in the masked Boolean to arithmetic conversion procedure of this...
PQC Cloudization: Rapid Prototyping of Scalable NTT/INTT Architecture to Accelerate Kyber
Mojtaba Bisheh-Niasar, Daniel Lo, Anjana Parthasarathy, Blake Pelton, Bharat Pillilli, Bryan Kelly
Public-key cryptography
The advent of quantum computers poses a serious challenge to the security of cloud infrastructures and services, as they can potentially break the existing public-key cryptosystems, such as Rivest–Shamir–Adleman (RSA) and Elliptic Curve Cryptography (ECC). Even though the gap between today’s quantum computers and the threats they pose to current public-key cryptography is large, the cloud landscape should act proactively and initiate the transition to the post-quantum era as early as...
2023/1028
Last updated: 2024-01-12
Revocable IBE with En-DKER from Lattices: A Novel Approach for Lattice Basis Delegation
Qi Wang, Haodong Huang, Juyan Li, Qi Yuan
Public-key cryptography
In public key encryption (PKE), anonymity is essential to ensure privacy by preventing the ciphertext from revealing the recipient’s identity. However, the literature has addressed the anonymity of PKE under different attack scenarios to a limited extent. Benhamouda et al. (TCC 2020) introduced the first formal definition of anonymity for PKE under corruption, and Huang et al. (ASIACRYPT 2022) made further extensions and provided a generic framework.
In this paper, we introduce a new...
At Last! A Homomorphic AES Evaluation in Less than 30 Seconds by Means of TFHE
Daphné Trama, Pierre-Emmanuel Clet, Aymen Boudguiga, Renaud Sirdey
Implementation
Since the pioneering work of Gentry, Halevi, and Smart in 2012, the state of the art on transciphering has moved away from work on AES to focus on new symmetric algorithms that are better suited for a homomorphic execution. Yet, with recent advances in homomorphic cryptosystems, the question arises as to where we stand today. Especially since AES execution is the application that may be chosen by NIST in the FHE part of its future call for threshold encryption.
In this paper, we propose an...
Concurrent Asynchronous Byzantine Agreement in Expected-Constant Rounds, Revisited
Ran Cohen, Pouyan Forghani, Juan Garay, Rutvik Patel, Vassilis Zikas
Cryptographic protocols
It is well known that without randomization, Byzantine agreement (BA) requires a linear number of rounds in the synchronous setting, while it is flat out impossible in the asynchronous setting. The primitive which allows to bypass the above limitation is known as oblivious common coin (OCC). It allows parties to agree with constant probability on a random coin, where agreement is oblivious, i.e., players are not aware whether or not agreement has been achieved.
The starting point of our...
Amortized Functional Bootstrapping in less than 7ms, with $\tilde{O}(1)$ polynomial multiplications
Zeyu Liu, Yunhao Wang
Cryptographic protocols
Amortized bootstrapping offers a way to refresh multiple ciphertexts of a fully homomorphic encryption scheme in parallel more efficiently than refreshing a single ciphertext at a time. Micciancio and Sorrell (ICALP 2018) first proposed the technique to bootstrap $n$ LWE ciphertexts simultaneously, reducing the total cost from $\tilde{O}(n^2)$ to $\tilde{O}(3^\epsilon n^{1+\frac{1}{\epsilon}})$ for arbitrary $\epsilon > 0$. Several recent works have further improved the asymptotic cost....
Vectorized and Parallel Computation of Large Smooth-Degree Isogenies using Precedence-Constrained Scheduling
Kittiphon Phalakarn, Vorapong Suppakitpaisarn, Francisco Rodríguez-Henríquez, M. Anwar Hasan
Implementation
Strategies and their evaluations play important roles in speeding up the computation of large smooth-degree isogenies. The concept of optimal strategies for such computation was introduced by De Feo et al., and virtually all implementations of isogeny-based protocols have adopted this approach, which is provably optimal for single-core platforms. In spite of its inherent sequential nature, several recent works have studied ways of speeding up this isogeny computation by exploiting the rich...
New SIDH Countermeasures for a More Efficient Key Exchange
Andrea Basso, Tako Boris Fouotsa
Public-key cryptography
The Supersingular Isogeny Diffie-Hellman (SIDH) protocol has been the main and most efficient isogeny-based encryption protocol, until a series of breakthroughs led to a polynomial-time key-recovery attack. While some countermeasures have been proposed, the resulting schemes are significantly slower and larger than the original SIDH.
In this work, we propose a new countermeasure technique that leads to significantly more efficient and compact protocols. To do so, we introduce the...
$\mathsf{Skye}$: An Expanding PRF based Fast KDF and its Applications
Amit Singh Bhati, Antonin Dufka, Elena Andreeva, Arnab Roy, Bart Preneel
Secret-key cryptography
A Key Derivation Function (KDF) generates a uniform and highly random key-stream from weakly random key material. KDFs are broadly used in various security protocols such as digital signatures and key exchange protocols. HKDF, the most deployed KDF in practice, is based on the extract-then-expand paradigm. It is presently used, among others, in the Signal Protocol for end-to-end encrypted messaging.
HKDF is a generic KDF for general input sources and thus is not optimized for...
Fast Exhaustive Search for Polynomial Systems over F3
Bo-Yin Yang, Wei-Jeng Wang, Shang-Yi Yang, Char-Shin Miou, Chen-Mou Cheng
Attacks and cryptanalysis
Solving multivariate polynomial systems over finite fields is an important
problem in cryptography. For random F2 low-degree systems with equally many
variables and equations, enumeration is more efficient than advanced solvers for all
practical problem sizes. Whether there are others remained an open problem.
We here study and propose an exhaustive-search algorithm for low degrees systems
over F3 which is suitable for parallelization. We implemented it on Graphic Processing
Units...
A Fast RLWE-Based IPFE Library and its Application to Privacy-Preserving Biometric Authentication
Supriya Adhikary, Angshuman Karmakar
Public-key cryptography
With the increased use of data and communication through the internet and the abundant misuse of personal data by many organizations, people are more sensitive about their privacy. Privacy-preserving computation is becoming increasingly important in this era. Functional encryption allows a user to evaluate a function on encrypted data without revealing sensitive information. Most implementations of functional encryption schemes are too time-consuming for practical use. Mera et al. first...
Ou: Automating the Parallelization of Zero-Knowledge Protocols
Yuyang Sang, Ning Luo, Samuel Judson, Ben Chaimberg, Timos Antonopoulos, Xiao Wang, Ruzica Piskac, Zhong Shao
Implementation
A zero-knowledge proof (ZKP) is a powerful cryptographic primitive used in many decentralized or privacy-focused applications. However, the high overhead of ZKPs can restrict their practical applicability. We design a programming language, Ou, aimed at easing the programmer's burden when writing efficient ZKPs, and a compiler framework, Lian, that automates the analysis and distribution of statements to a computing cluster. Lian uses programming language semantics, formal methods, and...
Enhancing the Privacy of Machine Learning via faster arithmetic over Torus FHE
Marc Titus Trifan, Alexandru Nicolau, Alexander Veidenbaum
Implementation
The increased popularity of Machine Learning as a Service (MLaaS) makes the privacy of user data and network weights a critical concern. Using Torus FHE (TFHE) offers a solution for privacy-preserving computation in a cloud environment by allowing computation directly over encrypted data. However, software TFHE implementations of cyphertext-cyphertext multiplication needed when both input data and weights are encrypted are either lacking or are too slow. This paper proposes a new way to...
PARMESAN: Parallel ARithMEticS over ENcrypted data
Jakub Klemsa, Melek Önen
Implementation
Fully Homomorphic Encryption enables the evaluation of an arbitrary computable function over encrypted data. Among all such functions, particular interest goes for integer arithmetics. In this paper, we present a bundle of methods for fast arithmetic operations over encrypted data: addition/subtraction, multiplication, and some of their special cases. On top of that, we propose techniques for signum, maximum, and rounding. All methods are specifically tailored for computations with data...
Adding more parallelism to the AEGIS authenticated encryption algorithms
Frank Denis
Secret-key cryptography
While the round function of the AEGIS authenticated encryption algorithms is highly parallelizable, their mode of operation is not.
We introduce two new modes to overcome that limitation: AEGIS-128X and AEGIS-256X, that require minimal changes to existing implementations and retain the security properties of AEGIS-128L and AEGIS-256.
Side-Channel Analysis of Integrate-and-Fire Neurons within Spiking Neural Networks
Matthias Probst, Manuel Brosch, Georg Sigl
Attacks and cryptanalysis
Spiking neural networks gain attention due to low power properties and event-based operation, making them suitable for usage in resource constrained embedded devices. Such edge devices allow physical access opening the door for side-channel analysis. In this work, we reverse engineer the parameters of a feed-forward spiking neural network implementation with correlation power analysis. Localized measurements of electro-magnetic emanations enable our attack, despite inherent parallelism and...
RPU: The Ring Processing Unit
Deepraj Soni, Negar Neda, Naifeng Zhang, Benedict Reynwar, Homer Gamil, Benjamin Heyman, Mohammed Nabeel Thari Moopan, Ahmad Al Badawi, Yuriy Polyakov, Kellie Canida, Massoud Pedram, Michail Maniatakos, David Bruce Cousins, Franz Franchetti, Matthew French, Andrew Schmidt, Brandon Reagen
Applications
Ring-Learning-with-Errors (RLWE) has emerged as the foundation of many important techniques for improving security and privacy, including homomorphic encryption and post-quantum cryptography. While promising, these techniques have received limited use due to their extreme overheads of running on general-purpose machines. In this paper, we present a novel vector Instruction Set Architecture (ISA) and microarchitecture for accelerating the ring-based computations of RLWE. The ISA, named B512,...
Discretization Error Reduction for Torus Fully Homomorphic Encryption
Kang Hoon Lee, Ji Won Yoon
Public-key cryptography
In recent history of fully homomorphic encryption, bootstrapping has been actively studied throughout many HE schemes. As bootstrapping is an essential process to transform somewhat homomorphic encryption schemes into fully homomorphic, enhancing its performance is one of the key factors of improving the utility of homomorphic encryption.
In this paper, we propose an extended bootstrapping for TFHE, which we name it by EBS. One of the main drawback of TFHE bootstrapping was that the...
High Throughput Lattice-based Signatures on GPUs: Comparing Falcon and Mitaka
Wai-Kong Lee, Raymond K. Zhao, Ron Steinfeld, Amin Sakzad, Seong Oun Hwang
Implementation
The US National Institute of Standards and Technology initiated a standardization process for post-quantum cryptography in 2017, with the aim of selecting key encapsulation mechanisms and signature schemes that can withstand the threat from emerging quantum computers. In 2022, Falcon was selected as one of the standard signature schemes, eventually attracting effort to optimize the implementation of Falcon on various hardware architectures for practical applications. Recently, Mitaka was...
Quantum Implementation of AIM: Aiming for Low-Depth
Kyungbae Jang, Dukyoung Kim, Yujin Oh, Sejin Lim, Yujin Yang, Hyunji Kim, Hwajeong Seo
Implementation
Security vulnerabilities in the symmetric-key primitives of a cipher can undermine the overall security claims of the cipher. With the rapid advancement of quantum computing in recent years, there is an increasing effort to evaluate the security of symmetric-key cryptography against potential quantum attacks.
This paper focuses on analyzing the quantum attack resistance of AIM, a symmetric-key primitive used in the AIMer digital signature scheme.
We presents the first quantum circuit...
FIN: Practical Signature-Free Asynchronous Common Subset in Constant Time
Sisi Duan, Xin Wang, Haibin Zhang
Cryptographic protocols
Asynchronous common subset (ACS) is a powerful paradigm enabling applications such as Byzantine fault-tolerance (BFT) and multi-party computation (MPC). The most efficient ACS framework in the information-theoretic setting is due to Ben-Or, Kelmer, and Rabin (BKR, 1994). The BKR ACS protocol has been both theoretically and practically impactful. However, the BKR protocol has an $O(\log n)$ running time (where $n$ is the number of replicas) due to the usage of $n$ parallel asynchronous...
Time is money, friend! Timing Side-channel Attack against Garbled Circuit Constructions
Mohammad Hashemi, Domenic Forte, Fatemeh Ganji
Attacks and cryptanalysis
With the advent of secure function evaluation (SFE), distrustful parties can jointly compute on their private inputs without disclosing anything besides the results. Yao’s garbled circuit protocol has become an integral part of secure computation thanks to considerable efforts made to make it feasible, practical, and more efficient. These efforts have resulted in multiple optimizations on this primitive to enhance its performance by orders of magnitude over the last years. The advancement in...
Area-time Efficient Implementation of NIST Lightweight Hash Functions Targeting IoT Applications
Safiullah Khan, Wai-Kong Lee, Angshuman Karmakar, Jose Maria Bermudo Mera, Abdul Majeed, Seong Oun Hwang
Implementation
To mitigate cybersecurity breaches, secure communication is crucial for the Internet of Things (IoT) environment. Data integrity is one of the most significant characteristics of security, which can be achieved by employing cryptographic hash functions. In view of the demand from IoT applications, the National Institute of Standards and Technology (NIST) initiated a standardization process for lightweight hash functions. This work presents field-programmable gate array (FPGA) implementations...
Optimized Implementation of Encapsulation and Decapsulation of Classic McEliece on ARMv8
Minjoo Sim, Siwoo Eum, Hyeokdong Kwon, Hyunjun Kim, Hwajeong Seo
Implementation
Recently, the results of the NIST PQC contest were announced.
Classic McEliece, one of the 3rd round candidates, was selected
as the fourth round candidate. Classic McEliece is the only code-based cipher in the NIST PQC finalists in third round and the algorithm is regarded as secure. However, it has low efficiency. In this paper, we propose an efficient software implementation of Classic McEliece, a code-based cipher, on 64-bit ARMv8 processors. Classic McEliece can be divided into Key...
Funshade: Function Secret Sharing for Two-Party Secure Thresholded Distance Evaluation
Alberto Ibarrondo, Hervé Chabanne, Melek Önen
Cryptographic protocols
We propose a novel privacy-preserving, two-party computation of various distance metrics (e.g., Hamming distance, Scalar Product) followed by a comparison with a fixed threshold, which is known as one of the most useful and popular building blocks for many different applications including machine learning, biometric matching, etc. Our solution builds upon recent advances in function secret sharing and makes use of an optimized version of arithmetic secret sharing. Thanks to this combination,...
Parallel Isogeny Path Finding with Limited Memory
Emanuele Bellini, Jorge Chavez-Saab, Jesús-Javier Chi-Domínguez, Andre Esser, Sorina Ionica, Luis Rivera-Zamarripa, Francisco Rodríguez-Henríquez, Monika Trimoska, Floyd Zweydinger
Attacks and cryptanalysis
The security guarantees of most isogeny-based protocols rely on the computational hardness of finding an isogeny between two supersingular isogenous curves defined over a prime field $\mathbb{F}_q$ with $q$ a power of a large prime $p$. In most scenarios, the isogeny is known to be of degree $\ell^e$ for some small prime $\ell$. We call this problem the Supersingular Fixed-Degree Isogeny Path (SIPFD) problem.
It is believed that the most general version of SIPFD is not solvable faster than...
A Side-Channel Attack on a Hardware Implementation of CRYSTALS-Kyber
Yanning Ji, Ruize Wang, Kalle Ngo, Elena Dubrova, Linus Backlund
Attacks and cryptanalysis
CRYSTALS-Kyber has been recently selected by the NIST as a new public-key encryption and key-establishment algorithm to be standardized. This makes it important to assess how well CRYSTALS-Kyber implementations withstand side-channel attacks. Software implementations of CRYSTALS-Kyber have been already analyzed and the discovered vulnerabilities were patched in the subsequently released versions. In this paper, we present a profiling side-channel attack on a hardware implementation of...
Towards Automating Cryptographic Hardware Implementations: a Case Study of HQC
Carlos Aguilar-Melchor, Jean-Christophe Deneuville, Arnaud Dion, James Howe, Romain Malmain, Vincent Migliore, Mamuri Nawan, Kashif Nawaz
Implementation
While hardware implementations allow the production of highly efficient and performance oriented designs, exploiting features such as parallelization, their longer time to code and implement often bottlenecks rapid prototyping. On the other hand, high-level synthesis (HLS) tools allow for faster experimentation of software code to a hardware platform while demonstrating a reasonable extrapolation of the expected hardware behavior. In this work, we attempt to show a rapid, fast prototyping of...
Sublinear-Round Broadcast without Trusted Setup against Dishonest Majority
Andreea B. Alexandru, Julian Loss, Charalampos Papamanthou, Giorgos Tsimos
Cryptographic protocols
Byzantine broadcast is one of the fundamental problems in distributed computing. Many practical applications from secure multiparty computation to consensus mechanisms for blockchains require increasingly weaker trust assumptions, as well as scalability for an ever-growing number of users, which rules out existing solutions with linear number of rounds or trusted setup requirements. In this paper, we propose the first sublinear-round and trustless Byzantine broadcast protocol. Unlike...
Security and Quantum Computing: An Overview
Prasannna Ravi, Anupam Chattopadhyay, Shivam Bhasin
Applications
The promise of scalable quantum computing is causing major upheaval in the domain of cryptography and security. In this perspective paper, we review the progress towards the realization of large-scale quantum computing. We further summarize the imminent threats towards existing cryptographic primitives. To address this challenges, there is a consolidated effort towards the standardization of new cryptographic primitives, namely post-quantum cryptography (PQC). We discuss the underlying...
Fast Fully Oblivious Compaction and Shuffling
Sajin Sasy, Aaron Johnson, Ian Goldberg
Implementation
Several privacy-preserving analytics frameworks have been proposed that use trusted execution environments (TEEs) like Intel SGX. Such frameworks often use compaction and shuffling as core primitives. However, due to advances in TEE side-channel attacks, these primitives, and the applications that use them, should be _fully oblivious_; that is, perform instruction sequences and memory accesses that do not depend on the secret inputs. Such obliviousness would eliminate the threat of leaking...
cuZK: Accelerating Zero-Knowledge Proof with A Faster Parallel Multi-Scalar Multiplication Algorithm on GPUs
Tao Lu, Chengkun Wei, Ruijing Yu, Chaochao Chen, Wenjing Fang, Lei Wang, Zeke Wang, Wenzhi Chen
Implementation
Zero-knowledge proof is a critical cryptographic primitive. Its most practical type, called zero-knowledge Succinct Non-interactive ARgument of Knowledge (zkSNARK), has been deployed in various privacy-preserving applications such as cryptocurrencies and verifiable machine learning. Unfortunately, zkSNARK like Groth16 has a high overhead on its proof generation step, which consists of several time-consuming operations, including large-scale matrix-vector multiplication (MUL),...
Homomorphic Encryption on GPU
Ali Şah Özcan, Can Ayduman, Enes Recep Türkoğlu, Erkay Savaş
Implementation
Homomorphic encryption (HE) is a cryptosystem that allows secure processing of encrypted data. One of the most popular HE schemes is the Brakerski-Fan-Vercauteren (BFV), which supports somewhat (SWHE) and fully homomorphic encryption (FHE). Since overly involved arithmetic operations of HE schemes are amenable to concurrent computation, GPU devices can be instrumental in facilitating the practical use of HE in real world applications thanks to their superior parallel processing capacity....
Long Live The Honey Badger: Robust Asynchronous DPSS and its Applications
Thomas Yurek, Zhuolun Xiang, Yu Xia, Andrew Miller
Cryptographic protocols
Secret sharing is an essential tool for many distributed applications, including distributed key generation and multiparty computation.
For many practical applications, we would like to tolerate network churn, meaning participants can dynamically enter and leave the pool of protocol participants as they please. Such protocols, called Dynamic-committee Proactive Secret Sharing (DPSS), have recently been studied; however, existing DPSS protocols do not gracefully handle faults: the presence...
Pushing the Limits of Generic Side-Channel Attacks on LWE-based KEMs - Parallel PC Oracle Attacks on Kyber KEM and Beyond
Gokulnath Rajendran, Prasanna Ravi, Jan-Pieter D'Anvers, Shivam Bhasin, Anupam Chattopadhyay
Applications
In this work, we propose generic and novel adaptations to the binary Plaintext-Checking (PC) oracle based side-channel attacks for Kyber KEM. Binary PC oracle-based side-channel attacks are fairly generic and easy to mount on a given target, as the attacker requires very minimal information about the target device. However, these attacks have an inherent disadvantage of requiring a few thousand traces to perform full key recovery, as they only recover a single bit of information per trace....
Maximizing the Potential of Custom RISC-V Vector Extensions for Speeding up SHA-3 Hash Functions
Huimin Li, Nele Mentens, Stjepan Picek
Applications
SHA-3 is considered to be one of the most secure standardized hash functions. It relies on the Keccak-f[1 600] permutation, which operates on an internal state of 1 600 bits, mostly represented as a 5×5×64-bit matrix. While software implementations process the state sequentially in chunks of typically 32 or 64 bits, the Keccak-f[1 600] permutation can benefit a lot from speedup through parallelization. This paper is the first to explore the full potential of parallelization of Keccak-f[1...
EZEE: Epoch Parallel Zero Knowledge for ANSI C
Yibin Yang, David Heath, Vladimir Kolesnikov, David Devecsery
Cryptographic protocols
Recent work has produced interactive Zero Knowledge (ZK) proof systems that can express proofs as arbitrary C programs (Heath et al., 2021, henceforth referred to as ZEE); these programs can be executed by a simulated ZK processor that runs in the 10KHz range.
In this work, we demonstrate that such proof systems are amenable to high degrees of parallelism. Our epoch parallelism-based approach allows the prover and verifier to divide the ZK proof into pieces such that each piece can be...
Low-latency Hardware Architecture for VDF Evaluation in Class Groups
Danyang Zhu, Jing Tian, Minghao Li, Zhongfeng Wang
The verifiable delay function (VDF), as a kind of cryptographic primitives, has recently been adopted quite often in decentralized systems. Highly correlated to the security of VDFs, the fastest implementation for VDF evaluation is generally desired to be publicly known. In this paper, for the first time, we propose a low-latency hardware implementation for the complete VDF evaluation in the class group by joint exploiting optimizations. On one side, we reduce the required computational...
Quantum Analysis of AES
Kyungbae Jang, Anubhab Baksi, Hyunji Kim, Gyeongju Song, Hwajeong Seo, Anupam Chattopadhyay
Secret-key cryptography
Quantum computing is considered among the next big leaps in computer science. While a fully functional quantum computer is still in the future, there is an ever-growing need to evaluate the security of the symmetric key ciphers against a potent quantum adversary. Keeping this in mind, our work explores the key recovery attack using the Grover's search on the three variants of AES (-128, -192, -256). In total, we develop a pool of 20 implementations per AES variant (thus totaling in 60), by...
Dashing and Star: Byzantine Fault Tolerance with Weak Certificates
Sisi Duan, Haibin Zhang, Xiao Sui, Baohan Huang, Changchun Mu, Gang Di, Xiaoyun Wang
Cryptographic protocols
State-of-the-art Byzantine fault-tolerant (BFT) protocols assuming partial synchrony such as SBFT and HotStuff use \textit{regular certificates} obtained from $2f+1$ (partial) signatures. We show that one can use \textit{weak certificates} obtained from only $f+1$ signatures to \textit{assist} in designing more robust and more efficient BFT protocols. We design and implement two BFT systems: Dashing (a family of two HotStuff-style BFT protocols) and Star (a parallel BFT framework).
We...
Efficient and Accurate homomorphic comparisons
Olive Chakraborty, Martin Zuber
Foundations
We design and implement a new efficient and accurate Fully homomorphic argmin/min or argmax/max comparison operator, which finds its application in numerous real-world use cases as a classifier. In particular we propose two versions of our algorithms using different tools from TFHE's functional bootstrapping toolkit. Our algorithm scales to any number of input data points with linear time complexity and logarithmic noise-propagation. Our algorithm is the fastest on the market for...
Efficient Lifting for Shorter Zero-Knowledge Proofs and Post-Quantum Signatures
Daniel Kales, Greg Zaverucha
Public-key cryptography
MPC-in-the-head based zero-knowledge proofs allow one to prove
knowledge of a preimage for a circuit defined over a finite field F. In recent
proofs the soundness depends on the size F, and small fields require more
parallel repetitions, and therefore produce larger proofs. In this paper we
develop and systematically apply lifting strategies to such proof protocols
in order to increase soundness and reduce proof size. The strategies are
(i) lifting parts of the protocol to extension fields...
Resumable Zero-Knowledge for Circuits from Symmetric Key Primitives
Handong Zhang, Puwen Wei, Haiyang Xue, Yi Deng, Jinsong Li, Wei Wang, Guoxiao Liu
Public-key cryptography
Consider the scenario that the prover and the verifier perform the zero-knowledge (ZK) proof protocol for the same statement multiple times sequentially, where each proof is modeled as a session. We focus on the problem of how to resume a ZK proof efficiently in such scenario. We introduce a new primitive called resumable honest verifier zero-knowledge proof of knowledge (resumable HVZKPoK) and propose a general construction of the resumable HVZKPoK for circuits based on the...
In 2020, Castryck-Decru-Smith constructed a hash function, using the (2,2)-isogeny graph of superspecial principally polarized abelian surfaces. In their construction, the initial surface was chosen from vertices very "close" to the square of a supersingular elliptic curve with a known endomorphism ring. In this paper, we introduce an algorithm for detecting a collision on their hash function. Under some heuristic assumptions, the time complexity and space complexity of our algorithm are...
Reductions are the workhorses of cryptography. They allow constructions of complex cryptographic primitives from simple building blocks. A prominent example is the non-interactive reduction from securely computing a ``complex" function $f$ to securely computing a ``simple" function $g$ via randomized encodings. Prior work equated simplicity with functions of small degree. In this work, we consider a different notion of simplicity where we require $g$ to only take inputs from a small...
SQIsign is a well-known post-quantum signature scheme due to its small combined signature and public-key size. However, SQIsign suffers from notably long signing times, and verification times are not short either. To improve this, recent research has explored both one-dimensional and two-dimensional variants of SQIsign, each with distinct characteristics. In particular, SQIsign2D's efficient signing and verification times have made it a focal point of recent research. However, the absence of...
HEonGPU is a high-performance library designed to optimize Fully Homomorphic Encryption (FHE) operations on Graphics Processing Unit (GPU). By leveraging the parallel processing capac- ity of GPUs, HEonGPU significantly reduces the computational overhead typically associated with FHE by executing complex operation concurrently. This allows for faster execution of homomorphic computations on encrypted data, enabling real-time applications in privacy-preserving machine learn- ing and secure...
The RISC-V Vector Cryptography Extensions (Zvk) were ratified in 2023 and integrated into the main ISA manuals in 2024. These extensions support high-speed symmetric cryptography (AES, SHA2, SM3, SM4) operating on the vector register file and offer significant performance improvements over scalar cryptography extensions (Zk) due to data parallelism. As a ratified extension, Zvk is supported by compiler toolchains and is already being integrated into popular cryptographic middleware such as...
Recently, the construction of cryptographic schemes based on hard lattice problems has gained immense popularity. Apart from being quantum resistant, lattice-based cryptography allows a wide range of variations in the underlying hard problem. As cryptographic schemes can work in different environments under different operational constraints such as memory footprint, silicon area, efficiency, power requirement, etc., such variations in the underlying hard problem are very useful for designers...
Digital signatures are fundamental building blocks in various protocols to provide integrity and authenticity. The development of the quantum computing has raised concerns about the security guarantees afforded by classical signature schemes. CRYSTALS-Dilithium is an efficient post-quantum digital signature scheme based on lattice cryptography and has been selected as the primary algorithm for standardization by the National Institute of Standards and Technology. In this work, we present a...
A prominent countermeasure against side channel attacks, the hiding countermeasure, typically involves shuffling operations using a permutation algorithm. Especially in the era of Post-Quantum Cryptography, the importance of the hiding coun- termeasure is emphasized due to computational characteristics like those of lattice and code-based cryptography. In this context, swiftly and securely generating permutations has a critical impact on an algorithm’s security and efficiency. The widely...
Zero-knowledge Succinct Non-interactive ARguments of Knowledge (zkSNARKs) allow a prover to convince a verifier of the correct execution of a large computation in private and easily-verifiable manner. These properties make zkSNARKs a powerful tool for adding accountability, scalability, and privacy to numerous systems such as blockchains and verifiable key directories. Unfortunately, existing zkSNARKs are unable to scale to large computations due to time and space complexity requirements...
Clustering is a crucial unsupervised learning method extensively used in the field of data analysis. For analyzing big data, outsourced computation is an effective solution but privacy concerns arise when involving sensitive information. Fully homomorphic encryption (FHE) enables computations on encrypted data, making it ideal for such scenarios. However, existing privacy-preserving clustering based on FHE are often constrained by the high computational overhead incurred from FHE, typically...
This paper proposes Oryx, a system for efficiently detecting cycles in federated graphs where parts of the graph are held by different parties and are private. Cycle detection is an important building block in designing fraud detection algorithms that operate on confidential transaction data held by different financial institutions. Oryx allows detecting cycles of various length while keeping the topology of the graphs secret, and it does so efficiently; Oryx achieves quasilinear...
As quantum computing progresses, the assessment of cryptographic algorithm resilience against quantum attack gains significance interests in the field of cryptanalysis. Consequently, this paper implements the depth-optimized quantum circuit of Korean hash function (i.e., LSH) and estimates its quantum attack cost in quantum circuits. By utilizing an optimized quantum adder and employing parallelization techniques, the proposed quantum circuit achieves a 78.8\% improvement in full depth and a...
$SPHINCS^+$, one of the Post-Quantum Cryptography Digital Signature Algorithms (PQC-DSA) selected by NIST in the third round, features very short public and private key lengths but faces significant performance challenges compared to other post-quantum cryptographic schemes, limiting its suitability for real-world applications. To address these challenges, we propose the GPU-based paRallel Accelerated $SPHINCS^+$ (GRASP), which leverages GPU technology to enhance the efficiency of...
Polynomials play a central role in cryptography. In the context of Zero Knowledge Proofs (ZKPs), protocols can be exclusively expressed using polynomials, making them a powerful abstraction tool, as demonstrated in most ZK research papers. Our first contribution is a high-level framework that enables practitioners to implement ZKPs in a more natural way, based solely on polynomial primitives. ZK provers are considered computationally intensive algorithms with a high degree of...
One of the main issues to deal with for fully homomorphic encryption is the noise growth when operating on ciphertexts. To some extent, this can be controlled thanks to a so-called gadget decomposition. A gadget decomposition typically relies on radix- or CRT-based representations to split elements as vectors of smaller chunks whose inner products with the corresponding gadget vector rebuilds (an approximation of) the original elements. Radix-based gadget decompositions present the advantage...
In the field of Artificial Intelligence (AI), convolution operations have primarily been used in Convolutional Neural Networks (CNNs). However, its utility is increasing with the appearance of convolution integrated transformers or state space models where convolution is a constituent element. In the field of private AI, generalized algorithm, multiplexed parallel convolution was recently proposed to implement CNNs based on the Homomorphic Encryption scheme, residue number system variant...
A primary challenge in isogeny-based cryptography lies in the substantial computational cost associated to computing and evaluating prime-degree isogenies. This computation traditionally relied on Vélu's formulas, an approach with time complexity linear in the degree but which was further enhanced by Bernstein, De Feo, Leroux, and Smith to a square-root complexity. The improved square-root Vélu's formulas exhibit a degree of parallelizability that has not been exploited in major...
In this paper, we introduce a new probability function parameter in the instantiations of the Goldreich-Micali-Wigderson with Fiat-Shamir and unbalanced challenges used in ALTEQ, a recent NIST PQC candidate in the call for additional signatures. This probability set at 100% does not bring any changes in the scheme, but modifies the public challenge generation process when below 100%, by injecting potential rejections in otherwise completely valid inputs. From a theoretical point of view,...
This work proposes a new white-box attack technique called filtering, which can be combined with any other trace-based attack method. The idea is to filter the traces based on the value of an intermediate variable in the implementation, aiming to fix a share of a sensitive value and degrade the security of an involved masking scheme. Coupled with LDA (filtered LDA, FLDA), it leads to an attack defeating the state-of-the-art SEL masking scheme (CHES 2021) of arbitrary degree and number of...
MRAE security is an important goal for many AEAD applications where the nonce uniqueness cannot be maintained and security risks are significant. However, MRAE schemes can be quite expensive. Two of the SoTA MRAE-secure schemes; Deoxys-II and AES-GCM-SIV rely on internal parallelism and special instructions to achieve competitive performance. However, they both suffer from the same bottleneck, they have at least one call to the underlying primitive that cannot be parallelized to any other...
This paper introduces a high-performance and scalable hardware architecture designed for the Number-Theoretic Transform (NTT), a fundamental component extensively utilized in lattice-based encryption and fully homomorphic encryption schemes. The underlying rationale behind this research is to harness the advantages of the hypercube topology. This topology serves to significantly diminish the volume of data exchanges required during each iteration of the NTT, reducing it to a complexity of...
We propose Reckle trees, a new vector commitment based on succinct RECursive arguments and MerKLE trees. Reckle trees' distinguishing feature is their support for succinct batch proofs that are updatable - enabling new applications in the blockchain setting where a proof needs to be computed and efficiently maintained over a moving stream of blocks. Our technical approach is based on embedding the computation of the batch hash inside the recursive Merkle verification via a hash-based...
In this paper, we explore a novel Zero-knowledge Virtual Machine (zkVM) framework leveraging succinct, non-interactive zero-knowledge proofs for verifiable computation over any code. Our approach divides the proof of program execution into two stages. In the first stage, the process breaks down program execution into segments, identifying and grouping identical sections. These segments are then proved through data-parallel circuits that allow for varying amounts of duplication. In the...
Uniformly random unitaries, i.e., unitaries drawn from the Haar measure, have many useful properties, but cannot be implemented efficiently. This has motivated a long line of research into random unitaries that ``look'' sufficiently Haar random while also being efficient to implement. Two different notions of derandomisation have emerged: $t$-designs are random unitaries that information-theoretically reproduce the first $t$ moments of the Haar measure, and pseudorandom unitaries (PRUs)...
Authenticated Encryption (AE) modes of operation based on Tweakable Block Ciphers (TBC) usually measure efficiency in the number of calls to the underlying primitive per message block. On the one hand, many existing solutions reach a primitive-rate of 1, meaning that each n-bit block of message asymptotically needs a single call to the TBC with output length n. On the other hand, while these modes look optimal in a blackbox setting, they become less attractive when leakage comes into play,...
Fully Homomorphic Encryption (FHE) enables computation on encrypted data, holding immense potential for enhancing data privacy and security in various applications. Presently, FHE adoption is hindered by slow computation times, caused by data being encrypted into large polynomials. Optimized FHE libraries and hardware acceleration are emerging to tackle this performance bottleneck. Often, these libraries implement the Number Theoretic Transform (NTT) algorithm for efficient polynomial...
To improve the throughput of Byzantine Fault Tolerance (BFT) consensus protocols, the Directed Acyclic Graph (DAG) topology has been introduced to parallel data processing, leading to the development of DAG-based BFT consensus. However, existing DAG-based works heavily rely on Reliable Broadcast (RBC) protocols for block broadcasting, which introduces significant latency due to the three communication steps involved in each RBC. For instance, DAGRider, a representative DAG-based protocol,...
Secure merge considers the problem of combining two sorted lists into a single sorted secret-shared list. Merge is a fundamental building block for many real-world applications. For example, secure merge can implement a large number of SQL-like database joins, which are essential for almost any data processing task such as privacy-preserving fraud detection, ad conversion rates, data deduplication, and many more. We present two constructions with communication bandwidth and rounds...
The notion of collaborative zk-SNARK is introduced by Ozdemir and Boneh (USENIX 2022), which allows multiple parties to jointly create a zk-SNARK proof over distributed secrets (also known as the witness). This approach ensures the privacy of the witness, as no corrupted servers involved in the proof generation can learn anything about the honest servers' witness. Later, Garg et al. continued the study, focusing on how to achieve faster proof generation (USENIX 2023). However, their...
The transition to post-quantum cryptography has been an enormous challenge and effort for cryptographers over the last decade, with impressive results such as the future NIST standards. However, the latter has so far only considered central cryptographic mechanisms (signatures or KEM) and not more advanced ones, e.g., targeting privacy-preserving applications. Of particular interest is the family of solutions called blind signatures, group signatures and anonymous credentials, for which...
Zero-knowledge proof (ZKP) is a cryptographic primitive that enables a prover to convince a verifier that a statement is true, without revealing any other information beyond the correctness of the statement itself. Due to its powerful capabilities, its most practical type, called zero-knowledge Succinct Non-interactive ARgument of Knowledge (zkSNARK), has been widely deployed in various privacy preserving applications such as cryptocurrencies and verifiable computation. Although...
As the message recovery-based attack poses a serious threat to lattice-based schemes, we conducted a study on the side-channel secu- rity of parallel implementations of lattice-based key encapsulation mech- anisms. Initially, we developed a power model to describe the power leakage during message encoding. Utilizing this power model, we pro- pose a multi-ciphertext message recovery attack, which can retrieve the required messages for a chosen ciphertext attack through a suitable mes- sage...
During the standardisation process of post-quantum cryptography, NIST encourages research on side-channel analysis for candidate schemes. As the recommended lattice signature scheme, CRYSTALS-Dilithium, when implemented on hardware, has only been subjected to the side-channel attack presented by Steffen et al. in IACR ePrint 2022. This attack is not complete and requires excessive traces. Therefore, we investigate the leakage of an FPGA (Kintex7) implementation of CRYSTALS-Dilithium using...
Hiding countermeasures are the most widely utilized techniques for thwarting side-channel attacks, and their significance has been further emphasized with the advent of Post Quantum Cryptography (PQC) algorithms, owing to the extensive use of vector operations. Commonly, the Fisher-Yates algorithm is adopted in hiding countermeasures with permuted operation for its security and efficiency in implementation, yet the inherently sequential nature of the algorithm imposes limitations on hardware...
Explainable AI (XAI) refers to the development of AI systems and machine learning models in a way that humans can understand, interpret and trust the predictions, decisions and outputs of these models. A common approach to explainability is feature importance, that is, determining which input features of the model have the most significant impact on the model prediction. Two major techniques for computing feature importance are LIME (Local Interpretable Model-agnostic Explanations) and...
Threshold Implementation (TI) is a well-known Boolean masking technique that provides provable security against side-channel attacks. In the presence of glitches, the probing model was replaced by the so-called glitch-extended probing model which specifies a broader security framework. In CHES 2021, Shahmirzadi et al. introduced a general search method for finding first-order 2-share TI schemes without fresh randomness (under the presence of glitches) for a given encryption algorithm....
The linear layer of block ciphers plays an important role in their security. In particular, ciphers designed following the wide-trail strategy use the branch number of the linear layer to derive bounds on the probability of linear and differential trails. At FSE 2014, the LS-design construction was introduced as a simple and regular structure to design bitsliced block ciphers. It considers the internal state as a bit matrix, and applies alternatively an identical S-Box on all the columns,...
Privacy enhancing technologies must not only protect sensitive data in-transit, but also locally at-rest. For example, anonymity networks hide the sender and/or recipient of a message from network adversaries. However, if a participating device is physically captured, its owner can be pressured to give access to the stored conversations. Therefore, client software should allow the user to plausibly deny the existence of meaningful data. Since biometrics can be collected without consent and...
CRYSTALS-Kyber is a key-encapsulation mechanism, whose security is based on the hardness of solving the learning-with-errors (LWE) problem over module lattices. As in its specification, Kyber prescribes the usage of the Number Theoretic Transform (NTT) for efficient polynomial multiplication. Side-channel assisted attacks against Post-Quantum Cryptography (PQC) algorithms like Kyber remain a concern in the ongoing standardization process of quantum-computer-resistant cryptosystems. Among the...
Lattice reduction algorithms such as BKZ (Block-Korkine-Zolotarev) play a central role in estimating the security of lattice-based cryptography. The subroutine in BKZ which finds the shortest vector in a projected sublattice can be instantiated with enumeration algorithms. The enumeration procedure can be seen as a depth-first search on some ``enumeration tree'' whose nodes denote a partial assignment of the coefficients, corresponding to lattice points as a linear combination of the lattice...
Recently proposed lattice-based cryptography algorithms can be used to protect the IoT communication against the threat from quantum computers, but they are computationally heavy. In particular, polynomial multiplication is one of the most time-consuming operations in lattice-based cryptography. To achieve efficient implementation, the Number Theoretic Transform (NTT) algorithm is an ideal choice, but it has certain limitations on the parameters, which not all lattice-based schemes can...
Homomorphic Encryption (HE) enhances data security by facilitating computations on encrypted data, opening new paths for privacy-focused computations. The Brakerski-Fan-Vercauteren (BFV) scheme, a promising HE scheme, raises considerable performance challenges. Graphics Processing Units (GPUs), with considerable parallel processing abilities, have emerged as an effective solution. In this work, we present an in-depth study focusing on accelerating and comparing BFV variants on GPUs,...
Homomorphic Encryption (HE) presents a promising solution to securing neural networks for Machine Learning as a Service (MLaaS). Despite its potential, the real-time applicability of current HE-based solutions remains a challenge, and the diversity in network structures often results in inefficient implementations and maintenance. To address these issues, we introduce a unified and compact network structure for real-time inference in convolutional neural networks based on HE. We further...
The long running time of isogeny-based cryptographic constructions has proved to be a boon in disguise for one particular type of primitive called Verifiable Delay Functions (VDFs). VDFs are characterised by sequential function evaluation but an immediate output verification. In order to ensure secure use of VDFs in real-world applications, it is important to determine the fastest implementation. Considering the point of view of an attacker (say with unbounded resources), this paper aims to...
In this paper we introduce a new keyed hash function based on 32-bit integer multiplication that we call Multimixer-128. In our approach, we follow the key-then-hash parallel paradigm. So, we first add a variable length input message to a secret key and split the result into blocks. A fixed length public function based on integer multiplication is then applied on each block and their results are added to form the digest. We prove an upper bound of $2^{-127}$ for the universality of...
In this work, we tackle the problem of estimating the security of iterated symmetric ciphers in an efficient manner, with tests that do not require a deep analysis of the internal structure of the cipher. This is particularly useful during the design phase of these ciphers, especially for quickly testing several combinations of possible parameters defining several cipher design variants. We consider a popular statistical test that allows us to determine the probability of flipping each...
In this paper, we introduce a novel trapdoor generation technique for Prest's hybrid sampler over NTRU lattices. Prest's sampler is used in particular in the recently proposed Mitaka signature scheme (Eurocrypt 2022), a variant of the Falcon signature scheme, one of the candidates selected by NIST for standardization. Mitaka was introduced to address Falcon's main drawback, namely the fact that the lattice Gaussian sampler used in its signature generation is highly...
Despite the notable advances in the development of high-assurance, verified implementations of cryptographic protocols, such implementations typically face significant performance overheads, particularly due to the penalties induced by formal verification and automated extraction of executable code. In this paper, we address some core performance challenges facing computer-aided cryptography by presenting a formal treatment for accelerating such verified implementations based on multiple...
In the past decade, blockchains have seen various financial and technological innovations, with cryptocurrencies reaching a market cap of over 1 trillion dollars. However, scalability is one of the key issues hindering the deployment of blockchains in many applications. To improve the throughput of the transactions, zkRollups and zkEVM techniques using the cryptographic primitive of zero-knowledge proofs (ZKPs) have been proposed and many companies are adopting these technologies in the...
NIST issued a new call in 2023 to diversify the portfolio of quantum-resistant digital signature schemes since the current portfolio relies on lattice problems. The MAYO scheme, which builds on the Unbalanced Oil and Vinegar (UOV) problem, is a promising candidate for this new call. MAYO introduces emulsifier maps and a novel 'whipping' technique to significantly reduce the key sizes compared to previous UOV schemes. This paper provides a comprehensive analysis of the implementation...
In this paper, we propose LOL, a general framework for designing blockwise stream ciphers, to achieve ultrafast software implementations for the ubiquitous virtual networks in 5G/6G environments and high-security level for post-quantum cryptography. The LOL framework is structurally strong, and all its components as well as the LOL framework itself enjoy high flexibility with various extensions. Following the LOL framework, we propose new stream cipher designs named LOL-MINI and LOL-DOUBLE...
CRYSTALS-Kyber, as the only public key encryption (PKE) algorithm selected by the National Institute of Standards and Technology (NIST) in the third round, is considered one of the most promising post-quantum cryptography (PQC) schemes. Lattice-based cryptography uses complex discrete alogarithm problems on lattices to build secure encryption and decryption systems to resist attacks from quantum computing. Performance is an important bottleneck affecting the promotion of post quantum...
The encryption processes and cryptosystems are very important. We use them to protect our private information over the Internet. Cellular automata are ones of the computational models that can also be used in cryptosystems. The advantage of the cellular automata is their abilities to work in parallel, and thus can reduce the encryption time. Some applications require the encryption time to be small, so this paper aims to reduce the encryption time of the cellular automata cryptosystems. We...
Side-channel attacks are a fundamental threat to the security of cryptographic implementations. One of the most prominent countermeasures against side-channel attacks is masking, where each intermediate value of the computation is secret shared, thereby concealing the computation's sensitive information. An important security model to study the security of masking schemes is the random probing model, in which the adversary obtains each intermediate value of the computation with some...
Oblivious Pseudo-Random Functions (OPRFs) are a central tool for building modern protocols for authentication and distributed computation. For example, OPRFs enable simple login protocols that do not reveal the password to the provider, which helps to mitigate known shortcomings of password-based authentication such as password reuse or mix-up. Reliable treatment of passwords becomes more and more important as we login to a multitude of services with different passwords in our daily...
This paper focuses on the GPU implementation of the Pilsung block cipher used in the Red Star 3.0 operating system developed in North Korea. The Pilsung block cipher is designed based on AES. One notable feature of the Pilsung block cipher is that the table calculations required for encryption take longer than the encryption process itself. This paper emphasizes the parallel implementation of the Pilsung block cipher by leveraging the parallel processing capabilities of GPUs and evaluates...
Masking is a prominent strategy to protect cryptographic implementations against side-channel analysis. Its popularity arises from the exponential security gains that can be achieved for (approximately) quadratic resource utilization. Many variants of the countermeasure tailored for different optimization goals have been proposed. The common denominator among all of them is the implicit demand for robust and high entropy randomness. Simply assuming that uniformly distributed random bits are...
In response to side-channel attacks on masked implementations of post-quantum cryptographic algorithms, a new bitsliced higher-order masked implementation of CRYSTALS-Kyber has been presented at CHES'2022. The bitsliced implementations are typically more difficult to break by side-channel analysis because they execute a single instruction across multiple bits in parallel. However, in this paper, we reveal new vulnerabilities in the masked Boolean to arithmetic conversion procedure of this...
The advent of quantum computers poses a serious challenge to the security of cloud infrastructures and services, as they can potentially break the existing public-key cryptosystems, such as Rivest–Shamir–Adleman (RSA) and Elliptic Curve Cryptography (ECC). Even though the gap between today’s quantum computers and the threats they pose to current public-key cryptography is large, the cloud landscape should act proactively and initiate the transition to the post-quantum era as early as...
In public key encryption (PKE), anonymity is essential to ensure privacy by preventing the ciphertext from revealing the recipient’s identity. However, the literature has addressed the anonymity of PKE under different attack scenarios to a limited extent. Benhamouda et al. (TCC 2020) introduced the first formal definition of anonymity for PKE under corruption, and Huang et al. (ASIACRYPT 2022) made further extensions and provided a generic framework. In this paper, we introduce a new...
Since the pioneering work of Gentry, Halevi, and Smart in 2012, the state of the art on transciphering has moved away from work on AES to focus on new symmetric algorithms that are better suited for a homomorphic execution. Yet, with recent advances in homomorphic cryptosystems, the question arises as to where we stand today. Especially since AES execution is the application that may be chosen by NIST in the FHE part of its future call for threshold encryption. In this paper, we propose an...
It is well known that without randomization, Byzantine agreement (BA) requires a linear number of rounds in the synchronous setting, while it is flat out impossible in the asynchronous setting. The primitive which allows to bypass the above limitation is known as oblivious common coin (OCC). It allows parties to agree with constant probability on a random coin, where agreement is oblivious, i.e., players are not aware whether or not agreement has been achieved. The starting point of our...
Amortized bootstrapping offers a way to refresh multiple ciphertexts of a fully homomorphic encryption scheme in parallel more efficiently than refreshing a single ciphertext at a time. Micciancio and Sorrell (ICALP 2018) first proposed the technique to bootstrap $n$ LWE ciphertexts simultaneously, reducing the total cost from $\tilde{O}(n^2)$ to $\tilde{O}(3^\epsilon n^{1+\frac{1}{\epsilon}})$ for arbitrary $\epsilon > 0$. Several recent works have further improved the asymptotic cost....
Strategies and their evaluations play important roles in speeding up the computation of large smooth-degree isogenies. The concept of optimal strategies for such computation was introduced by De Feo et al., and virtually all implementations of isogeny-based protocols have adopted this approach, which is provably optimal for single-core platforms. In spite of its inherent sequential nature, several recent works have studied ways of speeding up this isogeny computation by exploiting the rich...
The Supersingular Isogeny Diffie-Hellman (SIDH) protocol has been the main and most efficient isogeny-based encryption protocol, until a series of breakthroughs led to a polynomial-time key-recovery attack. While some countermeasures have been proposed, the resulting schemes are significantly slower and larger than the original SIDH. In this work, we propose a new countermeasure technique that leads to significantly more efficient and compact protocols. To do so, we introduce the...
A Key Derivation Function (KDF) generates a uniform and highly random key-stream from weakly random key material. KDFs are broadly used in various security protocols such as digital signatures and key exchange protocols. HKDF, the most deployed KDF in practice, is based on the extract-then-expand paradigm. It is presently used, among others, in the Signal Protocol for end-to-end encrypted messaging. HKDF is a generic KDF for general input sources and thus is not optimized for...
Solving multivariate polynomial systems over finite fields is an important problem in cryptography. For random F2 low-degree systems with equally many variables and equations, enumeration is more efficient than advanced solvers for all practical problem sizes. Whether there are others remained an open problem. We here study and propose an exhaustive-search algorithm for low degrees systems over F3 which is suitable for parallelization. We implemented it on Graphic Processing Units...
With the increased use of data and communication through the internet and the abundant misuse of personal data by many organizations, people are more sensitive about their privacy. Privacy-preserving computation is becoming increasingly important in this era. Functional encryption allows a user to evaluate a function on encrypted data without revealing sensitive information. Most implementations of functional encryption schemes are too time-consuming for practical use. Mera et al. first...
A zero-knowledge proof (ZKP) is a powerful cryptographic primitive used in many decentralized or privacy-focused applications. However, the high overhead of ZKPs can restrict their practical applicability. We design a programming language, Ou, aimed at easing the programmer's burden when writing efficient ZKPs, and a compiler framework, Lian, that automates the analysis and distribution of statements to a computing cluster. Lian uses programming language semantics, formal methods, and...
The increased popularity of Machine Learning as a Service (MLaaS) makes the privacy of user data and network weights a critical concern. Using Torus FHE (TFHE) offers a solution for privacy-preserving computation in a cloud environment by allowing computation directly over encrypted data. However, software TFHE implementations of cyphertext-cyphertext multiplication needed when both input data and weights are encrypted are either lacking or are too slow. This paper proposes a new way to...
Fully Homomorphic Encryption enables the evaluation of an arbitrary computable function over encrypted data. Among all such functions, particular interest goes for integer arithmetics. In this paper, we present a bundle of methods for fast arithmetic operations over encrypted data: addition/subtraction, multiplication, and some of their special cases. On top of that, we propose techniques for signum, maximum, and rounding. All methods are specifically tailored for computations with data...
While the round function of the AEGIS authenticated encryption algorithms is highly parallelizable, their mode of operation is not. We introduce two new modes to overcome that limitation: AEGIS-128X and AEGIS-256X, that require minimal changes to existing implementations and retain the security properties of AEGIS-128L and AEGIS-256.
Spiking neural networks gain attention due to low power properties and event-based operation, making them suitable for usage in resource constrained embedded devices. Such edge devices allow physical access opening the door for side-channel analysis. In this work, we reverse engineer the parameters of a feed-forward spiking neural network implementation with correlation power analysis. Localized measurements of electro-magnetic emanations enable our attack, despite inherent parallelism and...
Ring-Learning-with-Errors (RLWE) has emerged as the foundation of many important techniques for improving security and privacy, including homomorphic encryption and post-quantum cryptography. While promising, these techniques have received limited use due to their extreme overheads of running on general-purpose machines. In this paper, we present a novel vector Instruction Set Architecture (ISA) and microarchitecture for accelerating the ring-based computations of RLWE. The ISA, named B512,...
In recent history of fully homomorphic encryption, bootstrapping has been actively studied throughout many HE schemes. As bootstrapping is an essential process to transform somewhat homomorphic encryption schemes into fully homomorphic, enhancing its performance is one of the key factors of improving the utility of homomorphic encryption. In this paper, we propose an extended bootstrapping for TFHE, which we name it by EBS. One of the main drawback of TFHE bootstrapping was that the...
The US National Institute of Standards and Technology initiated a standardization process for post-quantum cryptography in 2017, with the aim of selecting key encapsulation mechanisms and signature schemes that can withstand the threat from emerging quantum computers. In 2022, Falcon was selected as one of the standard signature schemes, eventually attracting effort to optimize the implementation of Falcon on various hardware architectures for practical applications. Recently, Mitaka was...
Security vulnerabilities in the symmetric-key primitives of a cipher can undermine the overall security claims of the cipher. With the rapid advancement of quantum computing in recent years, there is an increasing effort to evaluate the security of symmetric-key cryptography against potential quantum attacks. This paper focuses on analyzing the quantum attack resistance of AIM, a symmetric-key primitive used in the AIMer digital signature scheme. We presents the first quantum circuit...
Asynchronous common subset (ACS) is a powerful paradigm enabling applications such as Byzantine fault-tolerance (BFT) and multi-party computation (MPC). The most efficient ACS framework in the information-theoretic setting is due to Ben-Or, Kelmer, and Rabin (BKR, 1994). The BKR ACS protocol has been both theoretically and practically impactful. However, the BKR protocol has an $O(\log n)$ running time (where $n$ is the number of replicas) due to the usage of $n$ parallel asynchronous...
With the advent of secure function evaluation (SFE), distrustful parties can jointly compute on their private inputs without disclosing anything besides the results. Yao’s garbled circuit protocol has become an integral part of secure computation thanks to considerable efforts made to make it feasible, practical, and more efficient. These efforts have resulted in multiple optimizations on this primitive to enhance its performance by orders of magnitude over the last years. The advancement in...
To mitigate cybersecurity breaches, secure communication is crucial for the Internet of Things (IoT) environment. Data integrity is one of the most significant characteristics of security, which can be achieved by employing cryptographic hash functions. In view of the demand from IoT applications, the National Institute of Standards and Technology (NIST) initiated a standardization process for lightweight hash functions. This work presents field-programmable gate array (FPGA) implementations...
Recently, the results of the NIST PQC contest were announced. Classic McEliece, one of the 3rd round candidates, was selected as the fourth round candidate. Classic McEliece is the only code-based cipher in the NIST PQC finalists in third round and the algorithm is regarded as secure. However, it has low efficiency. In this paper, we propose an efficient software implementation of Classic McEliece, a code-based cipher, on 64-bit ARMv8 processors. Classic McEliece can be divided into Key...
We propose a novel privacy-preserving, two-party computation of various distance metrics (e.g., Hamming distance, Scalar Product) followed by a comparison with a fixed threshold, which is known as one of the most useful and popular building blocks for many different applications including machine learning, biometric matching, etc. Our solution builds upon recent advances in function secret sharing and makes use of an optimized version of arithmetic secret sharing. Thanks to this combination,...
The security guarantees of most isogeny-based protocols rely on the computational hardness of finding an isogeny between two supersingular isogenous curves defined over a prime field $\mathbb{F}_q$ with $q$ a power of a large prime $p$. In most scenarios, the isogeny is known to be of degree $\ell^e$ for some small prime $\ell$. We call this problem the Supersingular Fixed-Degree Isogeny Path (SIPFD) problem. It is believed that the most general version of SIPFD is not solvable faster than...
CRYSTALS-Kyber has been recently selected by the NIST as a new public-key encryption and key-establishment algorithm to be standardized. This makes it important to assess how well CRYSTALS-Kyber implementations withstand side-channel attacks. Software implementations of CRYSTALS-Kyber have been already analyzed and the discovered vulnerabilities were patched in the subsequently released versions. In this paper, we present a profiling side-channel attack on a hardware implementation of...
While hardware implementations allow the production of highly efficient and performance oriented designs, exploiting features such as parallelization, their longer time to code and implement often bottlenecks rapid prototyping. On the other hand, high-level synthesis (HLS) tools allow for faster experimentation of software code to a hardware platform while demonstrating a reasonable extrapolation of the expected hardware behavior. In this work, we attempt to show a rapid, fast prototyping of...
Byzantine broadcast is one of the fundamental problems in distributed computing. Many practical applications from secure multiparty computation to consensus mechanisms for blockchains require increasingly weaker trust assumptions, as well as scalability for an ever-growing number of users, which rules out existing solutions with linear number of rounds or trusted setup requirements. In this paper, we propose the first sublinear-round and trustless Byzantine broadcast protocol. Unlike...
The promise of scalable quantum computing is causing major upheaval in the domain of cryptography and security. In this perspective paper, we review the progress towards the realization of large-scale quantum computing. We further summarize the imminent threats towards existing cryptographic primitives. To address this challenges, there is a consolidated effort towards the standardization of new cryptographic primitives, namely post-quantum cryptography (PQC). We discuss the underlying...
Several privacy-preserving analytics frameworks have been proposed that use trusted execution environments (TEEs) like Intel SGX. Such frameworks often use compaction and shuffling as core primitives. However, due to advances in TEE side-channel attacks, these primitives, and the applications that use them, should be _fully oblivious_; that is, perform instruction sequences and memory accesses that do not depend on the secret inputs. Such obliviousness would eliminate the threat of leaking...
Zero-knowledge proof is a critical cryptographic primitive. Its most practical type, called zero-knowledge Succinct Non-interactive ARgument of Knowledge (zkSNARK), has been deployed in various privacy-preserving applications such as cryptocurrencies and verifiable machine learning. Unfortunately, zkSNARK like Groth16 has a high overhead on its proof generation step, which consists of several time-consuming operations, including large-scale matrix-vector multiplication (MUL),...
Homomorphic encryption (HE) is a cryptosystem that allows secure processing of encrypted data. One of the most popular HE schemes is the Brakerski-Fan-Vercauteren (BFV), which supports somewhat (SWHE) and fully homomorphic encryption (FHE). Since overly involved arithmetic operations of HE schemes are amenable to concurrent computation, GPU devices can be instrumental in facilitating the practical use of HE in real world applications thanks to their superior parallel processing capacity....
Secret sharing is an essential tool for many distributed applications, including distributed key generation and multiparty computation. For many practical applications, we would like to tolerate network churn, meaning participants can dynamically enter and leave the pool of protocol participants as they please. Such protocols, called Dynamic-committee Proactive Secret Sharing (DPSS), have recently been studied; however, existing DPSS protocols do not gracefully handle faults: the presence...
In this work, we propose generic and novel adaptations to the binary Plaintext-Checking (PC) oracle based side-channel attacks for Kyber KEM. Binary PC oracle-based side-channel attacks are fairly generic and easy to mount on a given target, as the attacker requires very minimal information about the target device. However, these attacks have an inherent disadvantage of requiring a few thousand traces to perform full key recovery, as they only recover a single bit of information per trace....
SHA-3 is considered to be one of the most secure standardized hash functions. It relies on the Keccak-f[1 600] permutation, which operates on an internal state of 1 600 bits, mostly represented as a 5×5×64-bit matrix. While software implementations process the state sequentially in chunks of typically 32 or 64 bits, the Keccak-f[1 600] permutation can benefit a lot from speedup through parallelization. This paper is the first to explore the full potential of parallelization of Keccak-f[1...
Recent work has produced interactive Zero Knowledge (ZK) proof systems that can express proofs as arbitrary C programs (Heath et al., 2021, henceforth referred to as ZEE); these programs can be executed by a simulated ZK processor that runs in the 10KHz range. In this work, we demonstrate that such proof systems are amenable to high degrees of parallelism. Our epoch parallelism-based approach allows the prover and verifier to divide the ZK proof into pieces such that each piece can be...
The verifiable delay function (VDF), as a kind of cryptographic primitives, has recently been adopted quite often in decentralized systems. Highly correlated to the security of VDFs, the fastest implementation for VDF evaluation is generally desired to be publicly known. In this paper, for the first time, we propose a low-latency hardware implementation for the complete VDF evaluation in the class group by joint exploiting optimizations. On one side, we reduce the required computational...
Quantum computing is considered among the next big leaps in computer science. While a fully functional quantum computer is still in the future, there is an ever-growing need to evaluate the security of the symmetric key ciphers against a potent quantum adversary. Keeping this in mind, our work explores the key recovery attack using the Grover's search on the three variants of AES (-128, -192, -256). In total, we develop a pool of 20 implementations per AES variant (thus totaling in 60), by...
State-of-the-art Byzantine fault-tolerant (BFT) protocols assuming partial synchrony such as SBFT and HotStuff use \textit{regular certificates} obtained from $2f+1$ (partial) signatures. We show that one can use \textit{weak certificates} obtained from only $f+1$ signatures to \textit{assist} in designing more robust and more efficient BFT protocols. We design and implement two BFT systems: Dashing (a family of two HotStuff-style BFT protocols) and Star (a parallel BFT framework). We...
We design and implement a new efficient and accurate Fully homomorphic argmin/min or argmax/max comparison operator, which finds its application in numerous real-world use cases as a classifier. In particular we propose two versions of our algorithms using different tools from TFHE's functional bootstrapping toolkit. Our algorithm scales to any number of input data points with linear time complexity and logarithmic noise-propagation. Our algorithm is the fastest on the market for...
MPC-in-the-head based zero-knowledge proofs allow one to prove knowledge of a preimage for a circuit defined over a finite field F. In recent proofs the soundness depends on the size F, and small fields require more parallel repetitions, and therefore produce larger proofs. In this paper we develop and systematically apply lifting strategies to such proof protocols in order to increase soundness and reduce proof size. The strategies are (i) lifting parts of the protocol to extension fields...
Consider the scenario that the prover and the verifier perform the zero-knowledge (ZK) proof protocol for the same statement multiple times sequentially, where each proof is modeled as a session. We focus on the problem of how to resume a ZK proof efficiently in such scenario. We introduce a new primitive called resumable honest verifier zero-knowledge proof of knowledge (resumable HVZKPoK) and propose a general construction of the resumable HVZKPoK for circuits based on the...