Nothing Special   »   [go: up one dir, main page]

Ejsr Asic Sea

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

European Journal of Scientific Research

ISSN 1450-216X Vol.73 No.4 (2012), pp. 539-546


© EuroJournals Publishing, Inc. 2012
http://www.europeanjournalofscientificresearch.com

ASIC Implementation of Scalable Encryption Algorithm using


Efficient Modular Adders

Jegadish Kumar K. J.
Department of ECE, SSN College of Engineering, Kalavakkam-603110, India
E-mail: jegadishkj@ssn.edu.in
Tel: +91-44-27469700 Extn: 330; Fax: +91-44-27469772

Chenna Kesava Reddy K.


Principal, T.K.R. College of Engineering and Technology
Hyderabad-500079, India
E-mail: kesavary@yahoo.com
Tel: +91-40- 65587536; Fax: +91-40- 24094567

Salivahanan S.
Principal, SSN College of Engineering, Kalavakkam-603110, India
E-mail: salivahanans@ssn.edu.in
Tel: +91-44-27469752; Fax: +91-44-27469772

Abstract

Portable handheld devices in wireless communication systems are rising as strong


competitors to large traditional electronic devices and requires advanced safety and security
mechanisms. Scalable Encryption Algorithm(SEA) is one such security mechanism
specially designed for wireless applications with portable devices like sensor nodes, RFIDs
and other low power operated electronic gadgets. This paper describes SEA, a block cipher
that provides lightweight security solution that conquers the limitations of resource
constrained environments. In this paper, we revise SEA implementation using efficient
architectures of 2b and 2b – 1 modular adders in 90 nm standard CMOS technology using
Synopsys tools and investigate its performance. The efficient architecture is implemented
for data size n = 126 bits and word size b= 7 bits, taking the advantages of generic VHSIC
Hardware Description Language(VHDL) coding. This revised SEA implementation
achieves 8.48%, 15.53%, and 43.67% reduction in area @ 500kHz and @250 MHz work
frequency.

Keywords: Block ciphers, constrained applications, Modular adders, VHDL, 90nm


CMOS Technology.

1. Introduction
Resource constrained encryption does not have an extensive history in symmetric cryptography.
Examples of recent lightweight block ciphers are HIGHT and PRESENT[1,2]. However, both of them
do not afford favourable security against attacks such as linear and differential cryptanalysis. They are
as well non parametric to data and key size. These block ciphers lack in finding a good trade-off
ASIC Implementation of Scalable Encryption Algorithm using Efficient Modular Adders 540

between, cost, security and performance. Thus, there arises a requirement for a new cryptosystem that
endows with perfect solution for resource constrained systems. Embedded applications are basic
building infrastructures for portable devices and they represent significant opportunity and challenge
for new cryptosystem like Scalable Encryption Algorithm (SEA)[3,4]. SEA is designed with
uncomplicated operations like Substitute Box(S-Box), word/bit shift, XOR, AND, OR and Mod 2b
addition. The S-Box and Mod 2b addition operation decides the hardware complexity of the algorithm.
Among these two architectures, modular adder is more complex in the design, as S-Box can be easily
implemented using Look Up Tables(LUTs)/Read Only Memory (ROMs).
Modular adder is a fundamental arithmetic component generally used as Residue Number
System (RNS) and in Digital Signal Processing (DSP) systems[5-8]. They are the primary components
to build modular multipliers, residue-to-binary converters and other residue arithmetic operations in
cryptographic algorithms. Various modular adders proposed in [5,6] involve different building blocks
like Carry propagate adder (CPA), Multiplexers, basic gates (OR,AND) and Memory based tables like
ROMs.
SEA’s [4] hardware performance is investigated in a 0.13µm CMOS technology with
implementation of a single clock cycle per round loop architecture. The design being parametric in
text, key and processor size are provably secure against differential/linear cryptanalysis. Thus SEA
agrees to be an efficient combination of encryption and decryption. The design is more realistic with a
reduced datapath combined with a serial communication interface and achieves optimized
implementation of SEA. Advanced Encryption Standard(AES) is an encryption hardware core
presented in [9] and is suited for devices in which low area and low power consumption are required.
This hardcore constitutes a novel 8 bit architecture and supports encryption with 128 bit keys. This
design was synthesized on 0.13 µm CMOS technology in which their area optimization consumed
3.1kgates. The throughput estimated at the maximum clock frequency of 153 MHz is 121 Mbps.
A compact architecture of ICEBERG [10] 64 bit block cipher with 128 bit key parity based
concurrent error detection (CED) is evaluated for hardware performance in ASIC using a 0.18 µm
CMOS standard cell library. It is seen that the area of this compact architecture consumes 5800 gates
with throughput of 522 Mbps. Humming bird[11] a light – weight symmetric algorithm has 256 bit key
size in which 16- bit block size has been proposed by reverse security research team targeted for low
cost RFID tags. It is synthesized on 0.13µm CMOS technology and has a consumption area of about
2225 gate equivalents. The design consumes 1.08 µW of power at 100 kHz.
An overview of SEA proposed by Standert et al, is basically a block cipher routine to
implement in limited processing resources (e.g. a small processor) is described. This encryption routine
architecture is parametric with respect to plaintext, cipher text, key and the processor size. Since the
architecture is parametric in nature, it has flexibility and allows implementing in all platforms with
minimum code change. Most algorithms perform differently on different platforms but SEA is an
exception as it allows obtaining a small encryption routine targeted to any given processor, the security
of the cipher being adapted in function of its key size[3,4].
z2nbb × z2nbb → z2nbb : x, y → z = x y ⇔ zi = xi yi , 0 ≤ i ≤ nb − 1
The cipher described in[3,4] iterates an odd number nr of rounds. Figure 1 illustrates the
necessary steps to encrypt a plaintext P under a key K and produce a cipher text C. P, C and K have a
parametric bit size n. The operations within the cipher are performed considering parametric b-bit
words. The KR nr/2 is taken before the switch and C(i) is a nb -word vector of which all the words have
value 0 except the LSB that equals i. Decryption is exactly the same as encrypt round. The number of
rounds nr is odd, for key scheduling and encryption and the value of nr must be rounded up or down.
The minimum number of rounds required to provide security against known attacks would be nr =
[3(n/4) + 2((n/2b )+ b/2)][3]. This roughly corresponds to the number of rounds to resist
linear/differential attacks. Twice the number of rounds attain complete diffusion to prevent both
structural attacks and outer round improvements of statistical attacks. The value of nr must be always
odd, if not 1 must be added to make it odd. Due to its simplicity constraints, SEAn,b is based on some
541 Jegadish Kumar K. J., Chenna Kesava Reddy K. and Salivahanan S.

degree of number of elementary operations (selected for their availability in any processing device)
denoted as bitwise XOR, substitution box S, word (left) rotation R and inverse word rotation R-1, bit
rotation r, addition mod 2b . The detailed description of algorithm has been presented in [3,4]. The
hardware synthesis is a straight way implementation by generic VHDL coding and doesn’t focus on
optimization of operators that consume hardware complexities. In our work, studies are done on each
operator to evidently conclude that the Substitution Box and modulo 2b adders’ cost for more hardware
complexity. The fundamental addition mod 2b is denoted [3,4] as : The mod 2b addition is defined on
nb-word vectors as:

Figure 1: Encrypt/Decrypt round and key round[3,4]

2. Description of Modular Adder


The modulo m addition[6] of two numbers x and y belonging to {0, . . .,m} is defined by:
(x + y) mod m = x + y if x +y < m, and
= x + y − m if x + y ≥ m (2.1)
b
There are basically three methodologies to carry out a modulo 2 addition [6,12]:
1. Table-based operator method.
2. Hybrid-based operator method.
3. Adder-based operator method.
Adder-based Operators. Algorithm 2.1 explains the pseudo code of modulo addition
described in Equation (2.1). The resultant architecture is shown in Figure 2. Reference [12] provides
for instance proof of correctness of this method. This architecture requires only two carry-propagate
adders, a Multiplexer and an OR gate is therefore well suited for hardware implementation.
Algorithm 2.1 Modulo m addition.
‘b’ is choosen such that 2b-1 < m < 2b
Compute s0 ← x + y
Compute s1 ← (s0 mod 2b) + 2b − m
if the carry-out bit of s0 or s1 is one then
(x + y) mod m ← s1 mod 2b
else
(x + y) mod m ← s0 mod 2b
end if
ASIC Implementation of Scalable Encryption Algorithm using Efficient Modular Adders 542
Figure 2: Architecture of Modular adder 1 proposed by Bayoumi M.A. et al.

An improvisation of adder-based operator is illustrated in Figure 3 for specific values of m i.e.,


2b. For example, one’s complement addition or modulo (2b − 1) addition [12] is defined by
(x + y) mod (2b − 1) = (x + y + 1) mod 2b if x + y + 1 ≥ 2b
=x+y, if x + y + 1 < 2b (2.2)

Figure 3: Architecture of Modular adder 2 proposed by Beuchat J.L et al.

Figure 3 shows the architecture of the equivalent hardware operator. Because of the condition x
+ y + 1 ≥ 2b, we execute two additions in parallel and select the better result with a multiplexer by
considering that zero has a double representation in one’s complement, namely “0 . . . 0” and “1 . . . 1”
(i.e. 0 is congruent to 2b − 1 (modulo 2b − 1)). In case of second encoding of zero accommodated by
the path of computation, Equation (2.2) can be rewritten as in equation (2.3) [6].
(x + y) mod (2b − 1) = (x + y +1) mod 2b if x + y ≥ 2b
=x+y if x + y < 2b (2.3)
The carry-out (cout) of the addition x + y specifies whether the increment is to be done. Figure 3
depicts the advantage of evaluating the addition of x+y and x + y + 1 in parallel based on the result of
Cout. Figure 4 illustrates another architecture in which the sum x + y is added with Cout. The operators
depicted in Figure 3 do not appreciably progress in hardware complexity as in adder-based operator in
Figure 2. The modulo (2b− 1) adder described in [6] and shown in Figure 4 is compact as it does not
use any multiplexer for its operation. Finally, area and power consumption of SEA using these three
modular adder architectures are evaluated.

Figure 4: Architecture of Modular adder 3proposed by Beuchat J.L et al.


543 Jegadish Kumar K. J., Chenna Kesava Reddy K. and Salivahanan S.

3. Hardware Implementation
3.1. Efficient Modular Adder Realization
The modular adder constructed using CPAs is specially chosen since it is purely combinatorial adder
and has less design complexity. The structure of the adders presented by Bayoumi et al. consists of a
series connection of two CPA, OR gate and a 2 x 1 Multiplexer. The modular adder in equation (2.1)
can be straightforwardly implemented by an adder, a comparator, and a subtractor. The comparison is
however expensive, both in terms of area and delay. The algorithms studied in this section allow to get
rid of it and lead to more efficient hardware operators. In this paper, b = [log2 2b]+1 denotes the
number of bits which are required to encode both the inputs and the outputs of a modulo 2b arithmetic
operator. The improvised design with modulo addition algorithm 2.1 is found to result in reducing
overall circuit complexity of SEA.

3.2. Implementation Description of SEA


Initially, SEA implementation in hardware is done straightforwardly as in [3,4] on an 90nm CMOS cell
technology. It is an iterative loop based algorithm as shown in Figure 5. Each iteration is a round per
clock cycle and denoted as the loop implementation. It is known that the S-boxes and the mod 2b adder
are the operators that cost more in hardware implementation; but the operators like Word Rotate and
Bit Rotate blocks in the cipher are realized by swapping wires. The key generations consist of two
multiplexers to switch the right and left part of the key when the algorithm reaches half of its
execution. The execution is done by a suitable control signal called Swap. The switch controlled
multiplexer supplies looping function with the right part of the key during the execution of first half of
round and passes the left part of the key after the switch. The Generic Loop Architecture is simple and
change done is in the region of modular operator. Then improvised implementation of SEA using
lightweight Modular adders [5,6] to modify modulo 2b addition operator is done to achieve
considerable area and power consumption at the synthesizable VHDL design level.

Figure 5: Loop Architecture of Single round


ASIC Implementation of Scalable Encryption Algorithm using Efficient Modular Adders 544

4. Performance Analysis and Results


Figure 6: (a) Symbol of CPA , (b) Structure of CPA.

Carry Propagate Adder (CPA) is a series connected array of Full-Adder (FA) cells. The
topology of CPA is given in Figure 6. For b=8 the total delay of the CPA is b times the delay of a
single FA cell. This is because the ith cell requires the correct value of the carry-in bit Ci in order to
compute its correct outputs. Accordingly a total of ‘b’ full adder delays are needed to compute the sum
and the final carry out Cout. The total area of the b-bit CPA is equal to b times a single full-adder cell
area. The CPA scales up very easily, by adding addition cells starting from the most significant. The
performance has merits of modular arithmetic operations using CPAs as, the sign detection is easy. The
Most Significant Bit (MSB) indicates the sign bit and it is found that the significant area consumption
of cell is less. Also the power consumption is minimum compared to other delay efficient basic adders
like Carry Look Ahead adder(CLA) and Carry Save Adder(CSA). The total delay of CPA is ‘b’ times
the delay of single FA cell. Bayoumi et al.’s adders require two ‘b’ – bit CPA adders, one OR gate and
2 x 1 multiplexer. The hardware requirement proposed in this work, using adders is Area A = 13b + 3b
log2b-3. The delay of the circuit having two n bit adders and a multiplexer is computed as D= 4 log2b
+ 8. The unit gate area of 2 x 1 multiplexer is calculated as 6b – b. The delay through the Multiplexer
is Dmux = 2 as described in [5]. Abstracting, the overall area of the modular adder for ‘b’ bit sequence
of adder is calculated to be Area A= 2(‘b’ times FA area) + area of (2 x 1) MUX + Area of OR gate[7].
The modular adder architecture presented in Figure 3 is built using two CPAs working parallel and a 2
x 1 Multiplexer. The total area of the ‘b’ bit sequence adder is computed as Area A = 2(‘b’ times FA
area) + area of (2 x 1) MUX. And the total area computed for the modular adder in Figure 4 is simply
Area A = 2(‘b’ times FA area). Precisely it is understood that the later architecture leads to efficient
SEA implementation in terms of area and power consumption.
Implementation results are extracted with the SYNOPSYS tool on Standard Cell library of
90nm technology. The Table 1 shows the comparative studies of result in terms of area, gate counts
and power consumption of improvised SEA and conventional SEA. The realization is done for
parametric bit data (n =126) and a processor word size (b=7). A 8.48 %, 15.53%, 43.67% reduction in
area @ 500kHz and @ 250MHz has been achieved by this method when compared to the straight SEA
hardware synthesis. This efficient hardware realization and synthesis of SEA exhibited a very small
area utilization that comes at the cost of reduced power consumption. Consequently, it can be
considered as an interesting alternative for constrained devices.

Table 1: Comparitive Study

Comparison of Results of SEA Using Modular Adder Architecture in Figure(2)


Gate Total Power
Encryption Technology Total Cell Area (µm2)
Equivalent consumption (µW)
SEA, 126,7 @ 500kHz 90 nm 25846 4749 144
SEA, 126,7 @ 250MHz 90 nm 25846 4749 7203
545 Jegadish Kumar K. J., Chenna Kesava Reddy K. and Salivahanan S.
Table 1: Comparitive Study - continued

[4] SEA, 126,7 @ 250MHz 130nm 28241 4770 7216


Comparison of Results of SEA Using Modular Adder Architecture in Figure(3)
Total Power
Encryption Technology Total Cell Area (µm2) Gate Equivalent consumption
(µW)
SEA, 126,7 @ 500kHz 90 nm 23855 4383 133
SEA, 126,7 @ 250MHz 90 nm 23855 4383 6648
[4] SEA, 126,7 @ 250MHz 130nm 28241 4770 7216
Comparison of Results of SEA Using Modular Adder Architecture in Figure(4)
Total Power
Encryption Technology Total Cell Area (µm2) Gate Equivalent consumption
(µW)
SEA, 126,7 @ 500kHz 90 nm 15908 2923 89
SEA, 126,7 @ 250MHz 90 nm 15908 2923 4434
[4] SEA, 126,7 @ 250MHz 130nm 28241 4770 7216

5. Conclusion
SEA is initially proposed for low cost software implementations. While these design criteria turned out
to allow low cost hardware implementations as well, it is likely that targeting a cipher specifically for
low cost hardware would lead to even better solutions.Through the hardware investigation of the SEA,
it is shown that this modular symmetric algorithm, targeted for low-resources software solutions, can
interestingly respond to constrained hardware needs. Results demonstrated that the scalability of this
algorithm can be kept in the VHDL. The simple iterative loop design achieves interesting performance
in area and power reduction, improvise throughputs in cell based implementation. In addition, the
power consumed by the SEA module for different variants in data block and word size has been
analyzed. It is also important to emphasize a number of advantages in SEA that cannot be found in
other recent block ciphers, namely its simplicity, scalability(re-implementing SEA for a new block size
does not require to re-write code), good combination of encryption and decryption.

References
[1] D.Hong et al., “HIGHT: A new block cipher suitable for low-resource device”, in Proceedings
of Cryptographic Hardware and Embedded Systems-CHES 2006, ser. LCNS, Vol. 4249,
Springer, pp. 46-59
[2] A.Bogdanov et al., “PRESENT: An ultra-lightweight block cipher”, in Proceedings of
Cryptographic Hardware and Embedded Systems-CHES 2007, ser. LNCS, Vol. 4727, Springer
2007, pp 450-466
[3] Francois-Xavier Standaert, Gilles Piret, Neil Gershenfeld, Jean-Jacques Quisquater “SEA a
Scalable Encryption Algorithm for Small Embedded Applications” in Proceedings of
International Conference on Smart Card Research and Advanced Applications (CARDIS )
2006, pp 222-236.
[4] Francois Mace, Francois -Xavier Standaert, Jean-Jacques Quisquater, “ASIC Implementation of
the Block Cipher SEA for Constrained Applications”, Conference on RFID security, 2007.
[5] Riyaz A.Patel, Mohammed Benaissa, “Novel Power-Delay-Area-Efficient Approach to Generic
Modular Addition”, IEEE Transaction on Circuits and Systems – I , Vol. 54, No.6, June 2007,
pp. 1279 – 1292.
[6] Beuchat,J.-L.; Lab. De l'Infonnatique du Parallelisme, “ Some Modular adders and multipliers
for Field programmable Gate arrays”, in Proceedings of Parallel and Distributed processing
symposium 2003.
ASIC Implementation of Scalable Encryption Algorithm using Efficient Modular Adders 546

[7] Francisco Rodriguez-Henriquez, N.A.Saqib, A.Diaz-Perez, Cetin Kaya Koc, “Cryptographic


Algorithms on Reconfigurable Hardware”, Springer series on Signals and Communication
technology, 2006.
[8] Bayoumi M, Jullien G, Miller W, “ A VLSI implementation of residue adders”, IEEE
Transactions on Circuits system, Vol. 34, No. 3, Mar 1987, pp. 284-288
[9] Hamalainen P Alho, T, Hannikainen M, Hamalainen, T.D, “Design and Implementation of
Low-Area and Low-power AES Encryption Hardware core”, in Proceedings of 9th
EUROMICRO conference on Digital System Design 2006, pp. 577 - 583
[10] Huiju Cheng, Howard M.Heys, “Compact ASIC Implementation of the ICEBERG Block cipher
with concurrent Error Detection”, IEEE International Symposium ISCAS 2008, pp. 2921-2924
[11] Meng – Qin Xiao et al., “Low Power Implementation of Humming bird Cryptographic
Algorithm for RFID tag”, IEEE International Conference on Solid-State and Integrated Circuit
Technology (ICSICT) 2010, pp. 581-583.
[12] A.V Curiger, “VLSI Architectures for Computations in Finite Rings and Fields”, Vol. 26 of
series in Microelectronics, Hartung-Gorre Ver’lag,1993.

You might also like