Nothing Special   »   [go: up one dir, main page]

Design of Novel Address Decoders and Sense Amplifier For Sram Based Memory

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

DESIGN OF NOVEL ADDRESS DECODERS

AND SENSE AMPLIFIER FOR SRAM BASED


memory

A Thesis submitted in partial fulfillment of the Requirements for the degree of

Master of Technology
In
Electronics and Communication Engineering
Specialization: VLSI Design & Embedded System

By
Arvind kumar mishRA
Roll No. : 212EC2140

Department of Electronics and Communication Engineering


National Institute of Technology Rourkela
Rourkela, Odisha, 769 008, India
May 2014
DESIGN OF NOVEL ADDRESS DECODERS
AND SENSE AMPLIFIER FOR SRAM BASED
memory

A Thesis submitted in partial fulfillment of the Requirements for the degree of

Master of Technology
In
Electronics and Communication Engineering
Specialization: VLSI Design & Embedded System

By
ARVIND KUMAR MISHRA
Roll No. : 212EC2140

Under the Guidance of


Prof. Debiprasad Priyabrata Acharya

Department of Electronics and Communication Engineering


National Institute of Technology Rourkela
Rourkela, Odisha, 769 008, India
May 2014
Dedicated to…
My Dearest one
My parents and my friends
DEPT. OF ELECTRONICS AND COMMUNICATION
ENGINEERING

NATIONAL INSTITUTE OF TECHNOLOGY, ROURKELA


ROURKELA – 769008, ODISHA, INDIA

Certificate
This is to certify that the work in the thesis entitled Design of Address Decoder and Sense

Amplifier for SRAM by Arvind Kumar Mishra is a record of an original research work

carried out by him during 2013 - 2014 under my supervision and guidance in partial

fulfillment of the requirements for the award of the degree of Master of Technology in

Electronics and Communication Engineering (VLSI Design & Embedded System),

National Institute of Technology, Rourkela. Neither this thesis nor any part of it, to the best

of my knowledge, has been submitted for any degree or diploma elsewhere.

Place: Prof. D.P.Acharya


Date: Dept. of Electronics and Communication Engg.
National Institute of Technology
Rourkela-769008
DEPT. OF ELECTRONICS AND COMMUNICATION
ENGINEERING

NATIONAL INSTITUTE OF TECHNOLOGY, ROURKELA


ROURKELA – 769008, ODISHA, INDIA

Declaration
I declare that

a) The thesis work represented here is original and it is done by me under the guidance

of my guide.

b) This work has not been submitted for any degree or diploma to other Institute or

university.

c) I followed the rule and guidelines providing by the Institute for writing the thesis.

d) Whenever I have taken materials (data, figures, method) from other work, I have

specified credit to writer by mentioning their particulars in references.

e) Whenever I have quoted written materials from other sources, I have put them

under quotation marks and given due credit to the sources by citing them and giving

required details in the references.

Arvind Kumar Mishra

28th May 2014


ACKNOWLEDGEMENTS
I want to express my sincere gratitude to my supervisor, mentor and guide Dr. D. P.

Acharya for his invaluable guidance and advice during every stage of this project. I am

greatly indebted to him for his directions at times of successes along with encouragement

and support during hard times. I shall forever remain determined to walk on his footsteps. I

extend my hearty appreciation to Prof. K. K. Mohapatra, Prof. P. K. Tiwari, Prof. A. K.

Swain, and Prof. N. M. Islam for their insightful comments and valuable suggestions

during the course of my M.Tech carrier. I shall remain indebted to Dr. S. K. Sarangi,

Director for all his support during the course of my M.Tech carrier.

I would also like to thank my co guide Pradip Patra and my company Sankalp

Semiconductor, BBSR. I am grateful to them for providing all the necessary facilities

during my stay in company. I thank my colleagues with whom I share countless hours of

discussions and work.

I appreciate the stay at NIT Rourkela with all my classmates, Ph. D scholars and

everyone whom made me feel at home away from home. At last but not the least, I thank

my parents who always trusted and supported my endeavor of research with their love and

blessings

ARVIND KUMAR MISHRA


Arvindengg10@gmail.com

i
ABSTRACT
Address decoder and sense amplifier is important component of SRAM memory.

Selection of storage cell and read operation is depends on decoder and sense amplifier

respectively. Hence, performance of SRAM is depends on these components. This work

survey the address decoder and sense amplifier for SRAM memory, concentrating on delay

optimization and power efficient circuit techniques. We have concentrated on optimal

decoder structure with least number of transistors to reduce area of SRAM

In static decoders we have stared with simple AND gate decoder and its result is

examined. These simple decoder are neither area efficient nor faster one because AND/OR

gate are not natural gates, they are made up from combination of NAND/NOR and NOT

gate. Decoder having only NOR/NAND gate are area efficient and fast too. Therefore

universal decoding having NAND-NOR alternate stages scheme is taken and examined.

Universal decoding scheme are having some serious issue like different path delay which

may results in false decoding as well as extra power dissipation. To overcome from this

issue Novel Address decoding scheme is implemented and their result is compared with

simple AND decoder and Universal decoder. Novel address decoder circuit is presented

and analyzed. Novel address decoder using NAND-NOR alternate stages with pre-decoder

and replica inverter chain circuit is implemented successfully.

Current mirror sense-amp and latched type sense amplifier is also implemented

for SRAM. These two amplifiers are the basic one and having tremendous advantage due

to their small size. They are fast enough and can be fit below the SRAM cell. We have

implemented and tested 1Kb; 8 bit; 1.25GHz SRAM memory in Cadence by using UMC

90nm technology, for that decoder and sense amplifier is deployed.

ii
CONTENTS
ACKNOWLEDGEMENTS .............................................................................................. I

ABSTRACT .................................................................................................................. II

CONTENTS ................................................................................................................. III

LIST OF TABLES ....................................................................................................... VI

1 INTRODUCTION ...................................................................................................... 1

1.1 Motivation of work .................................................................................................... 2

1.2 Literature Survey ...................................................................................................... 3

1.3 Overview of Thesis..................................................................................................... 4

2 BASIC OF SRAM MEMORY .................................................................................... 5

2.1 Block Diagram of SRAM .......................................................................................... 6

2.2 Basic SRAM operations ............................................................................................ 7

2.3 Speed of Memories..................................................................................................... 7

3 STATIC DECODERS ................................................................................................ 9

3.1 Introduction ............................................................................................................. 10

3.2 Conventional AND decoder .................................................................................... 11

3.3 Universal Block Decoding Scheme ......................................................................... 12

3.4 Proposed Decoding Scheme .................................................................................... 15

Results and comparison ...................................................................................................... 17

iii
4 DYNAMIC DECODERS .......................................................................................... 20

4.1 NOR array decoders ................................................................................................ 21

4.2 Divide word line architecture ................................................................................. 25

4.3 Decoder by using Divide word line architecture ................................................... 26

4.4 Sense Amplifier Decoder ......................................................................................... 28

5 SENSE AMPLIFIER ............................................................................................... 31

5.1 Introduction ............................................................................................................. 32

5.2 Current mirror Sense amplifier ............................................................................. 32

5.3 Latch Type Sense Amplifier ................................................................................... 35

6 CONCLUSION ........................................................................................................ 38

DISSEMINATION ....................................................................................................... 41

REFERENCES ........................................................................................................... 42

iv
List of Figures
Figure 1 Block Diagram of SRAM memory........................................................................................................ 6
Figure 2 Speed comparison of Memories[8] ...................................................................................................... 8
Figure 3 Conventional 3-to-8 decoder by using CMOS AND gate .................................................................. 11
Figure 4 Schematic of 4 to 16 decoder, divided into blocks ............................................................................. 13
Figure 5 Simulation Result of Universal decoder ............................................................................................ 14
Figure 6 Layout of Universal Decoder ............................................................................................................ 14
Figure 7 Proposed 5:32 decoder using Predecoder and replica circuit .......................................................... 16
Figure 8 Layout of proposed decoder .............................................................................................................. 17
Figure 9 Delay comparison of proposed architecture with traditional and Block architecture ...................... 17
Figure 10 Simulation result of proposed 5:32 decoder .................................................................................... 18
Figure 11 2:4 NOR array decoder ................................................................................................................... 22
Figure 12 Simulation of 2:4 dynamic decoder ................................................................................................. 23
Figure 13 Simulation result of pulsed based 2:4 decoder ................................................................................ 23
Figure 14 3:8 NOR array decoder ................................................................................................................... 24
Figure 15 Simulation Result of 3:8 NOR array decoder .................................................................................. 25
Figure 16 Block diagram of DWL structure ..................................................................................................... 26
Figure 17 5:32 NOR array Decoder by using divided word line architecture ................................................. 27
Figure 18 Simulation Result of 5:32 Decoder .................................................................................................. 27
Figure 19 Layout of 5:32 Decoder ................................................................................................................... 28
Figure 20 Sense Amplifier Decoder ................................................................................................................. 29
Figure 21 simulation of sense-amp decoder..................................................................................................... 29
Figure 22 Current Mirror Type Sense Amplifier .............................................................................................. 33
Figure 23 simulation result of current mirror SA ............................................................................................ 34
Figure 24 Layout of current mirror SA ............................................................................................................ 34
Figure 25 Latched Type Sense Amplifier ......................................................................................................... 35
Figure 26 Simulation Result of Latched Type SA ............................................................................................. 36
Figure 27 Layout of latched type SA ................................................................................................................ 37

v
LIST OF TABLES
Table 1 Truth table for Logic GATE ................................................................................................................ 12
Table 2 Comparison Between Traditional, Block and Proposed Decoder ....................................................... 18
Table 3 Result of corner analysis .................................................................................................................... 19
Table 4 Simulation Result of Latched Type SA................................................................................................ 36

vi
1
INTRODUCTION

1
In computers data (information) and program (sequence of commands) are store in some

physical devices on permanent or temporary basis. This stored content is used in other

computing or on time computing depends on the application. For large data which may

need to access in future on permanent basis for that magnetic storage is used. Run time data

is stored in semiconductor memories. Computer memories are mainly divided into two

parts: Primary memory and Secondary Memory. Semiconductor memories come under

primary storage. SRAM is random access memory that means its content can be access

from anywhere of storage memory. Data are stored in cells and to access data randomly, fix

address is assign to all locations of storage cell. SRAMs are volatile in nature that means

their storage data will loss after power shutdown. Therefore they can be used to store run

time data in computer systems. SRAM is used as register and cache memory to make faster

program execution in computer system. SRAMs are made by using same sources as for

process made therefore they are compatible with processor in all extent and their speed is

matched with current processor speed. SRAM cell are made up with cross couple inverter

latch having positive feedback loop therefore during write operation it store data rapidly.

For reading it uses Sense Amplifier circuit which amplifies small voltage difference of

lines. Semiconductors memories are faster compare to other type of memories and SRAM

has highest read and write speed. Power dissipation of SRAM can be reduced by using

efficient circuit techniques.

1.1 Motivation of work


As VLSI technology shrink down and down, it gives high speed processors and

demands low power consumption. To increase processor speed demands large amount of

cache memory for temporary storage of arithmetic and logical data. Bulk storage cannot be

used as cache memory due to their speed limitations. Semiconductor memory is only

option to add with processor because they are having compatible CMOS structure and

2
speed. Therefore large amount of high density and low power SRAM memory is needed to

accomplice operation. SRAM is having high speed but its cell structure itself needed at

least 6 transistors which is access compare to DRAM. Here peripheral circuitry

optimization is needed to reduce memory size. Decoder and sense amplifier is the

important and large block in SRAM so their design has big challenges. In design phase of

their optimization is needed. In this work mainly decoder and sense amplifies is covered.

Deferent type of decoding schemes is considered and designed. Some designs are tested too

for 1Kb SRAM memory.

1.2 Literature Survey


 Michael A. Turi and José G. Delgado-Frias [1]. Dynamic decoders are always
having advantages over static decoders because of their speed and power
consumption. in this paper dynamic decoding schemes are discovered. Address
decoder using selective pre-charging schemes are presented and analysed here.
These schemes are having advantage on simple decoder and the AND–NOR
decoders. Results are also compared with conventional one and giving
satisfactory performance.
 Shivkaran Jain, Arun kr. Chatterjee [2], This paper presents some nand gate
design styles which when used in decoder educes energy consumption and delay.
Basically conventional, nor style nand, source coupled nand is discussed. The
three designs conventional, nor style nand, source coupled nand, ranges in area,
speed and power. In nor style nand transistors are added in parallel so high fan -
in is obtained and logical effort is reduced. In source coupled NAND gate
number of transistors is reduced it give speed of operation compared to an
inverter.
 Ireneusz B., Łukasz Z., [3], in this paper universal decoding scheme is proposed.
Universal decoders are made by alternate stage of NAND and NOR gate which
avoid unnecessary use of inverters. This paper overcomes problem on simple
decoders. AND gate are not available naturally, they are made up by using NOR
and NOT gate.

3
 B. S. Amrutur and M. A. Horowitz, [4][5] in this paper low power SRAM
techniques are explained. Decoder with different logic style is explained here.
Modelling of decoder is also explained here. Logical effort of circuits is
calculated and according to that transistors are sized.
 Kevin Zhang [6], in this book basic structure of SRAM along with component
are explained. Basic design techniques of components are given and sizing
issues are discussed. All SRAMs basic parts are covered along with their role in
memory optimization.

1.3 Overview of Thesis


This work has done for design and analysis of address decoder and sense amplifier for

SRAM memory. The introduction of different component of SRAM is given in Chapter 2

of the report. Chapter 3 covers static decoder design. Different available schemes are

discussed here. A novel decoder design is proposed in this chapter. Dynamic decoders are

discussed in chapter 4. Chapter 5 describes current mirror sense amplifier and latched type

sense amplifier circuit for SRAM. Conclusion and future work is discussed in chapter 6. At

end references are included.

4
2
BASIC OF SRAM MEMORY

5
2.1 Block Diagram of SRAM
Fig.1, logical block diagram of Static Random Access Memory is shown , operation

wise Structure of SRAM looks like it.

Figure 1 Block Diagram of SRAM memory

The row decoder use to select one of the 2𝑚 word lines which is connected to

the memory cells. Other timing circuit like timing control, sense amplifier, pre -charge,

driver, latches are also deployed to the SRAM block. Pre -charge are used to charge the

large line capacitances to the desired level to make operation fast and smooth. Sometime

local and global pre-charge circuits are deployed. Sense amplifiers are used to amplify bit

lines differential voltage during read operation without cell data flipping. SRAM cell

should have large noise margin to successful operation.

6
2.2 Basic SRAM operations
Cell should perform all three operations properly. Cell has following three operations:

 Data Write

 Data Read

 Data Retention

Data write operation start with pre-charging of bit lines to VDD then write driver

discharges one line to GND potential and then address decoder selects one word line and

switched on access transistor of cell. Now cross couple inverters are connected with bit

lines and charge transfer from high level to low level ensuring data write into cell. Cross

couple data latch performs fast write operation. Before data read also bit lines are pre-

charged to VDD and as address decoder selects cell one bit line start discharging and other

remain at VDD. Once sufficient difference is generated Sense Amplifier is switched ON and

data is taken at output latch. When bit lines are charged but word lines are unselected, data

retention must be maintained. Cell data should not flip in this case. Strong cross couple

inverter retains data accurately.

2.3 Speed of Memories


In memory hierarchy SRAM is used in upper cache L1 due to their higher speed of

operation. While L2, L3 are made by DRAMs. Main reason for using SRAM in high level

cache is their integration capability in VLSI chips and highest random access speed. In

every instruction cycle, processor needs register to store temporary data. So they must have

high speed of data access. Therefore registers of processor are also made by SRAM

memory. Figure 2 shows the speed of different memory available. SRAM memory has

highest speed.

7
Figure 2 Speed comparison of Memories[8]

8
3
STATIC DECODERS

9
3.1 Introduction
Address Decoder is an important digital block in SRAM which takes up to 50%

of the total chip access time and notable amount of the total SRAM power in normal

read/write cycle. To design address decoder need to consider two objectives, first

selecting the good circuit technique and second sizing of transistors in circuit. Novel

address decoder circuit is presented and analyzed in this paper. Address decoder using

NAND-NOR alternate stages with predecoder and replica inverter chain circuit is

proposed and compared with traditional and universal block architecture, using UMC

90nm CMOS technology.

SRAM IP cores are frequently used as program registers, buffers and cache

memory in most digital application and computing systems because of its compatible

speed with processor and it is accessed at minimum once at every clock cycle. In

memory hierarchy to connect bulk storage to the processor, SRAM are used as level

cache.

In recent developed VLSI system, processor speed is much higher than the available

bulk storage speed. SRAM is the only available and CMOS compatible memory which

is having seed compatible with processor but it takes large area to implement. Therefore

memory hierarchy is used. SRAM is used as register memory inside the processor and

upper level cache in memory hierarchy at microprocessor to seed up the system and to

increase system performance.

Due to large amount of storage cells in memories it can be found various solutions of

address decoder designs leading to power consumption reduction and performance

improvement. Usually different kind of precharging dynamic decoders are used. Design

of dynamic decoder is complex and having more probability of wrong sensing.

10
Traditional static decoder gives more accurate result but it is having more number of

transistors with large delay. Some solutions use hierarchical decoders with predecoding

and also implemented binary tree decoder built by DE multiplexers.

SRAM operation start with decoding of address, therefore row and column decoders is

most important componant in all random-access memories. SRAM performance is

determined by the time taken to access data and power consumption. Row decoders

takes an n-bit address data as input and gives 2n outputs, one of them is having unique

output which activates cell of SRAM. Small decoders are realise by single bloch by

using 2-input and 3input logic gates but for large decoders hierarchy is used.

3.2 Conventional AND decoder

Figure 3 Conventional 3-to-8 decoder by using CMOS AND gate [11]

Conventional decoder by using CMOS AND gate is shown in Fig. 3. Here two input AND
gate is used because as number of input incresases delay of decoder increases drastically.

11
This is the basic static decoder circuit. There is a problem with implementation of the
decoder in CMOS technology, because AND gates are not directly available in CMOS,
their realization needs two gates, NAND and NOT serially connected. It increases number
of transistors, power consumption and delay. So structure of the decoder have to be
realized directly with NOT, NAND and NOR gates only.

3.3 Universal Block Decoding Scheme

As shown in Table 1, NAND gate gives unique logic low output when both of its

input is high and it gives high output for other combination. Therefor we cannot make

decoder by only using NAND gate. NOR gate gives unique logic high output when both

of its input is low and it gives high output for other combination. Both these gate need

inverter at output to make decoder and this increase number of transistor as well as delay

in circuit. But their unique and different property can be used as combination and gives

excellent result, because NAND gives output low but demands high all input and NOR

gives output high but demands low input.

Table 1
Truth Table For Logic Gates
Input Combination Output For Different Gates

A B AND OR NAND NOR

0 0 0 0 1 1

0 1 0 1 1 0

1 0 0 1 1 0

1 1 1 1 0 0

To design Decoder, Gate with unique output is required. As shown in Table 1, NOR

Gate give unique high output for both low inputs and NAND gives unique low output

12
for both high input. Based on this principle, universal design scheme is proposed to

design decoder by using combination of NAND and NOR. For high logic output, the last

stage of decoder is consist of NOR gates and previous to that with NAND gates, the

alternate stages will continue up to input stage. Number of decoder inputs will decide

the no. of stages of decoder and hence the first level i.e. either NAND or NOR gates. For

even no. of input, the first stage is of NOR and for odd number of inputs it is of NAND

for block architecture. Fig. 4 shows the architecture of this decoder. In this case 4:16

decoder has been taken as example.

Figure 4 Schematic of 4 to 16 decoder, divided into blocks [3]

Problem in block architecture decoder is that, it is not fully optimized in terms of

transistor count, delay and power dissipation. Also due to different path lengths for

different inputs, i.e. LSB need to travel every stage from input to output while MSB

13
need to travel only last stage, that’s why some address combination gives multiple

outputs high due to path delay differences.

As shown in fig. 5, when address is 00000, before line 0 at decoder output become

high, line 15 became high for some duration. This is because different path delay in at

output stage. Layout is shown in fig. 6.

This results in false selection of cell and extra power dissipation. Only single inverter is

driving the stage of large gate so delay of decoder will increases for large input. Also as

number of stages increase delay increases. To eliminate these problems new decoding

scheme is proposed.

Figure 5 Simulation Result of Universal decoder

Figure 6 Layout of Universal Decoder

14
3.4 Proposed Decoding Scheme

We have proposed a 5:32 decoder for SRAM (Fig. 7) using pre-decoder and inverter

replica based circuit in addition to alternate NAND and NOR stage. In this architecture

pre-decoder circuit reduces the gate count, also number of stages from input to output

which results in reduction in delay and power consumption. By the application of

predecoder circuit we can reduce number of stages, it can be performed at combination

4,8,16… input decoder structure. Here we have reduced one stage.

Fig. 7 shows the proposed 5:32 decoder, here NAND and NOR stages works to

produce unique output. We have used predecoder circuitry to reduce the number of

stages as compared to universal architecture, also reduced the count of transistors which

makes proposed decoder faster and dissipates less power. Replica circuitry is used to

overcome the problem of multiple selections due to variable path delay. It provides the

same delay to MSB as that of LSB, and therefore the fixed delay circuit is formed for

every logic combination change. First stage of this decoder is always predecoder, which

can be made either NAND or NOR gates depends on number of input line. In this case

first stage is NOR based architecture. NOR gate provides high unique high output when

all its input is low. Next stage is NAND gate because it gives unique low output when

all input combination is high. Again NAND output can be decoded by NOR stage and

when input combination increases we can employ predecoder.

15
Figure 7 Proposed 5:32 decoder using Predecoder and replica circuit

Based on this simple approach this type of decoder can be designed and it is basic

principle of this designed technique. Third stage of this decoder needs inverter for

decoding but simple inverter gives false decoder due to different path delay in different

gate stage. So replica circuit overcomes the problem of multiple selections. CMOS

inverter have optimal fan-out 4,so for driving 16x2 stage we need 8 inverter with 4 high

and 4 low logic. Based on this approach replica chain is made and decoder is designed.

Layout of proposed decoder is shown in fig. 8.

Simulation results shows that the transistor count, delay and power dissipation in

proposed decoder is smallest in comparison with Traditional and Block architecture. Fig.

9 shows that, as the size of decoder increases, the performance of proposed decoder is

improved over block and traditional decoder architectures.

16
Figure 8 Layout of proposed decoder

200

150
Delay (ps)

100 Traditional
Block
50
Proposed
0
2:04 3:08 4:16 5:32
Decoder Size

Figure 9 Delay comparison of proposed architecture with traditional and Block


architecture

Results and comparison


For 5:32 decoder, comparison between traditional, universal block and proposed

architecture is shown in table 2. It shows the delay, number of transistors and power

dissipation in proposed architecture is less than that of traditional and universal block

17
architecture. Fig. 9 shows the proposed decoder is having better performance over

traditional and block and it improves with the increase in size of decoder with respect to

other. Fig. 10 shows the simulation of proposed decoder. Table 3 represents the results

for corner analysis of proposed decoder where the largest delay is found out for SS case

and it is 129.5ps whereas the smallest is for FF case and it is 116ps.

Figure 10 Simulation result of proposed 5:32 decoder

Table 2
Comparison Between Traditional, Block and Proposed
Traditional Block Proposed

Delay (ps) 162 119 98

No. of 370 250 250


Transistors

Power (uW) 295 210 155

18
Table 3
Result of Corner Analysis
Conditions Propagation
Delay (ps)
TT 98
SS 129.5
FF 81.6
FS 97
SF 101

Decoder with NAND and NOR stages, predecoder and replica circuit is

designed, it gives less power consumption and delay than that of traditional and

block architecture. Delay and power dissipation in proposed decoder is 60.49%

and 52.54% of traditional and 82.35% and 73.80% of universal block archit ecture

respectively. High speed decoder is the important block for fast SRAM. Proposed

decoder is used to implement a 1-kb 8-bit 1.25-GHz SRAM Memory.

19
4
DYNAMIC DECODERS

20
Decoder is basic building block for any random access memory (RAM). Decoders are

basically static and dynamic type. Dynamic decoders are further classified in many ways.

But basic NOR decoder is most widely used in its different derived form. Dynamic decoder

circuit technique is outstanding and notably performs better than static CMOS decoder

because of their fewer transistors.

SRAM memory designs normally use Word-Line address decoders which are

implemented with static CMOS Logic. Static CMOS are high fan-out techniques and

logical effort study show that for minimum delay in decoder circuit , tree of two and three

input NAND and NOR gates along with inverters should be used. Faster decoders have

implemented by balancing the logical effort of gates. One easy and most straightforward

technique to accomplish this is to use dynamic decoder circuits.

4.1 NOR array decoders


Basic 2:4 NOR array decoder is shown in fig.11. There is one pull-up PMOS and two

pull-down NMOS in every line. Every pull-up PMOS is controlled by single control

signals (ctl) and control as on-off switch of decoder. If applied signal is high logic then line

cannot be pulled high, low control logic is needed to pull line at high logic. By varying

pulse width of control signal power dissipation can be controlled. As pulse width decrease

power dissipation decrease but degrades performance of decoder. We can decrease pulse

width according to time needed to decode the address. In every line pull-up and pull-down

forms voltage divider circuit and it divides VDD into two parts in pull-up and pull-down

resisters. Size of pull-down transistor must be low compare to pull-up transistor to maintain

high swing at output line. As size of pull-down is increase, resistance decreases and swing

goes down. Inverters are used in this circuit is symmetrical type with following

specifications:

W 120n W 360n
(L) = , (L) = , NML = .4V , NMH = .4V
n 80n p 80n

21
Propagation delay = 7ps

Fig.12 below shows the simulation result of 2:4 dynamic NOR array decoder when
𝑊 120𝑛
control signal grounded permanently. All transistors are having = for both NMOS
𝐿 80𝑛

and PMOS. In this case propagation delay is 17ns and power dissipation is 133µW.

Figure 11 2:4 NOR array decoder [8]

Variation of propagation delay with respect to aspect ratio of transistor is given below:

𝑊 120𝑛 W 120n
1) When ( 𝐿 ) = and ( L ) =
P 80𝑛 N 80n

Propagation delay is 17ps.

𝑊 120𝑛 W 200n
2) When ( 𝐿 ) = and (L) =
P 80𝑛 N 80n

22
Propagation delay is 21ps.

𝑊 120𝑛 W 360n
3) When ( 𝐿 ) = and ( L ) =
P 80𝑛 N 80n

Propagation delay is 24ps.

Figure 12 Simulation of 2:4 dynamic decoder

Figure 13 Simulation result of pulsed based 2:4 decoder

23
As result shown above, as dimensions increases results in capacitance of MOSFET

therefore delay increases. Therefore lowest dimensions are best suited. We also observe

that power dissipation is large in this decoder, to reduce power dissipation we can reduce

control signal pulse width. By keeping pulse width 30ps (period 100ps, off time 30ps

which will ON PMOS), power dissipation is reduced to 80μW. Simulation result of pulsed

control is shown in fig.13.

A bigger version of NOR array decoder 3:8 is taken as shown in fig.14. This decoder is

having same construction and working as 2:4 decoders. Simulation result of 3:8 NOR array

decoder is below in fig.15. Propagation delay 27ps and power dissipation is 240 µW when
𝑊 120𝑛
all transistors are having = for both NMOS and PMOS.
𝐿 80𝑛

Figure 14 3:8 NOR array decoder

24
Figure 15 Simulation Result of 3:8 NOR array decoder

Power dissipation can be reduced by using pulsed scheme here too, but we cannot

reduce much. Power dissipation is major problem in NOR array decoder, As number of

input increases to 4 and above, delay and power dissipation increases drastically.

Other split type decoder structure can be applied to reduce delay in order to implement

larger decoder structures. One possible scheme is to divide the structure into local and

global address lines i.e. divide word line structure.

4.2 Divide word line architecture


When decoder structure become large, at least two stage structures is used to implement

alter net decoder. Divided word line decoder structure divide the single SRAM into small

blocks. Local word line is switched on when both the globe word line and block select are

activated together by address line.

Since at a time only one block is selected and being activated, hence reducing word line

delay and power dissipation of SRAM. In fig. 16 block diagram of DWL structure where

25
decoder are divided into two types, first MSB of address is decoded into global word line

and rest are as local word line.

Figure 16 Block diagram of DWL structure

4.3 Decoder by using Divide word line


architecture
5:32 decoder has been implemented by using divided word line architecture. 2:4 NOR

decoder is used as global decoder and 3:8 as local decoder. All NOR decoders are

implemented with control switch and controlled by external signal. LSB of address is given

to the 3:8 decoders and control signal is given to by 2:4 decoder. When address is applied

to the decoder, first three LSBs try to activate local decoders but there control signal are

low output remain low. After some time global decoder gives high signal to one local

decoder one output rises. Simulation result is shown in fig.17 and layout in fig.18.This
𝑊 120𝑛
circuit is implemented in CMOS 90nm technology, 𝑉𝐷𝐷 =1V, for = (both NMOS
𝐿 80𝑛

and PMOS) it gives best result. Worst case delay = 41ps (at line 28). Average power

dissipation = 819mW.

This decoder is successfully implemented in 1Kb, 1.25GHz SRAM memory.

26
Figure 17 5:32 NOR array Decoder by using divided word line architecture

Figure 18 Simulation Result of 5:32 Decoder

27
Figure 19 Layout of 5:32 Decoder

4.4 Sense Amplifier Decoder


By the application of selective pre-charging, Sense-Amp decoder is proposed. Figure 20

shows the schematic of this decoder. Accept sense amplifier circuitry, this decoder is same

as NOR decoder. In 4:16 decoder first two bit are used for NOR array connection to

activated transistors. Rest of two lines are connected by AND logic. Here source couple

NAND gate is used to generate AND logic. Two address bit activates four output line and

one of them is selected by AND logic. In sense amplifier cross couple inverter is used to

make line full charge. To keep power dissipation low precharge stage is of short duration.

Additionally, strong inverters are not needed for the precharge and discharge signals since

the signals are not inverted. The Sense-Amp decoder, designed in 90-nm CMOS

technology, uses an 180-ps period discharge-precharge-evaluate cycle with stage lengths of

60 ps each. The cycle is based on the precharge and discharge signals. In the discharge

28
stage all select-lines are first pulled-down to ground via discharge transistor and only the

last selected select-line is pulled to ground.

Figure 20 Sense Amplifier Decoder

Figure 21 simulation of sense-amp decoder

29
Unlike the AND–NOR scheme, this eliminates the need for further peripheral circuitry

to pull-down the last selected select-line when only the address MSBs change.

Simulation result of decoder is given in fig. 21. This circuit is implemented in CMOS

90nm technology, 𝑉𝐷𝐷 =1V. Delay = 120ps (period, discharge- precharge-evaluation).

Average power dissipation from = 564mW (120ps cycle). This decoder can push any

amount of current so need of driver circuit is eliminated.

30
5
SENSE AMPLIFIER

31
5.1 Introduction
Sense amplifier comes under the part of read circuit and used to access data from

selected cell of SRAM and the robustness of bit sensing is depend on it. It is used to boost

the difference between bit lines voltage during read operation. It convert small voltage

difference (around 100–200 mV in case of 1V power supply) into full logic voltage (in this

case 0V and 1V). Bit lines are subjected to large capacitances so reducing their voltage to

ground level from VDD will take large time therefore sense amplifier increases the speed of

read operation. Operation and performance of sense amplifier leads to direct impact on

SRAM’s performance therefore there area efficient and power efficient design is very

essential to increase performance of memory. Sense amplifier operation should not change

contain of cell, because in SRAM read operation unlike DRAM refreshing operation is not

used. During initial design phase choice of circuit topology, perfect transistor sizing,

operating point of transistor, low power dissipation, optimal gain and transient response

must be taken care based on the timing control and layout constraints for SRAM memory

system. Optimal bit line voltage difference play important role because reducing bit line

voltage takes large time due to large capacitance. Taking less voltage difference leads to

speed up memory but may cause problem during read, so optimal quantity should be taken.

5.2 Current mirror Sense amplifier


Basically any differential amplifier can work as sense amplifier. Many types of

structures has been proposed [13][14]. Below, a current-mirror type Sense Amplifier is

shown in fig. 22.

In 6T cell this amplifier having tremendous advantage because fit below the cell, so

during layout area can be optimized.

Gain of this amplifier is given by

𝐴 = 𝑔𝑚 (𝑟𝑜1 ||𝑟𝑜2 )

32
Figure 22 Current Mirror Type Sense Amplifier

As shown in fig. 1, circuit is having to transistor M1 and M2 as current source load, M3

and M4 as driver transistor, M5 as current source, M6 and M7 are forming inverter

(amplifier). In current source as we increase aspect ratio, current increases thus output level

decreases. Input of driver also play important role here because gm depends on gate voltage.

By increasing aspect ratio of load transistors it will try to push more current to output node

so swing increase. Output of amplifier is not digital logic (either VDD or GND), therefore to

achieve proper logic level inverter is placed. Aspect ratio of all transistors is given below.

𝑊 1400𝑛 𝑊 400𝑛 𝑊 450𝑛 𝑊 370𝑛 𝑊 120𝑛


( ) 1,2 = , ( ) 3,4 = , ( )5 = , ( )6 = ,( ) 7 =
𝐿 200𝑛 𝐿 80𝑛 𝐿 80𝑛 𝐿 80𝑛 𝐿 80𝑛

33
Figure 23 simulation result of current mirror SA

Figure 24 Layout of current mirror SA

Simulation result of sense amplifier is given below in fig.23 output of amplifier in

shown by green line. This circuit will work up to input 950mV (here VDD is 1V) which is

higher than bit line voltage. Large size of current mirror PMOS ensure mid DC level

(500mV) and providing symmetrical level therefore symmetrical inverter is used at

output.Propagation delay is 61ps and average power dissipation is 57µW for this SA at

34
100mV differential input. As difference between BL and BLB is increase SA operation

become fast but bit line voltage take large time to charge and discharge. Output is single

ended so need of differential to single end conversion is omitted this is the main advantage

of this type of amplifier. Layout of current mirror type SA is shown in fig.24 above.

5.3 Latch Type Sense Amplifier


A latch-type SA is shown in fig 25. Basically latch type sense amplifiers have two cross

couple inverter as in 6T SRAM cell to amplify deference between bit line voltages. During

low read enable both access PMOS M6 and M7 are switched on and output node charges to

bit line voltages. After some time when enable goes high and M5 switched on operation of

sense amplifier started. In this phase when enable goes high both access transistor switched

off and bit lines will decoupled with sense amplifier so further any change in bit line

voltage will not affect the read operation and this time can be used to pre-charge the bit line

which leads to time save of operation. Once enable is high, access transistors are decoupled

then both cross couple inverter works and bring output.

Figure 25 Latched Type Sense Amplifier

35
Figure 26 Simulation Result of Latched Type SA

Operation of this amplifier is fast enough because cross couple inverter forms positive

feedback and brings output rapidly. The sizing of transistors is as follows

𝑊 1400𝑛 𝑊 800𝑛 𝑊 1200𝑛 𝑊 1200𝑛


( ) 1,2 = , ( ) 3,4 = , ( )5 = , ( ) 6,7 =
𝐿 80𝑛 𝐿 80𝑛 𝐿 80𝑛 𝐿 80𝑛

Table 4
Simulation Result of Latched Type SA
BL Voltage BLB Voltage Propagation delay Average power

900mV 700mV 31ps 112µW

700mV 900mV 32ps 110µW

1V 900mV 41ps 103µW

900mV 1V 39ps 103µW

800mV 900mV 43ps 98µW

900mV 800mV 42ps 97µW

36
Figure 27 Layout of latched type SA

Simulation result of latched type sense amplifier is shown in fig.26, two outputs are

shown. During low SE signal both output goes high up to the bit line voltage and when SE

goes high output are going either low or high. Complete result is shown in table 4.

As shown in table, if difference between bit lines is large operation takes less time but to

make high difference required large time. Also it takes large power when difference is

large. Latched type sense amplifier needed differential to single end conversion or taking

single output leaving one draggling. But it cause deferential loading so conversion is

necessary. Here for simulation purpose chain of two inverters are subjected as load having

large aspect ratio. Layout of this amplifier is shown in fig. 27.

37
6
CONCLUSION

38
Different types of static and dynamic decoder has been designed and analyzed in this

work. Two types of basic sense amplifiers also designed and analyzed here too. In static

decoder three decoders are analyzed and designed. Conventional AND decoder suffers due

to large transistors number and delay. AND gates are not naturally available, they are

realized by combination of NAND and NOT gates. We cannot make decoder without

unique output combination of gate. AND gate gives unique high output when all inputs are

high. In NAND gate case, it gives unique low output when both inputs are high. So in

second stage of decoder we cannot use NAND gate. So only by using NAND gate decoder

cannot be made. But if we use NAND and NOR gate alternate decoder can be made. Based

on this approach decoder is made. But this decoder has some serious issues like different

path delay. To solve these issues new decoder is proposed. This proposed decoder has

better performance compared to other two. Delay and power dissipation in proposed

decoder is 60.49% and 52.54% of traditional and 82.35% and 73.80% of universal block

architecture respectively.

In dynamic decoder work started with conventional NOR decoder. This decoder has

large power dissipation. As decoder size increases power and delay increases drastically.

Large decoder structure has realized by using small decoders and selective pre-charging

scheme. This decoder performance has improved and it has deployed in memory IC design.

Another dynamic decoder is designed and analyzed. New sense amplifier decoder is

designed. Its power dissipation is very low.

Two basic sense amplifiers also designed and analyzed here. They are small size cell

compatible amplifiers. These decoders are used in memory designed and all circuits tested

successfully in Cadence, UMC 90nm technology. Layout, DRC check, RC extraction and

post-layout simulation of circuits is also done in Cadence, UMC 90nm technology.

39
SRAM memories are used in high speed computers and embedded applications. They

are used as register and cache memory of processor. SRAMs operating frequency is

compatible with modern processor speed because they are fabricated in CMOS technology.

So demand of SRAM memory will remain high in coming years. So there efficient design

and fabrication has large scope in current and near future market.

In future work more efficient decoder can be realized by improving available decoders.

Other current mode logic will be used to implement decoders. Sense amplifier with cascade

logic and current mode sensing will be taken. Other memory components are also need

optimization and up gradation.

40
DISSEMINATION
A.K.Mishra, D.P. Acharya and P. Patra, “Novel Design Technique of Address Decoder

for SRAM”, IEEE International Conference on Advanced Communication Control and

Computing Technologies 2014, May 2014.

41
REFERENCES
[1] Michael A. Turi and José G. Delgado-Frias, “High-Performance Low-Power Selective
Precharge Schemes for Address Decoders” IEEE Transactions On Circuits And
Systems, vol. 55, no. 9, Page: 917 – 921, 2008.

[2] Shivkaran Jain, Arun kr. Chatterjee “Nand gate architectures for memory decoder”
International Journal of Computers & Technology, Page: 610-614, 2013.

[3] I. Brzozowski, Ł. Zachara and A. Kos “Universal Design Method of n-to-2n


Decoders,” Mixed Design of Integrated Circuits and Systems Conference, Poland,
June 2013.

[4] B. S. Amrutur and M. A. Horowitz, “Fast low-power decoders for RAMs,” IEEE J.
Solid-State Circuits, vol. 36, no. 10, pp. 1506–1515, Oct. 2001.

[5] Bharadwaj S. Amrutur, “Design and Analysis Of Fast Low Power SRAMs” P.H.D
Thesis, Stanford University, 1999.

[6] A. Pavlov, M. Sachdv “CMOS SRAM Circuit Design and Parametric Test in Nano-
Scaled Technologies”: Springer publication.

[7] Kevin Zhang, “Embedded Memories for Nano-Scale VLSIs” Springer publication.

[8] Jan M. Rabaey, Anantha Chandrakasan, and Borivoje Nikolic, “Digital Integrated
Circuits a Design Perspective,” PHI Learning, 2009.

[9] V. Sharma “SRAM Design for Wireless Sensor Networks, Analog Circuits and
Signal Processing” Springer Science, New York 2013.

[10] Betty Prince,”High Performance Memories: New Architecture DRAMs and


SRAMs Evolution and Function” John Wiley & Sons, 1996.

[11] David A Hodges, “Digital integrated circuits design” TMH.

[12] S. M. Kang and Y. Leblebici,CMOS Digital Integrated Circuits, Analysis and


Design, ThirdEdition, McGraw-Hill, New York, 2003.

42
[13] Behrad Razavi, Design of Analog CMOS Integrated Circuits, McGraw-Hill, New

York, 2001.

[14] Allen and Holberg, CMOS Analog Circuit Design, Oxford University Press 2001.

[15] L. Wen Z. Li Y. Li, “High-performance dynamic circuit techniques with improved


noise immunity for address decoders” Published in IET Circuits, Devices & Systems
Received on 3rd January 2012.

43

You might also like