Embedded Systems Design: A Unified Hardware/Software Introduction

Embedded Systems Design: A Unified

Hardware/Software Introduction

Chapter 1: Introduction


• Embedded systems overview

– What are they?
• Design challenge – optimizing design metrics
• Technologies
– Processor technologies
– IC technologies
– Design technologies

Embedded systems overview

• Computing systems are everywhere

• Most of us think of “desktop” computers
– PC’s
– Laptops
– Mainframes
– Servers
• But there’s another type of computing system
– Far more common...

Embedded systems overview

• Embedded computing systems

Computers are in here...
– Computing systems embedded within
electronic devices and here...

– Hard to define. Nearly any computing and even here...

system other than a desktop computer
– Billions of units produced yearly,
versus millions of desktop units
– Perhaps 50 per household and per
automobile Lots more of these,
though they cost a lot
less each.

A “short list” of embedded systems
Anti-lock brakes Modems
Auto-focus cameras MPEG decoders
Automatic teller machines Network cards
Automatic toll systems Network switches/routers
Automatic transmission On-board navigation
Avionic systems Pagers
Battery chargers Photocopiers
Camcorders Point-of-sale systems
Cell phones Portable video games
Cell-phone base stations Printers
Cordless phones Satellite phones
Cruise control Scanners
Curbside check-in systems Smart ovens/dishwashers
Digital cameras Speech recognizers
Disk drives Stereo systems
Electronic card readers Teleconferencing systems
Electronic instruments Televisions
Electronic toys/games Temperature controllers
Factory control Theft tracking systems
Fax machines TV set-top boxes
Fingerprint identifiers VCR’s, DVD players
Home security systems Video game consoles
Life-support systems Video phones
Medical testing systems Washers and dryers

And the list goes on and on

Some common characteristics of embedded
• Single-functioned
– Executes a single program, repeatedly
• Tightly-constrained
– Low cost, low power, small, fast, etc.
• Reactive and real-time
– Continually reacts to changes in the system’s environment
– Must compute certain results in real-time without delay

An embedded system example -- a digital
Digital camera chip

CCD preprocessor Pixel coprocessor D2A



JPEG codec Microcontroller Multiplier/Accum

DMA controller Display ctrl

Memory controller ISA bus interface UART LCD ctrl

• Single-functioned -- always a digital camera

• Tightly-constrained -- Low cost, low power, small, fast
• Reactive and real-time -- only to a small extent

Design challenge – optimizing design metrics

• Obvious design goal:

– Construct an implementation with desired functionality
• Key design challenge:
– Simultaneously optimize numerous design metrics
• Design metric
– A measurable feature of a system’s implementation
– Optimizing design metrics is a key challenge

Design challenge – optimizing design metrics

• Common metrics
– Unit cost: the monetary cost of manufacturing each copy of the system,
excluding NRE cost
– NRE cost (Non-Recurring Engineering cost): The one-time
monetary cost of designing the system
– Size: the physical space required by the system
– Performance: the execution time or throughput of the system
– Power: the amount of power consumed by the system
– Flexibility: the ability to change the functionality of the system without
incurring heavy NRE cost

Design challenge – optimizing design metrics

• Common metrics (continued)

– Time-to-prototype: the time needed to build a working version of the
– Time-to-market: the time required to develop a system to the point that it
can be released and sold to customers
– Maintainability: the ability to modify the system after its initial release
– Correctness, safety, many more

Design metric competition -- improving one
may worsen others
Power • Expertise with both software
and hardware is needed to
Performance Size
optimize design metrics
– Not just a hardware or
software expert, as is common
NRE cost – A designer must be
comfortable with various
Digital camera chip technologies in order to
CCD preprocessor Pixel coprocessor D2A choose the best for a given
application and constraints
JPEG codec Microcontroller Multiplier/Accum

DMA controller Display ctrl Hardware

Memory controller ISA bus interface UART LCD ctrl


Time-to-market: a demanding design metric

• Time required to develop a

product to the point it can be
sold to customers
• Market window
Revenues ($)
– Period during which the
product would have highest
• Average time-to-market
Time (months) constraint is about 8 months
• Delays can be costly

Losses due to delayed market entry

• Simplified revenue model

Peak revenue
– Product life = 2W, peak at W
Peak revenue from
– Time of market entry defines a
Revenues ($)

delayed entry
On-time triangle, representing market
Market rise Market fall penetration
Delayed – Triangle area equals revenue
• Loss
D W 2W – The difference between the on-
On-time Delayed Time time and delayed triangle areas
entry entry

Losses due to delayed market entry (cont.)

• Area = 1/2 * base * height

Peak revenue – On-time = 1/2 * 2W * W
– Delayed = 1/2 * (W-D+W)*(W-D)
Peak revenue from
Revenues ($)

delayed entry
• Percentage revenue loss =
Market rise Market fall (D(3W-D)/2W2)*100%
Delayed • Try some examples
– Lifetime 2W=52 wks, delay D=4 wks
D W 2W
– (4*(3*26 –4)/2*26^2) = 22%
On-time Delayed Time – Lifetime 2W=52 wks, delay D=10 wks
entry entry – (10*(3*26 –10)/2*26^2) = 50%
– Delays are costly!

NRE and unit cost metrics

• Costs:
– Unit cost: the monetary cost of manufacturing each copy of the system, excluding NRE cost
– NRE cost (Non-Recurring Engineering cost): The one-time monetary cost of designing the system
– total cost = NRE cost + unit cost * # of units
– per-product cost = total cost / # of units
= (NRE cost / # of units) + unit cost

• Example
– NRE=$2000, unit=$100
– For 10 units
– total cost = $2000 + 10*$100 = $3000
– per-product cost = $2000/10 + $100 = $300

Amortizing NRE cost over the units results in an

additional $200 per unit

NRE and unit cost metrics

• Compare technologies by costs -- best depends on quantity

– Technology A: NRE=$2,000, unit=$100
– Technology B: NRE=$30,000, unit=$30
– Technology C: NRE=$100,000, unit=$2

$200,000 $200
$160,000 $160
total c ost (x1000)

per product cost

$120,000 $120

$80,000 $80

$40,000 $40

$0 $0
0 800 1600 2400 0 800 1600 2400
Number of units (volume) Number of units (volume)

• But, must also consider time-to-market

The performance design metric
• Widely-used measure of system, widely-abused
– Clock frequency, instructions per second – not good measures
– Digital camera example – a user cares about how fast it processes images, not
clock speed or instructions per second
• Latency (response time)
– Time between task start and end
– e.g., Camera’s A and B process images in 0.25 seconds
• Throughput
– Tasks per second, e.g. Camera A processes 4 images per second
– Throughput can be more than latency seems to imply due to concurrency, e.g.
Camera B may process 8 images per second (by capturing a new image while
previous image is being stored).
• Speedup of B over S = B’s performance / A’s performance
– Throughput speedup = 8/4 = 2

Three key embedded system technologies

• Technology
– A manner of accomplishing a task, especially using
technical processes, methods, or knowledge
• Three key technologies for embedded systems
– Processor technology
– IC technology
– Design technology

Processor technology
• The architecture of the computation engine used to implement a system’s desired
• Processor does not have to be programmable
– “Processor” not equal to general-purpose processor

Controller Datapath Controller Datapath Controller Datapath

Control index
Control Register Control logic Registers
logic and file and State total
State register register State
Custom +
ALU register
Data Data
memory memory
Program Data Program memory
memory memory
Assembly code Assembly code
for: for:

total = 0 total = 0
for i =1 to … for i =1 to …
General-purpose (“software”) Application-specific Single-purpose (“hardware”)

Processor technology
• Processors vary in their customization for the problem at hand

total = 0
for i = 1 to N loop
total += M[i]
end loop

General-purpose Application-specific Single-purpose

processor processor processor

General-purpose processors
• Programmable device used in a variety of
Controller Datapath
– Also known as “microprocessor” logic and
• Features State
– Program memory General
– General datapath with large register file and IR PC ALU

general ALU
• User benefits Program
– Low time-to-market and NRE costs
Assembly code
– High flexibility for:

• “Pentium” the most well-known, but total = 0

for i =1 to …
there are hundreds of others

Single-purpose processors

• Digital circuit designed to execute exactly Controller Datapath

one program Control index
– a.k.a. coprocessor, accelerator or peripheral
• Features State
register +
– Contains only the components needed to
execute a single program Data
– No program memory memory

• Benefits
– Fast
– Low power
– Small size

Application-specific processors

• Programmable processor optimized for a Controller Datapath

particular class of applications having Control Registers

logic and
common characteristics State
– Compromise between general-purpose and Custom
single-purpose processors IR PC

• Features Data
Program memory
– Program memory memory
– Optimized datapath
Assembly code
– Special functional units for:

• Benefits total = 0
for i =1 to …
– Some flexibility, good performance, size and
IC technology

• The manner in which a digital (gate-level)

implementation is mapped onto an IC
– IC: Integrated circuit, or “chip”
– IC technologies differ in their customization to a design
– IC’s consist of numerous layers (perhaps 10 or more)
• IC technologies differ with respect to who builds each layer and

IC package IC oxide
source channel drain
Silicon substrate

IC technology

• Three types of IC technologies

– Full-custom/VLSI
– Semi-custom ASIC (gate array and standard cell)
– PLD (Programmable Logic Device)

• All layers are optimized for an embedded system’s

particular digital implementation
– Placing transistors
– Sizing transistors
– Routing wires
• Benefits
– Excellent performance, small size, low power
• Drawbacks
– High NRE cost (e.g., $300k), long time-to-market

• Lower layers are fully or partially built

– Designers are left with routing of wires and maybe placing
some blocks
• Benefits
– Good performance, good size, less NRE cost than a full-
custom implementation (perhaps $10k to $100k)
• Drawbacks
– Still require weeks to months to develop

Embedded Systems Design: A Unified 27

Hardware/Software Introduction, (c) 2000 Vahid/Givargis
PLD (Programmable Logic Device)

• All layers already exist

– Designers can purchase an IC
– Connections on the IC are either created or destroyed to
implement desired functionality
– Field-Programmable Gate Array (FPGA) very popular
• Benefits
– Low NRE costs, almost instant IC availability
• Drawbacks
– Bigger, expensive (perhaps $30 per unit), power hungry,
Moore’s law

• The most important trend in embedded systems

– Predicted in 1965 by Intel co-founder Gordon Moore
IC transistor capacity has doubled roughly every 18 months for the past several decades

Logic transistors

(in millions)

per chip

logarithmic scale

2009 2007 2005 2003 2001 1999 1997 1995 1993 1991 1989 1987 1985 1983 1981

Moore’s law

• Wow
– This growth rate is hard to imagine, most people
– How many ancestors do you have from 20 generations ago
• i.e., roughly how many people alive in the 1500’s did it take to make
• 220 = more than 1 million people
– (This underestimation is the key to pyramid schemes!)

Graphical illustration of Moore’s law

1981 1984 1987 1990 1993 1996 1999 2002

10,000 150,000,000
transistors transistors

Leading edge Leading edge

chip in 1981 chip in 2002

• Something that doubles frequently grows more quickly than most people realize!
– A 2002 chip can hold about 15,000 1981 chips inside itself

Design Technology

• The manner in which we convert our concept of desired system functionality into an

Compilation/ Libraries/ Test/

Synthesis IP Verification

System System Hw/Sw/ Model simulat./

Compilation/Synthesis: specification synthesis OS checkers
Automates exploration and
insertion of implementation
details for lower level.
Behavioral Behavior Cores Hw-Sw
specification synthesis cosimulators
Libraries/IP: Incorporates pre-
designed implementation from
lower abstraction level into
higher level. RT RT RT HDL simulators
specification synthesis components

Test/Verification: Ensures correct

functionality at each level, thus
reducing costly iterations Logic Logic Gates/ Gate
between levels. specification synthesis Cells simulators

To final implementation

Design productivity exponential increase



(K) Trans./Staff – Mo.


1981 1983 1987 1989 1991 1993 1995 1997 1999 2001 2003 2005 2007 2009

• Exponential increase over the past few decades

The co-design ladder

• In the past: Sequential program code (e.g., C, VHDL)

– Hardware and software Compilers

Behavioral synthesis
design technologies were
Register transfers
very different Assembly instructions RT synthesis
– Recent maturation of Assemblers, linkers
(1980's, 1990's)

synthesis enables a unified (1950's, 1960's) Logic equations / FSM's

Logic synthesis
view of hardware and (1970's, 1980's)
Machine instructions
software Logic gates

• Hardware/software
“codesign” Microprocessor plus
program bits: “software”
implementation: “hardware”

The choice of hardware versus software for a particular function is simply a tradeoff among various
design metrics, like performance, power, size, NRE cost, and especially flexibility; there is no
fundamental difference between what hardware or software can implement.

Independence of processor and IC
• Basic tradeoff
– General vs. custom
– With respect to processor technology or IC technology
– The two technologies are independent

General- Single-
purpose ASIP purpose
General, processor processor Customized,
providing improved: providing improved:

Flexibility Power efficiency

Maintainability Performance
NRE cost Size
Time- to-prototype Cost (high volume)
Cost (low volume)

PLD Semi-custom Full-custom

Design productivity gap

• While designer productivity has grown at an impressive rate

over the past decades, the rate of improvement has not kept
pace with chip capacity
10,000 100,000
1,000 10,000
Logic transistors

(K) Trans./Staff-Mo.
(in millions)

100 1000
per chip

10 Gap 100
IC capacity
1 10
0.1 1
0.01 0.1
0.001 0.01

2009 2007 2005 2003 2001 1999 1997 1995 1993 1991 1989 1987 1985 1983 1981

Design productivity gap
• 1981 leading edge chip required 100 designer months
– 10,000 transistors / 100 transistors/month
• 2002 leading edge chip requires 30,000 designer months
– 150,000,000 / 5000 transistors/month
• Designer cost increase from $1M to $300M
10,000 100,000
Logic transistors

(K) Trans./Staff-Mo.
1,000 10,000
(in millions)

100 1000
per chip

10 Gap 100
IC capacity
1 10
0.1 1
0.01 0.1
0.001 0.01
2009 2007 2005 2003 2001 1999 1997 1995 1993 1991 1989 1987 1985 1983 1981

The mythical man-month
• The situation is even worse than the productivity gap indicates
• In theory, adding designers to team reduces project completion time
• In reality, productivity per designer decreases due to complexities of team management
and communication
• In the software community, known as “the mythical man-month” (Brooks 1975)
• At some point, can actually lengthen project completion time! (“Too many cooks”)
• 60000 15
1M transistors, 1 16 16
50000 19 18
designer=5000 trans/month
40000 23
• Each additional designer 24
reduces for 100 trans/month Months until completion
20000 43
• So 2 designers produce 4900 Individual
trans/month each
0 10 20 30 40
Number of designers

• Embedded systems are everywhere

• Key challenge: optimization of design metrics
– Design metrics compete with one another
• A unified view of hardware and software is necessary to
improve productivity
• Three key technologies
– Processor: general-purpose, application-specific, single-purpose
– IC: Full-custom, semi-custom, PLD
– Design: Compilation/synthesis, libraries/IP, test/verification

