Nothing Special   »   [go: up one dir, main page]

0% found this document useful (0 votes)
88 views70 pages

Java Concurrency

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 70

Chair of Software Engineering

So#wareArchitecture
BertrandMeyer,MichelaPedroni
ETHZurich,FebruaryMay2010
Lecture 14: Designing for concurrency
(Material prepared by Sebastian Nanz)

Chair of Software Engineering

Concurrent Processes

Why is concurrency so important?


Traditionally, specialized area of interest to a few experts:
Operating systems
Networking
Databases
Multicore and the Internet make it relevant to every
programmer!

What they say about concurrency


Corporation: Multi-core processing is taking the
industry on a fast-moving and exciting ride into
profoundly new territory. The defining paradigm in
computing performance has shifted inexorably from raw
clock speed to parallel operations and energy efficiency.
Rick Rashid, head of Microsoft Research: Multicore
processors represent one of the largest technology
transitions in the computing industry today, with deep
implications for how we develop software.
Bill Gates: Multicore: This is the one which will have the
biggest impact on us. We have never had a problem to
solve like this. A breakthrough is needed in how
applications are done on multicore devices.
Intel

Evolution of hardware (source: Intel)

Multiprocessing
Until a few years ago: systems with one processing unit
werestandard
Today: most end-user systems have multiple processing
units in the form of multi-core processors
Process 1

CPU 1

Process 2

CPU 2
Instructions

Multiprocessing: the use of more than one processing unit


in a system
Execution of processes is said to be parallel, as they are
running at the same time

Multitasking & multithreading


Even on systems with a single processing unit programs
may appear to run in parallel:
Multitasking*
Multithreading (within a process, see in a few slides)
Process 1

Process 2

CPU
Instructions

Multi-tasked execution of processes is said to be


interleaved, as all are in progress, but only one is running
at a time. (Closely related concept: coroutines.)
*This is common terminology, but multiprocessing was
also used previously as a synonym for multitasking

Processes
A (sequential) program is a set of instructions
A process is an instance of a program that is being
executed

Concurrency
Both multiprocessing and multithreading are examples of
concurrent computation
The execution of processes or threads is said to be
concurrent if it is either parallel or interleaved

Vomputation
To perform a computation is
To apply certain actions
To certain objects
Using certain processors

Actions

Objects

Processor

10

Operating system processes


How are processes implemented in an operating system?
Structure of a typical process:
Process identifier: unique ID of a process.
Process state: current activity of a process.
Process context: program counter, register values
Memory: program text, global data, stack, and heap.
Process ID
Code

Global data

Program
counter
Register
values

Heap
Stack

The scheduler
A system program called the scheduler controls which
processes are running; it sets the process states:
Running: instructions are being executed.
Blocked: currently waiting for an event.
Ready: ready to be executed, but has not been
assigned a processor yet.
blocked

running

ready
Context switch

The context switch


The swapping of processes on a processing unit by the
scheduler is called a context switch
Context

P1
Context
Registers

P2

CPU

Scheduler actions when switching processes P1 and P2:


P1.state := ready
Save register values as P1's context in memory
Use context of P2 to set register values
P2.state := running

Concurrency within programs


We also want to use concurrency within programs
compute
do
t1.do_task1
t2.do_task2
end

Sequential execution:
CPU 2

CPU 1

CPU 2

task 1

n
m+n

task 2

CPU 1
task 2

task 1

Concurrent execution:

max(m, n)

Threads (lightweight processes)


Make programs concurrent by associating them with
threads
A thread is a part of an operating system process
Private to each thread:
Process ID
Thread identifier
Code
Global data
Heap
Thread state
Thread ID1 Thread ID2 Thread ID3
Thread context
Program
Program
Program
Memory: only stack
counter
counter
counter
Shared with other threads:
Register
Register
Register
values
values
values
Program text
Stack
Stack
Stack
Global data
Heap

Processes vs threads
Process:
Has its own (virtual) memory space (in O-O programming, its
own objects)
Sharing of data (objects) with another process:
Is explicit (good for reliability, security, readability)
Is heavy (bad for ease of programming)
Switching to another process: expensive (needs to back up
one full context and restore another
Thread:
Shares memory with other threads
Sharing of data is straightforward
Simple go program (good)
Risks of confusion and errors: data races (bad)
Switching to another thread: cheap
16

Chair of Software Engineering

Concurrent Programming:
The Java Threads approach

Concurrent programs in Java


Associating a computation with a threads:
class Thread1 extends Thread {
public void run() {
// implement task1 here
}
}
class Thread2 extends Thread {
public void run() {
// implement task2 here
}
}

void compute() {
Thread1 t1 = new Thread1();
Thread2 t2 = new Thread2();
t1.start();
t2.start();
}

Write a class that inherits from class Thread (or


implements interface Runnable)
Implement the method run()

Joining threads
Often the final results of thread executions need to be
combined:
return t1.getResult() + t2.getResult();

To wait for both threads to be finished, we join them:


t1.start();
t2.start();
t1.join();
t2.join();
return t1.getResult() + t2.getResult();

The join() method, invoked on a thread t, causes the caller


to wait until t is finished

Chair of Software Engineering

Java Threads
Mutual exclusion

Race conditions (1)


Consider a counter class:
class Counter {
private int value = 0;
public int getValue() {
return value;
}
public void setValue(int someValue) {
value = someValue;
}
public void increment() {
value++;
}
}

Assume two threads:


Thread 1:

x.setValue(0);
x.increment();
int i = x.getValue();

Thread 2:
x.setValue(2);

Race conditions (2)


Because of the interleaving of threads, various results can
be obtained:
x.setValue(2)
x.setValue(0)
x.increment()
int i = x.getValue()

x.setValue(0)
x.setValue(2)
x.increment()
int i = x.getValue()

x.setValue(0)
x.increment()
x.setValue(2)
int i = x.getValue()

x.setValue(0)
x.increment()
int i = x.getValue()
x.setValue(2)

i == 1
x.value == ?

i == 3
x.value == ?

i == 2
x.value == ?

i == 1
x.value == ?

Such dependence of the result on nondeterministic


interleaving is a race condition (or data race)
Such errors can stay hidden for a long time and are difficult
to find by testing
22

Race conditions (2)


Because of the interleaving of threads, various results can
be obtained:
x.setValue(2)
x.setValue(0)
x.increment()
int i = x.getValue()

x.setValue(0)
x.setValue(2)
x.increment()
int i = x.getValue()

x.setValue(0)
x.increment()
x.setValue(2)
int i = x.getValue()

x.setValue(0)
x.increment()
int i = x.getValue()
x.setValue(2)

i == 1
x.value == 1

i == 3
x.value == 3

i == 2
x.value == 2

i == 1
x.value == 2

Such dependence of the result on nondeterministic


interleaving is a race condition (or data race)
Such errors can stay hidden for a long time and are difficult
to find by testing
23

Synchronization
To avoid data races, threads (or processes) must
synchronize with each other, i.e. communicate to agree on
the appropriate sequence of actions
How to communicate:
By reading and writing to shared sections of memory
(shared memory synchronization)
In the example, threads should agree that at any one
time at most one of them can access the resource
explicit exchange of information (message passing
synchronization)

By

Mutual exclusion
Mutual exclusion (or mutex) is a form of synchronization
that avoids the simultaneous use of a shared resource

To identify the program parts that need attention, we


introduce the notion of a critical section : a part of a
program that accesses a shared resource, and should
normally be executed by at most one thread at a time

Mutual exclusion in Java


Each object in Java has a mutex lock (can be held only by
one thread at a time!) that can be acquired and released
within synchronized blocks:

Object lock = new Object();


synchronized (lock) {
// critical section
}
The following are equivalent:
synchronized type m(args) {
// body
}

type m(args) {
synchronized (this) {
// body
}
}

Example: mutual exclusion


To avoid data races in the example, we enclose instructions
to be executed atomically in synchronized blocks
protected with the same lock objects
synchronized (lock) {
x.setValue(0);
x.increment();
int i = x.getValue();
}

synchronized (lock) {
x.setValue(2);
}

27

Chair of Software Engineering

Java Threads
Condition synchronization

The producer-consumer problem


Consider two types of looping processes:
Producer: At each loop iteration, produces a data
item for consumption by a consumer
Consumer: At each loop iteration, consumes a data
item produced by a producer
Producers and consumers communicate via a shared buffer
(a generalized notion of bounded queue)
Producers append data items to the back of the queue and
consumers remove data items from the front

Condition synchronization
The producer-consumer problem requires that processes
access the buffer properly:
Consumers must wait if the buffer is empty
Producers must wait if the buffer is full

Condition synchronization is a form of synchronization where


processes are delayed until a condition holds
In producer-consumer we use two forms of synchronization:
Mutual exclusion: to prevent races on the buffer
Condition synchronization: to prevent improper access
of the buffer

Condition synchronization in Java (2)


The following methods can be called on a synchronized
object (i.e. only within a synchronized block, on the lock
object):
wait(): block the current thread and release the lock
until some thread does a notify() or notifyAll()
notify(): resume one blocked thread (chosen
nondeterministically), set its state to "ready"
notifyAll(): resume all blocked threads
No guarantee the notification mechanism is fair

Producer-Consumer problem: Consumer code


public void consume() throws InterruptedException {
int value;
synchronized (buffer) {
while (buffer.size() == 0) {
buffer.wait();
}
value = buffer.get();
}
}

Consumer blocks if buffer.size() == 0 is true (waiting for a


notify() from the producer)

32

Producer-Consumer problem: Producer code


public void produce() {
int value = random.produceValue();
synchronized (buffer) {
buffer.put(value);
buffer.notify();
}
}

Producer notifies consumer that the condition


buffer.size() == 0 is no longer true

33

Chair of Software Engineering

JavaThreads
Deadlocks

The problem of deadlock


The ability to hold resources exclusively is central to
providing process synchronization for resource access
Unfortunately, it brings about other problems!

A deadlock is the situation where a group of processes


blocks forever because each of the processes is waiting
for resources which are held by another process in the
group

Deadlock example in Java


Consider the class
public class C extends Thread {
private Object a;
private Object b;
public C(Object x, Object y) {
a = x;
b = y;
}
public void run() {
synchronized (a) {
synchronized (b) {
...
}
}

... and this code being executed:


C t1 = new C(a1, b1);
C t2 = new C(b1, a1);
t1.start();
t2.start();

}}
36

Dining philosophers

37

Are deadlock & data races of the same kind?


No
Two kinds of concurrency issues (Lamport):
Safety:

bad things do not happen

Liveness:

some (good) thing will happen

38

Data from the field


Source for the next few slides:
Learning from Mistakes

Real World Concurrency Bug


Characteristics
Yuanyuan(YY) Zhou
University of Illinois, Urbana-Champaign
Microsoft Faculty Summit, 2008
See also her paper at ASPLOS 2008
39

105 real-world concurrency bugs from 4

large open-source programs


MySQL

Software Type Server


Language
C++/C
LOC (M line)
Bug history

2
6 years

Apache

Mozilla

OpenOffice

Server

Client

GUI

Mainly C

C++

C++

0.3
7 years

10 years 8 years

40

MySQL

Apache

Mozilla

OpenOffice

Total

Non-deadlock

14

13

41

74

Deadlock

16

31

Classified based on root causes


Categories
Atomicity violation

Pattern

Thread 1 Thread 2

The desired atomicity of certain

code region is violated

Order violation
Thread 1 Thread 2
The desired order between
X
two (sets of) accesses is flipped
Others

Implications

50

We should focus on

40

OpenOffice

30

atomicity violation
and order violation

Mozilla
Apache

20

MySQL

10

Bug detection tools

for order violation


bugs are desired

0
AtomicityOrder Other
*There are 3-bug overlap between Atomicity and Order

Note that order violations can be fixed by adding


locks to ensure atomicity with the previous operation
to ensure order. But the root cause is the incorrect
assumption about execution order.

OK
Woops!

101 out of 105 (96%) bugs involve at most

two threads

Most bugs can be reliably disclosed if we check

all possible interleaving between each pair of


threads

Few bugs cannot


Example: Intensive resource competition
among many threads causes unexpected delay

Chair of Software Engineering

Concurrent Programming:
The SCOOP approach

Then and now


Sequential programming:

Concurrent programming:

Used to be messy

Used to be messy

Still hard but key


improvements:
Structured programming
Data abstraction &
object technology
Design by Contract
Genericity, multiple
inheritance
Architectural techniques

Still messy
Example: threading models in
most popular approaches
Development level: sixties/
seventies
Only understandable through
operational reasoning
48

SCOOP mechanism
Simple Concurrent Object-Oriented Programming
Evolved through the last two decades
Comm. ACM paper (1993)
Chap. 30 of Object-Oriented Software Construction,
2nd edition, 1997
Piotr Nienaltowskis ETH thesis, 2008
Current work by Sebastian Nanz, Benjamin Morandi,
Scott West and other at ETH
Implementation available as part of EVE (research
version of EiffelStudio)

49

Object-oriented computation
To perform a computation is
To apply certain actions
To certain objects
Using certain processors

Actions

Objects

Processor

50

What makes an application concurrent?


Processor:
Thread of control supporting sequential execution of
instructions on one or more objects
Can be implemented as:
Computer CPU
Process
Thread
AppDomain (.NET)

Actions

Objects

Processor

Will be mapped to computational resources

51

Feature call: sequential

x.r (a)
Client

Supplier

previous

r (x : A)
do

end

x.r (a)
next

Processor
52

Feature call: asynchronous

Client

previous

Supplier

next

r (x : A)
do

end

Clients handler

Suppliers handler

x.r (a)

53

The fundamental difference


To wait or not to wait:
If same processor, synchronous
If different processor, asynchronous
Difference must be captured by syntax:
x: T
x:

separate T -- Potentially different processor

Fundamental semantic rule: x r (a) waits for nonseparate x, doesnt wait for separate x.

54

Consistency rules: avoiding traitors

nonsep : T
sep : separate T

Traitor!

nonsep := sep

nonsep p (a)

55

Wait by necessity
No explicit mechanism needed for client to
resynchronize with supplier after separate call.
The client will wait only when it needs to:

x .f

x.g (a)
y.f

value:=x.some_query

Wait here!

Lazy wait (Denis Caromel, wait by necessity)


56

Chair of Software Engineering

SCOOP
Mutual exclusion

Separate argument rule (1)


Target of a separate call must be formal
argument of enclosing routine:
put (b: separate BUFFER[T ]; value : T)
-- Store value into buffer.
do
b.put (value)
end

To use separate object:

buffer : separate BUFFER[INTEGER ]


create buffer
put (buffer , 10)
58

Separate argument rule (2)

The target of a separate call


must be an argument of the enclosing routine

Separate call: x.f (...) where x is separate

59

Wait rule

A routine call with separate arguments


will execute when all corresponding processors
are available
and hold them exclusively
for the duration of the routine
Since all processors of separate arguments
are locked and held for the duration of the
routine, mutual exclusion is provided for the
corresponding objects
60

Dining philosophers
class PHILOSOPHER inherit
PROCESS
rename
setup as getup
redefine step end
feature {BUTLER}
step
do
think ; eat (left, right)
end

end

eat (l, r : separate FORK)


-- Eat, having grabbed l and r.
do end

61

Typical traditional (non-SCOOP) code

62

Chair of Software Engineering

SCOOP
Condition synchronization

Condition synchronization
SCOOP has an elegant way of expressing condition
synchronization by reinterpreting preconditions as wait
conditions
Completed wait rule:

A call with separate arguments waits until:


The corresponding objects are all available
Preconditions hold

64

Producer-Consumer problem: Consumer code


consume (a_buffer: separate BUFFER[INTEGER])
require
not (a_buffer.count == 0)
local
value: INTEGER
do
Precondition
value := a_buffer.get (a_value)
becomes wait
end
condition

Consumer blocks itself if the condition buffer.size() == 0


is found to be true (waiting for a notify() from the
producer)
65

Producer-Consumer problem: Producer code


produce (a_buffer: separate BUFFER[INTEGER])
require
not a_buffer.is_full
local
value: INTEGER
do
value := random.produceValue
a_buffer.put (value)
end

Very easy to provide a solution for bounded buffers


No need for notification, the SCOOP scheduler ensures
that preconditions are automatically reevaluated at a later
time
66

Contracts
put (buf : separate QUEUE [INTEGER ] ; v : INTEGER)
-- Store v into buffer.
require
not buf.is_full
v>0
do
buf.put (v)
ensure
not buf.is_empty
end

Precondition becomes
wait condition

...
put (my_buffer, 10 )
67

Chair of Software Engineering

SCOOP
Deadlocks

Deadlocks
Deadlocks are still possible in SCOOP. Work for deadlock
prevention, based on locking orders, is underway
class C
feature
a : separate A
b : separate A
make (x : separate A, y : separate A)
do
a := x
b := y
end
f do g (a) end
g (x : separate A) do h (b) end
h (y : separate A) do ... end
end

create c1.make (a, b)


create c2.make (b, a)
c1.f
c2.f

For more
Several concurrency courses in the ETH curriculum,
including our (Bertrand Meyer, Sebastian Nanz) Concepts
of Concurrent Computation (Spring semester)
Good textbooks:
Kramer
Herlihy

70

You might also like