Nothing Special   »   [go: up one dir, main page]

Slides 04

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

Distributed Systems

(3rd Edition)

Chapter 04: Communication


Version: February 25, 2017
Communication: Foundations Layered Protocols

Basic networking model

Application protocol
Application 7
Presentation protocol
Presentation 6
Session protocol
Session 5
Transport protocol
Transport 4
Network protocol
Network 3
Data link protocol
Data link 2
Physical protocol
Physical 1

Network

Drawbacks
Focus on message-passing only
Often unneeded or unwanted functionality
Violates access transparency

The OSI reference model 2 / 49


Communication: Foundations Layered Protocols

Low-level layers

Recap
Physical layer: contains the specification and implementation of bits, and
their transmission between sender and receiver
Data link layer: prescribes the transmission of a series of bits into a frame
to allow for error and flow control
Network layer: describes how packets in a network of computers are to be
routed.

Observation
For many distributed systems, the lowest-level interface is that of the network
layer.

The OSI reference model 3 / 49


Communication: Foundations Layered Protocols

Transport Layer

Important
The transport layer provides the actual communication facilities for most
distributed systems.

Standard Internet protocols


TCP: connection-oriented, reliable, stream-oriented communication
UDP: unreliable (best-effort) datagram communication

The OSI reference model 4 / 49


Communication: Foundations Layered Protocols

Middleware layer

Observation
Middleware is invented to provide common services and protocols that can be
used by many different applications
A rich set of communication protocols
(Un)marshaling of data, necessary for integrated systems
Naming protocols, to allow easy sharing of resources
Security protocols for secure communication
Scaling mechanisms, such as for replication and caching

Note
What remains are truly application-specific protocols... such as?

Middleware protocols 5 / 49
Communication: Foundations Layered Protocols

An adapted layering scheme

Application protocol
Application

Middleware protocol
Middleware

Operating Host-to-host protocol


system

Physical/Link-level protocol
Hardware

Network

Middleware protocols 6 / 49
Communication: Foundations Types of Communication

Types of communication

Distinguish...
Synchronize at Synchronize at Synchronize after
request submission request delivery processing by server
Client

Request

Transmission
interrupt
Storage
facility

Reply

Server Time

Transient versus persistent communication


Asynchronous versus synchronous communication

7 / 49
Communication: Foundations Types of Communication

Types of communication
Transient versus persistent
Synchronize at Synchronize at Synchronize after
request submission request delivery processing by server
Client

Request

Transmission
interrupt
Storage
facility

Reply

Server Time

Transient communication: Comm. server discards message when it


cannot be delivered at the next server, or at the receiver.
Persistent communication: A message is stored at a communication
server as long as it takes to deliver it.
8 / 49
Communication: Foundations Types of Communication

Types of communication

Places for synchronization


Synchronize at Synchronize at Synchronize after
request submission request delivery processing by server
Client

Request

Transmission
interrupt
Storage
facility

Reply

Server Time

At request submission
At request delivery
After request processing

9 / 49
Communication: Foundations Types of Communication

Client/Server

Some observations
Client/Server computing is generally based on a model of transient
synchronous communication:
Client and server have to be active at time of communication
Client issues request and blocks until it receives reply
Server essentially waits only for incoming requests, and subsequently
processes them

10 / 49
Communication: Foundations Types of Communication

Client/Server

Some observations
Client/Server computing is generally based on a model of transient
synchronous communication:
Client and server have to be active at time of communication
Client issues request and blocks until it receives reply
Server essentially waits only for incoming requests, and subsequently
processes them

Drawbacks synchronous communication


Client cannot do any other work while waiting for reply
Failures have to be handled immediately: the client is waiting
The model may simply not be appropriate (mail, news)

10 / 49
Communication: Foundations Types of Communication

Messaging

Message-oriented middleware
Aims at high-level persistent asynchronous communication:
Processes send each other messages, which are queued
Sender need not wait for immediate reply, but can do other things
Middleware often ensures fault tolerance

11 / 49
Communication: Remote procedure call Basic RPC operation

Basic RPC operation

Observations
Application developers are familiar with simple procedure model
Well-engineered procedures operate in isolation (black box)
There is no fundamental reason not to execute procedures on separate
machine
Wait for result
Client

Conclusion Call remote Return


procedure from call
Communication between caller & callee
can be hidden by using procedure-call Request Reply
mechanism. Server
Call local procedure Time
and return results

12 / 49
Communication: Remote procedure call Basic RPC operation

Basic RPC operation


Client machine Server machine

Client process Server process


1. Client call to
procedure Implementation 6. Stub makes
of doit local call to “doit”
Server stub
r = doit(a,b) r = doit(a,b)
Client stub
proc: “doit” proc: “doit”
type1: val(a) type1: val(a) 5. Stub unpacks
2. Stub builds message
type2: val(b) message type2: val(b)

proc: “doit” 4. Server OS


Client OS type1: val(a) Server OS hands message
type2: val(b) to server stub

3. Message is sent
across the network

1 Client procedure calls client stub. 6 Server does local call; returns result to stub.
2 Stub builds message; calls local OS. 7 Stub builds message; calls OS.
3 OS sends message to remote OS. 8 OS sends message to client’s OS.
4 Remote OS gives message to stub. 9 Client’s OS gives message to stub.
5 Stub unpacks parameters; calls 10 Client stub unpacks result; returns to client.
server.
13 / 49
Communication: Remote procedure call Parameter passing

RPC: Parameter passing

There’s more than just wrapping parameters into a message


Client and server machines may have different data representations (think
of byte ordering)
Wrapping a parameter means transforming a value into a sequence of
bytes
Client and server have to agree on the same encoding:

How are basic data values represented (integers, floats, characters)


How are complex data values represented (arrays, unions)

Conclusion
Client and server need to properly interpret messages, transforming them into
machine-dependent representations.

14 / 49
Communication: Remote procedure call Parameter passing

RPC: Parameter passing

Some assumptions
Copy in/copy out semantics: while procedure is executed, nothing can be
assumed about parameter values.
All data that is to be operated on is passed by parameters. Excludes
passing references to (global) data.

15 / 49
Communication: Remote procedure call Parameter passing

RPC: Parameter passing

Some assumptions
Copy in/copy out semantics: while procedure is executed, nothing can be
assumed about parameter values.
All data that is to be operated on is passed by parameters. Excludes
passing references to (global) data.

Conclusion
Full access transparency cannot be realized.

15 / 49
Communication: Remote procedure call Parameter passing

RPC: Parameter passing

Some assumptions
Copy in/copy out semantics: while procedure is executed, nothing can be
assumed about parameter values.
All data that is to be operated on is passed by parameters. Excludes
passing references to (global) data.

Conclusion
Full access transparency cannot be realized.

A remote reference mechanism enhances access transparency


Remote reference offers unified access to remote data
Remote references can be passed as parameter in RPCs
Note: stubs can sometimes be used as such references

15 / 49
Communication: Remote procedure call Variations on RPC

Asynchronous RPCs

Essence
Try to get rid of the strict request-reply behavior, but let the client continue
without waiting for an answer from the server.

Wait for Callback to client


Client acceptance

Call remote Return


procedure from call Return
results
Accept
Request request

Server Call local procedure Time

Asynchronous RPC 16 / 49
Communication: Remote procedure call Variations on RPC

Sending out multiple RPCs

Essence
Sending an RPC request to a group of servers.

Server Call local procedure

Callbacks to client
Client

Call remote
procedures

Server Call local procedure Time

Multicast RPC 17 / 49
Communication: Remote procedure call Example: DCE RPC

RPC in practice

Uuidgen

Interface
definition file

IDL compiler

Client code Client stub Header Server stub Server code

#include #include

C compiler C compiler C compiler C compiler

Client Client stub Server stub Server


object file object file object file object file

Runtime Runtime
Linker Linker
library library

Client Server
binary binary

Writing a Client and a Server 18 / 49


Communication: Remote procedure call Example: DCE RPC

Client-to-server binding (DCE)

Issues
(1) Client must locate server machine, and (2) locate the server.

Directory machine

Directory
server
2. Register service
3. Look up server
Server machine
Client machine

5. Do RPC 1. Register port


Server
Client

4. Ask for port DCE


daemon Port
table

Binding a client to a server 19 / 49


Communication: Message-oriented communication Simple transient messaging with sockets

Transient messaging: sockets


Berkeley socket interface
Operation Description
socket Create a new communication end point
bind Attach a local address to a socket
listen Tell operating system what the maximum number of pending
connection requests should be
accept Block caller until a connection request arrives
connect Actively attempt to establish a connection
send Send some data over the connection
receive Receive some data over the connection
close Release the connection

Server
socket bind listen accept receive send close

Synchronization point Communication

socket connect send receive close


Client

20 / 49
Communication: Message-oriented communication Simple transient messaging with sockets

Sockets: Python code


Server
1 from socket import *
2 s = socket(AF_INET, SOCK_STREAM)
3 s.bind((HOST, PORT))
4 s.listen(1)
5 (conn, addr) = s.accept() # returns new socket and addr. client
6 while True: # forever
7 data = conn.recv(1024) # receive data from client
8 if not data: break # stop if client stopped
9 conn.send(str(data)+"*") # return sent data plus an "*"
10 conn.close() # close the connection

Client
1 from socket import *
2 s = socket(AF_INET, SOCK_STREAM)
3 s.connect((HOST, PORT)) # connect to server (block until accepted)
4 s.send(’Hello, world’) # send same data
5 data = s.recv(1024) # receive the response
6 print data # print the result
7 s.close() # close the connection

21 / 49
Communication: Message-oriented communication Advanced transient messaging

Making sockets easier to work with

Observation
Sockets are rather low level and programming mistakes are easily made.
However, the way that they are used is often the same (such as in a
client-server setting).

Alternative: ZeroMQ
Provides a higher level of expression by pairing sockets: one for sending
messages at process P and a corresponding one at process Q for receiving
messages. All communication is asynchronous.

Three patterns
Request-reply
Publish-subscribe
Pipeline

Using messaging patterns: ZeroMQ 22 / 49


Communication: Message-oriented communication Advanced transient messaging

Request-reply

Server
1 import zmq
2 context = zmq.Context()
3
4 p1 = "tcp://"+ HOST +":"+ PORT1 # how and where to connect
5 p2 = "tcp://"+ HOST +":"+ PORT2 # how and where to connect
6 s = context.socket(zmq.REP) # create reply socket
7
8 s.bind(p1) # bind socket to address
9 s.bind(p2) # bind socket to address
10 while True:
11 message = s.recv() # wait for incoming message
12 if not "STOP" in message: # if not to stop...
13 s.send(message + "*") # append "*" to message
14 else: # else...
15 break # break out of loop and end

Using messaging patterns: ZeroMQ 23 / 49


Communication: Message-oriented communication Advanced transient messaging

Request-reply

Client
1 import zmq
2 context = zmq.Context()
3
4 php = "tcp://"+ HOST +":"+ PORT # how and where to connect
5 s = context.socket(zmq.REQ) # create socket
6
7 s.connect(php) # block until connected
8 s.send("Hello World") # send message
9 message = s.recv() # block until response
10 s.send("STOP") # tell server to stop
11 print message # print result

Using messaging patterns: ZeroMQ 24 / 49


Communication: Message-oriented communication Advanced transient messaging

Publish-subscribe
Server
1 import zmq, time
2
3 context = zmq.Context()
4 s = context.socket(zmq.PUB) # create a publisher socket
5 p = "tcp://"+ HOST +":"+ PORT # how and where to communicate
6 s.bind(p) # bind socket to the address
7 while True:
8 time.sleep(5) # wait every 5 seconds
9 s.send("TIME " + time.asctime()) # publish the current time

Client
1 import zmq
2
3 context = zmq.Context()
4 s = context.socket(zmq.SUB) # create a subscriber socket
5 p = "tcp://"+ HOST +":"+ PORT # how and where to communicate
6 s.connect(p) # connect to the server
7 s.setsockopt(zmq.SUBSCRIBE, "TIME") # subscribe to TIME messages
8
9 for i in range(5): # Five iterations
10 time = s.recv() # receive a message
11 print time
Using messaging patterns: ZeroMQ 25 / 49
Communication: Message-oriented communication Advanced transient messaging

Pipeline

Source
1 import zmq, time, pickle, sys, random
2
3 context = zmq.Context()
4 me = str(sys.argv[1])
5 s = context.socket(zmq.PUSH) # create a push socket
6 src = SRC1 if me == ’1’ else SRC2 # check task source host
7 prt = PORT1 if me == ’1’ else PORT2 # check task source port
8 p = "tcp://"+ src +":"+ prt # how and where to connect
9 s.bind(p) # bind socket to address
10
11 for i in range(100): # generate 100 workloads
12 workload = random.randint(1, 100) # compute workload
13 s.send(pickle.dumps((me,workload))) # send workload to worker

Using messaging patterns: ZeroMQ 26 / 49


Communication: Message-oriented communication Advanced transient messaging

Pipeline

Worker
1 import zmq, time, pickle, sys
2
3 context = zmq.Context()
4 me = str(sys.argv[1])
5 r = context.socket(zmq.PULL) # create a pull socket
6 p1 = "tcp://"+ SRC1 +":"+ PORT1 # address first task source
7 p2 = "tcp://"+ SRC2 +":"+ PORT2 # address second task source
8 r.connect(p1) # connect to task source 1
9 r.connect(p2) # connect to task source 2
10
11 while True:
12 work = pickle.loads(r.recv()) # receive work from a source
13 time.sleep(work[1]*0.01) # pretend to work

Using messaging patterns: ZeroMQ 27 / 49


Communication: Message-oriented communication Advanced transient messaging

MPI: When lots of flexibility is needed

Representative operations
Operation Description
MPI bsend Append outgoing message to a local send buffer
MPI send Send a message and wait until copied to local or
remote buffer
MPI ssend Send a message and wait until transmission starts
MPI sendrecv Send a message and wait for reply
MPI isend Pass reference to outgoing message, and continue
MPI issend Pass reference to outgoing message, and wait until
receipt starts
MPI recv Receive a message; block if there is none
MPI irecv Check if there is an incoming message, but do not
block

The Message-Passing Interface (MPI) 28 / 49


Communication: Message-oriented communication Message-oriented persistent communication

Message-oriented middleware

Essence
Asynchronous persistent communication through support of middleware-level
queues. Queues correspond to buffers at communication servers.

Operations
Operation Description
put Append a message to a specified queue
get Block until the specified queue is nonempty, and
remove the first message
poll Check a specified queue for messages, and remove
the first. Never block
notify Install a handler to be called when a message is put
into the specified queue

Message-queuing model 29 / 49
Communication: Message-oriented communication Message-oriented persistent communication

General model

Queue managers
Queues are managed by queue managers. An application can put messages
only into a local queue. Getting a message is possible by extracting it from a
local queue only ⇒ queue managers need to route messages.

Routing
Look up
Source queue contact address Destination queue
manager of destination manager
queue manager
Logical
queue-level
address (name)

Local OS Address lookup


Local OS
database

Contact
Network address

General architecture of a message-queuing system 30 / 49


Communication: Message-oriented communication Message-oriented persistent communication

Message broker

Observation
Message queuing systems assume a common messaging protocol: all
applications agree on message format (i.e., structure and data representation)

Broker handles application heterogeneity in an MQ system


Transforms incoming messages to target format
Very often acts as an application gateway
May provide subject-based routing capabilities (i.e., publish-subscribe
capabilities)

Message brokers 31 / 49
Communication: Message-oriented communication Message-oriented persistent communication

Message broker: general architecture

Source Message broker Destination

Application Application

Broker plugins Rules

Interface Interface
Queuing
layer
Local OS Local OS Local OS

Message brokers 32 / 49
Communication: Message-oriented communication Example: IBM’s WebSphere message-queuing system

IBM’s WebSphere MQ
Basic concepts
Application-specific messages are put into, and removed from queues
Queues reside under the regime of a queue manager
Processes can put messages only in local queues, or through an RPC
mechanism

Message transfer
Messages are transferred between queues
Message transfer between queues at different processes, requires a
channel
At each end point of channel is a message channel agent
Message channel agents are responsible for:
Setting up channels using lower-level network communication
facilities (e.g., TCP/IP)
(Un)wrapping messages from/in transport-level packets
Sending/receiving packets

Overview 33 / 49
Communication: Message-oriented communication Example: IBM’s WebSphere message-queuing system

IBM’s WebSphere MQ
Schematic overview
Client's receive
Sending client Routing table Send queue queue Receiving client

Queue Queue
manager manager

MQ Interface MQ Interface

Server MCA MCA Server


Stub MCA MCA Stub
stub stub

Enterprise network Local network


RPC Message passing To other remote
(synchronous) (asynchronous) queue managers

Channels are inherently unidirectional


Automatically start MCAs when messages arrive
Any network of queue managers can be created
Routes are set up manually (system administration)
Overview 34 / 49
Communication: Message-oriented communication Example: IBM’s WebSphere message-queuing system

Message channel agents

Some attributes associated with message channel agents


Attribute Description
Transport type Determines the transport protocol to be used
FIFO delivery Indicates that messages are to be delivered in the
order they are sent
Message length Maximum length of a single message
Setup retry count Specifies maximum number of retries to start up the
remote MCA
Delivery retries Maximum times MCA will try to put received message
into queue

Channels 35 / 49
Communication: Message-oriented communication Example: IBM’s WebSphere message-queuing system

IBM’s WebSphere MQ
Routing
By using logical names, in combination with name resolution to local queues, it
is possible to put a message in a remote queue

Alias table Routing table


LA1 QMC QMB SQ1 Routing table
Alias table
LA2 QMD QMC SQ1
LA1 QMA QMA SQ1
QMD SQ2
LA2 QMD QMC SQ1
QMD SQ1
SQ2 SQ1
SQ1
QMA
QMB

Routing table Routing table


SQ1 QMB
QMA SQ1 QMA SQ1
QMC SQ2 QMC SQ1
QMB SQ1 SQ2 QMD SQ1

Alias table
LA1 QMA
SQ1
LA2 QMC
QMD

Message transfer 36 / 49
Communication: Multicast communication Application-level tree-based multicasting

Application-level multicasting

Essence
Organize nodes of a distributed system into an overlay network and use that
network to disseminate data:
Oftentimes a tree, leading to unique paths
Alternatively, also mesh networks, requiring a form of routing

37 / 49
Communication: Multicast communication Application-level tree-based multicasting

Application-level multicasting in Chord

Basic approach
1 Initiator generates a multicast identifier mid.
2 Lookup succ(mid), the node responsible for mid.
3 Request is routed to succ(mid), which will become the root.
4 If P wants to join, it sends a join request to the root.
5 When request arrives at Q:
Q has not seen a join request before ⇒ it becomes forwarder; P
becomes child of Q. Join request continues to be forwarded.
Q knows about tree ⇒ P becomes child of Q. No need to forward
join request anymore.

38 / 49
Communication: Multicast communication Application-level tree-based multicasting

ALM: Some costs

Different metrics
End host E Router
A 1 1 1 C

Ra Re Rc
30 20
5
7
40 Rd
1 Rb 1
Internet D
B
Overlay network

Link stress: How often does an ALM message cross the same physical
link? Example: message from A to D needs to cross hRa, Rbi twice.
Stretch: Ratio in delay between ALM-level path and network-level path.
Example: messages B to C follow path of length 73 at ALM, but 47 at
network level ⇒ stretch = 73/47.

Performance issues in overlays 39 / 49


Communication: Multicast communication Flooding-based multicasting

Flooding

Essence The size of a random overlay as


P simply sends a message m to function of the number of nodes
each of its neighbors. Each 300

Number of edges (x 1000)


neighbor will forward that message, 250
except to P, and only if it had not 200 pedge = 0.6
seen m before. 150
pedge = 0.4
100
Performance 50
pedge = 0.2

The more edges, the more 0


expensive! 100 500 1000
Number of nodes

40 / 49
Communication: Multicast communication Flooding-based multicasting

Flooding

Essence The size of a random overlay as


P simply sends a message m to function of the number of nodes
each of its neighbors. Each 300

Number of edges (x 1000)


neighbor will forward that message, 250
except to P, and only if it had not 200 pedge = 0.6
seen m before. 150
pedge = 0.4
100
Performance 50
pedge = 0.2

The more edges, the more 0


expensive! 100 500 1000
Number of nodes

Variation
Let Q forward a message with a certain probability pflood , possibly even
dependent on its own number of neighbors (i.e., node degree) or the degree of
its neighbors.

40 / 49
Communication: Multicast communication Gossip-based data dissemination

Epidemic protocols

Assume there are no write–write conflicts


Update operations are performed at a single server
A replica passes updated state to only a few neighbors
Update propagation is lazy, i.e., not immediate
Eventually, each update should reach every replica

Two forms of epidemics


Anti-entropy: Each replica regularly chooses another replica at random,
and exchanges state differences, leading to identical states at both
afterwards
Rumor spreading: A replica which has just been updated (i.e., has been
contaminated), tells a number of other replicas about its update
(contaminating them as well).

41 / 49
Communication: Multicast communication Gossip-based data dissemination

Anti-entropy

Principle operations
A node P selects another node Q from the system at random.
Pull: P only pulls in new updates from Q
Push: P only pushes its own updates to Q
Push-pull: P and Q send updates to each other

Observation
For push-pull it takes O(log(N)) rounds to disseminate updates to all N nodes
(round = when every node has taken the initiative to start an exchange).

Information dissemination models 42 / 49


Communication: Multicast communication Gossip-based data dissemination

Anti-entropy: analysis
Basics
Consider a single source, propagating its update. Let pi be the probability that
a node has not received the update after the i th round.

Analysis: staying ignorant

With pull, pi +1 = (pi )2 : the node was


not updated during the i th round and 1.0
N = 10,000

Probability not yet updated


should contact another ignorant node 0.8
during the next round.
push
With push, 0.6

pi +1 = pi (1 − N1 )N (1−pi ) ≈ pi e−1 (for 0.4 push-pull


small pi and large N): the node was pull
ignorant during the i th round and no 0.2

updated node chooses to contact it


during the next round. 0 5 10 15 20 25
Round
With push-pull: (pi )2 · (pi e−1 )

Information dissemination models 43 / 49


Communication: Multicast communication Gossip-based data dissemination

Anti-entropy performance

1.0
N = 10,000
Probability not yet updated 0.8

0.6 push

0.4 push-pull
pull
0.2

0 5 10 15 20 25
Round

Information dissemination models 44 / 49


Communication: Multicast communication Gossip-based data dissemination

Rumor spreading

Basic model
A server S having an update to report, contacts other servers. If a server is
contacted to which the update has already propagated, S stops contacting
other servers with probability pstop .

Observation
If s is the fraction of ignorant servers (i.e., which are unaware of the update), it
can be shown that with many servers
−(1/pstop +1)(1−s)
s=e

Information dissemination models 45 / 49


Communication: Multicast communication Gossip-based data dissemination

Formal analysis
Notations
Let s denote fraction of nodes that have not yet been updated (i.e., susceptible;
i the fraction of updated (infected) and active nodes; and r the fraction of
updated nodes that gave up (removed).

From theory of epidemics


(1) ds/dt = −s · i
(2) di/dt = s · i − pstop · (1 − s) · i
p
⇒ di/ds = −(1 + pstop ) + stop
s
⇒ i(s) = −(1 + pstop ) · s + pstop · ln(s) + C

Wrapup
i(1) = 0 ⇒ C = 1 + pstop ⇒ i(s) = (1 + pstop ) · (1 − s) + pstop · ln(s). We are
−(1/pstop +1)(1−s)
looking for the case i(s) = 0, which leads to s = e

Information dissemination models 46 / 49


Communication: Multicast communication Gossip-based data dissemination

Rumor spreading
The effect of stopping

0.20 Consider 10,000 nodes


1/pstop s Ns
0.15
1 0.203188 2032
s 0.10 2 0.059520 595
3 0.019827 198
0.05
4 0.006977 70
0.00 5 0.002516 25
0.0 0.2 0.4 0.6 0.8 1.0 6 0.000918 9
pstop
7 0.000336 3

Information dissemination models 47 / 49


Communication: Multicast communication Gossip-based data dissemination

Rumor spreading
The effect of stopping

0.20 Consider 10,000 nodes


1/pstop s Ns
0.15
1 0.203188 2032
s 0.10 2 0.059520 595
3 0.019827 198
0.05
4 0.006977 70
0.00 5 0.002516 25
0.0 0.2 0.4 0.6 0.8 1.0 6 0.000918 9
pstop
7 0.000336 3

Note
If we really have to ensure that all servers are eventually updated, rumor
spreading alone is not enough

Information dissemination models 47 / 49


Communication: Multicast communication Gossip-based data dissemination

Deleting values

Fundamental problem
We cannot remove an old value from a server and expect the removal to
propagate. Instead, mere removal will be undone in due time using epidemic
algorithms

Solution
Removal has to be registered as a special update by inserting a death
certificate

Removing data 48 / 49
Communication: Multicast communication Gossip-based data dissemination

Deleting values

When to remove a death certificate (it is not allowed to stay for ever)
Run a global algorithm to detect whether the removal is known
everywhere, and then collect the death certificates (looks like garbage
collection)
Assume death certificates propagate in finite time, and associate a
maximum lifetime for a certificate (can be done at risk of not reaching all
servers)

Note
It is necessary that a removal actually reaches all servers.

Removing data 49 / 49

You might also like