Nothing Special   »   [go: up one dir, main page]

Chapter 3-Processes

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 40

CSE 5307: Distributed Systems

Chapter 3 - Processes

Dr.T.GopiKrishna, Assistant Professor


Big Data, Machine Learning and data science Specialization
Department of Computer Science and Engineering@SoEEC
ASTU, Adama
Introduction

 communication takes place between processes


 a process is a program in execution
 from OS perspective, management and scheduling of processes
is important
 other important issues arise in distributed systems
 multithreading to enhance performance
 how are clients and servers organized
 process or code migration to achieve scalability and to
dynamically configure clients and servers
 software agents: a collection of processes trying to reach a
common goal

2
3.1 Threads and their Implementation
 Threads can be used in both distributed and nondistributed
systems
 Threads in Non distributed Systems
 a process has an address space (containing program text
and data) and a single thread of control, as well as other
resources such as open files, child processes, accounting
information, etc.
Process 1 Process 2 Process 3

three processes each with one thread one process with three threads 3
 Each thread has its own program counter, registers, stack,
and state; but all threads of a process share address space,
global variables and other resources such as open files, etc.

4
 Threads take turns in running
 Threads allow multiple executions to take place in the same
process environment, called multithreading
Thread Usage – Why do we need threads?
e.g., a word processor has different parts; parts for
 interacting with the user
 formatting the page as soon as changes are made
 timed savings (for auto recovery)
 spelling and grammar checking, etc.
1. Simplifying the programming model: since many activities are
going on at once
2. They are easier to create and destroy than processes since
they do not have any resources attached to them
3. Performance improves by overlapping activities if there is too
much I/O; i.e., to avoid blocking when waiting for input or
doing calculations, say in a spreadsheet
4. Real parallelism is possible in a multiprocessor system
5
 having finer granularity in terms of multiple threads per process
rather than processes provides better performance and makes it
easier to build distributed applications
 in nondistributed systems, threads can be used with shared data
instead of processes to avoid context switching overhead in
interprocess communication (IPC)

process A to process B

context switching as the result of IPC


6
Thread Implementation
 threads are usually provided in the form of a thread package

 the package contains operations to create and destroy a thread,

operations on synchronization variables such as mutexes and


condition variables
 two approaches of constructing a thread package

1. construct a thread library that is executed entirely in user mode


(the OS is not aware of threads)
 cheap to create and destroy threads; just allocate and free
memory
 context switching can be done using few instructions; store and
reload only CPU register values
 disadv: invocation of a blocking system call will block the
entire process to which the thread belongs and all other
threads in that process
2. implement them in the OS’s kernel
 let the kernel be aware of threads and schedule them
 expensive for thread operations such as creation and deletion
since each requires a system call 7
 solution: use a hybrid form of user-level and kernel-level threads,
called lightweight process (LWP)
 a LWP runs in the context of a single (heavy-weight) process, and
there can be several LWPs per process
 the system also offers a user-level thread package for some
operations such as creating and destroying threads, for thread
synchronization (mutexes and condition variables)
 the thread package can be shared by multiple LWPs

combining kernel-level lightweight processes and user-level threads 8


Threads in Distributed Systems
 Multithreaded Clients
 consider a Web browser; fetching different parts of a page can be

implemented as a separate thread, each opening its own TCP/IP


connection to the server or to separate and replicated servers
 each can display the results as it gets its part of the page

 Multithreaded Servers
 servers can be constructed in three ways

1. Single-Threaded Process
 it gets a request, examines it, carries it out to completion

before getting the next request


 the server is idle while waiting for disk read, i.e., system calls

are blocking

9
2. Threads
 threads are more important for implementing servers
 e.g., a file server
 the dispatcher thread reads incoming requests for a file
operation from clients and passes it to an idle worker thread
 the worker thread performs a blocking disk read; in which
case another thread may continue, say the dispatcher or
another worker thread

a multithreaded server organized in a dispatcher/worker model 10


3. finite-state machine
 if threads are not available
 it gets a request, examines it, tries to fulfill the request from
cache, else sends a request to the file system; but instead of
blocking it records the state of the current request and proceeds
to the next request
 Summary

Model Characteristics
Single-threaded process No parallelism, blocking system calls
Parallelism, blocking system calls
Threads
(thread only)
Finite-state machine Parallelism, nonblocking system calls
three ways to construct a server

11
3.2 Anatomy of Clients
 Two issues: user interfaces and client-side software for
distribution transparency

1. User Interfaces
 to create a convenient environment for the interaction of a

human user and a remote server; e.g. mobile phones with


simple displays and a set of keys
 GUIs are most commonly used

 The X Window System (or simply X)

 it has the X-kernel: the part of the OS that controls the


terminal (monitor, keyboard, pointing device like a mouse)
 contains all terminal-specific device drivers through the
library called xlib

12
the basic organization of the X Window System

13
2. Client-Side Software for Distribution Transparency
 in addition to the user interface, parts of the processing and
data level in a client-server application are executed at the
client side.
 moreover, client software can also include components to
achieve distribution transparency
e.g., replication transparency: assume a distributed system
with remote objects; the client proxy can send requests to
each replica, collects all responses and passes a single
return value to the client application

14
• Essence: Often focused on providing distribution transparency
o access transparency: client-side stubs for RPCs and RMIs
o location/migration transparency: let client-side software
keep track of actual location
o replication transparency: multiple invocations handled by
client stub:
o failure transparency: can often be placed only at client
(we’re trying to mask server and communication failures).

a possible approach to transparent replication of a remote object using a client-side solution


15
3.3 Servers: General Design Issues
 issues
 how to organize servers?
 where do clients contact a server?
 whether and how a server can be interrupted
 whether or not the server is stateless

1. how to organize servers?


 iterative server
 the server itself handles the request and returns the result

 concurrent server
 it passes a request to a separate process or thread and

waits for the next incoming request; e.g., a multithreaded


server

16
2. where do clients contact a server?
 using endpoints or ports at the machine where the server is
running where each server listens to a specific endpoint
 how do clients know the endpoint of a service?
 globally assign endpoints for well-known services; e.g. FTP

is on TCP port 21, HTTP is on TCP port 80


 for services that do not require preassigned endpoints, it

can be dynamically assigned by the local OS


 how can the client know this endpoint? two approaches

a) have a daemon running and listening to a well-known


endpoint like in DCE; it keeps track of all endpoints of
services on the collocated server
o the client will first contact the daemon which provides it

with the endpoint, and then the client contacts the specific
server.

17
Client-to-server binding using a daemon as in DCE
b) use a superserver (as in UNIX) that listens to all endpoints and
then forks a process to take care of the request; this is instead of
having a lot of servers running simultaneously and most of them
idle

Client-to-Server binding using a superserver as in UNIX 18


3. whether and how a server can be interrupted
 for instance, a user may want to interrupt a file transfer, may
be it was the wrong file
 let the client exit the client application; this will break the
connection to the server; the server will tear down the
connection assuming that the client had crashed
or
 let the client send out-of-bound data, data to be processed by
the server before any other data from the client; the server
may listen on a separate control endpoint; or send it on the
same connection as urgent data as is in TCP
Note: we require OS supports high-priority scheduling of specific
threads or processes
4. whether or not the server is stateless
 a stateless server does not keep information on the state of its
clients; for instance a Web server
 a stateful server maintains information about its clients; for
instance a file server that allows a client to keep a local copy
of a file and can make update operations 19
3.4 Code Migration
 so far, communication was concerned on passing data
 we may pass programs, even while running and in
heterogeneous systems
 code migration also involves moving data as well: when a
program migrates while running, its status, pending signals, and
other environment variables such as the stack and the program
counter also have to be moved

20
1. Reasons for Migrating Code
 to improve performance; move processes from heavily-loaded to
lightly-loaded machines (load balancing)
 to reduce communication: move a client application that performs
many database operations to a server if the database resides on
the server; then send only results to the client
 to exploit parallelism (for nonparallel programs): e.g., copies of a
mobile program moving from site to site searching the Web

21
 to have flexibility by dynamically configuring distributed systems:
instead of having a multitiered client-server application deciding
in advance which parts of a program are to be run where

the principle of dynamically configuring a client to communicate to a server;


the client first fetches the necessary software, and then invokes the
server 22
2. Models for Code Migration
 a process consists of three segments: code segment (set of
instructions), resource segment (references to external resources
such as files, printers, ...), and execution segment (to store the
current execution state of a process such as private data, the
stack, the program counter)
 Weak Mobility
 transfer only the code segment and may be some initialization
data; in this case a program always starts from its initial stage,
e.g. Java Applets
 execution can be by the target process (in its own address
space like in Java Applets) or by a separate process

23
 Strong Mobility
 transfer code and execution segments; helps to migrate a
process in execution
 can also be supported by remote cloning; having an exact
copy of the original process and running on a different
machine; executed in parallel to the original process; UNIX
does this by forking a child process.
 migration can be
 sender-initiated: the machine where the code resides or is
currently running; e.g., uploading programs to a server; may
need authentication or that the client is a registered one
 receiver-initiated: by the target machine; e.g., Java Applets;
easier to implement

24
Summary of models of code migration

alternatives for code migration

25
4. Migration and Local Resources
 how to migrate the resource segment

 not always possible to move a resource; e.g., a reference to TCP

port held by a process to communicate with other processes


Types of Process-to-Resource Bindings
 Binding by identifier (the strongest): a resource is referred by its
identifier; e.g., a URL to refer to a Web page or an FTP server
referred by its Internet address
 Binding by value (weaker): when only the value of a resource is
needed; in this case another resource can provide the same
value; e.g., standard libraries of programming languages such as
C or Java which are normally locally available, but their location
in the file system may vary from site to site
 Binding by type (weakest): a process needs a resource of a
specific type; reference to local devices, such as monitors,
printers, ...

26
 in migrating code, the above bindings cannot change, but the
references to resources can
 how can a reference be changed? depends whether the resource
can be moved along with the code, i.e., resource-to-machine
binding
Types of Resource-to-Machine Bindings
 Unattached Resources: can be easily moved with the migrating
program (such as data files associated with the program)
 Fastened Resources: such as local databases and complete
Web sites; moving or copying may be possible, but very costly
 Fixed Resources: intimately bound to a specific machine or
environment such as local devices and cannot be moved
 we have nine combinations to consider

27
Resource-to machine binding
Unattached Fastened Fixed
By identifier MV (or GR) GR (or MV) GR
Process-to-
resource binding By value CP (or MV, GR) GR (or CP) GR
By type RB (or GR, CP) RB (or GR, CP) RB (or GR)

actions to be taken with respect to the references to local


resources when migrating code to another machine
 GR: Establish a global system wide reference
 MV: Move the resource
 CP: Copy the value of the resource
 RB: Rebind process to a locally available resource

 Exercise: for each of the nine combinations, give example


resources

28
Migration in Heterogeneous Systems
 distributed systems are constructed on a heterogeneous
collection of platforms, each with its own OS and machine
architecture
 heterogeneity problems are similar to those of portability
o The target machine may not be suitable to execute the migrated
code
o The definition of process/thread/processor context is highly
dependent on local hardware, operating system and runtime system
solution: Make use of an abstract machine that is implemented
on different platforms
 easier in some languages; for scripting languages the source
code is interpreted; for Java an intermediary code is generated
by the compiler for a virtual machine
 in weak mobility
 since there is no runtime information, compile the source code
for each potential platform

29
 in strong mobility
 difficult to transfer the execution segment since there may be
platform-dependent information such as register values;
 one possible solution: restrict migration to specific points in
the execution of a program, i.e., only when a subroutine is
called
 the runtime system maintains a machine-independent

program stack, called migration stack; for this the compiler


must generate code to update the migration stack
whenever a subroutine is entered or exited

30
the principle of maintaining a migration stack to support migration of an
execution segment in a heterogeneous environment

31
3.5 Software Agents and Agent Technology
 A software agent is an autonomous unit (process) capable of
performing a task in collaboration (i.e., it communicates) with
other, possibly remote, agents
 it is capable of reacting to, and initiating changes (proactive) in its
environment, possibly with users and other agents
 a collaborative agent is an agent that forms part of a multiagent
system, in which agents seek to achieve some common goal
through collaboration; e.g., in arranging meetings
 a mobile agent is an agent having the capability to move
between different machines; e.g., to retrieve information
distributed across a large heterogeneous network such as
the Internet
 an interface agent is an agent that assists an end user in the
use of one or more applications and has learning capabilities (it is
adaptive); e.g., those that bring buyers and sellers together

32
 an information agent is an agent that manages information from
different sources such as ordering and filtering; e.g. an email
agent filtering unwanted mail from its owner’s mailbox, or
automatically distributing incoming mail into appropriate subject-
specific mailboxes
Common to
Property Description
all agents?
Autonomous Yes Can act on its own
Responds timely to changes in its
Reactive Yes
environment
Proactive Yes Initiates actions that affects its environment
Can exchange information with users and
Communicative Yes
other agents
Continuous No Has a relatively long lifespan
Mobile No Can migrate from one site to another
Adaptive No Capable of learning

some important properties by which different types of agents can be


distinguished 33
Agent Technology
 we need support to develop agent systems; for instance a
middleware consisting of generally-used components of agents in
distributed systems
 FIPA - Foundation for Intelligent Physical Agents - develops a
general model for software agents
 agent platform: where agents are registered at, and operate;
provides basic services such as creating and deleting agents,
locating agents, interagent communication, ...; it may include the
following
 agent management: keep track of the agents for the associated
platform; provides facilities for creating and deleting agents,
looking for the current endpoint for a specific agent by providing
a naming service to globally and uniquely identify an endpoint
 local directory service: where agents can look up what other
agents on the platform have to offer
 agent communication channel (ACC): for agents to
communicate by exchanging messages in multiagent systems

34
the general model of an agent platform

35
 Agent Communication Languages (ACL)
 ACL: an application-level protocol where communication between
agents takes place
 a message has a purpose and content
 purpose
 to request a specific service
 to respond to a request
 to inform about an event
 to propose during negotiation

36
Message purpose Description Message Content
Inform that a given proposition is
INFORM Proposition
true
Query whether a given proposition
QUERY-IF Proposition
is true
QUERY-REF Query for a given object Expression
Proposal
CFP Ask for a proposal
specifics
PROPOSE Provide a proposal Proposal
ACCEPT- Tell that a given proposal is
Proposal ID
PROPOSAL accepted
REJECT-
Tell that a given proposal is rejected Proposal ID
PROPOSAL
Request that an action be Action
REQUEST
performed specification
Reference to
SUBSCRIBE Subscribe to an information source
source
examples of different message types in the FIPA ACL, giving the purpose
of a message, along with the description of the actual message content 37
 ACL messages consist of a header and the actual content
 the header has different fields:
 a field to identify the purpose of the message,
 fields to identify the sender and the receiver,
 a field to identify the language or encoding scheme for the
content (since an ACL does not prescribe the format or
language in which the message content is expressed),
 a field to identify a standardized mapping of symbols to their
meaning called ontology (if no common understanding of
interpreting data)

38
 e.g., to inform an agent about Dutch royalty relationships

Field Value
Purpose INFORM
Sender max@http://fanclub-beatrix.royalty-spotters.nl:7239
Header

Receiver elke@iiop://royalty-watcher.uk:5623
Language Prolog
Ontology genealogy
Content female(beatrix), parent(beatrix, juliana, bernhard)

simple example of a FIPA ACL message sent between two agents


using Prolog to express genealogy information

39
40

You might also like