Ch6. Transport Layer
Ch6. Transport Layer
Ch6. Transport Layer
Just as there are two types of network service, connection-oriented and connectionless, there are also the same two types of transport service. If the transport layer is so similar to the network layer service, why are there two distinct layers? The network layer is part of communication subnet. What happens if the network layer offers connection-oriented service but is unreliable? Suppose that it frequently loses packets, what happens if routers crash from time to time? The users have no control over the subnet, so they cannot solve the problem of poor service by using better routers or putting more error handling in the data link layer. The only possibility is to put another layer on the top of the network layer that improves the quality of service.
The transport service primitives allow transport users (e.g., application programs) to access the transport service. Each transport service has its own access primitives. The purpose of the transport layer is to provide a reliable service on the top of an unreliable network. Therefore, it hides the imperfections of the network service so the user processes can just assume the existence of an error-free bit stream. The following transport service primitives allow application programs to establish, use, and release connection.
A message, sent from transport entity to transport entity, is called Transport Protocol Data Unit (TPDU). When a frame arrives, the data link layer processes the frame header and passes the contents of the frame payload field up to the network entity. The network entity processes the packet header and passes the contents of the packet payload up to the transport entity.
Consider an application with a server and a number of remote clients. The server executes a LISTEN primitive to block the server (i.e., is interested in handling requests) until a client turns up. When a client wants to talk to a server, it executes a CONNECT primitive. This causes to block the client and to send a CONNECTION REQUEST TPDU to the server. When it arrives, the transport entity checks to see that server is blocked on a LISTEN. It then unblock the server and sends a CONNECTION ACCEPTED TPDU back to the client. When this TPDU arrives, the client is unblocked and the connection is established. When connection is no longer needed, it must be released by issuing a DISCONNECT primitive.
Figure 6-4. A state diagram for a simple connection management scheme. Transitions labeled in italics are caused by packet arrivals. The solid lines show the client's state sequence. The dashed lines show the server's state sequence.
The transport service is implemented by a transport protocol used between the two transport entities. Both transport protocols and data link protocols have to deal with error control, sequencing, and flow control. However, significant differences between the two also exist. At data link layer, two routers communicate directly, via a physical channel, whereas at the transport layer, this physical channel is replaced by the entire subnet
Figure 6-7. (a) Environment of the data link layer. (b) Environment of the transport layer.
Differences between the Data Link Layer and Transport Layer: First: In data link layer, each outgoing line uniquely specifies a particular router. However, in the transport layer, explicit addressing of destinations is required. Second: The process of establishing a connection over a wire is simple. However, in transport layer, initial connection establishment is more complicated. Third: The existence of storage capacity in the subnet. In data link layer, the frame may arrive or be lost, but it cannot bounce around for a while. However, in transport layer, there is a nonnegligible probability that a packet may be stored for a number of seconds and then delivered later. Fourth: Buffering and flow control are needed in both layers, but in the transport layer may require a different approach than we used in the data link layer.
ADDRESSING When an application process wishes to set up a connection to a remote application process, it must specify which one to connect to. The method is to define transport addresses to which processes can listen for connection request. These are called Transport Service Access Point (TSAP). In the Internet, these points are (IP address, local port) pairs. Similarly, the end points in the network layer are called Network Service Access Point (NSAP).
ADDRESSING
Example: A possible connection scenario for a transport connection over a connection oriented a network layer is as follows. 1. A time-of-day server on host 2 attaches itself to TSAP 122 to wait for an incoming call. A call such as LISTEN might me used. 2. An application process on host 1 wants to find out the time-of-day, so it issues a CONNECT request specifying TSAP 6 as the source and TSAP 122 as the destination. 3. The transport entity on host 1 selects a network address on its machine (if it has more than one) and sets up a network connection to make host 1's transport entity to talk to the transport entity on host 2. 4. The first thing the transport entity on 1 says to its peer on 2 is: "Good morning. I would like to establish a transport connection between my TSAP 6 and your TSAP 122. What do you say?" 5. The transport entity on 2 then asks the time-of-day server at TSAP 122 if it is willing to accept a new connection.
ADDRESSING
How does the user process on host 1 know that the time-of-day server is attached to TSAP 122? One possibility is stable TSAP addresses, i.e., the time-of-day server has been attaching itself to TSAP 122 for years, and gradually all the network users have learned this. Stable TSAP addresses might work for a small number of key services. However, in general, user processes often talk to other user processes for a short time and do not have a TSAP address that is known in advance. Furthermore, if there are many server processes, most of which are rarely used, it is wasteful to have each of them active and listening to a stable TSAP address all day long. Therefore, a better scheme is needed.
ADDRESSING
One scheme, used by UNIX on the Internet, is known as the initial connection protocol, which is as follows. Each machine that wishes to offer service to remote users has a special process server, which listens to a set of ports at the same time, waiting for a TCP connection request, specifying a TCP address (TCP port) of the service they want. If no server is waiting for them, they get a connection to the process server. After it gets the incoming request, the process server produces the requested server, which then does the requested work, while the process server goes back to listening for new requests.
Figure 6-9. How a user process in host 1 establishes a connection with a time-of-day server in host 2.
For a file server, it needs to run a special hardware (a machine with a disk). There exists a special process called a name server or sometimes a directory server. To find the TSAP address corresponding to a given service name, such as "time-of-day", a user sets up a connection to the name server. The user then sends a message specifying the service name. The name server sends back the TSAP address. Then the user releases the connection with the name server and establishes a new one with the desired service. TSAP addresses can be either hierarchical addresses or flat addresses. Hierarchical Address = <galaxy><star><planet><country><network><host><port>.
Establishing a Connection
Establishing a connection sounds easy. It would seem sufficient for one entity to just send a CONNECTION REQUEST TPDU to the destination and wait for a CONNECTION ACCEPTED reply. The problem occurs when the network can lose, store, and duplicate packets. To solve this problem, one way is to use throw away transport addresses. Each time a transport address is needed, a new one is generated. When a connection is released, the address is discarded. Another possibility is to give each connection identifier (i.e., a sequence number incremented for each connection established), chosen by the initiating party, and put in each TPDU, including the one requesting the connection. After each connection is released, each transport entity could update a table listing obsolete connections as (peer transport entity, connection identifier) pairs. Whenever a connection request came in, it could be checked against the table to see if it belong to previously released connection.
Establishing a Connection
The drawback of this scheme is that when a machine crashes and loses its history, it will no longer know which connection identifiers have already been used. Therefore, we have to devise a mechanism to kill of the aged packets using one of the following techniques: 1. Restricted subnet design: It prevents packet from looping. 2. Putting a hop limit in each packet. 3. Timestamping each packet: The router clocks need to be synchronized. In practice, we will need to guarantee not only that packet is dead, but also that all acknowledgements to it are also dead. If we wait a time T after a packet has been sent, we can sure that all traces of it are now gone and that neither it nor its acknowledgements will suddenly appear.
Establishing a Connection
Figure 6-10. (a) TPDUs may not enter the forbidden region. (b) The resynchronization problem.
Establishing a Connection
To get around a machine losing all memory, Tomlinson proposed to equip each host with a time-of-day clock. The clocks at different hosts need not to be synchronized. Furthermore, the number of bits in the counter must equal or exceed the number of bits in the sequence number. Also, the clock is assumed to continue running even if the host goes down. The basic idea is to ensure that two identically numbered TDPU are never outstanding at the same time. When a connection is set up, the low-order k bits of the clock are used as the initial sequence number. Therefore, each connection starts numbering its TDPUs with a different sequence number. The sequence space should be so large that by the time sequence number wrap around, old TPDUs with the same sequence number are long gone.
Establishing a Connection
To establish a connection, there is a potential problem in getting both sides to agree on the initial sequence number. For example, host 1 establishes a connection by sending a CONNECT REQUEST TPDU containing the proposed initial sequence number and destination port number to a remote peer, host 2. The receiver, host 2, then acknowledges this request by sending a CONNECTION ACCEPT TPDU back. If the CONNECTION REQUEST TPDU is lost but a delayed duplicate CONNECTION REQUEST suddenly shows up at host 2, the connection will be established incorrectly. To solve this problem, Tomlison introduced the three-way handshake.
Establishing a Connection
The three-way handshake protocol does not require both sides to begin sending with the same sequence number. Host 1 chooses a sequence number, x, and sends a CONNECTION REQUEST TPDU containing x to host 2. Host 2 replies with a CONNECTION ACCEPTED TDPU acknowledging x and announcing its own initial sequence number, y. Finally, host 1 acknowledges host 2 in the first data TPDU that it sends. The three-way handshake works in the presence of delayed duplicate control TPDUs.
Establishing a Connection
Figure 6-11. Three protocol scenarios for establishing a connection using a three-way handshake. CR denotes CONNECTION REQUEST. (a) Normal operation. (b) Old duplicate CONNECTION REQUEST appearing out of nowhere. (c) Duplicate CONNECTION REQUEST and duplicate ACK.
Releasing a Connection
There are two styles of terminating a connection: asymmetric release and symmetric release. Asymmetric release is the way the telephone works: when one party hangs up, the connection is broken. Asymmetric release may result in a data loss. If a connection is established, host 1 sends a TDPU to host 2. Then host 1 sends another TPDU. However, host 2 issues a DISCONNECT before the second TPDU arrives. The result is that the connection is released and the data are lost.
Releasing a Connection
Symmetric release treats the connection as two separate unidirectional connections and requires each one to be released separately. Although symmetric release is more sophisticated protocol that avoids data loss, it does not always work. There is a famous problem that deals with this issue. It is called the two-army problem.
Releasing a Connection
Figure 6-14. Four protocol scenarios for releasing a connection. (a) Normal case of three-way handshake. (b) Final ACK lost.
Releasing a Connection
Figure 6-14. Four protocol scenarios for releasing a connection. Final ACK lost. (c) Response lost. (d) Response lost and subsequent DRs lost.
Flow Control scheme is needed on each connection to keep a fast transmitter from overrunning a slow receiver. If the subnet provides a datagram service, the sending transport entity must buffer outgoing frames because they might be retransmitted. If the receiver knows that the sender buffers all TPDUs until they are acknowledged, the receiver may or may not dedicate specific buffers to specific connections. The receiver may maintain a single buffer pool shared by all connections. When a TPDU comes in, an attempt is made to acquire a new buffer. If one is available, the TPDU is accepted, otherwise, it is discarded. Since the sender is prepared to retransmit TPDUs lost by the subnet, no harm is done by having the receiver drop TPDUs. The sender just keeps trying until it gets an acknowledgement.
In summary, If the network service is unreliable, the sender must buffer all TPDUs sent. However, with reliable network service, if the sender knows that the receiver always has a buffer space, it need not retain copies of the TPDUs it sends. However, if the receiver cannot guarantee that every incoming TPDU will be accepted, the sender will have to buffer anyway.
Buffer Size If most TPDUs are nearly the same size, we can organize the buffers as a pool of identical size buffers, with one TPDU per buffer. If there is a wide variation in TPDU size, from a few characters typed at a terminal to thousands of characters from file transfers, a pool of fixed-sized buffers presents problems. If the buffer size is chosen equal to the largest possible TPDU, space will be wasted whenever a short TPDU arrives. If the buffer size is chosen less than the maximum TPDU size, multiple buffers will be needed for long TPDUs. Another approach to the buffer size problem is to use variable-size buffers. The advantage is better memory utilization, at the price of more complicated buffer management. A third possibility is to dedicate a single large circular buffer per connection. This is good approach when all connections are heavily loaded, but is poor if some connections are lightly loaded.
Figure 6-15. (a) Chained fixed-size buffers. (b) Chained variable-sized buffers. (c) One large circular buffer per connection.
Dynamic Buffer Allocation: The sender requests a certain number of buffers. The receiver then grants as many of these as it can afford. Every time the sender transmits a TPDU, it must decrements its allocation, and stops when the allocation reaches zero. The receiver then separately piggybacks both acknowledgements and buffer allocations onto the reverse traffic.
Figure 6-16. Dynamic buffer allocation. The arrows show the direction of transmission. An ellipsis (...) indicates a lost TPDU.
Multiplexing
Plays a role in several layers of the Network architecture Several Transport connections at a time possible Also if only one Network address is available on a Host Upward Multiplexing Downward Multiplexing
Multiplexing
Crash recovery
In case of a Host or Router crash, and: If the Network provides Datagram service: TPDU may be lost. It should be then retransmitted. If the Network provides Connection Oriented Service: New Circuit will be build up Last TPDU, if lost should be retransmitted
Crash Recovery
For failures occurring in Layer N, Recovery is only possible in Layer N+1 The Transport layer can therefore recover failures of the Network Layer But with the condition, enough status information are saved
Crash recovery
Figure 6-21. The example protocol as a finite state machine. Each entry has an optional predicate, an optional action, and the new state. The tilde indicates that no major action is taken. An overbar above a predicate indicates the negation of the predicate. Blank entries correspond to impossible or invalid events.
Figure 6-22. The example protocol in graphical form. Transitions that leave the connection state unchanged have been omitted for simplicity
Introduction to UDP
The UDP header.
Introduction to UDP
The User Datagram Protocol is a connectionless transport protocol Usually used for short messages Transmits 8 bytes Segment header followed by the payload The ports identifies the End points of the source and destination Hosts The source port is primarily needed when a reply must be sent back to the source
Introduction to UDP-2 UDP length field includes the 8 bytes header and the data Checksum is optional and is 0 if not used UDP does NOT do: Flow control Error control Retransmission The User process is responsible for that.
Introduction to UDP-3 All what UDP do: Provide an interface for demultiplexing multiple processes using the ports Is useful for Server/Client applications
Client send a short request (ex: DNS) Server send a short reply back If Request or Replay is lost, it will timeout.Client can try again.
Figure 6-24. Steps in making a remote procedure call. The stubs are shaded
Figure 6-25. (a) The position of RTP in the protocol stack. (b) Packet nesting
The Real-Time Transport Protocol (RTP) 2 RTP runs over UDP, that means it just uses UDP RTP has, as UDP, no flow control, no error control, no ack, and no mechanism to request retransmissions UDP packets are imbedded in IP packets If the host is on Ethernet, the IP packets are then put in Ethernet frames for transmission
The Internet has two main protocols in the transport layer, a connectionoriented protocol (TCP) and a connectionless protocol (UDP). TCP (Transmission Control Protocol) was designed to provide a reliable end-to end connection over an unreliable internetwork. TCP was designed to dynamically adapt to the properties of the internetwork. A TCP entity accepts user data streams from local processes, breaks them up into pieces not exceeding 64K bytes (in practice, usually 1500 bytes), and sends each piece as a separate IP datagram. When IP datagrams containing TCP data arrive at a machine, they are given to the TCP entity, which reconstruct the original byte streams. The IP layer gives no guarantee that datagram will be delivered properly, so it is up to TCP to time out and retransmit them. Also, it is up to TCP to reassemble the datagrams into messages in the proper sequence. Therefore, TCP provides the reliability the most users want and that IP does not provide
Figure 6-28. (a) Four 512-byte segments sent as separate IP datagrams. (b) The 2048 bytes of data delivered to the application in a single READ call.
A sending and receiving TCP entities exchange data in form of segments. A segment consists of a 20-byte header (plus an optional part) followed by zero or more data bytes. Two limits restrict the segment size. First, each segment, including the TCP header, must fit in the 65,535-byte IP payload. Second, each network has a maximum transfer unit (MTU). If a segment passes through sequence of networks without being fragmented and then hits one whose MTU is smaller than the segment, the router at the boundary fragments the segment into two or more smaller segments. Each new segment gets its own TCP and IP headers, so fragmentation by routers increases the total overhead.
For lines with high bandwidth, high delay, or both, the 64 KByte window is often a problem. On a T3 line (44.736 Mbps), it take only 12 msec to output a full 64 KByte window. If the round trip propagation delay is 50 msec (typical for transcontinental fiber), the sender will be idle 3/4 of the time waiting for acknowledgement. On a satellite connection, the situation is even worse. A larger window size would allow the sender to keep pumping data out, but using 16-bit window size field, there is no way to express such a size. Therefore, a window scale option is proposed, allowing the sender and receiver to negotiate a window scale option. This number allows both sides to shift the window size field up to 16 bits to the left, allowing windows up to 232 bytes. Another option proposed is the use of the selective repeat instead of go back n protocol.
(a) TCP connection establishment in the normal case. (b) Call collision.
The states used in the TCP connection management finite state machine
TCP Connection management Modeling -2 Connection management (Server point of view) The server does a LISTEN and wait When SYN received, it is acknowledged Server goes to SYN RCVD state When the servers SYN Ack is Acknowlgd, three ways handshake is complete, and the server goes to ESTABLISHED state. DATA TRANSFER CAN THEN START
TCP Connection management Modeling -3 When a client is done, it does a CLOSE. This cause a FIN to arrive to the server When the server has finished too, it sends a CLOSE too. A FIN is sent to the client. Once the client Acknowledge the FIN, the server releases the connection, and deletes the connection record
TCP Transmission Policy -1Example: Case the receiver has 4k buffer: If the sender transmits a 2k segment: The receiver will acknowledge the segment and advertise a window of 4-2=2K Now the sender transmits another 2k, which will be acknowledged and the window will be 2-2=0 The sender must stop sending until it is informed that some data is removed from the windows buffer, with 2 exceptions:
TCP Transmission Policy -2The 2 exceptions: Urgent data may be sent. Ex:to allow the user to kill a remote process
The sender may send a 1 byte segment to make the receiver reannounce the next byte expected and the new window size
Nagles Algotithm
When data comes into the sender one byte at a time, then: Just send the first byte and: Buffer all the rest until the outstanding byte is acknowledged, then: Send all buffered characters in one TCP, then Start buffering again until the sent data is acknowledged
TCP Congestion Control -2 First, TCP try to detect congestion It supposes that timeouts are due to congestion TCP try also to prevent congestions By connection setup, the receiver specify a window based buffer size If no congestion, window overflow problem will not occur
TCP Congestion Control -3 The solution take 2 facts into account: - Network capacity - Receiver capacity Each sender maintains two windows - the window the receiver has granted - the congestion window. It takes into account a third parameter: the threshold
TCP Timer Management-1 TCP uses multiple timers The retransmission Timer is the most important Problem: How to determine it! It is more difficult to determine the one in the Internet Transport Layer than in the data link layer (Network versus point to point) Should be slightly more then the RTT, and should take possible congestion into account
TCP Timer Management -2 If timeout T1 is set too short, unnecessary retransmissions will occur
More packets will be on the Network Performance will suffer, especially when a packet is lost Also, the mean and variance of the acknowledgement arrival distribution can change rapidly as congestion builds up and can be resolved very quickly
If timeout is set too large, unnecessary wait time will be spent for a packet which could be lost. The solution: A highly dynamic algorithm that constantly adjusts the timeout interval is used.
TCP Timer Management -3 For each connection, TCP maintains a variable, RTT (Round Trip Time) When a segment is sent, a Timer is started, both to see: How long the acknowledgement takes And to start a retransmission if it takes too long. If the ack gets back before the timer expires, then TCP measurers how long the acknowledgement took, say, M It then updates RTT according to the formula: RTT= aRTT + (1-a)M Where a is a smoothing factors that determines how much weight is given to the old value. Typically a=7/8
Wireless TCP and UDP 1 Unfortunately, wireless transmission links are highly unreliable Packets are therefore often lost This can lead to retransmission by TCP And transmission will be slowed down And mutters get worse Usually on a link, only the last 1 km is wireless Making the correct decision on a timeout is mutter of where a problem has occurred
Wireless TCP and UDP 2 The Bakne and Badrinath solution is the INDIRECT TCP. It splits TCP connection into 2 separate connections. see figure One from sender to the base station The other from base station to receiver The base station simply copies packets between the connections in both directions Pb1: it breaks the semantics of TCP Pb2: Ack received from base station doesnt mean that the receiver has got the message
Wireless TCP and UDP 3 A different solution, due to BALAKRISHNAN. It doesnt break TCP semantics Several small modifications in Network Layer Code in the base station Addition of a snooping agent at the base station Disadvantage: if the wireless link is very lossy, the source may timeout, and congestion control algorithm may be invoked.
Transactional TCP 1 Remote procedure calls are a way to implement client server systems. UDP is used in case of Request and Replay are small, and fits in a single packet. See the previous figure for normal sequence of packets for doing RPC over TCP. The question is how to get both the efficiency of using UDP (just 2 msg) and the reliability of TCP. The solution is to use T/TCP (transactional TCP)
T/TCP or Transactional TCP 2See previous figure, part (b). The standard connection setup sequence slightly is modified It allows the transfer of data during setup. If the reply of the server fits in a packet, the connection will terminate after 3 msg
If not, the server will send multiple packets before closing down the connection.