The document discusses socket address structures, which are data structures used to identify sockets in networking applications. It describes the IPv4 and IPv6 socket address structures, which contain fields like the address family, port number, and IPv4/IPv6 address. It also discusses generic socket address structures, byte ordering issues between host and network byte order, and functions for converting between the orders.
The document discusses socket address structures, which are data structures used to identify sockets in networking applications. It describes the IPv4 and IPv6 socket address structures, which contain fields like the address family, port number, and IPv4/IPv6 address. It also discusses generic socket address structures, byte ordering issues between host and network byte order, and functions for converting between the orders.
The document discusses socket address structures, which are data structures used to identify sockets in networking applications. It describes the IPv4 and IPv6 socket address structures, which contain fields like the address family, port number, and IPv4/IPv6 address. It also discusses generic socket address structures, byte ordering issues between host and network byte order, and functions for converting between the orders.
The document discusses socket address structures, which are data structures used to identify sockets in networking applications. It describes the IPv4 and IPv6 socket address structures, which contain fields like the address family, port number, and IPv4/IPv6 address. It also discusses generic socket address structures, byte ordering issues between host and network byte order, and functions for converting between the orders.
UNIT-II Socket Address Structures Most socket functions require a pointer to a socket address structure as an argument. Each supported protocol suite defines its own socket address structure. IPv4 Socket Address Structure An IPv4 socket address structure, commonly called an "Internet socket address structure," is named sockaddr_in and is defined by including the <netinet/in.h>header. The POSIX definition of IPv4 SAS is shown below. struct in_addr { in_addr_t s_addr; /* 32-bit IPv4 address */ /* network byte ordered */ };
int8_t uint8_t int16_t unit16_t int32_t unit32_t Signed 8-bit integer Unsigned 8-bit integer Signed 16-bit integer Unsigned 16-bit integer Signed 32-bit integer Unsigned 32-bit integer <sys/types.h> <sys/types.h> <sys/types.h> <sys/types.h> <sys/types.h> <sys/types.h> sa_family_t socklen_t Address family of socket address structure Length of socket address structure, normally uint32_t <sys/socket.h> <sys/socket.h> in_addr_t in_port_t IPv4 address, normally uint32_t TCP or UDP port , normally uint16_t <netinet/in.h> <netinet/in.h> Table 1 : Datatype, Description and Header File of IPV4 SAS Members The POSIX specification requires only three members in the structure: sin_family, sin_addr, and sin_port. The datatypesu_char, u_short, u_int, and u_long, which are all unsigned Both the IPv4 address and the TCP or UDP port number are always stored in the structure in network byte order. The 32-bit IPv4 address can be accessed in two different ways. o For example, if serv is defined as an Internet socket address structure, then serv.sin_addr references the 32-bit IPv4 address as an in_addr structure, while serv.sin_addr.s_addr Socket Address Structures
Srinu Bevara Page 2
references the same 32-bit IPv4 address as an in_addr_t (typically an unsigned 32-bit integer). Generic Socket Address Structure A socket address structure is always passed by reference when passed as an argument to any socket functions. void * is the generic pointer type. <sys/socket.h> :Generic Socket address structure struct sockaddr { uint8_t sa_len; sa_family_t sa_family; /* address family: AF_xxx value */ char sa_data[14]; /* protocol-specific address */ };
The socket functions are then defined as taking a pointer to the generic socket address structure, as shown here in the ANSI C function prototype for the bind function: int bind(int,struct sockaddr *, socklen_t);
#define SIN6_LEN /* required for compile-time tests */
struct sockaddr_in6 { uint8_t sin6_len; /* length of this struct (28) */ sa_family_t sin6_family; /* AF_INET6 */ in_port_t sin6_port; /* transport layer port# */ /* network byte ordered */ uint32_t sin6_flowinfo; /* flow information, undefined */ struct in6_addr sin6_addr; /* IPv6 address */ /* network byte ordered */ uint32_t sin6_scope_id; /* set of interfaces for a scope */ };
The sin6_flowinfo member is divided into two fields: o The low-order 20 bits are the flow label o The high-order 12 bits are reserved Socket Address Structures
Srinu Bevara Page 3
The sin6_scope_id identifies the scope zone in which a scoped address is meaningful, most commonly an interface index for a link-local address New Generic Socket Address Structure defined as part of the IPv6 sockets API unlike the structsockaddr <netinet/in.h> header Struct sockaddr_storage { uint8_t ss_len; /* length of this struct (implementation dependent) */ sa_family_t ss_family; /* address family: AF_xxx value */
/* implementation-dependent elements to provide: * a) alignment sufficient to fulfill the alignment requirements of * all socket address types that the system supports. * b) enough storage to hold any type of socket address that the * system supports. */ }; Different from structsockaddr in two ways o provides the strictest alignment requirement o large enough to contain any socket address structure that the system supports must be cast or copied to the appropriate socket address structure Comparison of Socket Address Structures
Figure 1: Comparison of various socket address structures Socket Address Structures
Srinu Bevara Page 4
Value-Result Arguments Socket address structure passed from process to kernel Three functionsbind, connect, and sendto pass a socket address structure from the process to the kernel. One argument to these three functions is the pointer to the socket address structure and another argument is the integer size of the structure. Since the kernel is passed both the pointer and the size of what the pointer points to, it knows exactly how much data to copy from the process into the kernel.
Figure 2: Socket address structure passed from process to kernel
Socket address structure passed from kernel to process Four functions accept,recvfrom, getsockname, andgetpeername pass a socket address structure from the kernel to the process, the reverse direction from the previous scenario. Two of the arguments to these four functions are the pointer to the socket address structure along with a pointer to an integer containing the size of the structure. The reason that the size changes from an integer to be a pointer to an integer is because the size is both a value when the function is called (it tells the kernel the size of the structure so that the kernel does not write past the end of the structure when filling it in) and a result when the function returns. This type of argument is called a value-result argument. Socket Address Structures
Srinu Bevara Page 5
Figure 3: Socket address structure passed from kernel to process Byte Ordering Functions Consider a 16-bit integer that is made up of 2 bytes. There are two ways to store the two bytes in memory: with the low-order byte at the starting address, known as little-endian byte order, or with the high-order byte at the starting address, known as big-endian byte order.
Figure 4: Little-endian byte order and big-endian byte order for a 16-bit integer We must deal with these byte ordering differences as network programmers because networking protocols must specify a network byte order. For example, in a TCP segment, there is a 16-bit port number and a 32- bit IPv4 address. The sending protocol stack and the receiving protocol stack must agree on the order in which the bytes of these multibyte fields will be transmitted. The Internet protocols use big-endian byte ordering for these multibyte integers. Socket Address Structures
Srinu Bevara Page 6
In theory, an implementation could store the fields in a socket address structure in host byte order and then convert to and from the network byte order when moving the fields to and from the protocol headers, saving us from having to worry about this detail. But, both history and the POSIX specification say that certain fields in the socket address structures must be maintained in network byte order. Our concern is therefore converting between host byte order and network byte order. We use the following four functions to convert between these two byte orders. #include <netinet/in.h> uint16_t htons(uint16_t host16bitvalue) ; uint32_t htonl(uint32_t host32bitvalue) ; Both return: value in network byte order uint16_t ntohs(uint16_t net16bitvalue) ; uint32_t ntohl(uint32_t net32bitvalue) ; Both return: value in host byte order
In the names of these functions, h stands for host, n stands for network, s stands for short, and l stands for long. The terms "short" and "long" are historical artifacts from the Digital VAX implementation of 4.2BSD. We should instead think of s as a 16-bit value (such as a TCP or UDP port number) and l as a 32-bit value (such as an IPv4 address). Indeed, on the 64-bit Digital Alpha, a long integer occupies 64 bits, yet the htonl and ntohl functions operate on 32-bit values. NOTE: These functions are used exclusively for data functionality between sockets (storage). Byte Manipulation Functions There are two groups of functions that operate on multibyte fields, without interpreting the data, and without assuming that the data is a null-terminated C string. We need these types of functions when dealing with socket address structures because we need to manipulate fields such as IP addresses, which can contain bytes of 0, but are not C character strings. The first group of functions, whose names begin with b (for byte), are from 4.2BSD and are still provided by almost any system that supports the socket functions. The second group of functions, whose names begin with mem (for memory), are from the ANSI C standard and are provided with any system that supports an ANSI C library. #include <strings.h> void bzero(void *dest, size_t nbytes); void bcopy(const void *src, void *dest, size_t nbytes); Int bcmp(const void *ptr1, const void *ptr2, size_t nbytes); Returns: 0 if equal, nonzero if unequal The following functions are the ANSI C functions: #include <string.h> void *memset(void *dest, int c, size_t len); void *memcpy(void *dest, const void *src, size_t nbytes); Int memcmp(const void *ptr1, const void *ptr2, size_t nbytes); Returns: 0 if equal, <0 or >0 if unequal (see text)
Socket Address Structures
Srinu Bevara Page 7
src might represent application space and dest might represent socket send buffer space (socket receive buffer space). inet_aton, inet_addr, and inet_ntoa Functions To send IP address on the network, we have the functions that serve the purpose. The following functions are for IPV4. #include <arpa/inet.h> Int inet_aton(const char *strptr, struct in_addr *addrptr); Returns: 1 if string was valid, 0 on error in_addr_t inet_addr(const char *strptr); Returns: 32-bit binary network byte ordered IPv4 address; INADDR_NONE if error char *inet_ntoa(struct in_addr inaddr); Returns: pointer to dotted-decimal string inet_pton and inet_ntop Functions The IPV6 functions for the data communication over the network, following functions areused. These functions can also be used for IPV4 addresses also (The family argument specifies this). #include <arpa/inet.h> Int inet_pton(int family, const char *strptr, void *addrptr); Returns: 1 if OK, 0 if input not a valid presentation format, -1 on error const char *inet_ntop(int family, const void *addrptr, char *strptr, size_t len); Returns: pointer to result if OK, NULL on error
Socket Address Structures
Srinu Bevara Page 8
Figure 5: Summary of address conversion functions sock_ntop Function A basic problem with inet_ntop is that it requires the caller to pass a pointer to a binary address. This address is normally contained in a socket address structure, requiring the caller to know the format of the structure and the address family. To solve this problem, sock_ntop() is used which takes pointer to a socket address structure as an argument, calls the appropriate function and the presentation address is returned. #include "unp.h" char *sock_ntop(const struct sockaddr *sockaddr, socklen_t addrlen); Returns: non-null pointer if OK, NULL on error readn, writen, and readline Functions Stream sockets (e.g., TCP sockets) exhibit a behavior with the read and write functions that differ from normal file I/O. A read or write on a stream socket might input or output fewer bytes than requested, but this is not an error condition. The reason is that buffer limits might be reached for the socket in the kernel. All that is required to input or output the remaining bytes is for the caller to invoke the read or write function again. Some versions of UNIX also exhibit this behavior when writing more than 4,096 bytes to a pipe. This scenario is always a possibility on a stream socket with read, but is normally seen with write only if the socket is nonblocking. Nevertheless, we always call our writenfunction instead of write, in case the implementation returns a short count. #include "unp.h" ssize_t readn(int filedes, void *buff, size_t nbytes); ssize_t writen(int filedes, const void *buff, size_t nbytes); ssize_t readline(int filedes, void *buff, size_t maxlen); All return: number of bytes read or written, 1 on error Socket Address Structures
Srinu Bevara Page 9
Elementary TCP Sockets
This chapter describes the elementary socket functions required to write a complete TCP client and server.
Figure 6: Socket functions for elementary TCP client/server Socket Function To perform network I/O, the first thing a process must do is call the socket function, specifying the type of communication protocol desired (TCP using IPv4, UDP using IPv6, Unix domain stream protocol, etc.). #include <sys/socket.h> int socket (int family, int type, int protocol); Returns: non-negative descriptor if OK, -1 on error Creates a socket on demand (placing it in an unconnected state), returns an integer identifying the socket (descriptor), and specifies: Socket Address Structures
Srinu Bevara Page 10
Family - particular address of the family. Type - Type of communication socket Protocol - Accommodates multiple protocols within a family
Figure 7: Protocol family constants for socket function
Figure 8: type of socket for socket function
Figure 9: protocol of sockets for AF_INET or AF_INET6 Connect Function The connect function is used by a TCP client to establish a connection with a TCP server. #include <sys/socket.h> int connect(int sockfd, const struct sockaddr *servaddr, socklen_t addrlen); Returns: 0 if OK, -1 on error
EX: connect (socket, destaddr, addrlen); Binds a permanent destination to a socket placing it in a connected state.Sockets using connection-less service do not have to use connect (specify the address in every datagram), but may. Socket - socket descriptor. Destaddr -socket_addr structure (also includes protocol port number) specifying the destination address. Addrlen - length of destination address (in bytes). Socket Address Structures
Srinu Bevara Page 11
bind Function The bind function assigns a local protocol address to a socket. With the Internet protocols, the protocol address is the combination of either a 32-bit IPv4 address or a 128-bit IPv6 address, along with a 16-bit TCP or UDP port number. #include <sys/socket.h> int bind (int sockfd, const struct sockaddr *myaddr, socklen_t addrlen); Returns: 0 if OK,-1 on error
EX: bind (socket, localaddr, addrlen); Socket is created without any association to local or destination addresses, so a program uses bind to establish a local address for it. Socket - integer descriptor of the socket. Localaddr - structure that specifies the local address to be bound. Addrlen - integer length of the address (in bytes). listen Function The listen function is called only by a TCP server and it performs two actions: 1. When a socket is created by the socket function, it is assumed to be an active socket, that is, a client socket that will issue a connect. The listen function converts an unconnected socket into a passive socket, indicating that the kernel should accept incoming connection requests directed to this socket. In terms of the TCP state transition diagram, the call to listen moves the socket from the CLOSED state to the LISTEN state. 2. The second argument to this function specifies the maximum number of connections the kernel should queue for this socket. #include <sys/socket.h> #int listen (int sockfd, int backlog); Returns: 0 if OK, -1 on error This function is normally called after both the socket and bind functions and must be called before calling the accept function. EX: listen (socket, backlog); Server creates a socket, binds it to a well-known port, and waits for requests. To avoid rejecting service requests that cannot be handled, a server queue is created using Listen. It provides a mechanism to create the queue and then listen for incoming connections (passive mode). Listen only works with sockets using a reliable stream service. Socket Address Structures
Srinu Bevara Page 12
Socket - Integer descriptor. Backlog(qlength) - length of the request queue for that socket (max. = 5). accept Function accept is called by a TCP server to return the next completed connection from the front of the completed connection queue. If the completed connection queue is empty, the process is put to sleep (assuming the default of a blocking socket). #include <sys/socket.h> int accept (int sockfd, struct sockaddr *cliaddr, socklen_t *addrlen); Returns: non-negative descriptor if OK, -1 on error EX: accept (socket, addr, addrlen); Bind associates a socket with port, but that socket is not connected to a foreign destination. When a request comes in, Accept establishes the full connection. It blocks until a connectionrequest arrives. Addr - pointer to the sockaddr structure. Addrlen - pointer to integer size of address. fork and exec Functions This function (including the variants of it provided by some systems) is the only way in Unix to create a new process #include <unistd.h> pid_t fork(void); Returns: 0 in child, process ID of child in parent, -1 on error If you have never seen this function before, the hard part in understanding fork is that it is called once but it returns twice. It returns once in the calling process (called the parent) with a return value that is the process ID of the newly created process (the child). It also returns once in the child, with a return value of 0. Hence, the return value tells the process whether it is the parent or the child. The reason fork returns 0 in the child, instead of the parent's process ID, is because a child has only one parent and it can always obtain the parent's process ID by calling getppid. A parent, on the other hand, can have any number of children, and there is no way to obtain the process IDs of its children. If a parent wants to keep track of the process IDs of all its children, it must record the return values from fork. All descriptors open in the parent before the call to fork are shared with the child after fork returns. We will see this feature used by network servers: The parent calls accept and then calls fork. The connected socket is then shared between the parent and child. Normally, the child then reads and writes the connected socket and the parent closes the connected socket. Socket Address Structures
Srinu Bevara Page 13
There are two typical uses of fork: 1. A process makes a copy of itself so that one copy can handle one operation while the other copy does another task. This is typical for network servers. We will see many examples of this later in the text. 2. A process wants to execute another program. Since the only way to create a new process is by calling fork, the process first calls fork to make a copy of itself, and then one of the copies (typically the child process) calls exec (described next) to replace itself with the new program. This is typical for programs such as shells. exec replaces the current process image with the new program file, and this new program normally starts at the main function.The process ID does not change. We refer to the process that calls exec as the calling process and the newly executed program as the new program. The differences in the six exec functions are: (a) whether the program file to execute is specified by a filename or a pathname; (b) whether the arguments to the new program are listed one by one or referenced through an array of pointers; and (c) whether the environment of the calling process is passed to the new program or whether a new environment is specified. #include <unistd.h> Int execl (const char *pathname, const char *arg0, ... /* (char *) 0 */ ); Int execv (const char *pathname, char *constargv[]); Int execle (const char *pathname, const char *arg0, ... /* (char *) 0, char *constenvp[] */ ); Int execve (const char *pathname, char *constargv[], char *constenvp[]); Int execlp (const char *filename, const char *arg0, ... /* (char *) 0 */ ); Int execvp (const char *filename, char *constargv[]); All six return: -1 on error, no return on success These functions return to the caller only if an error occurs. Otherwise, control passes to the start of the new program, normally the main function.
Figure 10 : Relationship among the six exec functions. Note the following differences among these six functions: Socket Address Structures
Srinu Bevara Page 14
1. The three functions in the top row specify each argument string as a separate argument to the exec function, with a null pointer terminating the variable number of arguments. The three functions in the second row have an argv array, containing pointers to the argument strings. This argv array must contain a null pointer to specify its end, since a count is not specified. 2. The two functions in the left column specify a filename argument. This is converted into a pathname using the current PATH environment variable. If the filename argument to execlp or execvp contains a slash (/) anywhere in the string, the PATH variable is not used. The four functions in the right two columns specify a fully qualified pathname argument. 3. The four functions in the left two columns do not specify an explicit environment pointer. Instead, the current value of the external variable environ is used for building an environment list that is passed to the new program. The two functions in the right column specify an explicit environment list. The envp array of pointers must be terminated by a null pointer. Close: (A system call from traditional UNIX Environment) close (socket descriptor); When a client or server finishes with a socket, calls close to reallocate its resources. The connection immediately terminates unless several processes share the same socket. It then decrements the reference count (closing it completely when reference count = 0). Order of Socket System Calls: Client Side (depends on connection type): Server Side (depends on connection type): Socket Socket Connect Bind Write (may be repeated) Listen Read (may be repeated) Accept Close Read (may be repeated) Write (may be repeated) Close (go back to Accept) Shutdown: Shutdown (socket, direction); The shutdown function applies to full-duplex sockets (connected using a TCP socket) and is used to partially close the connection. Socket - socket descriptor of a connected socket. Direction - direction in which shutdown is desired 0 = terminate further input. 1 = terminate further output. 2 = terminate input / output (close).
Socket Address Structures
Srinu Bevara Page 15
Concurrent Servers Outline for a typical concurrent server pid_t pid; int listenfd, connfd;
listenfd = Socket( ... );
/* fill in sockaddr_in{} with server's well-known port */ Bind(listenfd, ... ); Listen(listenfd, LISTENQ);
When a connection is established, accept returns, the server calls fork, and the child process services the client (on connfd, the connected socket) and the parent process waits for another connection (on listenfd, the listening socket). The parent closes the connected socket since the child handles the new client.
Figure 11 : Status of client/server before call to accept returns First, Figure 10 shows the status of the client and server while the server is blocked in the call to accept and the connection request arrives from the client
Figure 12 : Status of client/server after return from accept Immediately after accept returns, we have the scenario shown in Figure 11. The connection is accepted by the kernel and a new socket, connfd, is created. This is a connected socket and data can now be read and written across the connection. Socket Address Structures
Srinu Bevara Page 16
Figure 13 : Status of client/server after fork returns The next step in the concurrent server is to call fork. Figure 12 shows the status after fork returns.
Figure 14 : Status of client/server after parent and child close appropriate sockets Notice that both descriptors, listenfd and connfd, are shared (duplicated) between the parent and child. The next step is for the parent to close the connected socket and the child to close the listening socket. This is shown in Figure 13. This is the desired final state of the sockets. The child is handling the connection with the client and the parent can call accept again on the listening socket, to handle the next client connection. getsockname and getpeername Functions These two functions return either the local protocol address associated with a socket (getsockname) or the foreign protocol address associated with a socket (getpeername).
Socket Address Structures
Srinu Bevara Page 17
#include <sys/socket.h> Int getsockname(int sockfd, structsockaddr *localaddr, socklen_t *addrlen); Int getpeername(int sockfd, structsockaddr *peeraddr, socklen_t *addrlen); Both return: 0 if OK, -1 on error Notice that the final argument for both functions is a value-result argument. That is, both functions fill in the socket address structure pointed to by localaddr or peeraddr. We mentioned in our discussion of bind that the term "name" is misleading. These two functions return the protocol address associated with one of the two ends of a network connection, which for IPV4 and IPV6 is the combination of an IP address and port number. These functions have nothing to do with domain names. These two functions are required for the following reasons: After connect successfully returns in a TCP client that does not call bind, getsockname returns the local IP address and local port number assigned to the connection by the kernel. After calling bind with a port number of 0 (telling the kernel to choose the local port number), getsockname returns the local port number that was assigned. getsockname can be called to obtain the address family of a socket. In a TCP server that binds the wildcard IP address, once a connection is established with a client (accept returns successfully), the server can call getsockname to obtain the local IP address assigned to the connection. The socket descriptor argument in this call must be that of the connected socket, and not the listening socket. When a server is execed by the process that calls accept, the only way the server can obtain the identity of the client is to callgetpeername.