CMPSC
311, Introduction to Systems Programming
Introduction to Sockets
Reading
- CS:APP
- Sec. 1.8, Systems Communicate with Other Systems Using Networks
- Ch. 11, Network Programming, Intro
- Sec. 11.1, The Client-Server Programming Model
- Sec. 11.2, Networks
- Sec. 11.3, The Global IP Internet
- Sec. 11.4, The Sockets Interface
- APUE, Ch. 16
Network Programming with Sockets
- Client-Server Model
- IP Networks
- Sockets Interface
- functions - socket,
connect, bind, listen, accept
A simple point-to-point model
- process --- host --- connection --- host --- process
Client-Server Model for application design
- Client (many, ephemeral)
- Server (one, permanent)
- these are processes, not specific machines
- Resource (managed by the server)
Service Transaction
- Request (client to server)
- Action (server)
- access the resource, perhaps modify it
- Response (server to client)
- Receipt (client)
Server design
- initialize
- loop forever
- wait for client request
- take action
- respond to client
- cleanup
Client design
- initialize
- send request to server
- wait for server to respond
- use response
Networks - host, connector, "wire"
- a Local Area Network is constructed with one type of "wire" and
one set of communication hardware and protocols
- for example, Ethernet with hubs
and bridges
- a Wide Area Network is constructed from multiple Local Area
Networks
- similar principles, everything is compatible but a larger
geographic area
- an internet is
constructed from multiple LANs and WANs, without requiring them to be
directly compatible
- routers are used for
the interconnections
- the routers are responsible for translating data and protocols
between incompatible networks
IP Network - a specific way to identify hosts and exchange data
IP Internet - same but more general
- layered connectors, host --- hub --- bridge --- router
- layered protocol software
- use software abstractions to cover up the differences between
physical networks
Protocol = rules for exchanging data
- what to expect next?
- how to identify what arrives?
Naming scheme
- internet addresses
- layering -- hosts are identified by internet addresses, which are
uniform, not by LAN/WAN addresses, which are not
- this is an abstraction
mechanism
Delivery mechanism
- packet = header + payload
- header = source address, destination address, packet size, etc.
- payload = data sent from source to destination
- layering -- a packet at one level of the network can become the
payload for a packet at the next level of the network, by adding
another header
- this is a form of encapsulation
Internet Protocol, building the Global IP Network in layers
client or server code
|
user code
|
sockets interface
|
system calls
|
TCP/IP
|
OS kernel code
|
hardware interface
|
interrupts
|
network adapter
|
hardware
|
TCP = Transmission Control Protocol
IP = Internet Protocol
Recall,
- process --- host --- connection --- host --- process
IP provides naming and delivery of packets (datagrams)
- host-to-host
- but, datagrams can be lost or duplicated
UDP = Unreliable Datagram Protocol
TCP provides reliable delivery of packets
- process-to-process
- based on expectation, acknowledgement, retransmission if necessary
process --- host --- connection --- host --- process
- The hosts might use different memory architectures.
- "Network Byte Order" = big-endian
- functions are provided to convert between host byte order (big-
or little-endian) and
network byte order (big-endian)
<netinet/in.h>
htonl (host to network, long int)
htons
(host to network, short int)
ntohl,
ntohs
- some of this can also be handled by the network adapter, in
hardware
IPv4 (Internet Protocol version 4), 32-bit IP address
IPv6, 128-bit IP address
A host is mapped to an IP address (numeric).
An IP address is mapped to an Internet Domain Name (symbolic).
Internet Address (IPv4)
struct in_addr
- one field,
unsigned int s_addr, in network byte
order
- 130.203.16.27, etc.
- "dotted decimal notation" is used so people don't have to read
hexadecimal
- hex 82cb101b --> "dotted hex" 82.cb.10.1b --> "dotted
decimal" 130.203.16.27
Internet Domain Name
- because no one really wants to mess with numbers ... that's why
we have computers
- because applying some kind of hierarchy to the names helps to
understand them
Lookup Functions (old)
gethostbyname(), gethostbyaddr()
- These are obsolete (removed from the 2008 Posix Standard), but
easier to explain.
#include
<netdb.h>
struct
hostent
* gethostbyaddr(const void *addr, socklen_t len, int type);
struct hostent *
gethostbyname(const char *name);
The return values are pointers to static data.
struct hostent {
name, domain name
aliases, NULL-terminated array of
domain names
address type (AF_INET, address
family, Internet)
address length
addresses, NULL-terminated array of struct
in_addr
*
};
// Mac OS X
struct hostent {
char *h_name; /*
official name of host */
char **h_aliases; /* alias list */
int h_addrtype; /* host address
type */
int h_length; /*
length of address */
char **h_addr_list; /* list of addresses from name server
*/
};
// example usage, but not complete
struct in_addr addr; // internet address
addr.s_addr = something;
//
unsigned int
struct hostent *hostp;
hostp = gethostbyaddr(&addr, sizeof(addr), AF_INET);
printf("%s\n", hostp->h_name);
Lookup Function (new)
int
getaddrinfo(const
char
*
restrict
nodename,
const char
* restrict servname,
const struct
addrinfo * restrict hints,
struct addrinfo
** restrict res);
- nodename is either a
valid host name or a numeric host address string consisting of a dotted
decimal IPv4 address or an IPv6 address
- servname is either a
decimal port number or a service name listed in services(5)
- hints is optional,
information about the caller's socket; set to NULL for defaults
- If the call is successful, *res
points to a linked list of addrinfo
structures
struct
addrinfo {
int
ai_flags;
/*
input
flags */
int
ai_family;
/*
protocol
family for socket */
int
ai_socktype;
/*
socket
type */
int
ai_protocol;
/*
protocol
for socket */
socklen_t
ai_addrlen;
/* length of socket-address */
struct
sockaddr
*ai_addr;
/*
socket-address for socket */
char
*ai_canonname;
/*
canonical name
for service location */
struct
addrinfo
*ai_next;
/*
pointer to next in list */
};
ai_family,
ai_socktype, and
ai_protocol can be used later
in a call to
socket().
For each
addrinfo
structure in the list, the
ai_addr
member points to a filled-in socket address structure of length
ai_addrlen.
- struct sockaddr will
be described shortly
Internet Connections
- point-to-point, between processes
- full-duplex, data moves in both directions
- reliable, bytes arrive in the order in which they were sent, and
are never lost or duplicated
Socket = connection endpoint
- socket address = internet address (32 or 128 bits) + port number
(16 bits)
- a client's port number is assigned by the OS in response to a
request, and is temporary
- a server's port number is advertised and permanent
- See
/etc/services for a list of well-known ports
(with web links to
IANA)
Connection = two socket addresses
- clientaddr:clientport, serveraddr:serverport
Sockets overview, as a sequence of function calls
Client
|
action
|
Server
|
socket
|
|
socket
|
|
|
bind
|
|
|
listen
|
connect
|
connection request --->
|
accept
|
write
|
service request --->
|
read
|
read
|
<--- service response
|
write
|
close
|
EOF
--->
|
read
|
|
|
close
|
- client -
read(), write() and close()
using the return value from socket()
- server -
read(), write() and close()
using the return value from accept()
- After returning from
accept(), the server could
call fork() or pthread_create() to allow
the transaction with this client to be handled independently of the
other clients. After returning from close(), the
"child server" would then call exit() or pthread_exit()
to clean up this transaction. The "parent server" never
terminates.
Sockets
- The OS kernel sees an end-point for communication.
- The client and server programs see an open file.
struct sockaddr, generic
socket address, 16 bytes
protocol family, 2 bytes, indicates
meaning of additional data
additional data, 14 bytes
struct sockaddr_in, Internet-style socket address, 16 bytes
address family, always AF_INET
port number, 16 bits (a server's port numbers are advertised)
IP address, 32 bits (IPv4)
padding, 8 bytes
#include <sys/types.h>
#include <sys/socket.h>
int socket(int domain, int type, int protocol);
- simple usage:
domain = AF_INET, type = SOCK_STREAM,
protocol = 0
- called by the client and by the server, as two independent actions
- returns a socket descriptor, partially opened, eventually used
with
read(), write(), etc., like a file
descriptor from open()
- an "active socket", but it's not connected to anything yet
int connect(int sockfd, struct sockaddr *server_addr, int
addrlen);
- called by the client
sockfd came from socket()
- wait for connection to server, or failure
- If successful,
sockfd can now be used with read()
and write() (the socket is connected).
- second arg, pass
struct sockaddr_in *
- before call, set some fields with info from
gethostbyname()
- after return, the remaining fields are set for the server
- third arg, pass
sizeof(struct sockaddr_in)
int bind(int sockfd, struct sockaddr *server_addr, int addrlen);
- called by the server
sockfd came from socket()
- associates server's socket address with the socket descriptor
int listen(int sockfd, int backlog);
- This call, by the server, distinguishes server from client.
sockfd is now a "listening socket". Up to this
point, it was an "active socket". It still isn't connected.
backlog indicates how many outstanding requests to
keep (pick a large number)
int accept(int listenfd, struct sockaddr *addr, int addrlen);
listenfd came from listen(), it's a
"listening descriptor".
- Wait for a connection request, return a "connected descriptor"
for
use with
read() and write().
- The client's info is stored through
addr.
client
read(), write() and close()
using the return value from socket() after connect()
has returned
server
read(), write() and close()
using the return value from accept() after socket(),
bind() and listen() have returned
Summary

Client
- create a socket by calling
socket(), save return
value as clientfd
- identify server by host address, port number
- associate server with
clientfd by using connect()
- use
clientfd with read(), write(),
close()
Server
- once only,
- create a socket with
socket(), save return value
as listenfd
- identify server by host address, port number
- associate server with
listenfd by using bind()
and listen()
- in a loop, call
accept() with listenfd
- upon return, the server has a new client
- save return value as
connfd
- use
connfd with read(), write(),
close()
The point-to-point client-server socket connection is between clientfd
(in the client) and connfd (in the server). This
connection is ephemeral.
Example, an iterative echo server
Client
Server
Example, a concurrent echo server based on processes
#include "csapp.h"
void sigchld_handler(int sig)
{
while (waitpid(-1, 0, WNOHANG) > 0)
;
return;
}
void echo(int connfd); // see above, echo.c
int main(int argc, char *argv[])
{
int listenfd, connfd, port;
int clientlen = sizeof(struct sockaddr_in);
struct sockaddr_in clientaddr;
if (argc != 2) {
fprintf(stderr, "usage: %s <port>\n", argv[0]);
exit(0);
}
port = atoi(argv[1]);
Signal(SIGCHLD, sigchld_handler);
listenfd = Open_listenfd(port);
while (1) {
connfd = Accept(listenfd, (SA *) &clientaddr, &clientlen);
if (Fork() == 0) {
Close(listenfd); /* Child closes its listening socket */
echo(connfd); /* Child services client */
Close(connfd); /* Child closes connection with client */
exit(0); /* Child exits */
}
Close(connfd); /* Parent closes connected socket (important!) */
}
}
Example, a concurrent echo server based on threads
#include "csapp.h"
void echo(int connfd); // see above, echo.c
void *thread(void *vargp);
int main(int argc, char *argv[])
{
int listenfd, *connfdp, port;
int clientlen = sizeof(struct sockaddr_in);
struct sockaddr_in clientaddr;
pthread_t tid;
if (argc != 2) {
fprintf(stderr, "usage: %s <port>\n", argv[0]);
exit(0);
}
port = atoi(argv[1]);
listenfd = Open_listenfd(port);
while (1) {
connfdp = Malloc(sizeof(int));
*connfdp = Accept(listenfd, (SA *) &clientaddr, &clientlen);
Pthread_create(&tid, NULL, thread, connfdp);
}
}
/* thread routine */
void *thread(void *vargp)
{
int connfd = *((int *)vargp);
Pthread_detach(pthread_self());
Free(vargp);
echo(connfd);
Close(connfd);
return NULL;
}
Example, adapted from Mac OS X man page getaddrinfo(3)
The following code tries to connect to host "www.kame.net" and service "http" via a stream
socket. It loops through all the addresses available, regardless
of address family. If the destination resolves to an IPv4
address, it will use an AF_INET
socket. Similarly, if it resolves to IPv6, an AF_INET6 socket is used.
Observe that there is no hardcoded reference to a particular address
family. The code works even if getaddrinfo() returns addresses
that are not IPv4/v6.
struct addrinfo hints, *res;
memset(&hints, 0,
sizeof(hints));
hints.ai_family = PF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;
int error =
getaddrinfo("www.kame.net", "http", &hints, &res);
if (error)
{
errx(1, "%s",
gai_strerror(error));
// error reporting and exit
}
int s = -1;
const char *cause = NULL;
for (struct addrinfo *p =
res; p != NULL; p = p->ai_next)
{
s =
socket(p->ai_family, p->ai_socktype, p->ai_protocol);
if (s < 0)
{
cause = "socket";
continue;
}
if (connect(s,
p->ai_addr, p->ai_addrlen) < 0)
{
cause = "connect";
close(s);
s = -1;
continue;
}
break;
/*
okay,
we
got
one */
}
if (s < 0)
{
err(1, "%s",
cause);
//
error reporting and exit
}
freeaddrinfo(res);
Last revised, 14 May 2012