Network Programming
#lecture note based on 15-213 Introduction to Computer Systems
- Network - system of boxes and wires
- LAN local area
- WAN wide area
- SAN storage area
- MAN metropolitan
- …
- internet - interconnect set of networks
- Example networks
- Internet - example of internet
- Arpa net - some 1969 network connecting some universities
Now, network implementation time
H2 Hardware Level
Recall:
- On the machine
- Slide some network adaptor into expansion slot on I/O bus
- So it’s almost like doing I/O
- Ethernet
- Some host(s) connecting to a hub (more primitive router, etc…) via cable.
- Each ethernet adapter (MAC address) with 48-bit address (MAC address).
- e.g.
00:16:ea:e3:54:e6
- e.g.
- Operations
- Hosts send bits to other host in chunks, called frames
- Hub copies every bit from every port to other port (basically broadcasting, so every host see everything)
- Problem arises - things can cross talk…
- Hubs connect to bridge, bridges connect to each other (This used to be common. Now bridges are cheap enough to replace hubs)
- Figuring out where to send stuff
- Bridges learn where is the best place to send stuff, using some heuristic
- This lets the network resilient to, say, loosing the central machine
- LANs combine to WAN by wiring routers together
- Logical structure
- There’s a hierarchy. Different part may not use same protocol
- They can reach some other machine via different paths
- Packets can therefore choose whatever route to go
H2 Internet protocol
- Protocol components
- How to name something
- Some unique address for every host
- How to deliver something
- Standard transfer unit packet
- Standard format for packet with
- header - size, destination, etc.
- payload - the data
- How to name something
- Protocol software
- Host to LAN: package it (add packet header, LAN frame header) and send, router receives
- Router to another LAN: unpack it, repack with another protocol, and send it to another LAN
- Destination host: LAN sends it to host, host unpacks it
- Issues
- Diferent protocol
- Fault tolerance
- Package lost
H2 Global IP Internet
Famous example of internet based on TCP/IP protocol
- TCP/IP
- IP to name stuff
- Used to be 32-bit (IPv4)
128.2.203.179
- Prefix controlled by entity, e.g.
128.2
CMU 127
relative127.0.0.1
localhost
- IPv6 introduced in 1996, 128-bit, but slowly adopted
- Always stored in bid-endian
- Call
getaddrinfo
getnameinfo
to get info
- Used to be 32-bit (IPv4)
- Domain name system maps name to IP (DNS) - a system on top of IP
- A database representing some giant mapping function
- Turn IP into human readable address
www.cs.cmu.edu
->128.2.217.3
- In terminal, call
nslookup localhost
,hostname
, … - Mapping is not one-to-one. One name can map to multiple hosts, multiple name can map to same IP, some valid address may not map to anything
- UDP (user datagram protocol) - unreliable datagram to deliver data
- viz. put in a bit, maybe receive it on the other end
- TCP (transmission control protocol) -
- reliable byte stream
- viz. put in a bit, other end will receive it, in the same order
- point-to-point communication - connect pair of processes, not group
- full-duplex - data can flow in both direction at same time
- reliable byte stream
- IP to name stuff
- Internet components
- Backbone - specialised companies
- Colocation sites
- …
- Hardware
- Clinet / user code <-socket interface-> TCP/IP / kernel code <-hardware interface-> network adapter / hardware, firmware
H2 Client-Server Transaction
- Server waiting for clients to send request
- Upon request, kernel gets data to the right process (usually by port number), send response back
- Port - 16-bit integer that identifies a process so data gets sent to the right process
- Ephemeral port - assigned when making a connection request
- e.g. phone call
- Well-known port - associated with some service
- 80 http
- 21 ftp
- 22 ssh
- 443 https
cat /etc/services
to list
- Ephemeral port - assigned when making a connection request
- Soket protocol - endpoint for endpoint communication
- It just has an
int
file descriptor
- It just has an
H3 Getting Socket Address Struct
Functions
getaddrinfo
returns list ofaddrinfo
struct containing possible socket address to trygetnameinfo
connect
bind
The sockaddr
struct
struct sockaddr {
// protocol family
uint16_t sa_family;
// address data, which could be shorter
// than 14 bytes depending on protocol
char sa_data[14];
};
If address family is IPv4, the same bytes are interpreted as a sockaddr_in
struct (but functions still need the generic sockaddr
struct)
struct sockaddr_in {
uint16_t sin_family; // protocol family is now always AF_INET
uint16_t sin_port; // port number (network byte order)
struct in_addr sin_addr; // ip address (network byte order)
unsigned char sin_zero[8]; // pads to make same size to sockaddr
};
Now the addrinfo
struct returned by getaddrinfo
, essentially a list of possible addresses.
struct addrinfo {
int ai_flags;
int ai_family; // args to socket function
int ai_socktype; // args to socket function
int ai_protocol; // args to socket function
char *ai_canonname; // canonical host name
size_t ai_addrlen; // size of ai_addr struct
struct sockaddr *ai_addr; // pointer to socket addr struct
struct addrinfo *ai_next; // the next node
}
getnameinfo
is the opposite of getaddrinfo
H3 Client-server connection procedure
H3 Inside connect and listen
- Client connect
- Call
getaddrinfo
, get back a list of possible sockets - Create socket descriptor, which acts like a file descriptor
int socket(int domain, int type, int protocol);
int clientfd = socket(ai->ai_family, ai->ai_socketype, ai->protocol);
connect
to connectint connect(int clientfd, SA *addr, socklen_t addrlen);
- Call
- Server set up so that it receives connection
- Same start with client (up until create socket)
bind
to associate with addressint bind(int sockfd, SA *addr, socklen_t addrlen);
listen
to actually listenint listen(int sockfd, int backlog);
where backlog is the number of unprocessed requests the kernel holds on to (the kernel may not fulfill this)
accept
to wait for connectionint accept(int listenfd, SA *addr, int *addrlen);
which returns unique identifier for connection, usually calledconnfd
- The descriptors
- listening descriptor - one for one server
- connected descriptor - one for each connection
H2 HTTP
Protocol on top of TCP
- Establish TCT connection
- Request content
- Respond content, in MIME (Multipurpose Internet Mail Extensions) type
text/html
etc.
- Close connection