Chapter 3
Transport Layer
Computer Networking:
A Top Down Approach Featuring the Internet, 3rd edition.
Jim Kurose, Keith Ross Addison-Wesley, July 2004.
Chapter 3: Transport Layer
Our goals:
understand principles behind transport
layer services:
multiplexing/demultipl exing
reliable data transfer
flow control
congestion control
learn about transport layer protocols in the Internet:
UDP: connectionless transport
TCP: connection-oriented transport
TCP congestion control
Chapter 3 outline
3.1 Transport-layer services
3.2 Multiplexing and demultiplexing
3.3 Connectionless transport: UDP
3.4 Principles of
reliable data transfer
3.5 Connection-oriented transport: TCP
segment structure
reliable data transfer
flow control
connection management
3.6 Principles of congestion control
3.7 TCP congestion
control
Transport services and protocols
provide logical communication between app processes
running on different hosts
transport protocols run in end systems
send side: breaks app messages into segments, passes to network layer
rcv side: reassembles segments into messages, passes to app layer
more than one transport protocol available to apps
application transport
network data link physical
application transport
network data link physical network
data link physical
network data link physical network
data link physical network data link physical network
data link physical
logical end -end t
ransport
Transport vs. network layer
network layer: logical communication
between hosts
transport layer: logical communication
between processes
relies on, enhances, network layer services
Household analogy:
12 kids sending letters to 12 kids
processes = kids
app messages = letters in envelopes
hosts = houses
transport protocol = Ann and Bill
network-layer protocol
= postal service
Internet transport-layer protocols
reliable, in-order delivery (TCP)
congestion control
flow control
connection setup
unreliable, unordered delivery: UDP
no-frills extension of
“best-effort” IP
services not available:
delay guarantees
application transport
network data link physical
application transport
network data link physical network
data link physical
network data link physical network
data link physical network data link physical network
data link physical
logical end -end t
ransport
Chapter 3 outline
3.1 Transport-layer services
3.2 Multiplexing and demultiplexing
3.3 Connectionless transport: UDP
3.4 Principles of
reliable data transfer
3.5 Connection-oriented transport: TCP
segment structure
reliable data transfer
flow control
connection management
3.6 Principles of congestion control
3.7 TCP congestion
control
Multiplexing/demultiplexing
application transport network link
P1 application
transport network link physical application
transport network
link
P3 P1 P2 P4
= process
= socket
delivering received segments to correct socket
Demultiplexing at rcv host:
gathering data from multiple sockets, enveloping data with header (later used for
demultiplexing)
Multiplexing at send host:
How demultiplexing works
host receives IP datagrams
each datagram has source IP address, destination IP address
each datagram carries 1 transport-layer segment
each segment has source, destination port number (recall: well-known port numbers for specific applications)
host uses IP addresses and port numbers to direct
segment to appropriate socket
source port # dest port # 32 bits
application data (message)
other header fields
TCP/UDP segment format
Connectionless demultiplexing
Create sockets with port numbers:
DatagramSocket mySocket1 = new DatagramSocket(99111);
DatagramSocket mySocket2 = new DatagramSocket(99222);
UDP socket identified by two-tuple:
(
dest IP address, dest port number)
When host receives UDP segment:
checks destination port number in segment
directs UDP segment to socket with that port number
IP datagrams with different source IP
addresses and/or source
port numbers directed
to same socket
Connectionless demux (cont)
DatagramSocket serverSocket = new DatagramSocket(6428);
Client IP:B
P2
client IP: A
P3 P1P1
server IP: C
SP: 6428 DP: 9157 SP: 9157
DP: 6428
SP: 6428 DP: 5775
SP: 5775 DP: 6428
SP provides “return address”
Connection-oriented demux
TCP socket identified by 4-tuple:
source IP address
source port number
dest IP address
dest port number
recv host uses all four values to direct
segment to appropriate socket
Server host may support many simultaneous TCP sockets:
each socket identified by its own 4-tuple
Web servers have
different sockets for each connecting client
non-persistent HTTP will have different socket for each request
Connection-oriented demux (cont)
Client
IP:B P1
client IP: A
P2 P1 P4
server IP: C
SP: 9157 DP: 80
SP: 9157 DP: 80
P5 P6 P3
D-IP:C S-IP: A
D-IP:C
S-IP: B SP: 5775
DP: 80 D-IP:C S-IP: B
Chapter 3 outline
3.1 Transport-layer services
3.2 Multiplexing and demultiplexing
3.3 Connectionless transport: UDP
3.4 Principles of
reliable data transfer
3.5 Connection-oriented transport: TCP
segment structure
reliable data transfer
flow control
connection management
3.6 Principles of congestion control
3.7 TCP congestion
control
UDP: User Datagram Protocol [RFC 768]
“no frills,” “bare bones”
Internet transport protocol
“best effort” service, UDP segments may be:
lost
delivered out of order to app
connectionless:
no handshaking between UDP sender, receiver
each UDP segment
handled independently of others
Why is there a UDP?
no connection
establishment (which can add delay)
simple: no connection state at sender, receiver
small segment header
no congestion control: UDP can blast away as fast as desired
UDP: more
often used for streaming multimedia apps
loss tolerant
rate sensitive
other UDP uses
DNS
SNMP
reliable transfer over UDP:
add reliability at application layer
application-specific error recovery!
source port # dest port # 32 bits
Application data (message)
length checksum Length, in
bytes of UDP segment, including header
UDP checksum
Sender:
treat segment contents as sequence of 16-bit integers
checksum: addition (1’s complement sum) of segment contents
sender puts checksum value into UDP checksum field
Receiver:
compute checksum of received segment
check if computed checksum equals checksum field value:
NO - error detected
YES - no error detected.
But maybe errors
Goal: detect “errors” (e.g., flipped bits) in transmitted
segment
Internet Checksum Example
Note
When adding numbers, a carryout from the most significant bit needs to be added to the result
Example: add two 16-bit integers
1 1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0 1 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1 wraparound
Chapter 3 outline
3.1 Transport-layer services
3.2 Multiplexing and demultiplexing
3.3 Connectionless transport: UDP
3.4 Principles of
reliable data transfer
3.5 Connection-oriented transport: TCP
segment structure
reliable data transfer
flow control
connection management
3.6 Principles of congestion control
3.7 TCP congestion
control
Principles of Reliable Data Transfer
important in app., transport, link layers
top-10 list of important networking topics!
Reliable data transfer: getting started
send side receive
side
rdt_send(): called from above, (e.g., by app.). Passed data to deliver to receiver upper layer
udt_send(): called by rdt, to transfer packet over unreliable channel to receiver
rdt_rcv(): called when packet arrives on rcv-side of channel deliver_data(): called by
rdt to deliver data to upper
Reliable data transfer: getting started
We’ll:
incrementally develop sender, receiver sides of reliable data transfer protocol (rdt)
consider only unidirectional data transfer
but control info will flow on both directions!
use finite state machines (FSM) to specify sender, receiver
state
1 state
event causing state transition actions taken on state transition
event
Rdt1.0: reliable transfer over a reliable channel
underlying channel perfectly reliable
no bit errors
no loss of packets
separate FSMs for sender, receiver:
sender sends data into underlying channel
receiver reads data from underlying channel
Wait for call from
above
packet = make_pkt(data) udt_send(packet)
rdt_send(data)
extract (packet,data) deliver_data(data) Wait for
call from below
rdt_rcv(packet)
sender receiver
Rdt2.0: channel with bit errors
underlying channel may flip bits in packet
checksum to detect bit errors
the question: how to recover from errors:
acknowledgements (ACKs): receiver explicitly tells sender that pkt received OK
negative acknowledgements (NAKs): receiver explicitly tells sender that pkt had errors
sender retransmits pkt on receipt of NAK
new mechanisms in rdt2.0 (beyond rdt1.0):
error detection
receiver feedback: control msgs (ACK,NAK) rcvr->sender
rdt2.0: FSM specification
Wait for call from
above
snkpkt = make_pkt(data, checksum) udt_send(sndpkt)
extract(rcvpkt,data) deliver_data(data) udt_send(ACK) rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt) rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
udt_send(NAK) rdt_rcv(rcvpkt) &&
corrupt(rcvpkt) Wait for
ACK or NAK
Wait for call from
below
sender
receiver
rdt_send(data)
Λ
Sender sends one packet, then waits for receiver response
stop and wait
rdt2.0: operation with no errors
Wait for call from
above
snkpkt = make_pkt(data, checksum) udt_send(sndpkt)
extract(rcvpkt,data) rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt) rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
udt_send(NAK) rdt_rcv(rcvpkt) &&
corrupt(rcvpkt) Wait for
ACK or NAK
Wait for call from
below rdt_send(data)
Λ
rdt2.0: error scenario
Wait for call from
above
snkpkt = make_pkt(data, checksum) udt_send(sndpkt)
extract(rcvpkt,data) deliver_data(data) udt_send(ACK) rdt_rcv(rcvpkt) &&
notcorrupt(rcvpkt) rdt_rcv(rcvpkt) && isACK(rcvpkt)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
isNAK(rcvpkt)
udt_send(NAK) rdt_rcv(rcvpkt) &&
corrupt(rcvpkt) Wait for
ACK or NAK
Wait for call from
below rdt_send(data)
Λ
rdt2.0 has a fatal flaw!
What happens if
ACK/NAK corrupted?
sender doesn’t know what happened at receiver!
can’t just retransmit:
possible duplicate
Handling duplicates:
sender adds sequence number to each pkt
sender retransmits current pkt if ACK/NAK garbled
receiver discards (doesn’t deliver up) duplicate pkt
rdt2.1: sender, handles garbled ACK/NAKs
Wait for call 0 from
above
sndpkt = make_pkt(0, data, checksum) udt_send(sndpkt)
rdt_send(data)
Wait for ACK or
NAK 0 udt_send(sndpkt) rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isNAK(rcvpkt) )
sndpkt = make_pkt(1, data, checksum) udt_send(sndpkt)
rdt_send(data)
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isNAK(rcvpkt) ) rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt)
Wait for call 1 from
above Wait for
ACK or NAK 1
Λ Λ
rdt2.1: receiver, handles garbled ACK/NAKs
Wait for 0 from
below
sndpkt = make_pkt(NAK, chksum) udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) &&
has_seq0(rcvpkt)
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt) extract(rcvpkt,data)
Wait for 1 from
below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq0(rcvpkt) extract(rcvpkt,data) deliver_data(data)
sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt)
rdt_rcv(rcvpkt) &&
not corrupt(rcvpkt) &&
has_seq1(rcvpkt)
rdt_rcv(rcvpkt) && (corrupt(rcvpkt)
sndpkt = make_pkt(ACK, chksum) udt_send(sndpkt)
sndpkt = make_pkt(NAK, chksum) udt_send(sndpkt)
rdt2.1: discussion
Sender:
seq # added to pkt
two seq. #’s (0,1) will suffice. Why?
must check if received ACK/NAK corrupted
twice as many states
state must “remember”
whether “current” pkt has 0 or 1 seq. #
Receiver:
must check if received packet is duplicate
state indicates whether 0 or 1 is expected pkt seq #
note: receiver can not know if its last
ACK/NAK received OK
at sender
rdt2.2: a NAK-free protocol
same functionality as rdt2.1, using ACKs only
instead of NAK, receiver sends ACK for last pkt received OK
receiver must explicitly include seq # of pkt being ACKed
duplicate ACK at sender results in same action as
NAK: retransmit current pkt
rdt2.2: sender, receiver fragments
Wait for call 0 from
above
sndpkt = make_pkt(0, data, checksum) udt_send(sndpkt)
rdt_send(data)
udt_send(sndpkt) rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isACK(rcvpkt,1) )
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt,0)
Wait for ACK
sender FSM 0
fragment
Wait for 0 from
below
rdt_rcv(rcvpkt) && notcorrupt(rcvpkt)
&& has_seq1(rcvpkt) extract(rcvpkt,data) rdt_rcv(rcvpkt) &&
(corrupt(rcvpkt) ||
has_seq1(rcvpkt)) udt_send(sndpkt)
receiver FSM fragment
Λ
rdt3.0: channels with errors and loss
New assumption:
underlying channel can also lose packets (data or ACKs)
checksum, seq. #, ACKs, retransmissions will be of help, but not enough
Approach: sender waits
“reasonable” amount of time for ACK
retransmits if no ACK received in this time
if pkt (or ACK) just delayed (not lost):
retransmission will be duplicate, but use of seq.
#’s already handles this
receiver must specify seq
# of pkt being ACKed
rdt3.0 sender
sndpkt = make_pkt(0, data, checksum) udt_send(sndpkt)
start_timer rdt_send(data)
Wait for ACK0
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isACK(rcvpkt,1) )
Wait for call 1 from
above
sndpkt = make_pkt(1, data, checksum) udt_send(sndpkt)
start_timer rdt_send(data)
rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt,0)
rdt_rcv(rcvpkt) &&
( corrupt(rcvpkt) ||
isACK(rcvpkt,0) ) rdt_rcv(rcvpkt)
&& notcorrupt(rcvpkt)
&& isACK(rcvpkt,1)
stop_timer stop_timer
udt_send(sndpkt) start_timer
timeout
udt_send(sndpkt) start_timer
timeout
Wait for call 0from
above
Wait for ACK1
Λ
Λ
rdt3.0 in action
rdt3.0 in action
Performance of rdt3.0
rdt3.0 works, but performance stinks
example: 1 Gbps link, 15 ms e-e prop. delay, 1KB packet:
Ttransmit = 8kb/pkt
10**9 b/sec = 8 microsec
U sender: utilization – fraction of time sender busy sending
1KB pkt every 30 msec -> 33kB/sec thruput over 1 Gbps link
U sender = .008
30.008 = 0.00027 L / R
RTT + L / R = L (packet length in bits)
R (transmission rate, bps) =
rdt3.0: stop-and-wait operation
first packet bit transmitted, t = 0
sender receiver
RTT last packet bit transmitted, t = L / R
first packet bit arrives
last packet bit arrives, send ACK
ACK arrives, send next packet, t = RTT + L / R
U sender = .008
30.008 = 0.00027 L / R
RTT + L / R =