Network Protocols:
Design and Analysis
Polly Huang EE NTU
http://cc.ee.ntu.edu.tw/~phuang [email protected]
TCP Papers
[Jacobson88a]
[Fall96a]
Key ideas
• [Jacboson88a]:
– implementation of transport layer TCP
– theory behind TCP congestion control: packet conserva
tion
– congestion control alg and how it relates to other TCP a lgorithms
• [Fall96a]:
– benefits of SACK (selective acknowledgements) – compares several loss recovery algorithms
Agenda
• connection setup and teardown • flow control
• congestion control theory
• congestion control practice (in TCP)
– slow start – congestion avoidance • loss recovery • putting it together • security • performance
TCP Congestion Control
• two mechanisms:
– slow start
– congestion avoidance
• interacts very closely with loss repair:
– good retransmit timeout (RTO) estimation – fast retransmit and recovery
– why?
• packet loss is the signal for congestion
• packet loss recovery can cause redundant work during times of conge stion
• but need to recover from loss reasonably quickly or goodput drops to zero
TCP Congestion Principals
• underlying principle: packet conservation
– at equilibrium, inject packet into network only when one is removed
– basis for stability of physical systems
• components:
– how to get there: slow start
TCP Congestion Control
Mechanisms
• new congestion window cwnd
– what the network can handle
– vs. flow control window (wnd): what the other end can handle
• sender limits tx
– min (wnd, cwnd)
TCP Self-clocking
Pr Pb Ar Ab receiver sender Asdepends on ACK stream to keep packets flowing (Redrawn from [Jacobson88a]) packet spacing at receiver xxx pck spacing at botteneck xxx ACK spacing at sender
Slow Start
• How do we get the ACK clock started?
– Initialize cwnd = 1
– Upon receipt of every ACK, cwnd = cwnd + 1
• Implications
– how much in each RTT? increase multipliciati vely (doubles each rtt)
– Will overshoot window and cause packet loss (but remember, packet loss is part of the plan)
Polly Huang, NTU EE 10
Slow Start Example
1 one RTT one pkt time 0R 2 1R 3 4 2R 5 67 8 3R 9 1011 1213 1415 1 2 3 4 5 6 7 (redrawn from [Jacobson88a] Fig 2)
Slow Start Time-Sequence Plot
time Data (KB)
When to End Slow-Start?
• Want to end when the pipe is full
– do end when cwnd > ssthresh
– start with large ssthresh, but then refine it
• On packet loss
– cwnd=1 and go back to slow start – ssthresh = cwnd / 2
• assume that pipe size was somewhere between last good windo w (cwnd/2) and current window (cwnd)
• Eventually, ssthresh is right and transition to cong
Congestion Avoidance
• upon receiving ACK
– Increase cwnd by 1/cwnd
– This is additive increase (over 1 RTT it adds up to increasing by 1 segment)
• why not multiplicative increase?
Congestion Window
time Congestion
Problems So Far
• have way to fill pipe (slow start)
• have way to run at equilibrium (congestion
avoidance)
• but tough transition
– no good initial ssthresh
– large ssthresh causes packet loss, every time
need approaches to quickly recover from pack et loss (or explicit signal of congestion)
Agenda
• connection setup and teardown
• flow control
• congestion control
• loss recovery
• security
TCP Loss Recovery
• timeout and retransmit
• fast retransmit
• fast recovery
• New-Reno partial ACKs
• SACK
Fallback Mechanism: Timeout
• retransmission timer (RTO)
– if no ACK after RTO fires,
reset cwnd and resend lowest unACK’ed segment
• but they’re very crude
– completely stop the ACK clock – force slow-start again
– are often slow—a long time with no traffic
Digression: RTO Calculation
• Must estimate RTO
– don’t know it at start
– may change due to congestion or path change
• But need a good estimate
– too low => unnecessary retransmits
Initial Round-trip Estimator
Round trip times exponentially averaged:
• New RTT = (old RTT) + (1 - ) (new
sample)
• Recommended value for : 0.8 - 0.9
• Retransmit timer set to RTT, where = 2
• Every RTO expiration, increase it
Retransmission Ambiguity
A B ACK Sample RTT A B Original transmission retransmission Sample RTT Original transmission retransmission ACK RTO RTOKarn’s Retransmission Timeout
Estimator
• Accounts for retransmission ambiguity
• If a segment has been retransmitted:
– Don’t count RTT sample on ACKs for this seg ment
– Keep backed off time-out for next packet
– Reuse RTT estimate only after one successful tr ansmission
Jacobson’s Retransmission
Timeout Estimator
• Key observation:
– Using RTT for timeout doesn’t work
(not adaptive enough with fixed : at high loads, variance is high)
• Solution:
– If D denotes mean variation (measured) – Timeout = RTT + 4D
– is now adaptive
TCP Loss Recovery
• timeout and retransmit
• fast retransmit
• fast recovery
• New-Reno partial ACKs
• SACK
Fast Retransmit
• Interpret n duplicate ACKs as loss indicatio
n
– in fact, send a dup ACK for every packet you g et after a missing one
– but beware: now packet re-ordering causes pro blems
• Goal: avoid RTO by fixing the one missing
segment
Fast Retransmit Example
fast retransmit after 3 dup ACKs from [Fall96a] figure 2
fast retx helps a lot,
Fast Recovery
• Problem: fast retx still forces slow-start, breaking t he ACK clock
• Fast Recovery Solution: artificially inflate the cwn
d as more dup ACKs come in
– cut cwnd, but instead of slow start, do additive increase for each ACK
– justification: each dup ACK represents a packet leaving the network, so we can increase cwnd
Fast Retransmit and Recovery
• If we get 3 duplicate ACKs for segment N
– Retransmit segment N – Set ssthresh to 0.5*cwnd
– Set cwnd to ssthresh + 3 [why?]
• For every subsequent duplicate ACK
– Increase cwnd by 1 segment
• When new ACK received
Fast Recovery Example
fast retransmit after 3 dup ACKs
fast recovery due to add’tl dup ACKs
New-Reno Partial ACKs
• But fast retx and recovery only repair one lo
st segment per RTT
• New-Reno idea: use partial ACKs to stay in
fast recovery and fix more lost segments
New-Reno Example
fast retransmit after 3 dup ACKs
fast recovery due to add’tl dup ACKs
additional fast retx and recovery from New Reno
SACK
• Forget these hacks, have receiver just tell
sender what’s missing
SACK: selective acknowledgement
– use TCP options to encode some info about
multiple losses and avoid all of this guess work – but why is SACK deployment so much slower
Agenda
• connection setup and teardown • flow control
• congestion control theory
• congestion control practice (in TCP) • loss recovery
• putting it together
• security