TCP Reno - TCP壅塞控制技術之研究與設計

TCP Reno is a window¹-based congestion control mechanism. Its window-adjustment algorithm consists of three phases; slow-start, AIMD (additive increase/multiplicative decrease), and fast retransmit and recovery. A connection begins with the slow-start phase. The objective of slow-start is to enable a TCP connection to discover the

1TCP window refers to the amount of outstanding data that can be transmitted by the sender without acknowledgements.

Table 2.1: RFCs for the TCP implementation.

RFC number Topic

793 Transmission Control Protocol

1122 Requirements for Internet Hosts - Communication Layers 1323 TCP Extensions for High Performance

2018 TCP Selective Acknowledgement Options 2581 TCP Congestion Control

2582 The NewReno Modification to TCP’s Fast Recovery Algorithm 2914 Congestion Control Principles

3168 The Addition of Explicit Congestion Notification (ECN) to IP 3390 Increasing TCP’s Initial Window

Source

Destination

Time

...

CWND=1 CWND=2 CWND=4 CWND=8

Figure 2.1: Packets in transit during slow-start.

available bandwidth by gradually increasing the amount of data injected into the network from the initial window size². Upon receiving an acknowledgement packet (ACK), the congestion window size (CWND) is increased by one packet. With ref-erence to Fig. 2.1, initially, the sender starts by transmitting one packet and waits for its ACK. When that ACK is received, the congestion window is incremented from one to two, and two packets can be sent. When both of these two packets are acknowledged, the congestion window is increased to four, and so on.

2RFC 2581 suggests an initial window size of two packets and RFC 3390 suggests a larger initial window can be used for reducing the duration of startup period, specifically for connections running in long propagation delay networks.

Since the CWND in the slow-start phase expands exponentially, the packets sent at this increasing rate would quickly lead to network congestion. To avoid this, the AIMD phase begins when CWND reaches the slow-start threshold (SSTHRESH ).

In AIMD phase, the CWND is added by 1/CWND packet every once receiving an ACK, this makes window size grow linearly. The process continues until a packet loss is detected and then the the CWND will be cut by half.

There are two ways for TCP Reno to detect packet loss. One is based on the reception of three duplicate ACKs, the other is based on retransmission timeout.

When a source receives three duplicate ACKs, the fast retransmit and recovery algorithm is performed. It retransmits the lost packet immediately without waiting for a coarse-grained timer to expire. In the meantime, the SSTHRESH is set to half of CWND, which is then set to SSTHRESH plus the number of duplicate ACKs.

The CWND is increased by one packet every once receiving a duplicate ACK. When the ACK of a retransmitted packet is received, the CWND is set to SSTHRESH and the source reenters the AIMD phase.

If a serious congestion occurs and there is no sufficient survived packets to trigger three duplicate ACKs, the congestion will be detected by a coarse-grained retrans-mission timeout. When the retransretrans-mission timer expires, the SSTHRESH is set to half of CWND and then the CWND is reset to one and finally the source restarts from slow-start phase.

A window evolution example including three window-adjustment phases of TCP Reno can be referred to Fig. 2.2. A connection starts from slow-start phase with an exponentially increasing rate. Since the connection has no idea about the available bandwidth of the network, the over expanded window size incurs a severe congestion quickly. After a retransmission timeout, the connection restarts from slow-start phase. When the CWND grows up to the SSTHRESH, the window size is increased linearly. After that, the pattern of periodically additive increasing and multiplicative decreasing of window size continues throughout the lifetime of the connection.

The fast retransmit and recovery algorithm of TCP Reno allows a connection to quickly recover from isolated packet losses. However, when multiple packets are

0 10 20 30 40 50 60

0 2 4 6 8 10

Time (s)

CWND (Packets)

Figure 2.2: TCP Reno’s window evolution.

dropped from a window of data, TCP Reno may suffer serious performance problems.

Since it retransmits at most one dropped packet per round-trip time, and further the CWND may be decreased more than once due to multiple packet losses occurred during one round-trip time interval. In this situation, TCP Reno operates at a very low rate and loses a significant amount of throughput.

A number of enhanced loss recovery algorithms have been proposed to improve the above problem. In the following subsections, we briefly describe three noted remedies of TCP Reno, these include TCP NewReno [13], SACK [14], and FACK [15].

2.1.1 TCP NewReno

TCP NewReno makes a small change to a connection source, it may eliminate TCP Reno’s waiting for a retransmission timeout when multiple packets are lost from a window. The change enhances the fast recovery algorithm of TCP Reno.

In TCP Reno, partial ACKs³ bringing the connection out of fast recovery results

3Partial ACK is an acknowledgement that acknowledge some but not all of the outstanding packets at the start of that fast recovery phase.

in a retransmission timeout in case of multiple packet losses. In TCP NewReno, when a source receives a partial ACK, it won’t get out of fast recovery [5, 42, 13]. Instead, it assumes that the packet immediately follows the most recently acknowledged packet has been lost, and hence retransmits the lost packet. Thus, in the situation of multiple packet losses, TCP NewReno will retransmit one lost packet per round-trip time until all of the lost packets from the same window have been recovered, and will not incur retransmission timeout. It remains in fast recovery phase until all of the outstanding packets at the start of that fast recovery phase have been acknowledged. Although this can avoid the unnecessary window reduction, the recovery time is still long. The implementation details of TCP NewReno has been specified in RFC 2582.

2.1.2 SACK

Another way to deal with multiple packet losses is to tell the source which pack-ets have arrived at the destination. Selective Acknowledgments (SACK) does so exactly. TCP adapts accumulated acknowledgement strategy to acknowledge the successfully transmitted packets, this improves the robustness of acknowledgement when the path back to the source features high loss rate. However, the drawback of accumulated acknowledgement is that after a packet loss the source is unable to find out which packets are successfully transmitted. Therefore, it is unable to recover more than one lost packet in each round-trip time.

SACK option [14] field contains a number of SACK blocks, where each SACK blocks reports a non-contiguous set of data that has been received and buffered.

The destination uses ACK with SACK option to inform the source one contiguous block of data that has been received out of order at the destination.

When SACK blocks are received by the source, they are used to maintain an image of the receiver queue, i.e., which packets are missing and which have been received at the destination. Scoreboard is set up to track those transmitted and received packets according to the previous information of the SACK option. For

each transmitted packet, scoreboard records its sequence number and a flag bit that indicates whether the packet has been “SACKed”. A packet with the SACKed bit turned on does not require to retransmit, but packets with the SACKed bit off and sequence number less than the highest SACKed packet are eligible for retransmission.

Whether a SACKed packet is on or off, it is removed from the retransmission buffer only when it has been cumulatively acknowledged.

SACK TCP implementation still uses the same congestion control algorithms as TCP Reno. The main difference between SACK TCP and TCP Reno is the behavior in the event of multiple packet losses. SACK TCP refines the fast retransmit and fast recovery strategy of TCP Reno so that multiple lost packets in a single window can be recovered within one round-trip time.

2.1.3 FACK

Forward Acknowledgments (FACK) [15] was developed to decouple the congestion control algorithms from the data recovery algorithms. It uses the additional infor-mation provided by SACK option to keep an explicit measure of the total amount of outstanding data in the network. The goal of the FACK algorithm is to perform precise congestion control during recovery. By accurately controlling the outstand-ing data in the network, FACK can improve the connection throughput duroutstand-ing the data recovery phase.

在文檔中 TCP壅塞控制技術之研究與設計 (頁 22-27)