Aggressive TCP issues

Roman Elizarov, December 1999
Roman_Elizarov@baylor.edu

Abstract

This reports gives the result of agressive TCP implementation attempt. While it is not hard to make TCP behave absolutely wild and flood the network with the packets that effectively knock out all other competing protocols running over the network (unless some sort of fair queueing is done in the network), but that approach will drive the network to the congestion collase which is not the goal and must be avoided. It turned out that aggressive TCP, i.e. that will compete significantly better without provoking congestion collaps is not an easy thing to do. Efforts by different network specialists are undertaken to improve TCP performance in the presence of congestions. Overview of their work shows that there are few things that could be done without changing TCP too much.

Aggressive

The original purpose of the work was to modify Linux kernel of one of the HackNet group machines in order to make TCP aggressive. By word aggressive we mean the implementaion that is able to get more bandwidth that non-aggressive TCP implementations are able to. But we still want to get data being send and do not drive the network into congestion collapse.

What to violate?

That is an easy question to answer. [RFC2581] gives explicit upper bounds on how aggressive TCP implementations shall be, though it allows experimantal implementations to relax or violate some of the requiremnts. There are not many requirements to break and almost all requirements have good reasons of being imposed.

Actually, limits on TCP aggresiveness are made in an effort to prevent congestion collapse and I will futher justify this later. Let us now look at aggresiveness limits themselves.

Slow Start: Slow start was invented in order to prevent congestion that could be created by injection of big bursts into the network. That is why we cannot eliminate it completely. While slow start can be made faster by incresing intial window size and/or increasing increment, it will not lead to significant performance increase. Simulations show that slow start part of TCP is used only at the first stage of data transfer. During the bulk transfer Fast Retransmit/Fast Recovery usually come into play. Still, we cannot even make first stage (when slow start is used) significantly faster, because injection of bursts will harm our preformance as well.
Congestion Avoidance: Congestion avoidance is justified by the fact that is no reason in sending too much data if it is not able to get through, anyway (we are only creating congestion) Of couse, for aggressive TCP, there may be a reason: to drive away other TCP connections that share the same link. But we still want to be able to detect physical bottleck and not to congest ourselves. This constraint does not deserve much attention by the similair reason to Slow Start, becuase of Fast Retransmit/Fast Recovery. Congestion avoidance backoff is just not used often.
We can still try to make window increase faster, but by increasing it faster we run into congestion faster, and again end up with smaller window. The only resonably looking solution is to eleminate as many backoffs as possible in an attempt to get highest possible window. The main source of backoffs during actual connections, as it turned out, is Fast Retransmit/Fast Recovery.
Fast Retransmit/Fast Recovery: Quick review of references ([RFC2582] is a good source for them) shows that this is hot area of research that should be done very carefully in order improve performance but not to degrade it.
Fast restransmit is a primary source of congestion detection and it is the main mechanism that limits the growth or the windows. Though, simulations showed that over a long delay links implementation window limit can serve as a perfomance bound.

ACK Clock

One of the main feature of TCP that limits its perfomance is an ACK clock. There is not much TCP sender can do (implying that data to send is always available) being not the part of the reaction on ACK from receiver. The only execptions are retransmit timers that actually fire rarely.

This behaviour is a great constaint on the TCP's abilty to handle data transfer in the most efficient way, and it also prevents simple hacks to make it aggresive. It seams that having internal clock which frequency is adjusted based on the feedback is superiour, and that is how most realtime transfer protocols do their job. This will allow one effiently complete with TCP connections over the network bandwidth.

Fair Queueing

The presence of any kind of fairness-enforcing gateway (like one proposed in [Lin97]) anywhere on the path renders any efforts to create aggressive TCP useless. That one reason why it is very hard to evaluate actual aggresivenes of some implementaion over the real Internet, and, unfortunataly, that is the single interesting place to run simulations over.

Simulations

Small number of simulations were done using HackNet workstations as a testbed. Linux kernel was modified to write (using kprint function) different characters in different places of TCP implementation in order to study TCP behaviour during actual long-delay FTP data transfer (put) over the Internet to the remote host.

References

[RFC2581]: "TCP Congestion Control", M. Allman, V. Paxson and W. Stevens, April 1999
ftp://ftp.isi.edu/in-notes/rfc2581.txt
[RFC2582]: "The NewReno Modification to TCP's Fast Recovery Algorithm", S. Floyd and T. Henderson, April 1999
ftp://ftp.isi.edu/in-notes/rfc2582.txt
[Lin97]: "Dynamics of Random Early Detection", D. Lin and R. Morris, SIGCOMM, volume 27, number 4, October 1997
http://www.acm.org/sigcomm/sigcomm97/papers/p078.ps