George Ou made this argument at the
FCC
En Banc hearing
at Stanford on 4/25 (A/V
here).
It's actually quite common throughout the world that TCP RSTs
are used.
...
Speaking of the 1:45 AM resets,
ISPs all over the world, they've found that up to 12% of sessions
get reset, all over the world. It's almost like there's this
12% of background noise of TCP resets that are happening that
may not be coming from comcast but could be coming from any
device on the Internet, all routers, all firewalls
support that feature and we don't really know where it's coming
from.
Here's ATT's response
to Vuze's claim that they use RSTs for "network management purposes"
(i.e., terminating connections they don't like):
In response to your specific question about AT&T's network management
practices, AT&T does not use "false reset messages" to manage its
network. We agree with Vuze that the use of the Vuze Plug-In to
measure network traffic has numerous limitations and deficiencies, and
does not demonstrate whether any particular network providers or their
customers are using TCP Reset messages for network management
purposes. Given that Vuze itself has recognized these problems with
the measurements generated by its Plug-In, we believe that Vuze should
not have published these misleading measurements, nor filed them with
the FCC. Moreover, as Vuze and others have acknowledged, TCP resets
are generated for many reasons wholly unrelated to the network
management practices of broadband network providers, which explains
why resets may appear on networks of companies such as AT&T who do not
use TCP resets for network management (see, e.g., An Analysis ofTCP
Reset Behaviour on the Internet, University of Calgary (2004)).
I've reviewed the paper by Arlitt and Williamson to which AT&T is
referring (Ou didn't cite his sources), and while it's
interesting work, I don't think it really speaks to Ou's argument.
The RSTs that Arlitt and Williamson are talking about are primarily
ungraceful
terminations of TCP connections that would be ending anyway. The
authors suggest a number of cases here:
- Servers aggressively closing connections after short idle
times, but the client already has a request in flight and
the server responds with an RST.
- Clients responding to FINs from the server with an
RST. The reasons for this are a bit unclear.
- Servers closing connections with RSTs.
- Connections to servers which aren't listening on a given
port and so are rejecting it.
In all of these cases but the last,
though, the Web transactions are actually
over, so while there may be some negative effects from not going
through the correct TCP finish handshake (cf. RFC 1337), neither side perceives this as failed
transactions. And in the final case, the server explicitly
is rejecting the connection, so this seems appropriate as well.
It's also fairly straightforward to distinguish these cases
as a passive observer (as the authors have done) with the
appropriate tools.
What Comcast has done, however, is something different: they
were (are?) using RSTs to abort other people's transactions.
The base rate of normal RSTs isn't really that useful for assessing
the appropriateness of third party RSTs as a network traffic
management technique. As a hypothetical, imagine that Comcast
were forging FINs instead of RSTs. One could expand Ou's argument
to say "FINs are a natural feature of the Internet", but it doesn't
really follow that it's desirable to have third parties forging
FINs on your connections.
It does bear, on the other hand, on what we can infer from Vuze's
data.
Vuze hasn't really published that many details of their methodology,
but they claim to be measuring the total number of RSTs,
not just those of Azureus/Bittorrent connections
(Incidentally, I'm not sure how I'd feel as a user
about installing some app that sniffed all the traffic on
my network and sent statistics to Vuze)[-- see update below; EKR]:
The Vuze Plug\u2010In constantly monitors the rate of network
interruptions occurring from RST ("reset") packets by
measuring the total number of attempted network connections
and the total number of network connections that were
interrupted by a reset message. By comparing these two
values, one can calculate the ratio of network connections
interrupted by reset messages. We have chosen to reflect
the median
ratio in order to reduce variability in the data given the sample size.
The Plug\u2010In collects data for all Internet connections, not
just connections occurring
due to use of the Vuze application, and logs it every ten minutes Then, at the top of the
hour, the Plug\u2010In aggregates the data into one\u2010hour blocks
and transmits it to Vuze, Inc.. By definition, each
source of data had the Vuze application installed and
launched in order connections.
But if you're measuring all RSTs and not attempting to determine
which ones are "normal" and which ones represent connection
failure, then it's not clear how representative your data is.
It is sort of interesting how much variation (about an order
of magnitude from 2.5% to 24%) there is in terms
the rate of RSTs, but as Iljitsch van Beijnum observes, this could be the result of caching
proxies and the like in the network. You may not particularly
want your ISP interposing a proxy, but that's a different
question than whether they're actually blocking your P2P traffic.
This isn't the only possible reason, either. For instance, users
might just have different software profiles. Given that
Vuze claims to have 8000 users on 1200 ASs (with the data
being reported for ASs with greater than 20 users, there could well
just be a lot of statistical variation. Some evidence of this is
that the results from Comcast alone span from 14% to 24%).
In order to really make sense of data like Vuze's we'd need to
try to distinguish normal RSTs from those injected in the network,
which requires more forensics (TTL inspection, IP ID, etc.)
than Vuze's paper describes.
UPDATE (8:49 AM): I was wrong about this needing to be a packet sniffer. I just read the source (here; thanks to Danny McPherson for pointing out that I could download it.)
They're just using netstat to read the network statistics and grabbing the reset counter
out of the results. On the other hand, this means that they're not even in principle
able to differentiate between RSTs generated on Azureus connections and those
on other connections or between those generated by some man in the middle or
by endpoints. While the variation in reported RSTs remains interesting, you'd need a significantly more advanced tool than this to really diagnose what's going on.