NANOG is all
abuzz with the news that
Level(3) has
depeered
Cogent. What the heck
does this mean and why do you care?
A Network of Networks
Most people's experience of getting Internet access is simple: you call
up your ISP and order a line. You pay them some chunk of money per
month and they carry traffic to and from your house. Now, any real
ISP has a big network, so they have lots of customers just like you.
The figure below shows the simplest such network, where every one of
the ISPs customers is connected to the same central router (this is called
a star configuration).
When customer C1 wants to talk to customer C2, they send traffic
to the ISP router which forwards it to C2. Return traffic follows
the same path in reverse.
This is of course a very simple network. A big ISP will have
more than one router in multiple locations. These routers
are somehow interconnected in a way we don't really care about
here. For the purposes of this discussion we can just think
of the ISP's network as one big opaque blob that knows how
to route traffic from any customer to any other customer.
What I've just described works great when all you want to do is
talk to other people on the same ISP but as you may have noticed,
there's more than one ISP in the world. If a customer on ISP A
wants to talk to a customer on ISP B, they must be connected
somehow. The simplest such topology looks like this:
Clearly, you can extend this to three ISPs or more. If we ignore
the interior structure of the ISPs networks, it looks something
like this:
In the figure above, each ISP is connected to the other two ISPs.
Now, think about what happens when
a customer of A wants to send a message to a customer on B.
A has two links, one to B and one to C. In order for this to
work properly, it has to know to send it down link 1 rather than
link 2. Routing protocols (BGP in this case) are used to let the router(s) at B know which
hosts (or networks) are on which ISP and hence which link to
send packets down. The details of how this works are complicated,
but roughly speaking each ISP advertises the network addresses
that it knows how to reach. (For technical reasons, these are known
as prefixes.1) This lets each other ISP build up a
table of routes for the traffic to follow.
Peering and Transit
What I've just described works great if every ISP is connected to
every other (the technical term for this is a full mesh), but
there are zillions of ISPs so that's not very convenient.
What actually happens is that there are a relatively small
number of big ISPs and that little ISPs, rather than being
connected to each other, connect to some big ISP, who carries
some or all of their traffic to the rest of the network.
So, if we imagine adding two such small ISP to the
network we drew before, we get something like this:
In this network, ISP D has connected to ISP A. Any traffic to
any other part of the Internet not operated by D or by A must
go through A. The technical term for this is that ISP D is
buying transit from ISP A. E's situation is similar
except that they're buying transit from C.
At this point, it's worth noting that ISP D's position
with respect to ISP A is very much the same as your position
with respect to your ISP. In both cases, you're paying
someone else to carry your traffic to the rest of the Internet.
In fact, it's very common for end user customers to connect
not their host to their ISP but rather an entire network
consisting of a number of computers, sometimes distributed
over a variety of locations. You wouldn't be wrong
to think of an end-user like this as a degenerate sort of
ISP--one that doesn't have any customers but their own
users.
Now, it's easy to get your Internet service this way, but there
are disadvantages. The first disadvantage is that you're paying
someone else to give you service. And much like the situation with
your own ISP, the more bandwidth that
ISP D consumes on the link to ISP A, the more ISP A charges him.
The second disadvantage is that you're totally dependent on one ISP.
If something goes wrong on that ISP, then you're totally cut
off from the rest of the Internet. Finally, imagine that you're
served by ISP D and you want to communicate to someone who's served
by ISP E. Traffic needs to go from D to A to C to E. As the supply lines get longer, it introduces latency
and brittleness.2
A partial solution to the second and third problems
is to establish a connection to a second ISP. This gives you both
redundancy and a shorter path to that ISP's customer. The technical
term for this is multi-homing (if you only have one
connection, you're single-homed).
Consider the case
of ISP D and ISP E. They both have equivalently good connections to
the Internet, through a big ISP (A and C respectively)
that's connected to all the other
big ISPs. However, as noted before, traffic between them goes through
a fairly inefficient route (D,A,C,E). They can improve this situation
by connecting up directly, through link 6, as shown below.
So, you'll remember that I said that D pays A for transit and E pays
C. So, you might ask does D pay E or the other way around? The
answer is, it depends. If D is much bigger than E, E may pay D (because
getting to its customers is more valuable to E) and if E is much bigger
than D, it may go the other way around. However, if they're roughly
equivalent sizes, they may choose to just connect and exchange
traffic for free (well, technically without paying a fee. There are
still all the equipment costs associated with getting lines attached
to the same location, etc. This can sometimes be more expensive than
buying transit through an existing connection!).
This is called peering. Most of the big
ISPs do some peering and the very biggest ones (called Tier 1s) never
pay anyone for transit. They either peer or sell transit. Generally,
it's considered a point of prestige for carriers to peer rather
than buy transit--nobody wants to feel like they're not one of the
big boys.
In order to understand the situation with Cogent and Layer(3), you
need to understand one more thing. When you peer with someone
else, you often don't carry their traffic to other parts
of the Internet. I.e., traffic from D to C goes D,A,C, not D,E,C.
The way that this works technically is that A advertises D's prefixes
D but E does not. D, of course, advertises its prefixes to both
A and E, but E filters those prefixes when it advertises its own
routes to C. This means that the link between D and E may
provides redundancy only for D-E communication. If D's link to A
goes down, he won't be able to talk to anyone but E.
Level(3) and Cogent
With this background, we're now equipped to understand what's
going on between Level(3) and Cogent. Level(3) is a Tier 1
provider; they don't pay for transit. Cogent is an almost-Tier 1;
generally they peer but occasionally they pay for transit but
only to a few select networks.
Until very recently, Level(3) and Cogent peered, but Level(3) was
obviously unhappy with that relationship and wanted Cogent to
pay them for transit. Cogent didn't want to, probably partly
for financial reasons and partly for prestige reasons. When
negotiation didn't work out, Level(3) terminated the peering
relationship. Cogent responded by offering
free transit for a year to Level(3) customers, which
is an obvious attempt to take business away from Level(3).
Level(3) has temporarily reconnected Cogent until
November 9th.
Because Cogent isn't paying for transit to Level(3)
(and Level(3) certainly isn't paying for transit to Cogent),
packets can't pass between the two networks.
This only affects you if you (or your ISP)
is single-homed to either Level(3) or Cogent (which is a lot
of people). If you are,
you won't be able to talk to anyone else who is single-homed
with the other ISP. If you aren't, then you won't have a
problem.
Basically, what's going on here is a game of chicken. It's
valuable to both Level(3) and Cogent to have their customers
be able to talk to the other.
They're both suffering when
they're not connected, but they both figure that the other
will give in first.
Level(3) can give up by
turning back on the connection with Cogent. Cogent can give
up by agreeing to pay Level(3) for transit or someone else
for transit to Level(3). In the past, Cogent has pursued this
stragegy at least twice, once successfully (Teleglobe) and once
unsuccessfully (OpenTransit). It will be interesting to see what the
result is this time.
Acknowledgement:
This post relies heavily on the
discussion of this event on NANOG (in particular this post by Richard A. Steenbergen), and on discussions with Dave Meyer.
All errors are, as usual, my own.
1 The way that Internet routing works is that a
route advertisement is for a contiguous block of IP addresses,
For instance, the route 192.168/16 means "any IP address whose
first two (most significant) bytes are 192.168". Because the
addresses are written most significant to least significant,
this means that any address in the block (e.g. 192.168.1.1)
must have the block's prefix.
2 The key parameter here is the number of ISPs (actually
Autonomous Systems (ASs) that the traffic has to pass through.
BGP uses the AS_PATH parameter to carry this information.