Network Bottlenecks – When Your Router Drops Packets, Things Can Get Ugly


By Art Reisman

CTO – APconnections

As a general rule, when a network router sees more packets than it can send or receive on a link, it will drop the extra  packets. Intuitively, when your router is dropping packets, one would assume that the perceived slow down, per user, would be just a gradual shift slower.

What happens in reality is far worse…

1) Distant users get spiraling slower responses.

Martin Roth, a colleague of ours who founded one of the top performance analysis companies in the world, provided this explanation:

“Any device which is dropping packets “favors” streams with the shortest round trip time, because (according to the TCP protocol) the time after which a lost packet is recovered is depending on the round trip time. So when a company in Copenhagen/Denmark has a line to Australia and a line to Germany on the same internet router, and this router is discarding packets because of bandwidth limits/policing, the stream to Australia is getting much bigger “holes” per lost packet (up to 3 seconds) than the stream to Germany or another office in Copenhagen. This effect then increases when the TCP window size to Australia is reduced (because of the retransmissions), so there are fewer bytes per round trip and more holes between to round trips.”

In the screen shot above (courtesy of avenida.dk), the Bandwidth limit is 10 Mbit (= 1 Mbyte/s net traffic), so everything on top of that will get discarded. The problem is not the discards, this is standard TCP behaviour, but the connections that are forcefully closed because of the discards. After the peak in closed connections, there is a “dip” in bandwidth utilization, because we cut too many connections.

2) Once you hit a congestion point, where your router is forced to drop packets, overall congestion actually gets worse before it gets better.

When applications don’t get a response due to a dropped packet, instead of backing off and waiting, they tend to start sending re-tries, and this is why you may have noticed prolonged periods (3o seconds or more) of no service on a congested network. We call this the rolling brown out. Think of this situation as sort of a doubling down on bandwidth at the moment of congestion. Instead of easing into a full network and lightly bumping your head, all the devices demanding bandwidth ramp up their requests at precisely the moment when your network is congested, resulting in an explosion of packet dropping until everybody finally gives up.

How do you remedy outages caused by Congestion?

We have written extensively about solutions to prevent bottlenecks. Here is a quick summary with links:

1) The most obvious being to increase the size of your link.

2) Enforce rate limits per user.

3) Wse something more sophisticated like a Netequalizer, a device that is designed to specifically counter the effects of congestion.

From Martin Roth of Avenida.dk

“With NetEqualizer we may get the same number of discards, but we get fewer connections closed, because we “kick” the few connections with the high bandwidth, so we do not get the “dip” in bandwidth utilization.

The graphs (above) were recorded using 1 second intervals, so here you can see the bandwidth is reached. In a standard SolarWinds graph with 10 minute averages the bandwidth utilization would be under 20% and the customer would not know they are hitting the limit.”

———————————————————————-

The excerpt below was a message from a reseller who had been struggling with congestion issues at a hotel, he tried basic rate limits on his router first. Rate Limits will buy you some time , but on an oversold network you can still hit the congestion point, and for this you need a smarter device.

“…NetEq delivered a 500% gain in available bandwidth by eliminating rate caps, possible through a mix of connection limits and Equalization.  Both are necessary.  The hotel went from 750 Kbit max per accesspoint (entire hotel lobby fights over 750Kbit; divided between who knows how many users) to 7Mbit or more available bandwidth for single users with heavy needs.

The ability to fully load the pipe, then reach out and instantly take back up to a third of it for an immediate need like a speedtest was also really eye-opening.  The pipe is already maxed out, but there is always a third of it that can be immediately cleared in time to perform something new and high-priority like a speed test.”
 
Rate Caps: nobody ever gets a fast Internet connection.
Equalized: the pipe stays as full as possible, yet anybody with a business-class need gets served a major portion of the pipe on demand. “
– Ben Whitaker – jetsetnetworks.com

Are those rate limits on your router good enough?

How does your ISP actually enforce your Internet Speed


By Art Reisman, CTO, www.netequalizer.com

Art Reisman CTO www.netequalizer.com

YT

Have you ever wondered how your ISP manages to control the speed of your connection? If so, you might find the following article enlightening.  Below, we’ll discuss the various trade-offs used to control and break out bandwidth rate limits and the associated side effects of using those techniques.

Dropping Packets (Cisco term “traffic policing”)

One of the simplest methods for a bandwidth controller to enforce a rate cap is by dropping packets. When using the packet-dropping method, the bandwidth controlling device will count the total number of bytes that cross a link during a second.  If the target rate is exceeded during any single second, the bandwidth controller will drop packets for the remainder of that second. For example, if the bandwidth limit is 1 megabit, and the bandwidth controller counts 1 million bits gone by  in 1/2 a second, it will then drop packets for the remainder of the second.  The counter will then reset for the next second. From most evidence we have observed, rate caps enforced by many ISPs use the drop packet method, as it is the least expensive method supported on most basic routers.

So, what is wrong with dropping packets to enforce a bandwidth cap?

Well, when a link hits a rate cap and packets are dropped en masse, it can wreak havoc on a network. For example, the standard reaction of a Web browser when it perceives web traffic is getting lost is to re-transmit the lost data. For a better understanding of dropping packets, let’s use the analogy of a McDonald’s fast food restaurant.

Suppose the manager of the restaurant was told his bonus was based on making sure there was a never a line at the cash register. So, whenever somebody showed up to order food when all registers were occupied, the manager would open a trap door conveniently ejecting   the customer back out into the parking lot.  The customer, being extremely hungry, will come running back in the door (unless of course they die of starvation or get hit by a car) only to be ejected again. To make matters worse, let’s suppose a bus load of school kids arrive. As the kids file in to the McDonald’s, the remaining ones on the bus have no idea their classmates inside are getting ejected, so they keep streaming into the McDonald’s. Hopefully, you get the idea.

Well, when bandwidth shapers deploy packet-dropping technology to enforce a rate cap, you can get the same result seen with the trapdoor analogy in the McDonald’s. Web browsers and other user-based applications will beat their heads into the wall when they don’t get responses from their counterparts on the other end of the line. When packets are being dropped en masse,  the network tends to spiral out-of-control until all the applications essentially give up.  Perhaps you have seen this behavior while staying at a hotel with an underpowered Internet link. Your connectivity will alternate between working and then hanging up completely for a minute or so during busy hours. This can obviously be very maddening.

The solution to shaping bandwidth on a network without causing gridlock is to implement queuing.

Queuing Packets (Cisco term “traffic shaping”)

Queuing is the art of putting something in a line and making it wait before continuing on. Obviously, this is what fast food restaurants do in reality. They plan enough staff on hand to handle the average traffic throughout the day, and then queue up their customers when they are arriving at a faster rate then they can fill orders. The assumption with this model is that at some point during the day the McDonald’s will get caught up with the number of arriving customers and the lines will shrink away.

Another benefit of queuing is that wait times can perhaps be estimated by customers as they drive by and see the long line extending out into the parking lot, and thus, they will save their energy and not attempt to go inside.

But, what happens in the world of the Internet?

With queuing methods implemented, a bandwidth controller looks at the data rate of the incoming packets, and if deemed too fast, it will delay the packets in a queue. The packets will eventually get to their destination, albeit somewhat later than expected. Packets on queue can pile up very quickly, and without some help, the link would saturate. Computer memory to store the packets in the queue would also saturate and, much like the scenario mentioned above, the packets would eventually get dropped if they continued to come in at a faster rate than they were sent out.

TCP to the Rescue (keeping queuing under control)

Most internet applications use a service called TCP (transmission control protocol) to handle their data transfers. TCP has developed intelligence to figure out the speed of the link for which it is sending data on, and then can make adjustments. When the NetEqualizer bandwidth controller queues a packet or two, the TCP controllers on the customer end-point computers will sense the slower packets and back off the speed of the transfer. With just a little bit of queuing, the sender slows down a bit and dropping packets can be kept to a minimum.

Queuing Inside the NetEqualizer

The NetEqualizer bandwidth shaper uses a combination of queuing and dropping packets to get speed under control. Queuing is the first option, but when a sender does not back off eventually, their packets will get dropped. For the most part, this combination of queuing and dropping works well.

So far we have been inferring a simple case of a single sender and a single queue, but what happens if you have gigabit link with 10,000 users and you want to break off 100 megabits to be shared by 3000 users? How would a bandwidth shaper accomplish this? This is another area where a well-designed bandwidth controller like the NetEqualizer separates itself from the crowd.

In order to provide smooth shaping for a large group of users sharing a link, the NetEqualizer does several things in combination.

  1. It keeps track of all streams, and based on their individual speeds, the NetEqualizer will use different queue delays on each stream.
  2. Streams that back off will get minimal queuing
  3. Streams that do not back off may eventually have some of their packets dropped

The net effect of the NetEqualizer queuing intelligence is that all users will experience steady response times and smooth service.

Notes About UDP and Rate Limits

Some applications such as video do not use TCP to send data. Instead, they use a “send-and-forget” mechanism called UDP, which has no built-in back-off mechanism. Without some higher intelligence, UDP packets will continue to be sent at a fixed rate, even if the packets are coming too quickly for the receiver.  The good news is that even most UDP applications also have some way of measuring if their packets are getting to their destination. It’s just that with UDP, the mechanism of synchronization is not standardized.

Finally there are those applications that just don’t care if the packets get to their destination. Speed tests and viruses send UDP packets as fast as they can, regardless of whether the network can handle them or not. The only way to enforce a rate cap with such ill-mannered application is to drop the packets.

Hopefully this primer has given you a good introduction to the mechanisms used to enforce Internet Speeds, namely dropping packets & queuing.  And maybe you will think about this the next time you visit a fast food restaurant during their busy time…

%d bloggers like this: