By Art Reisman
CTO – APconnections
As a general rule, when a network router sees more packets than it can send or receive on a link, it will drop the extra packets. Intuitively, when your router is dropping packets, one would assume that the perceived slow down, per user, would be just a gradual shift slower.
What happens in reality is far worse…
1) Distant users get spiraling slower responses.
Martin Roth, a colleague of ours who founded one of the top performance analysis companies in the world, provided this explanation:
“Any device which is dropping packets “favors” streams with the shortest round trip time, because (according to the TCP protocol) the time after which a lost packet is recovered is depending on the round trip time. So when a company in Copenhagen/Denmark has a line to Australia and a line to Germany on the same internet router, and this router is discarding packets because of bandwidth limits/policing, the stream to Australia is getting much bigger “holes” per lost packet (up to 3 seconds) than the stream to Germany or another office in Copenhagen. This effect then increases when the TCP window size to Australia is reduced (because of the retransmissions), so there are fewer bytes per round trip and more holes between to round trips.”
In the screen shot above (courtesy of avenida.dk), the Bandwidth limit is 10 Mbit (= 1 Mbyte/s net traffic), so everything on top of that will get discarded. The problem is not the discards, this is standard TCP behaviour, but the connections that are forcefully closed because of the discards. After the peak in closed connections, there is a “dip” in bandwidth utilization, because we cut too many connections.
2) Once you hit a congestion point, where your router is forced to drop packets, overall congestion actually gets worse before it gets better.
When applications don’t get a response due to a dropped packet, instead of backing off and waiting, they tend to start sending re-tries, and this is why you may have noticed prolonged periods (3o seconds or more) of no service on a congested network. We call this the rolling brown out. Think of this situation as sort of a doubling down on bandwidth at the moment of congestion. Instead of easing into a full network and lightly bumping your head, all the devices demanding bandwidth ramp up their requests at precisely the moment when your network is congested, resulting in an explosion of packet dropping until everybody finally gives up.
How do you remedy outages caused by Congestion?
We have written extensively about solutions to prevent bottlenecks. Here is a quick summary with links:
1) The most obvious being to increase the size of your link.
2) Enforce rate limits per user.
3) Wse something more sophisticated like a Netequalizer, a device that is designed to specifically counter the effects of congestion.
From Martin Roth of Avenida.dk
“With NetEqualizer we may get the same number of discards, but we get fewer connections closed, because we “kick” the few connections with the high bandwidth, so we do not get the “dip” in bandwidth utilization.
The graphs (above) were recorded using 1 second intervals, so here you can see the bandwidth is reached. In a standard SolarWinds graph with 10 minute averages the bandwidth utilization would be under 20% and the customer would not know they are hitting the limit.”
The excerpt below was a message from a reseller who had been struggling with congestion issues at a hotel, he tried basic rate limits on his router first. Rate Limits will buy you some time , but on an oversold network you can still hit the congestion point, and for this you need a smarter device.
“…NetEq delivered a 500% gain in available bandwidth by eliminating rate caps, possible through a mix of connection limits and Equalization. Both are necessary. The hotel went from 750 Kbit max per accesspoint (entire hotel lobby fights over 750Kbit; divided between who knows how many users) to 7Mbit or more available bandwidth for single users with heavy needs.
Pros and Cons of Using Your Router as a Bandwidth ControllerJanuary 30, 2011 — netequalizer
So, you already have a router in your network, and rather than take on the expense of another piece of equipment, you want to double-up on functionality by implementing your bandwidth control within your router. While this is sound logic and may be your best decision, as always, there are some other factors to consider.
Here are a few things to think about:
1. Routers are optimized to move packets from one network to another with utmost efficiency. To do this function, there is often minimal introspection of the data, meaning the router does one table look-up and sends the data on its way. However, as soon as you start doing some form of bandwidth control, your router now must perform a higher-level of analysis on the data. Additional analysis can overwhelm a router’s CPU without warning. Implementing non-routing features, such as protocol sniffing, can create conditions that are much more complex than the original router mission. For simple rate limiting there should be no problem, but if you get into more complex bandwidth control, you can overwhelm the processing power that your router was designed for.
2. The more complex the system, the more likely it is to lock up. For example, that old analog desktop phone set probably never once crashed. It was a simple device and hence extremely reliable. On the other hand, when you load up an IP phone on your Windows PC, you will reduce reliability even though the function is the same as the old phone system. The problem is that your Windows PC is an unreliable platform. It runs out of memory and buggy applications lock it up.
This is not news to a Windows PC owner, but the complexity of a mission will have the same effect on your once-reliable router. So, when you start loading up your router with additional missions, it is increasingly more likely that it will become unstable and lock up. Worse yet, you might cause a subtle network problem (intermittent slowness, etc.) that is less likely to be identified and fixed. When you combine a bandwidth controller/router/firewall together, it can become nearly impossible to isolate problems.
3. Routing with TOS bits? Setting priority on your router generally only works when you control both ends of the link. This isn’t always an option with some technology. However, products such as the NetEqualizer can supply priority for VoIP in both directions on your Internet link.
4. A stand-alone bandwidth controller can be moved around your network or easily removed without affecting routing. This is possible because a bandwidth controller is generally not a routable device but rather a transparent bridge. Rearranging your network setup may not be an option, or simply becomes much more difficult, when using your router for other functions, including bandwidth control.
These four points don’t necessarily mean using a router for bandwidth control isn’t the right option for you. However, as is the case when setting up any network, the right choice ultimately depends on your individual needs. Taking these points into consideration should make your final decision on routing and bandwidth control a little easier.