A Brief History of Peer to Peer File Sharing and the Attempts to Block It


By Art Reisman

The following history is based on my notes and observations as both a user of peer to peer, and as a network engineer tasked with cleaning  it up.

Round One, Napster, Centralized Server, Circa 2002

Napster was a centralized service, unlike the peer to peer behemoths of today there was never any question of where the copyrighted material was being stored and pirated from. Even though Napster did not condone pirated music and movies on their site, the courts decided by allowing copyrighted material to exist on their servers, they were in violation of copyright law. Napster’s days of free love were soon over.

From an historic perspective the importance of the decision to force the shut down of Napster was that it gave rise to a whole new breed of p2p applications. We detailed this phenomenon in our 2008 article.

Round Two, Mega-Upload  Shutdown, Centralized Server, 2012

We again saw a doubling down on p2p client sites (they expanded) when the Mega-Upload site, a centralized sharing site, was shutdown back in Jan 2012.

“On the legal side, the recent widely publicized MegaUpload takedown refocused attention on less centralized forms of file sharing (i.e. P2P). Similarly, improvements in P2P technology coupled with a growth in file sharing file size from content like Blue-Ray video also lead many users to revisit P2P.”

Read the full article from deepfield.net

The shut down of Mega-Upload had a personal effect on me as I had used it to distribute a 30 minute account from a 92-year-old WWII vet where he recalled, in oral detail, his experience of surviving a German prison camp.

Blocking by Signature, Alias Layer 7 Shaping, Alias Deep packet inspection. Late 1990’s till present

Initially, the shining star savior in the forefront against spotting illegal content on your network, this technology can be expensive and fail miserably in the face of newer encrypted p2p applications. It also can get quite expensive to keep up with the ever changing application signatures, and yet it is still often the first line of defense attempted by ISPs.

We covered this topic in detail, in our recent article,  Layer 7 Shaping Dying With SSL.

Blocking by Website

Blocking the source sites where users download their p2p clients is still possible. We see this method applied at mostly private secondary schools, where content blocking is an accepted practice. This method does not work for computers and devices that already have p2p clients. Once loaded, p2p files can come from anywhere and there is no centralized site to block.

Blocking Uninitiated Requests. Circa Mid-2000

The idea behind this method is to prevent your Network from serving up any content what so ever! Sounds a bit harsh, but the average Internet consumer rarely, if ever, hosts anything intended for public consumption. Yes at one time, during the early stages of the Internet, my geek friends would set up home pages similar to what everybody exposes on Facebook today. Now, with the advent hosting sites, there is just no reason for a user to host content locally, and thus, no need to allow access from the outside. Most firewalls have a setting to disallow uninitiated requests into your network (obviously with an exemption for your publicly facing servers).

We actually have an advanced version of this feature in our NetGladiator security device. We watch each IP address on your internal network and take note of outgoing requests, nobody comes in unless they were invited. For example, if we see a user on the Network make a request to a Yahoo Server , we expect a response to come back from a Yahoo server; however if we see a Yahoo server contact a user on your network without a pending request, we block that incoming request. In the world of p2p this should prevent an outside client from requesting a receiving a copyrighted file hosted on your network, after all no p2p client is going to randomly send out invites to outside servers or would they?

I spent a few hours researching this subject, and here is what I found (this may need further citations). It turns out that p2p distribution may be a bit more sophisticated and has ways to get around the block uninitiated query firewall technique.

P2P networks such as Pirate Bay use a directory service of super nodes to keep track of what content peers have and where to find them. When you load up your p2p client for the first time, it just needs to find one super node to get connected, from there it can start searching for available files.

Note: You would think that if these super nodes were aiding and abetting in illegal content that the RIAA could just shut them down like they did Napster. There are two issues with this assumption:

1) The super nodes do not necessarily host content, hence they are not violating any copyright laws. They simply coordinate the network in the same way DNS service keep track of URL names and were to find servers.
2) The super nodes are not hosted by Pirate Bay, they are basically commandeered from their network of users, who unwittingly or unknowingly agree to perform this directory service when clicking the license agreement that nobody ever reads.

From my research I have talked to network administrators that claim despite blocking uninitiated outside requests on their firewalls, they still get RIAA notices. How can this be?

There are only two ways this can happen.

1) The RIAA is taking liberty to simply accuse a network of illegal content based on the directory listings of a super node. In other words if they find a directory on a super node pointing to copyrighted files on your network, that might be information enough to accuse you.

2) More likely, and much more complex, is that the Super nodes are brokering the transaction as a condition of being connected. Basically this means that when a p2p client within your network, contacts a super node for information, the super node directs the client to send data to a third-party client on another network. Thus the send of information from the inside of your network looks to the firewall as if it was initiated from within. You may have to think about this, but it makes sense.

Behavior based thwarting of p2p. Circa 2004 – NetEqualizer

Behavior-based shaping relies on spotting the unique footprint of a client sending and receiving p2p applications. From our experience, these clients just do not know how to lay low and stay under the radar. It’s like the criminal smuggling drugs doing 100 MPH on the highway, they just can’t help themselves. Part of the p2p methodology is to find as many sources of files as possible, and then, download from all sources simultaneously. Combine this behavior with the fact that most p2p consumers are trying to build up a library of content, and thus initiating many file requests, and you get a behavior footprint that can easily be spotted. By spotting this behavior and making life miserable for these users, you can achieve self compliance on your network.

Read a smarter way to block p2p traffic.

Blocking the RIAA probing servers

If you know where the RIAA is probing from you can deny all traffic to their probes and thus prevent the probe of files on your network, and ensuing nasty letters to desist.

What Is Deep Packet Inspection and Why the Controversy?


By Art Reisman

Art Reisman CTO www.netequalizer.com

Editor’s note: Art Reisman is the CTO of APconnections. APconnections designs and manufactures the popular NetEqualizer bandwidth shaper. APconnections removed all deep packet inspection technology from their NetEqualizer product over 2 years ago.

Article Updated March 2012

As the debate over Deep Packet Inspection continues, network administrators are often faced with a difficult decision: ensure network quality or protect user privacy. However, the legality of the practice is now being called into question, adding a new twist to the mix. Yet, for many Internet users, deep packet inspection continues to be an ambiguous term in need of explanation. In the discussion that follows, deep packet inspection will be explored in the context of the ongoing debate.

Exactly what is deep packet inspection?

All traffic on the Internet travels around in what is called an IP packet. An IP packet is a string of characters moving from computer A to computer B. On the outside of this packet is the address where it is being sent. On the inside of the packet is the data that is being transmitted.

The string of characters on the inside of the packet can be conceptually thought of as the “payload,” much like the freight inside of a railroad car. These two elements, the address and the payload, comprise the complete IP packet.

When you send an e-mail across the Internet, all your text is bundled into packets and sent on to its destination. A deep packet inspection device literally has the ability to look inside those packets and read your e-mail (or whatever the content might be).

Products sold that use DPI are essentially specialized snooping devices that examine the content (pay load inside) of Internet packets. Other terms sometimes used to describe techniques that examine Internet data are packet shapers, layer-7 traffic shaping, etc.

How is deep packet inspection related to net neutrality?

Net neutrality is based on the belief that nobody has the right to filter content on the Internet. Deep packet inspection is a method used for filtering. Thus, there is a conflict between the two approaches. The net neutrality debate continues to rage in its own right.

Why do some Internet providers use deep packet inspection devices?

There are several reasons:

1) Targeted advertising If a provider knows what you are reading, they can display content advertising on the pages they control, such as your login screen or e-mail account.

2) Reducing “unwanted” traffic — Many providers are getting overwhelmed by types of traffic that they deem as less desirable such as Bittorrent and other forms of peer-to-peer. Bittorrent traffic can overwhelm a network with volume. By detecting and redirecting the Bittorrent traffic, or slowing it down, a provider can alleviate congestion.

3) Block offensive material — Many companies or institutions that perform content filtering are looking inside packets to find, and possibly block, offensive material or web sites.

4) Government spying — In the case of Iran (and to some extent China), DPI is used to keep tabs on the local population.

When is it appropriate to use deep packet inspection?

1) Full disclosure — Private companies/institutions/ISPs that notify employees that their Internet use is not considered private have the right to snoop, although I would argue that creating an atmosphere of mistrust is not the mark of a healthy company.

2) Law enforcement — Law enforcement agencies with a warrant issued by a judge would be the other legitimate use.

3) Intrusion detection and prevention– It is one thing to be acting as an ISP  and to eaves drop on a public conversation;  it is entirely another paradigm if you are a  private business examining the behavior of somebody  coming in your front door. For example in a private home it is within your right to look through your peep hole and not let shady characters into your home.  In a private business it is a good idea to use Deep packet inspection in order to block unwanted intruders from your network. Blocking bad guys before they break into and damage your network and is perfectly acceptable.

4) Spam filtering- Most consumers are very happy to have their ISP or email provider remove spam.  I would categorize this type of DPI as implied disclosure. For example, in Gmail you do have the option to turn Spam filtering off, and although most consutomers may not realize that google is reading their mail ( humans don’t read it but computer scanners do), their motives are understood. What consumers may not realize is that their email provider is also reading everything they do in order to set target advertising

Does Content filtering use Deep Packet Inspection ?

For the most part no. Content filtering is generally  done at the URL level. URL’s are generally considered public information, as routers need to look this up anyway. We have only encountered content filters at private institutions that are within their right.

What about spam filtering, does that use Deep Packet Inspection?

Yes many Spam filters will look at content, and most people could not live without their spam filter, however with spam filtering most people have opted in at one point or another, hence it is generally done with permission.

What is all the fuss about?

It seems that consumers are finally becoming aware of what is going on behind the scenes as they surf the Internet, and they don’t like it. What follows are several quotes and excerpts from articles written on the topic of deep packet inspection. They provide an overview not only of how DPI is currently being used, but also the many issues that have been raised with the practice.

For example, this is an excerpt from a recent PC world article:

Not that we condone other forms of online snooping, but deep packet inspection is the most egregious and aggressive invasion of privacy out there….It crosses the line in a way that is very frightening.

Paul Stephens, director of policy and advocacy for the Privacy Rights Clearinghouse, as quoted in the E-Commerce Times on November 14, 2008. Read the full article here.

Recently, Comcast had their hand slapped for re-directing Bittorrent traffic:

Speaking at the Stanford Law School Center for Internet and Society, FCC Chairman Kevin Martin said he’s considering taking action against the cable operator for violating the agency’s network-neutrality principles. Seems Martin was troubled by Comcast’s dissembling around the BitTorrent issue, not to mention its efforts to pack an FCC hearing on Net neutrality with its own employees.

— Digital Daily, March 10, 2008. Read the full article here.

Later in 2008, the FCC came down hard on Comcast.

In a landmark ruling, the Federal Communications Commission has ordered Comcast to stop its controversial practice of throttling file sharing traffic.

By a 3-2 vote, the commission on Friday concluded that Comcast monitored the content of its customers’ internet connections and selectively blocked peer-to-peer connections.

Wired.com, August 1, 2008.Read the full article here.

To top everything off, some legal experts are warning companies practicing deep packet inspection that they may be committing a felony.

University of Colorado law professor Paul Ohm, a former federal computer crimes prosecutor, argues that ISPs such as Comcast, AT&T and Charter Communications that are or are contemplating ways to throttle bandwidth, police for copyright violations and serve targeted ads by examining their customers’ internet packets are putting themselves in criminal and civil jeopardy.

Wired.com, May 22, 2008. Read the full article here.

However, it looks like things are going the other way in the U.K. as Britain’s Virgin Media has announced they are dumping net neutrality in favor of targeting bittorrent.

The UK’s second largest ISP, Virgin Media, will next year introduce network monitoring technology to specifically target and restrict BitTorrent traffic, its boss has told The Register.

The Register, December 16, 2008. Read the full article here.

Canadian ISPs confess en masse to deep packet inspection in January 2009.

With the amount of attention being paid to Comcast recently, a lot of people around the world have begun to look at their ISPs and wonder exactly what happens to their traffic once it leaves. This is certainly true for Canada, where several Canadian ISPs have come under the scrutiny of the CRTC, the regulatory agency responsible for Canada. After investigation, it was determined that all large ISPs in Canada filter P2P traffic in some fashion.

Tech Spot, January 21, 2009. Read the full article here.

In April 2009, U.S. lawmakers announced plans to introduce legislation that would limit the how ISPs could track users. Online privacy advocates spoke out in support of such legislation.

In our view, deep packet inspection is really no different than postal employees opening envelopes and reading letters inside. … Consumers simply do not expect to be snooped on by their ISPs or other intermediaries in the middle of the network, so DPI really defies legitimate expectations of privacy that consumers have.

Leslie Harris, president and CEO of the Center for Democracy and Technology, as quoted on PCWorld.com on April 23, 2009. Read the full article here.

The controversy continues in the U.S. as AT&T is accused of traffic shaping, lying and blocking sections of the Internet.

7/26/2009 could mark a turning point in the life of AT&T, when the future looks back on history, as the day that the shady practices of an ethically challenged company finally caught up with them: traffic filtering, site banning, and lying about service packages can only continue for so long before the FCC, along with the bill-paying public, takes a stand.

Kyle Brady, July 27, 2009. Read the full article here.

[February 2011 Update] The Egyptian government uses DPI to filter elements of their Internet Traffic, and this act in itself becomes the news story. In this video in this news piece, Al Jazeera takes the opportunity to put out an unflattering piece on the company Naurus that makes the DPI technology and sold it to the Egyptians.

While the debate over deep packet inspection will likely rage on for years to come, APconnections made the decision to fully abandon the practice over two years ago, having since proved the viability of alternative approaches to network optimization. Network quality and user privacy are no longer mutually exclusive goals.

Created by APconnections, the NetEqualizer is a plug-and-play bandwidth control and WAN/Internet optimization appliance that is flexible and scalable. When the network is congested, NetEqualizer’s unique “behavior shaping” technology dynamically and automatically gives priority to latency sensitive applications, such as VoIP and email. Click here for a full price list.

Does Lower cost bandwidth foretell a decline in Expensive Packet Shapers ?


This excerpt is from a recent interview with Art Reisman and has some good insight into the future of bandwidth control appliances.

Are you seeing a drop off in layer 7 bandwidth shapers in the marketplace?

In the early stages of the Internet, up until the early 2000s, the application signatures were not that complex and they were fairly easy to classify. Plus the cost of bandwidth was in some cases 10 times more expensive than 2010 prices. These two factors made the layer 7 solution a cost-effective idea. But over time, as bandwidth costs dropped, speeds got faster and the hardware and processing power in the layer 7 shapers actually rose. So, now in 2010 with much cheaper bandwidth, the layer 7 shaper market is less effective and more expensive. IT people still like the idea, but slowly over time price and performance is winning out. I don’t think the idea of a layer 7 shaper will ever go away because there are always new IT people coming into the market and they go through the same learning curve. There are also many WAN type installations that combine layer 7 with compression for an effective boost in throughput. But, even the business ROI for those installations is losing some luster as bandwidth costs drop.

So, how is the NetEqualizer doing in this tight market where bandwidth costs are dropping? Are customers just opting to toss their NetEqualizer in favor of adding more bandwidth?

There are some that do not need shaping at all, but then there are many customers that are moving from $50,000 solutions to our $10,000 solution as they add more bandwidth. At the lower price points, bandwidth shapers still make sense with respect to ROI. Even with lower bandwidth costs  users will almost always clog the network with new more aggressive applications. You still need a way to gracefully stop them from consuming everything, and the NetEqualizer at our price point is a much more attractive solution.

Related article on Packeteers recent Decline in Revenue

Related article Layer 7 becoming obsolete from SSL

The Inside Scoop on Where the Market for Bandwidth Control Is Going


Editor’s Note: The modern traffic shaper appeared in the market in the late 1990s. Since then market dynamics have changed significantly. Below we discuss these changes with industry pioneer and APconnections CTO Art Reisman.

Editor: Tell us how you got started in the bandwidth control business?

Back in 2002, after starting up a small ISP, my partners and I were looking for a tool that we could plug-in and take care of the resource contention without spending too much time on it. At the time, we had a T1 to share among about 100 residential users and it was costing us $1200 per month, so we had to do something.

Editor: So what did you come up with?

I consulted with my friends at Cisco on what they had. Quite a few of my peers from Bell Labs had migrated to Cisco on the coat tails of Kevin Kennedy, who was also from Bell Labs. After consulting with them and confirming there was nothing exactly turnkey at Cisco, we built the Linux Bandwidth Arbitrator (LBA) for ourselves.

How was the Linux Bandwidth Arbitrator distributed and what was the industry response?

We put out an early version for download on a site called Freshmeat. Most of the popular stuff on that site are home-user based utilities and tools for Linux. Given that the LBA was not really a consumer tool, it rose like a rocket on that site. We were getting thousands of downloads a month, and about 10 percent of those were installing it someplace.

What did you learn from the LBA project?

We eventually bundled layer 7 shaping into the LBA. At the time that was the biggest request for a feature. We loosely partnered with the Layer 7 project and a group at the Computer Science Department at the University of Colorado to perfect our layer 7 patterns and filter. Myself and some of the other engineers soon realized that layer 7 filtering, although cool and cutting edge, was a losing game with respect to time spent and costs. It was not impossible but in reality it was akin to trying to conquer all software viruses and only getting half of them. The viruses that remain will multiply and take over because they are the ones running loose. At the same time we were doing layer 7, the core idea of Equalizing,  the way we did fairness allocation on the LBA, was s getting rave reviews.

What did you do next ?

We bundled the LBA into a CD for install and put a fledgling GUI interface on it. Many of the commercial users were happy to pay for the convenience, and from there we started catering to the commercial market and now here we are with modern version of the NetEqualizer.

How do you perceive the layer 7 market going forward?

Customers will always want layer 7 filtering. It is the first thing they think of from the CIO on down. It appeals almost instinctively to people. The ability to choose traffic  by type of application and then prioritize it by type is quite appealing. It is as natural as ordering from a restaurant menu.

We are not the only ones declaring a decline in Deep packet inspection we found this opinion on another popular blog regarding bandwidth control:

The end is that while Deep Packet Inspection presentations include nifty graphs and seemingly exciting possibilities; it is only effective in streamlining tiny, very predictable networks. The basic concept is fundamentally flawed. The problem with generous networks is not that bandwidth wants to be shifted from “terrible” protocols to “excellent” protocols. The problem is volume. Volume must be managed in a way that maintains the strategic goals of the arrangement administration. Nearly always this can be achieved with a macro approach of allocating an honest share to each entity that uses the arrangement. Any attempt to micro-manage generous networks ordinarily makes them of poorer quality; or at least simply results in shifting bottlenecks from one business to another.

So why did you get away from layer 7 support in the NetEqualizer back in 2007?

When trying to contain an open Internet connection it does not work very well. The costs to implement were going up and up. The final straw was when encrypted p2p hit the cloud. Encrypted p2p cannot be specifically classified. It essentially tunnels through $50,000 investments in layer 7 shapers, rendering them impotent. Just because you can easily sell a technology does not make it right.

We are here for the long haul to educate customers. Most of our NetEqualizers stay in service as originally intended for years without licensing upgrades. Most expensive layer 7 shapers are mothballed after about 12 months are just scaled back to do simple reporting. Most products are driven by channel sales and the channel does not like to work very hard to educate customers with alternative technology. They (the channel) are interested in margins just as a bank likes to collect fees to increase profit. We, on the other hand, sell for the long haul on value and not just what we can turn quickly to customers because customers like what they see at first glance.

Are you seeing a drop off in layer 7 bandwidth shapers in the marketplace?

In the early stages of the Internet up until the early 2000s, the application signatures were not that complex and they were fairly easy to classify. Plus the cost of bandwidth was in some cases 10 times more expensive than 2010 prices. These two factors made the layer 7 solution a cost-effective idea. But over time, as bandwidth costs dropped, speeds got faster and the hardware and processing power in the layer 7 shapers actually rose. So, now in 2010 with much cheaper bandwidth, the layer 7 shaper market is less effective and more expensive. IT people still like the idea, but slowly over time price and performance is winning out. I don’t think the idea of a layer 7 shaper will ever go away because there are always new IT people coming into the market and they go through the same learning curve. There are also many WAN type installations that combine layer 7 with compression for an effective boost in throughput. But, even the business ROI for those installations is losing some luster as bandwidth costs drop.

So, how is the NetEqualizer doing in this tight market where bandwidth costs are dropping? Are customers just opting to toss their NetEqualizer in favor of adding more bandwidth?

There are some that do not need shaping at all, but then there are many customers that are moving from $50,000 solutions to our $10,000 solution as they add more bandwidth. At the lower price points, bandwidth shapers still make sense with respect to ROI.  Even with lower bandwidth costs, users will almost always clog the network with new more aggressive applications. You still need a way to gracefully stop them from consuming everything, and the NetEqualizer at our price point is a much more attractive solution.

Equalizing Compared to Application Shaping (Traditional Layer-7 “Deep Packet Inspection” Products)


Editor’s Note: (Updated with new material March 2012)  Since we first wrote this article, many customers have implemented the NetEqualizer not only to shape their Internet traffic, but also to shape their company WAN.  Additionally, concerns about DPI and loss of privacy have bubbled up. (Updated with new material September 2010)  Since we first published this article, “deep packet inspection”, also known as Application Shaping, has taken some serious industry hits with respect to US-based ISPs.   

==============================================================================================
Author’s Note: We often get asked how NetEqualizer compares to Packeteer (Bluecoat), NetEnforcer (Allot), Network Composer (Cymphonix), Exinda, and a plethora of other well-known companies that do Application Shaping (aka “packet shaping”, “deep packet inspection”, or “Layer-7” shaping).   After several years of these questions, and discussing different aspects with former and current application shaping with IT administrators, we’ve developed a response that should clarify the differences between NetEqualizer’s behavior- based approach and the rest of the pack.
We thought of putting our response into a short, bullet-by-bullet table format, but then decided that since this decision often involves tens of thousands of dollars, 15 minutes of education on the subject with content to support the bullet chart was in order.  If you want to skip the details, see our Summary Table at the end of this article

However, if you’re looking to really understand the differences, and to have the question answered as objectively as possible, please take a few minutes to read on…
==============================================================================================

How NetEqualizer compares to Bluecoat, Allot, Cymphonix, & Exinda

In the following sections, we will cover specifically when and where Application Shaping is used, how it can be used to your advantage, and also when it may not be a good option for what you are trying to accomplish.  We will also discuss how Equalizing, NetEqualizer’s behavior-based shaping, fits into the landscape of application shaping, and how in many cases Equalizing is a much better alternative.

Download the full article (PDF)  Equalizing Compared To Application Shaping White Paper

Read the rest of this entry »

$1000 Discount Offered Through NetEqualizer Cash For Conversion Program


After witnessing the overwhelming popularity of the government’s Cash for Clunkers new car program, we’ve decided to offer a similar deal to potential NetEqualizer customers. Therefore, this week, we’re announcing the launch of our Cash for Conversion program.The program offers owners of select brands (see below) of network optimization technology a $1000 credit toward the list-price purchase of NetEqualizer NE2000-10 or higher models (click here for a full price list). All owners have to do is send us your old (working or not) or out of license bandwidth control technology. Products from the following manufacturers will be accepted:

  • Exinda
  • Packeteer/Blue Coat
  • Allot
  • Cymphonics
  • Procera

In addition to receiving the $1000 credit toward a NetEqualizer, program participants will also have the peace of mind of knowing that their old technology will be handled responsibly through refurbishment or electronics recycling programs.

Only the listed manufacturers’ products will qualify. Offer good through the Labor Day weekend (September 7, 2009). For more information, contact us at 303-997-1300 or admin@apconnections.net.

Hitchhiker’s Guide To Network And WAN Optimization Technology


Manufacturers make all sorts of claims about speeding up your network with special technologies, in the following pages we’ll take a look at the different types of technologies explaining them in such a way that you the Consumer can make an informed decision on what is right for you.

Table of Contents

  • Compression – Relies on data patterns that can be represented more efficiently. Best suited for point to point leased lines.
  • Caching – Relies on human behavior , accessing the same data over and over. Best suited for point to point leased lines, but also viable for Internet Connections and VPN tunnels.
  • Protocol Spoofing – Best suited for Point to Point WAN links.
  • Application Shaping – Controls data usage based on spotting specific patterns in the data. Best suited for both point to point leased lines and Internet connections. Very expensive to maintain in both initial cost, ongoing costs and labor.
  • Equalizing – Makes assumptions on what needs immediate priority based on the data usage. Excellent choice for Internet connections and clogged VPN tunnels.
  • Connection Limits – Prevents access gridlock in routers and access points. Best suited for Internet access where p2p usage is clogging your network.
  • Simple Rate Limits – Prevents one user from getting more than a fixed amount of data. Best suited as a stop gap first effort for a remedying a congested Internet connection with a limited budget.

Compression

At first glance, the term compression seems intuitively obvious. Most people have at one time or another extracted a compressed Zip windows file. Examining the file sizes pre and post extraction reveals there is more data on the hard drive after the extraction. WAN compression products use some of the same principles only they compress the data on the WAN link and decompress it automatically once delivered, thus saving space on the link, making the network more efficient. Even though you likely understand compression on a Windows file conceptually, it would be wise to understand what is really going on under the hood during compression before making an investment to reduce network costs. Some questions to consider: How does compression really work? Are there situations where it may not work at all?

How it Works

A good, easy to visualize analogy to data compression is the use of short hand when taking dictation. By using a single symbol for common words a scribe can take written dictation much faster than if he were to spell out each entire word. Thus the basic principle behind compression techniques is to use shortcuts to represent common data. Commercial compression algorithms, although similar in principle, vary widely in practice. Each company offering a solution typically has their own trade secrets that they closely guard for a competitive advantage.

There are a few general rules common to all strategies. One technique is to encode a repeated character within a data file. For a simple example let’s suppose we were compressing this very document and as a format separator we had a row with a solid dash.

The data for this solid dash line is comprised of approximately 160 times the ASCII character “-�. When transporting the document across a WAN link without compression this line of document would require 80bytes of data, but with clever compression we can encode this using a special notation “-� X 160.

The compression device at the front end would read the 160 character line and realize: “Duh, this is stupid. Why send the same character 160 times in a row?” so it would incorporate a special code to depict the data more efficiently.

Perhaps that was obvious, but it is important know a little bit about compression techniques to understand the limits of their effectiveness. There are many types of data that cannot be efficiently compressed.

For example: many image and voice recordings are already optimized and there is very little improvement in data size that can be accomplished with compression techniques. The companies that sell compression based solutions should be able to provide you with profiles on what to expect based on the type of data sent on your WAN link.

Caching

Suppose you are the administrator for a network, and you have a group of a 1000 users that wake up promptly at 7:00 am each morning and immediately go to MSNBC.com to retrieve the latest news from Wall Street. This synchronized behavior would create 1000 simultaneous requests for the same remote page on the Internet.

Or, in the corporate world, suppose the CEO of a multinational 10,000 employee business, right before the holidays put out an all points 20 page PDF file on the corporate site describing the new bonus plan? As you can imagine all the remote WAN links might get bogged down for hours while each and every employee tried to download this file.

Well it does not take a rocket scientist to figure out that if somehow the MSNBC home page could be stored locally on an internal server that would alleviate quite a bit of pressure on your WAN link.

And in the case of the CEO memo, if a single copy of the PDF file was placed locally at each remote office it would alleviate the rush of data.

Caching does just that.

Offered by various vendors Caching can be very effective in many situations, and vendors can legitimately make claims of tremendous WAN speed improvement in some situations. Caching servers have built in intelligence to store the most recently and most frequently requested information, thus preventing future requests from traversing the WAN link unnecessarily .

You may know that most desktop browsers do their own form caching already. Many web servers keep a time stamp of their last update to data , and browsers such as the popular Internet Explorer will use a cached copy of a remote page after checking the time stamp.

So what is the downside of caching?

There are two main issues that can arise with caching:

  1. Keeping the cache current. If you access a cache page that is not current then you are at risk of getting old and incorrect information. Some things you may never want to be cached, for example the results of a transactional database query. It’s not that these problems are insurmountable, but there is always the risk that the data in cache will not be synchronized with changes.
  2. Volume. There are some 60 million web sites out on the Internet alone. Each site contains upwards of several megabytes of public information. The amount of data is staggering and even the smartest caching scheme cannot account for the variation in usage patterns among users and the likely hood they will hit an un-cached page.

Protocol Spoofing

Historically, there are client server applications that were developed for an internal LAN. Many of these applications are considered chatty. For example, to complete a transaction between a client and server, 10’s of messages may be transmitted, when perhaps one or two would suffice. Everything was fine until companies-for logistical and other reasons extended their LANs across the globe using WAN links to tie different locations together.

To get a better visual on what goes on in a chatty application perhaps an analogy will help with getting a picture in your mind. Suppose you were sending a letter to family members with your summer vacation pictures, and, for some insane reason, you decided to put each picture in a separate envelope and mail them individually on the same mail run. Obviously, this would be extremely inefficient.

What protocol spoofing accomplishes is to fake out the client or server side of the transaction and then send a more compact version of the transaction over the Internet, i.e. put all the pictures in one envelope and send it on your behalf thus saving you postage…

You might ask why not improve the inefficiencies in these chatty applications rather than write software to deal with the problem?

Good question, but that would be the subject of a totally different white paper on how IT organizations must evolve with legacy technology. It’s just beyond the scope of our white paper.

Application Shaping

One of the most popular and intuitive forms of optimizing bandwidth is a method called “application shaping” with aliases of “traffic shaping”, “bandwidth control”, and perhaps a few others thrown in for good measure. For the IT manager that is held accountable for everything that can and will go wrong on a network, or the CIO that needs to manage network usage policies, this is a dream come true. If you can divvy up portions of your WAN link to various applications then you can take control of your network and insure that important traffic has sufficient bandwidth.

At the center of application shaping is the ability to identify traffic by type. Is this Citrix traffic, streaming Audio, Kazaa peer to peer or something else?

The Fallacy of Internet Ports and Application Shaping

Many applications are expected to use Internet ports when communicating across the Internet. An Internet port is part of an Internet address, and many firewall products can easily identify ports and block or limit them. For example, the “FTP” application commonly used for downloading files uses the well know “port 21”. The fallacy with this scheme, as many operators soon find out, is that there are many applications that do not consistently use a fixed port for communication. Many application writers have no desire to be easily classified. In fact, they don’t want IT personnel to block them at all, so they deliberately design applications to not conform to any formal port assignment scheme. For this reason, any product that purports to block or alter application flows, by port, should be avoided if your primary mission is to control applications by type.

So, if standard firewalls are inadequate at blocking applications by port what can help?

As you are likely aware, all traffic on the Internet travels around in what is called an IP packet. An IP packet can very simply be thought of as a string of characters moving from Computer A to Computer B. The string of characters is called the “payload,” much like the freight inside a railroad car. On the outside of this payload, or data, is the address where it is being sent. These two elements, the address and the payload, comprise the complete IP packet. In the case of different applications on the Internet we would expect to see different kinds of payloads. For example, let’s take the example of a skyscraper being transported from New York to Los Angeles. How could this be done using a freight train? Common sense suggests that one would disassemble the office tower, stuff it into as many freight cars as it takes to transport it, and then when the train arrived in Los Angeles hopefully the workers on the other end would have the instructions on how to reassemble the tower.

Well, this analogy works with almost anything that is sent across the Internet, only the payload is some form of data, not a physical hunk of bricks, metal and wires. If we were sending a Word document as an e-mail attachment, guess what? The contents of the document would be disassembled into a bunch of IP packets and sent to the receiving e-mail client where it would be re-assembled. If I looked at the payload of each Internet packet in transit I could actually see snippets of the document in each packet and could quite easily read the words as they went by.

At the heart of all current application shaping products is special software that examines the content of Internet packets, and through various pattern matching techniques determines what type of application a particular flow is.

Once a flow is determined then the application shaping tool can enforce the operators policies on that flow.  Here are some examples:

  • Limit AIM messenger traffic to 100kbs
  • Reserve 500kbs for Shoretell voice traffic

The list of rules you can apply to traffic types and flow is unlimited.

The Downside to Application Shaping

Application shaping does work and is a very well thought out logical way to set up a network. After all, complete control over all types of traffic should allow an operator to run a clean ship, right? But as with any euphoric ideal there are drawbacks to the reality that you should be aware of.

  1. The number of applications on the Internet is a moving target. The best application shaping tools do a very good job of identifying several thousand of them, and yet there will always be some traffic that is unknown (estimated at ten percent by experts from the leading manufactures). The unknown traffic is lumped into the unknown classification and an operator must make a blanket decision on how to shape this class. Is it important? Is it not? Suppose the important traffic was streaming audio for a web cast and is not classified. Well, you get the picture. Although theory behind application shaping by type is a noble one, the cost for a company to keep current is large and there are cracks.
  2. Even if the application spectrum could be completely classified, the spectrum of applications constantly changes. You must keep licenses current to insure you have the latest in detection capabilities. And even then it can be quite a task to constantly analyze and change the mix of policies on your network. As bandwidth costs lessen, how much human time should be spent divvying up and creating ever more complex policies to optimize your WAN traffic?

Equalizing

Take a minute to think about what is really going on in your network to make you want to control it in the first place.

We can only think of a few legitimate reasons to do anything at all to your WAN: “The network is slow”, or “My VoIP call got dropped”.

If such words were never uttered than life would be grand.

So you really only have to solve these issues to be successful. Who cares about the actual speed of the WAN link or the number and types of applications running on your network or what port they are using, if you never hear these two complaints?

Equalizing goes at the heart of congestion using the basic principal of time. The reason why a network is slow or a voice call breaks up is that the network is stupid. The network grants immediate access to anybody who wants to use it, no matter what their need is. That works great much of the day when networks have plenty of bandwidth to handle all traffic demands, but it is the peak usage demands that play havoc.

Take the above statement with some simple human behavior factors. People notice slowness when real time activities break down. Accessing a web page, or sending an e-mail , chat session, voice call. All these activities will generate instant complaints if response times degrade from the “norm”.

The other fact of human network behavior is that there are bandwidth intensive applications, peer to peer, large e-mail attachments, data base back ups. These bandwidth intensive activities are attributed to a very small number of active users at any one time which makes it all the more insidious as they can consume well over ninety percent of a network’s resources at any time. Also, most of these bandwidth intensive applications can be spread out over time without notice from the user.

That data base back up for example: does it really need to be completed in three minutes at 5:30 on a Friday, or can it be done over six minutes and complete at 5:33? That would give your network perhaps fifty percent more bandwidth at no additional cost and nobody would notice. It is unlikely the user backing up their local disk drive is waiting for it to complete with stop watch in hand.

It is these unchanging human factor interactions that allow equalizing to work today, tomorrow and well into the future without need for upgrading. It looks at the behavior of the applications and usage patterns. By adhering to some simple rules of behavior the real time applications can be identified from the heavy non real time activities and thus be granted priority on the fly without any specific policies set by the IT Manager.

How Equalizing Technology Balances Traffic

Each connection on your network constitutes a traffic flow. Flows vary widely from short dynamic bursts, for example, when searching a small website, to large persistent flows, as when performing peer-to-peer file sharing.

Equalizing is determined from the answers to these questions:

  1. How persistent is the flow?
  2. How many active flows are there?
  3. How long has the flow been active?
  4. How much total congestion is currently on the trunk?
  5. How much bandwidth is the flow using relative to the link size?

Once these answers are known then Equalizing makes adjustments to flow by adding latency to low-priority tasks so high-priority tasks receive sufficient bandwidth. Nothing more needs to be said and nothing more needs to be administered to make it happen, once set up it need not be revisited.

Exempting Priority Traffic

Many people often point out that although equalizing technology sounds promising that it may be prone to mistakes with such a generic approach to traffic shaping. What if a user has a high priority bandwidth intensive video stream that must get through, wouldn’t this be the target of a miss-applied rule to slow it down?

The answer is yes, but what we have found is that high bandwidth priority streams are usually few in number and known by the administrator; they rarely if ever pop up spontaneously, so it is quite easy to exempt such flows since they are the rare exception. This is much easier than trying to classify every flow on your network at all times.

Connection Limits

Often overlooked as a source of network congestion is the number of connections a user generates. A connection can be defined as a single user communicating with a single Internet site. Take accessing the Yahoo home page for example. When you access the Yahoo home page your browser goes out to Yahoo and starts following various links on the Yahoo page to retrieve all the data. This data is typically not all at the same Internet address, so your browser may access several different public Internet locations to load the Yahoo home page, perhaps as many as ten connections over a short period of time. Routers and access points on your local network must keep track of these “connections” to insure that the data gets routed back to the correct browser. Although ten connections to the Yahoo home page is not excessive over a few seconds there are very poorly behaved applications, (most notably Gnutella, Bear Share, and Bittorrent), which are notorious for opening up 100’s or even 1000’s of connections in a short period of time. This type of activity is just as detrimental to your network as other bandwidth eating applications and can bring your network to a grinding halt. The solution is to make sure any traffic management solution deployed incorporates some form of connection limiting features.

Simple Rate Limits

The most common and widely used form of bandwidth control is the simple rate limit. This involves putting a fixed rate cap on a single IP address as per often is the case with rate plans promised by ISPs to their user community. “2 meg up and 1 meg down” is a common battle cry, but what happens in reality with such rate plans?

Although setting simple rates limits is far superior to running a network wide open we often call this “set, forget, and pray”!

Take for example six users sharing a T1 if each of these six users gets a rate of 256kbs up and 256kbs down. Then these six users each using their full share of 256 kilo bits per second is the maximum amount a T1 can handle. Although it is unlikely that you will hit gridlock with just six users, when the number of users reaches thirty, gridlock becomes likely, and with forty or fifty users, it becomes a certainty to happen quite often. It is not uncommon for schools, wireless ISPs, and executive suites to have sixty users to as many as 200 users sharing a single T1 with simple fixed user rate limits as the only control mechanism.

Yes, simple fixed user rate limiting does resolve the trivial case where one or two users, left unchecked, can use all available bandwidth; however unless your network is not oversold there is never any guarantee that busy-hour conditions will not result in gridlock.

Conclusion

The common thread to all WAN optimization techniques is they all must make intelligent assumptions about data patterns or human behavior to be effective. After all, in the end, the speed of the link is just that, a fixed speed that cannot be exceeded. All of these techniques have their merits and drawbacks, the trick is finding a solution best for your network needs. Hopefully the background information contained in this document will give you information so you the consumer can make an informed decision.

%d bloggers like this: