By Art Reisman, CTO, http://www.netequalizer.com
Disclaimer: It is considered controversial and by some definitions illegal for a US-based ISP to use deep packet inspection on the public Internet.
At APconnections, we subscribe to the philosophy that there is more to be gained by explaining your technology secrets than by obfuscating them with marketing babble. Read on to learn how I hunt down aggressive P2P traffic.
In order to create a successful tool for blocking a P2P application, you must first figure out how to identify P2P traffic. I do this by looking at the output data dump from a P2P session.
To see what is inside the data packets I use a custom sniffer that we developed. Then to create a traffic load, I use a basic Windows computer loaded up with the latest utorrent client.
Editors Note: The last time I used a P2P engine on a Windows computer, I ended up reloading my Windows OS once a week. Downloading random P2P files is sure to bring in the latest viruses, and unimaginable filth will populate your computer.
The custom sniffer is built into our NetGladiator device, and it does several things:
1) It detects and dumps the data inside packets as they cross the wire to a file that I can look at later.
2) It maps non printable ASCII characters to printable ASCII characters. In this way, when I dump the contents of an IP packet to a file, I don’t get all kinds of special characters embedded in the file. Since P2P data is encoded random music files and video, you can’t view data without this filter. If you try, you’ll get all kinds of garbled scrolling on the screen when you look at the raw data with a text editor.
So what does the raw data output dump of a P2P client look like ?
Here is a snippet of some of the utorrent raw data I was looking at just this morning. The sniffer has converted the non printable characters to “x”.
You can clearly see some repeating data patterns forming below. That is the key to identifying anything with layer 7. Sometimes it is obvious, while sometimes you really have work to find a pattern.
Packet 1 exx_0ixx`12fb*!s[`|#l0fwxkf)d1:ad2:id20:c;&h45h”2x#5wg;|l{j{e1:q4:ping1:t4:ka 31:v4:utk21:y1:qe
Packet 2 exx_0jxx`1kmb*!su,fsl0′_xk<)d1:ad2:id20:c;&h45h”2x#5wg;|l{j{e1:q4:ping1:t4:xv4^1:v4:utk21:y1:qe
Packet 3 exx_0kxx`1exb*!sz{)8l0|!xkvid1:ad2:id20:c;&h45h”2x#5wg;|l{j{e1:q4:ping1:t4:09hd1:v4:utk21:y1:qe
Packet 4 exx_0lxx`19-b*!sq%^:l0tpxk-ld1:ad2:id20:c;&h45h”2x#5wg;|l{j{e1:q4:ping1:t4:=x{j1:v4:utk21:y1:qe
The next step is to develop a layer 7 regular expression to identify the patterns in the data. In the output you’ll notice the string “exx” appears in line, and that is what you look for. A repeating pattern is a good place to start.
The regular expression I decided to use looks something like:
exx.0.xx.*qe
This translates to: match any string starting with “exx” followed, by any character “.” followed by “0″, followed by “xx”, followed by any sequence of characters ending with “qe”.
Note: When I tested this regular expression it turns out to only catch a fraction of the Utorrent, but it is a start. What you don’t want to do is make your regular expression so simple that you get false positives. A layer 7 product that creates a high degree of false positives is pretty useless.
The next thing I do with my new regular expression is a test for accuracy of target detection and false positives.
Accuracy of detection is done by clearing your test network of everything except the p2p target you are trying to catch, and then running your layer 7 device with your new regular expression and see how well it does.
Below is an example from my NetGladiator in a new sniffer mode. In this mode I have the layer 7 detection on, and I can analyze the detection accuracy. In the output below, the sniffer puts a tag on every connection that matches my utorrent regular expression. In this case, my tag is indicated by the word “dad” at the end of the row. Notice how every connection is tagged. This means I am getting 100 percent hit rate for utorrent. Obviously I doctored the output for this post :)
ndex SRCP DSTP Wavg Avg IP1 IP2 Ptcl Port Pool TOS
0 0 0 17 53 255.255.255.255 95.85.150.34 — 2 99 dad
1 0 0 16 48 255.255.255.255 95.82.250.60 — 2 99 dad
2 0 0 16 48 255.255.255.255 95.147.1.179 — 2 99 dad
3 0 0 18 52 255.255.255.255 95.252.60.94 — 2 99 dad
4 0 0 12 24 255.255.255.255 201.250.236.194 — 2 99 dad
5 0 0 18 52 255.255.255.255 2.3.200.165 — 2 99 dad
6 0 0 10 0 255.255.255.255 99.251.180.164 — 2 99 dad
7 0 0 88 732 255.255.255.255 95.146.136.13 — 2 99 dad
8 0 0 12 0 255.255.255.255 189.202.6.133 — 2 99 dad
9 0 0 12 24 255.255.255.255 79.180.76.172 — 2 99 dad
10 0 0 16 48 255.255.255.255 95.96.179.38 — 2 99 dad
11 0 0 11 16 255.255.255.255 189.111.5.238 — 2 99 dad
12 0 0 17 52 255.255.255.255 201.160.220.251 — 2 99 dad
13 0 0 27 54 255.255.255.255 95.73.104.105 — 2 99 dad
14 0 0 10 0 255.255.255.255 95.83.176.3 — 2 99 dad
15 0 0 14 28 255.255.255.255 123.193.132.219 — 2 99 dad
16 0 0 14 32 255.255.255.255 188.191.192.157 — 2 99 dad
17 0 0 10 0 255.255.255.255 95.83.132.169 — 2 99 dad
18 0 0 24 33 255.255.255.255 99.244.128.223 — 2 99 dad
19 0 0 17 53 255.255.255.255 97.90.124.181 — 2 99 dad
A bit more on reading this sniffer output…
Notice columns 4 and 5, which indicate data transfer rates in bytes per second. These columns contain numbers that are less than 100 bytes per second – Very small data transfers. This is mostly because as soon as that connection is identified as utorrent, the NetGladiator drops all future packets on the connection and it never really gets going. One thing I did notice is that the modern utorrent protocol hops around very quickly from connection to connection. It attempts not to show it’s cards. Why do I mention this? Because in layer 7 shaping of P2P, speed of detection is everything. If you wait a few milliseconds too long to analyze and detect a torrent, it is already too late because the torrent has transferred enough data to keep it going. It’s just a conjecture, but I suspect this is one of the main reasons why this utorrent is so popular. By hopping from source to source, it is very hard for an ISP to block this one without the latest equipment. I recently wrote a companion article regarding the speed of the technology behind a good layer 7 device.
The last part of testing a regular expression involves looking for false positives. For this we use a commercial grade simulator. Our simulator uses a series of pre-programmed web crawlers that visit tens of thousands of web pages an hour at our test facility. We then take our layer 7 device with our new regular expression and make sure that none of the web crawlers accidentally get blocked while reading thousands of web pages. If this test passes we are good to go with our new regular expression.
Editors Note: Our primary bandwidth shaping product manages P2P without using deep packet inspection.
The following layer 7 techniques can be run on our NetGladiator Intrusion Prevention System. We also advise that public ISPs check their country regulations before deploying a deep packet inspection device on a public network.










Behind the Scenes on the latest Comcast Ruling on Net Neutrality
April 7, 2010 — netequalizerYesterday the FCC ruled in favor of Comcast regarding their rights to manipulate consumer traffic . As usual, the news coverage was a bit oversimplified and generic. Below we present a breakdown of the players involved, and our educated opinion as to their motivations.
1) The Large Service Providers for Internet Service: Comcast, Time Warner, Quest
From the perspective of Large Service Providers, these companies all want to get a return on their investment, charging the most money the market will tolerate. They will also try to increase market share by consolidating provider choices in local markets. Since they are directly visible to the public, they will also be trying to serve the public’s interest at heart; for without popular support, they will get regulated into oblivion. Case in point, the original Comcast problems stemmed from angry consumers after learning their p2p downloads were being redirected and/or blocked.
Any and all government regulation will be opposed at every turn, as it is generally not good for private business. In the face of a strong headwind, don’t be surprised if Large Service Providers might try to reach a compromise quickly to alleviate any uncertainty. Uncertainty can be more costly than regulation.
To be fair, Large Service Providers are staffed top to bottom with honest, hard-working people but, their decision-making as an entity will ultimately be based on profit. To be the most profitable they will want to prevent third-party Traditional Content Providers from flooding their networks with videos. That was the original reason why Comcast thwarted bittorrent traffic. All of the Large Service Providers are currently, or plotting to be, content providers, and hence they have two motives to restrict unwanted traffic. Motive one, is to keep their capacities in line with their capabilities for all generic traffic. Motive two, would be to thwart other content providers, thus making their content more attractive. For example who’s movie service are you going to subscribe with? A generic cloud provider such as Netflix whose movies run choppy or your local provider with better quality by design?
2) The Traditional Content Providers: Google, YouTube, Netflix etc.
They have a vested interest in expanding their reach by providing expanded video content. Google, with nowhere to go for new revenue in the search engine and advertising business, will be attempting an end-run around Large Service Providers to take market share. The only thing standing in their way is the shortcomings in the delivery mechanism. They have even gone so far as to build out an extensive, heavily subsidized, fiber test network of their own. Much of the hubbub about Net Neutrality is based on a market play to force Large Service Providers to shoulder the Traditional Content Providers’ delivery costs. An analogy from the bird world would be the brown-headed cowbird, where the mother lays her eggs in another bird’s nest, and then lets her chicks be raised by an unknowing other species. Without their own delivery mechanism direct-to-the-consumer, the Traditional Content Providers must keep pounding at the FCC for rulings in their favor. Part of the strategy is to rile consumers against the Large Service Providers, with the Net Neutrality cry.
3) The FCC
The FCC is a government organization trying to take their existing powers, which were granted for airwaves, and extend them to the Internet. As with any regulatory body, things start out well-intentioned, protection of consumers etc., but then quickly they become self-absorbed with their mission. The original reason for the FCC was that the public airways for television and radio have limited frequencies for broadcasts. You can’t make a bigger pipe than what frequencies will allow, and hence it made sense to have a regulatory body oversee this vital resource. In the early stages of commercial radio, there was a real issue of competing entities broadcasting over each other in an arms race for the most powerful signal. Along those lines, the regulatory entity (FCC) has forever expanded their mission. For example, the government deciding what words can be uttered on primetime is an extension of this power.
Now with Internet, the FCC’s goal will be to regulate whatever they can, slowly creating rules for the “good of the people”. Will these rules be for the better? Most likely the net effect is no; left alone the Internet was fine, but agencies will be agencies.
4) The Administration and current Congress
The current Administration has touted their support of Net Neutrality, and perhaps have been so overburdened with the battle on health care and other pressing matters that there has not been any regulation passed. In the face of the aftermath of the FCC getting slapped down in court to limit their current powers, I would not be surprised to see a round of legislation on this issue to regulate Large Service Providers in the near future. The Administraton will be painted as consumer protection against big greedy companies that need to be reigned in, as we have seen with banks, insurance companies, etc…. I hope that we do not end up with an Internet Czar, but some regulation is inevitable, if nothing else for a revenue stream to tap into.
5) The Public
The Public will be the dupes in all of this, ignorant voting blocks lobbied by various scare tactics. The big demographic difference on swaying this opinion will be much different from the health care lobby. People concerned for and against Internet Regulation will be in income brackets that have a higher education and employment rate than the typical entitlement lobbies that support regulation. It is certainly not going to be the AARP or a Union Lobbyist leading the charge to regulate the Internet; hence legislation may be a bit delayed.
6) Al Gore
Not sure if he has a dog in this fight; we just threw him in here for fun.
7) NetEqualizer
Honestly, bandwidth control will always be needed, as long as there is more demand for bandwidth than there is bandwidth available. We will not be lobbying for or against Net Neutrality.
8) The Courts
This is an area where I am a bit weak in understanding how a Court will follow legal precedent. However, it seems to me that almost any court can rule from the bench, by finding the precedent they want and ignoring others if they so choose? Ultimately, Congress can pass new laws to regulate just about anything with impunity. There is no constitutional protection regarding Internet access. Most likely the FCC will be the agency carrying out enforcement once the laws are in place.
Share this:
Like this: