Stick a Fork in Third Party Caching (Squid Proxy)


I was just going through our blog archives and noticed that many of the caching articles we promoted circa 2011 are still getting hits.  Many of the hits are coming from less developed countries where bandwidth is relatively expensive when compared to the western world.  I hope that businesses and ISPs hoping for a miracle using caching will find this article, as it applies to all third-party caching engines, not just the one we used to offer as an add-on to the NetEqualizer.

So why do I make such a bold statement about third-party caching becoming obsolete?

#1) There have been some recent changes in the way Google provides YouTube content, which makes caching it almost impossible.  All of their YouTube videos are generated dynamically and broken up into segments, to allow differential custom advertising.  (I yearn for the days without the ads!)

#2) Almost all pages and files on the Internet are marked “Do not Cache” in the HTML headers. Some of them will cache effectively, but you must assume the designer plans on making dynamic, on the fly, changes to their content.  Caching an obsolete page and delivering it to an end user could actually result in serious issues, and perhaps even a lawsuit, if you cause some form of economic harm by ignoring the “do not cache” directive.

#3) Streaming content as well as most HTML content is now encrypted, and since we are not the NSA, we do not have a back door to decrypt and deliver from our caching engines.

As you may have noticed I have been careful to point out that caching is obsolete on third-party caching engines, not all caching engines, so what gives?

Some of the larger content providers, such as Netflix, will work with larger ISPs to provide large caching servers for their proprietary and encrypted content. This is a win-win for both Netflix and the Last Mile ISP.  There are some restrictions on who Netflix will support with this technology.  The point is that it is Netflix providing the caching engine, for their content only, with their proprietary software, and a third-party engine cannot offer this service.  There may be other content providers providing a similar technology.  However, for now, you can stick a fork in any generic third-party caching server.

Caching Your iOS Updates Made Easy


If you have talked to us about caching in recent months, you probably know that we are now lukewarm on open ended third party caching servers . The simple un-encrypted content of the Internet circa 2010 has been replaced by dynamically generated pages along with increased content encryption.  It’s not that the caching servers don’t work, it’s just that if they follow rules of good practice, the amount of data that a caching server can cache has diminished greatly over the last few years.

The good news is that Apple has realized the strain they are putting on Business and ISP networks when their updates come out. They have recently released an easy to implement low-cost caching solution specifically for Apple content.  In fact, one of our customers noted in a recent discussion group that they are using an old MAC mini to cache iOS updates for an entire College Campus.

Other notes on Caching Options

Akamai offers a cloud solution. Usually hosted at larger providers, but if you are buying bandwidth in bulk sometimes you can often piggyback on their savings and get a discount on cached traffic.

There is also a service offered by Netflix for larger providers.  However, last I checked you must be using 10 gigabits sustained Netflix traffic to qualify.

So You Think you Have Enough Bandwidth?


There are actually only two tiers of bandwidth , video for all, and not video for all. It is a fairly black and white problem. If you secure enough bandwidth such that 25 to 30 percent of your users can simultaneously watch video feeds, and still have some head room on your circuit, congratulations  – you have reached bandwidth nirvana.

Why is video the lynchpin in this discussion?

Aside from the occasional iOS/Windows update, most consumers really don’t use that much bandwidth on a regular basis. Skype, chat, email, and gaming, all used together, do not consume as much bandwidth as video. Hence, the marker species for congestion is video.

Below, I present some of the metrics to see if you can mothball your bandwidth shaper.

1) How to determine the future bandwidth demand.
Believe it or not, you can outrun your bandwidth demand, if your latest bandwidth upgrade is large enough to handle the average video load per customer.  Then it is possible that no further upgrades will be needed, at least in the foreseeable future.

In the “Video for all” scenario the rule of thumb is you can assume 25 percent of your subscribers watching video at any one time.  If you still have 20 percent of your bandwidth left over, you have reached the video for all threshold.

To put some numbers to this
Assume 2000 subscribers, and a 1 gigabit link. The average video feed will require about 2 megabits. (note some HD video is higher than this )  This would mean, to support video 25 percent of your subscribers would use the entire 1 gigabit and there is nothing left over anybody else, hence you will run out of  bandwidth.

Now if you have 1.5 gigabits for 2000 subscribers you have likely reached the video for all threshold, and most likely you will be able to support them without any advanced intelligent bandwidth control . A simple 10 megabit rate cap per subscriber is likely all you would need.

2) Honeymoon periods are short-lived.
The reason why the reprieve in congestion after a bandwidth upgrade is so short-lived is usually because the operator either does not have a good intelligent bandwidth control solution, or they take their existing solution out thinking mistakenly they have reached the “video for all” level.  In reality, they are still under the auspices of the video not for all. They are lulled into a false sense of security for a brief honeymoon period.  After the upgrade things are okay. It takes a while for a user base to fill the void of a new bandwidth upgrade.

Bottom line: Unless you have the numbers to support 25 to 30 percent of your user base running video you will need some kind of bandwidth control.

Surviving iOS updates


The birds outside my office window are restless. I can see the strain in the Comcast cable wires as they droop, heavy with the burden of additional bits, weighting them down like a freak ice storm.   It is time, once again, for Apple to update every device in the Universe with their latest IOS update.

Assuming you are responsible for a Network with a limited Internet pipe, and you are staring down 100 or more users, about to hit the accept button for their update, what can you do to prevent your user network from being gridlocked?

The most obvious option to gravitate to is caching. I found this nice article (thanks Luke) on the Squid settings used for a previous iOS update in 2013.  Having worked with Squid quite a bit helping our customers, I was not surprised on the amount of tuning required to get this to work, and I suspect there will be additional changes to make it work in 2014.

If you have a Squid caching solution already up and running it is worth a try, but I am on the fence of recommending a Squid install from scratch.  Why? Because we are seeing diminishing returns from Squid caching each year due to the amount of dynamic content.  Translation: Very few things on the Internet come from the same place with the same filename anymore, and for many content providers they are marking much of their content as non-cacheable.

If you have a NetEqualizer in place you can easily blunt the effects of the data crunch with a standard default set-up. The NetEqualizer will automatically push the updates out further into time, especially during peak hours when there is contention. This will allow other applications on your network to function normally during the day. I doubt anybody doing the update will notice the difference.

Finally if you are desperate, you might be able to block access to anything IOS update on your firewall.  This might seem a bit harsh, but then again Apple did not consult with you, and besides isn’t that what the free Internet at Starbucks is for?

Here is a snippet pulled from a forum on how to block it.

iOS devices check for new versions by polling the server mesu.apple.com. This is done via HTTP, port 80. Specifically, the URL is:

http://mesu.apple.com/assets/com_apple_MobileAsset_SoftwareUpdate/com_apple_MobileAsset_SoftwareUpdate.xml

If you block or redirect mesu.apple.com, you will inhibit the check for software updates. If you are really ambitious, you could redirect the query to a cached copy of the XML, but I haven’t tried that. Please remove the block soon; you wouldn’t want to prevent those security updates, would you?

Stuck on Desert Island, Do You Take Your Caching Server or Your Netequalizer ?


Caching is a great idea and works well, but I’ll take my NetEqualizer with me if forced to choose between the two on my remote island with a satellite link.

Yes there are  a few circumstances where a caching server might have a nice impact. Our most successful deployments are in educational environments where the same video is watched repeatedly as an assignment;  but for most wide open installations  ,expectations of performance far outweigh reality.   Lets  have at look at what works and also drill down on expectations that are based on marginal assumptions.

From my personal archive of experience here are some of the expectations attributed to caching that perhaps are a bit too optimistic.

“Most of my users go to their Yahoo or Face Book home page every day when they log in and that is the bulk of all they do”

– I doubt this customer’s user base is that conformist :),   and they’ll find out once they install their caching solution.  But even if true, only some of the content on Face  Book and Yahoo is static.  A good portion of these pages are by default dynamic, and ever-changing with content.  They are marked as Dynamic in their URLs which means the bulk of the page must be reloaded each time.  For example,  in order for caching to have an impact , the users in this scenario would have to stick to their home pages , and not look at friend photo’s or other pages.

” We expect to see a 30 percent hit rate when we deploy our cache.”

You won’t see a 30 percent hit rate, unless somebody designs some specific robot army to test your cache, hitting the same pages over and over again. Perhaps, on IOS update day, you might see a bulk of your hits going to the same large file and have a significant performance boost for a day. But overall you will be  doing well if  you get a 3 or 4 percent hit rate.

” I expect the cache hits to take pressure off my Internet Link”

Assuming you want your average user to experience a fast loading Internet, this is where you really want your NetEqualizer ( or similar intelligent bandwidth controller) over your caching engine. The smart bandwidth controller can re-arrange traffic on the fly insuring Interactive hits get the best response. A caching engine does not have that intelligence.

Let’s suppose you have a 100 megabit link to the Internet ,and you install a cache engine that effectively gets a 6 percent hit rate. That would be exceptional  hit rate.

So what is the  end user experience with a 6 percent hit rate compared to pre-cache ?

-First off, it is not the hit rate that matters when looking at total bandwidth. Much of those hits will likely be smallish image  files from the Yahoo home page or common sites, that account for less than 1 percent of your actual traffic.  Most of your traffic is likely dominated by large file downloads and only a portion of those may be coming from cache.

– A 6 percent hit rate means that 94 percent miss rate , and if your Internet was slow from congestion before the caching server it will still be slow 94 percent of the time.

– Putting in a caching server  would be like upgrading your bandwidth from 100 megabits to 104 megabits to relieve congestion. That cache hits may add to the total throughput in your reports, but the 100 megabit bottleneck is still there, and to the end user, there is little or no difference in user perception at this point.  A  portion of your Internet access is still marginal or unusable during peak times, and other than the occasional web page or video loading nice and snappy , users are getting duds most of the time.

Even the largest caching server is insignificant in how much data it can store.

– The Internet is Vast and your Cache is not. Think of a tiny Ant standing on top of Mount Everest. YouTube puts up 100 hours of new content every minute of every day. A small commercial caching server can store about 1/1000 of what YouTube uploads in day, not to mention yesterday and the day before and last year. It’s just not going to be in your cache.

So why is a NetEqualizer bandwidth controller so much more superior than a caching server when changing user perception of speed?  Because the NetEqualizer is designed to keep Internet access from crashing , and this is accomplished by reducing the large file transfers and video download footprints during peak times. Yes these videos  and downloads may be slow or sporadic, but they weren’t going to work anyway, so why let them crush the interactive traffic ? In the end caching and equalizing are not perfect, but from real world trials the equalizer changes the user experience from slow to fast for all Interactive transactions, caching is hit or miss ( pun intended).

Squid Caching Can be Finicky


Editors Note: The past few weeks we have been working on tuning and testing our caching engine. We have been working  closely with  some of the developers who contribute to the Squid open source program.

Following are some of my  observations and discoveries regarding Squid Caching from our testing process.

Our primary mission was to make sure YouTube files cache correctly ( which we have done). One of the tricky aspects of caching a YouTube file, is that many of these files are considered dynamic content. Basically, this means their content contains a portion that may change with each access, sometimes the URL itself is just a pointer to a server where the content is generated fresh with each new access.

An extreme example of dynamic content would be your favorite stock quote site. During the business day much of the information on these pages is changing constantly, thus it  is obsolete within seconds. A poorly designed caching engine would do much more harm than good if it served up out of data stock quotes.

Caching engines by default try not cache dynamic content, and for good reason.    There are two different methods a caching server uses to decide whether or not to cache a page

1) The web designer can specifically set flags in the  format the actual URL  to tell caching engines whether a page is safe to cache or not.

In a recent test I set up a crawler to walk through the excite web site and all its urls. I use this crawler to create load in our test lab as well as to fill up our caching engine with repeatable content. I set my Squid Configuration file to cache all content less than 4k. Normally this would generate a great deal of Web hits , but for some reason none of the Excite content would cache. Upon further analysis our Squid consultant found the problem.

  I have completed the initial analysis. The problem is the excite.com
server(s). All of the “200 OK” excite.com responses that I have seen
among the first 100+ requests contain Cache-Control headers that
prohibit their caching by shared caches. There appears to be only two
kinds of Cache-Control values favored by excite:

Cache-Control: no-store, no-cache, must-revalidate, post-check=0,
               pre-check=0

and

Cache-Control: private,public

Both are deadly for a shared Squid cache like yours. Squid has options
to overwrite most of these restrictions, but you should not do that for
all traffic as it will likely break some sites.”

2) The second method is a bit more passive than deliberate directives.  Caching engines look at the actual URL of a page to gain clues about its permanence. A “?” used in the url implies dynamic content and is generally a  red flag to the caching server . And here-in lies the issue with caching Youtube files, almost all of them have  a “?” embedded within their URL.

Fortunately  Youtube Videos,  are normally permanent and unchanging once they are uploaded. I am still getting a handle these pages, but it seems the dynamic part is used for the insertion of different advertisements on the front end of the Video.  Our squid caching server uses a normalizing technique to keep the root of the URL consistent and thus serve up the correct base YouTube every time. Over the past two years we have had to replace our normalization technique twice in order to consistently cache YouTube files.

Caching in the Cloud is Here


By Art Reisman, CTO APconnections (www.netequalizer.com)

I just got a note from a customer, a University, that their ISP is offering them 200 megabit internet at fixed price. The kicker is, they can also have access to a 1 gigabit feed specifically for YouTube at no extra cost.  The only explanation for this is that their upstream ISP has an extensive in-network YouTube cache. I am just kicking myself for not seeing this coming!

I was well-aware that many of the larger ISPs cached NetFlix and YouTube on a large scale, but this is the first I have heard of a bandwidth provider offering a special reduced rate for YouTube to a customer downstream. I am just mad at myself for not predicting this type of offer and hearing about it from a third party.

As for the NetEqualizer, we have already made adjustments in our licensing for this differential traffic to come through at no extra charge beyond your regular license level, in this case 200 megabits. So if for example, you have a 350 megabit license, but have access to a 1Gbps YouTube feed, you will pay for a 350 megabit license, not 1Gbps.  We will not charge you for the overage while accessing YouTube.

A Novel Idea on How to Cache Data Completely Transparently


By Art Reisman

Recently I got a call from a customer claiming our Squid proxy was not retrieving videos from cache when expected.

This prompted me to set up a test in our lab where I watched  four videos over and over. With each iteration, I noticed that the proxy would  sometimes go out and fetch a new copy of a video, even though the video was already in the local cache, thus confirming the customer’s observation.

Why does this happen?

I have not delved down into the specific Squid code yet, but I think It has to do with the dynamic redirection performed by YouTube in the cloud, and the way the Squid proxy interprets the URL.  If you look closely at YouTube URLs, there is a CGI component in the name, the word “what” followed by a question mark “?”.  The URLs  are not static. Even though I may be watching the same YouTube on successive tries, the cloud is getting the actual video from a different place each time, and so the Squid proxy thinks it is new.

Since caching old copies of data is a big no-no, my Squid proxy, when in doubt, errors on the side of caution and fetches a new copy.

The other hassle with using a proxy caching server  is the complexity of  setting up port re-direction (special routing rules). By definition the Proxy must fake out the client making the request for the video. Getting this re-direction to work requires some intimate network knowledge and good troubleshooting techniques.

My solution for the above issues is to just toss the traditional Squid proxy altogether and invent something easier to use.

Note: I have run the following idea  by the naysayers  (all of my friends who think I am nuts), and yes, there are still  some holes in this idea. I’ll represent their points after I present my case.

My caching idea

To get my thought process started, I tossed all that traditional tomfoolery with re-direction and URL name caching out the window.

My caching idea is to cache streams of data without regard to URL or filename.  Basically, this would require a device to save off streams of characters as they happen.  I am already very familiar with implementing this technology; we do it with our CALEA probe.  We have already built technology that can capture raw streams of data, store, and then index them, so this does not need to be solved.

Figuring out if a subsequent stream matched a stored stream would be a bit more difficult but not impossible.

The benefits of this stream-based caching scheme as I see them:

1) No routing or redirection needed, the device could plugged into any network link by any weekend warrior.

2) No URL confusion.  Even if a stream (video) was kicked off from a different URL, the proxy device would recognize the character stream coming across the wire to be the same as a stored stream in the cache, and then switch over to the cached stream when appropriate, thus saving the time and energy of fetching the rest of the data from across the Internet.

The pure beauty of this solution is that just about any consumer could plug it in without any networking or routing knowledge.

How this could be built

Some rough details on how this would be implemented…

The proxy would cache the most recent 10,000 streams.

1) A stream would be defined as occurring when continuous data was transferred in one direction from an IP and port to another IP and port.

2) The stream would terminate and be stored when the port changed.

3) The server would compare the beginning parts of new streams to streams already in cache, perhaps the first several thousand characters.  If there was a match, it would fake out the sender and receiver and step in the middle and continue sending the data.

What could go wrong

Now for the major flaws in this technology that must be overcome.

1) Since there is no title on the stream from the sender, there would always be the chance that the match was a coincidence.  For example, an advertisement appended to multiple YouTube videos might fool the caching server. The initial sequence of bytes would match the advertisement and not the following video.

2) Since we would be interrupting a client-server transaction mid-stream, the server would have to be cut-off in the middle of the stream when the proxy took over.  That might get ugly as the server tries to keep sending. Faking an ACK back to the sending server would also not be viable, as the sending server would continue to send data, which is what we are trying to prevent with the cache.

Next step, (after I fix our traditional URL matching problem for the customer) is to build an experimental version of stream-based caching.

Stay tuned to see if I can get this idea to work!

The World’s Biggest Caching Server


Caching solutions are used in all shapes and sizes to speed up Internet data retrieval. From your desktop keeping a local copy of the last web page viewed, to your cable company keeping an entire library of NetFlix movies,  there is a broad diversity in the scope and size of  caching solutions.

So, what is the biggest caching server out there?  Moreover, if I found the world’s largest caching server, would  it store  just a tiny microscopic subset of the total data  available from the public  Internet?   Is it possible that somebody has actually cached everything Internet? A caching server the size of the Internet seems absurd, but I decided to investigate anyway, and so with an open mind, I set out to find the biggest caching server in the world.  Below I have detailed my research and findings.

As always I started with Google, but not in the traditional sense. If you think about Google, they seem to have every  public page on the Internet indexed. That is a huge amount of data, and I suspect  they are the worlds biggest caching server.  Asserting Google as the worlds largest caching server seems logical , but somewhat hollow and unsubstantiated, my next step was to quantify my assertion.

To figure out how much data is actually stored by Google,  in a weird twist of logic, I figured the best way to estimate the size of the stored data would be to determine what data is not stored in Google.

I would need to find a good way to stumble into some truly random web pages without using Google to find them, and then specifically test to see if Google knew about those pages by  asking Google to search for unique, deep rooted, text strings within those sites.

Rather than ramble too much, I’ll just walk through one of my experiments below.

To find a random Web site, I started with  one of those random web site stumblers. As advertised, it took me to a  random  web site titled, “Finest Polynesian Tiki Objects”. From there, I looked for unique text strings on the Tiki site.  The  idea here is find a sentence of text from this site that is not likely to found anywhere but on this site. In essence something deep enough so as not to be a deliberatly indexed title already submitted to google.   I poked around on the Tiki site  and found some seemingly innocuous text on their merchant  site. “Presenting Genuine Witco Art – every piece will come with a scanned”. I put that exact string in my Google search box and presto there it was.

Screen Shot 2013-05-29 at 4.21.04 PM

Wow it looks like Google has this somewhat random page archived and indexed because it came up in my search.

A sample set of two data points is not large enough to extrapolate from and draw conclusions, so I repeated my experiment a few more times and here are more samples of what I found….

Try number two.

Random Web Site

http://www.genarowlandsband.com/contact.php

Search String In Google

“For booking or general whatnot, contact Bob. Heck, just write to say hello if you feel like it.”

Screen Shot 2013-05-30 at 2.06.35 PM

It worked again, it found the exact page from a search on a string buried deep on the page.

And then I did it again.

Screen Shot 2013-05-30 at 2.18.55 PM

And again Google found the page.

The conclusion is that Google has cached close to 100 percent of the publicly accessible text on the Internet. In fairness to Google’s competitors they also found the same Web pages using the same search terms.

So how much data is cached in terms of a raw number?

 

There are plenty of public statistics for number of Web sites/pages connected to the Internet, and there is also data detailing the average size of a Web Page, what I have not determined  is how much of the Video, and Images are cached by Google, I do know they are working on image search engines, but for now, to be conservative I’ll base my estimates on Text only.

So roughly there are 15 billion Web Pages, and the average amount of text is 25 thousand bytes. (note most of the Web is Video and Images text is actually a small percentage)

So to get a final number I multiply 15 billion  15,000,000,000 times 25 thousand 25,000 and I get…

375,000,000,000,000 bytes cached…

 

 

Notice the name of te site or the band does not appear in my search string, nothing to tip off the google search engine what I am looking for and presto!

Caching Success Urban Myth or Reality


Editors Note:

Caching is a bit overrated as a means of eliminating congestion and speeding up Internet access. Yes there are some nice caching tricks that create fleeting illusions of speed, but in the end, caching alone will fail to mitigate problems due to congestion. The following article adapted from our previous November  2011  posting details why.

You might be surprised to learn that Internet link congestion cannot be mitigated with a caching server alone. Contention can only be eliminated by:

1) Increasing bandwidth

2) Some form of intelligent bandwidth control

3) Or a combination of 1) and 2)

A common assumption about caching is that somehow you will be able to cache a large portion of common web content – such that a significant amount of your user traffic will not traverse your backbone with a decent caching solution. Unfortunately, our real world experience has shown us that the after the implementation of a caching solution the overall congestion on your Internet link shows no improvement.

For example: Let’s take the case of an  Internet trunk  that delivers 100 megabits, and is heavily saturated prior to implementing a caching  solution. What happens when you add a caching server to the mix?

From our experience, a good hit rate to cache will likely not exceed  5 percent. Yes, we have heard claims of 50 percent, but have not seen this in practice and suspect this is just best case vendor hype or a very specialized solution targeted at NetFLix (not general caching).  We have been selling a caching solution and discussing other caching solutions with customers for almost 3 years, and like any urban  myth, claims of high percentage caching hits are impossible to track down.

Why is the hit rate at best only 5 percent?

T
he Internet is huge relative to a cache, and you can only cache a tiny fraction of total Internet content. Even Google, with billions invested in data storage, does not come close. You can attempt to keep trending popular content in the cache, but the majority of access requests to the Internet will tend to be somewhat random and impossible to anticipate. Yes, a good number of hits locally resolve a Yahoo home page, but many more users are going to do unique things. For example, common hits like email and Facebook are all very different for each user, are not a shared resource maintained in the cache. User hobbies are also all different, and thus they traverse different web pages and watch different videos. The point is you can’t anticipate this data and keep it in a local cache any more reliably than guessing the weather long term. You can get a small statistical advantage, and that accounts for the 5 percent that you get right.

 

Even with caching at a 5 percent hit rate, your backbone link usage will not decline.

With caching in place, any gain in efficiency will be countered by a corresponding increase in total usage. Why is this?

If you assume an optimistic 10 percent hit rate to cache, you will end up getting a boost and obviously handle 10 percent more traffic than you did prior to caching , however your main pipe won’t.

This is worth repeating, if you cache 10 percent  of your data, that does not mean your Internet pipe usage will go from  100 percent to 90 percent , it is not a zero sum game. The net effect will be your main pipe will remain at 100 percent full , and you will get 10 percent on top of that from your cache.Thus your net usage to the  Internet appears to be 110 percent.  The problem is you still have a congested pipe and the associated slow web pages and files that are not stored in cache will suffer , you have not solved your congestion problem!

Perhaps I am beating a dead horse with examples, but just one more.

Let’s start with a very congested 100 megabit Internet link. Web hits are slow, YouTube takes forever, email responses are slow, and Skype calls break up. To solve these issues, you put in a caching server.

Now 10 percent of your hits come from cache, but since you did nothing to mitigate overall bandwidth usage, your users will simply eat up the extra 10 percent from cache and then some. It is like giving a drug addict a free hit of their preferred drug. If you serve up a fast YouTube, it will just encourage more YouTube usage.

Even with a good caching solution in place, if somebody tries to access Grandma’s Facebook page, it will have to come over the congested link, and it may time out and not load right away. Or, if somebody makes a Skype call it will still be slow. In other words, the 90 percent of the hits not in cache are still slow even though some video and some pages play fast, so the question is:

If 10 percent of your traffic is really fast, and 90 percent is doggedly slow, did your caching solution help?

The answer is yes, of course it helped, 10 percent of users are getting nice, uninterrupted YouTube. It just may not seem that way when the complaints keep rolling in. :)

Ever Wonder Why Your Video (YouTube) Over the Internet is Slow Sometimes?


By: Art Reisman

Art Reisman CTO www.netequalizer.com

Art Reisman is the CTO of APconnections. He is Chief Architect on the NetGladiator and NetEqualizer product lines.

I live in a nice suburban neighborhood with both DSL and Cable service options for my Internet. My speed tests always show better than 10 megabits of download speed, and yet sometimes, a basic YouTube or iTunes download just drags on forever. Calling my provider to complain about broken promises of Internet speed is futile. Their call center people in India have the patience of saints; they will wear me down with politeness despite my rudeness and screaming. Although I do want to believe in some kind of Internet Santa Claus, I know first hand that streaming unfettered video for all is just not going to happen. Below I’ll break down some of the limitations for video over the Internet, and explain some of the seemingly strange anomalies for various video performance problems.

The factors dictating the quality of video over the Internet are:

1) How many customers are sharing the link between your provider and the rest of the Internet

Believe it or not, your provider pays a fee to connect up to the Internet. Perhaps not in the same exact way a consumer does, but the more traffic they connect up to the rest of the Internet the more it costs them. There are times when their connection to the Internet is saturated, at which point all of their customers will experience slower service of some kind.

2) The server(s) where the video is located

It is possible that the content hosted site has overloaded servers and their disk drives are just not fast enough to maintain decent quality. This is usually what your operator will claim regardless if it is their fault or not. :)

3) The link from the server to the Internet location of your provider

Somewhere between the content video server and your provider there could be a bottleneck.

4) The “last mile”  link between you and your provider (is it dedicated or shared?)

For most cable and DSL customers, you have a direct wire back to your provider. For wireless broadband, it is a completely different story. You are likely sharing the airwaves to your nearest tower with many customers.

So why is my video slow sometimes for YouTube but not for NetFlix?

The reason why I can watch some NetFlix movies, and a good number of popular YouTube videos without any issues on my home system is that my provider uses a trick called caching to host some content locally. By hosting the video content locally, the provider can insure that items 2 and 3 (above) are not an issue. Many urban cable operators also have a dedicated wire from their office to your residence which eliminates issues with item 4 (above).

Basically, caching is nothing new for a cable operator. Even before the Internet, cable operators had movies on demand that you could purchase. With movies on demand, cable operators maintained a server with local copies of popular movies in their main office, and when you called them they would actually throw a switch of some kind and send the movie down the coaxial cable from their office to your house. Caching today is a bit more sophisticated than that but follows the same principles. When you watch a NetFlix movie, or YouTube video that is hosted on your provider’s local server (cache),  the cable company can send the video directly down the wire to your house. In most setups, you don’t share your local last mile wire, and hence the movie plays without contention.

Caching is great, and through predictive management (guessing what is going to be used the most), your provider often has the content you want in a local copy and so it downloads quickly.  However, should you truly surf around to get random or obscure YouTube videos, your chances of a slower video will increase dramatically, as it is not likely to be stored in your provider’s cache.

Try This: The next time you watch a (not popular) YouTube video that is giving your problems, kill it, and try a popular trending video. More often than not, the popular trending video will run without interruption. If you repeat this experiment a few times and get the same results, you can be certain that your provider is caching some video to speed up your experience.

In case you need more proof that this is “top of mind” for Internet Providers, check out the January 1st 2012, CED Magazine article on the Top Broadband 50 for 2011 (read the whole article here).  #25 (enclosed below) is tied to improving video over the Internet.

#25: Feeding the video frenzy with CDNs

So everyone wants their video anywhere, anytime and on any device. One way of making sure that video is poised for rapid deployment is through content delivery networks. The prime example of a cable CDN is the Comcast Content Distribution Network (CCDN), which allows Comcast to use its national backbone to tie centralized storage libraries to regional and local cache servers.

Of course, not every cable operator can afford the grand-scale CDN build-out that Comcast is undertaking, but smaller MSOs can enjoy some of the same benefits through partnerships. – MR

Our Take on Network Instruments 5th Annual Network Global Study


Editors Note: Network Instruments released their “Fifth Annual State of the Network Global study” on March 13th, 2o12. You can read their full study here. Their results were based on responses by 163 network engineers, IT directors, and CIOs in North America, Asia, Europe, Africa, Australia, and South America. Responses were collected from October 22, 2011 to January 3, 2012.

What follows is our take (or my .02 cents) on the key findings around Bandwidth Management and Bandwidth Monitoring from the study.

Finding #1: Over the next two years, more than one-third of respondents expect bandwidth consumption to increase by more than 50%.

Part of me says “well, duh!” but that is only because we hear that from many of our customers. So I guess if you were an Executive, far removed from the day-to-day, this would be an important thing to have pointed out to you. Basically, this is your wake up call (if you are not already awake) to listen to your Network Admins who keep asking you to allocate funds to the network. Now is the time to make your case for more bandwidth to your CEO/President/head guru. Get together budget and resources to build out your network in anticipation of this growth – so that you are not caught off guard. Because if you don’t, someone else will do it for you.

Finding #2: 41% stated network and application delay issues took more than an hour to resolve.

You can and should certainly put monitoring on your network to be able to see and react to delays. However, another way to look at this, admittedly biased from my bandwidth shaping background, is get rid of the delays!

If you are still running an unshaped network, you are missing out on maximizing your existing resource. Think about how smoothly traffic flows on roads, because there are smoothing algorithms (traffic lights) and rules (speed limits) that dictate how traffic moves, hence “traffic shaping.” Now, imagine driving on roads without any shaping in place. What would you do when you got to a 4-way intersection? Whether you just hit the accelerator to speed through, or decided to stop and check out the other traffic probably depends on your risk-tolerance and aggression profile. And the result would be that you make it through OK (live) or get into an ugly crash (and possibly die).

Similarly, your network traffic, when unshaped, can live (getting through without delays) or die (getting stuck waiting in a queue) trying to get to its destination. Whether you look at deep packet inspection, rate limiting, equalizing, or a home-grown solution, you should definitely look into bandwidth shaping. Find a solution that makes sense to you, will solve your network delay issues, and gives you a good return-on-investment (ROI). That way, your Network Admins can spend less time trying to find out the source of the delay.

Finding #3: Video must be dealt with.

24% believe video traffic will consume more than half of all bandwidth in 12 months.
47% say implementing and measuring QoS for video is difficult.
49% have trouble allocating and monitoring bandwidth for video.

Again, no surprise if you have been anywhere near a network in the last 2 years. YouTube use has exploded and become the norm on both consumer and business networks. Add that to the use of video conferencing in the workplace to replace travel, and Netflix or Hulu to watch movies and TV, and you can see that video demand (and consumption) has risen sharply.

Unfortunately, there is no quick, easy fix to make sure that video runs smoothly on your network. However, a combination of solutions can help you to make video run better.

1) Get more bandwidth.

This is just a basic fact-of-life. If you are running a network of < 10Mbps, you are going to have trouble with video, unless you only have one (1) user on your network. You need to look at your contention ratio and size your network appropriately.

2) Cache static video content.

Caching is a good start, especially for static content such as YouTube videos. One caveat to this, do not expect caching to solve network congestion problems (read more about that here) – as users will quickly consume any bandwidth that caching has freed up. Caching will help when a video has gone viral, and everyone is accessing it repeatedly on your network.

3) Use bandwidth shaping to prioritize business-critical video streams (servers).

If you have a designated video-streaming server, you can define rules in your bandwidth shaper to prioritize this server. The risk of this strategy is that you could end up giving all your bandwidth to video; you can reduce the risk by rate capping the bandwidth portioned out to video.

As I said, this is just my take on the findings. What do you see? Do you have a different take? Let us know!

YouTube Dominates Video Viewership in U.S.


Editor’s Note: Updated July 27th, 2011 with material from www.pewinternet.org:

YouTube studies are continuing to confirm what I’m sure we all are seeing – that Americans are creating, sharing and viewing video online more than ever, this according to a Pew Research Center Internet & American Life Project study released Tuesday.

According to Pew, fully 71% of online Americans use video-sharing sites such as YouTube and Vimeo, up from 66% a year earlier. The use of video-sharing sites on any given day also jumped five percentage points, from 23% of online Americans in May 2010 to 28% in May 2011.  This figure (28%) is slightly lower than the 33% Video Metrix reported in June, but is still significant.

To download or read the fully study, click on this link:  http://pewinternet.org/Reports/2011/Video-sharing-sites/Report.aspx

———————————————————————————————————————————————————

YouTube viewership in May 2011 was approximately 33 percent of video viewed on the Internet in the U.S., according to data from the comScore Video Metrix released on June 17, 2011.

Google sites, driven primarily by video viewing at YouTube.com, ranked as the top online video content property in May with 147.2 million unique viewers, which was 83 percent of the total unique viewers tracked.  Google Sites had the highest number of viewing sessions with more than 2.1 billion, and highest time spent per viewer at 311 minutes, crossing the five-hour mark for the first time.

To read more on the data released by comScore, click here.  comScore, Inc. (NASDAQ: SCOR) is a global leader in measuring the digital world and preferred source of digital business analytics. For more information, please visit www.comscore.com/companyinfo.

This trend further confirms why our NetEqualizer Caching Option (NCO) is geared to caching YouTube videos. While NCO will cache any file sized from 2MB-40MB traversing port 80, the main target content is YouTube.  To read more about the NetEqualizer Caching Option to see if it’s a fit for your organization, read our YouTube Caching FAQ or contact Sales at sales@apconnections.net.

YouTube Caching Results: detailed analysis from live systems


Since the release of YouTube caching support on our NetEqualizer bandwidth controller,  we have been able to review several live systems in the field. Below we will go over the basic hit rate of YouTube videos and explain in detail how this effects the user experience. The analysis  below is based on an actual snapshot from a mid-sized state university, using a 64 Gigabyte cache, and approximately 2000 students in residence.

The Squid Proxy server provides a wide range of statistics. You can easily spend hours examining them and become exhausted with MSOS, an acronym for “meaningless stat overload syndrome”.  To save you some time we are going to look at just one stat from one report.  From the Squid Statistics Tab on the NetEqualizer, we selected the Cache Client List option. This report shows individual Cache stats for all clients on your network. At the very bottom is a summary report totaling all squid stats and hits for all clients.

TOTALS

  • ICP : 0 Queries, 0 Hits (0%)
  • HTTP: 21990877 Requests, 3812 Hits (0%)

At first glance it appears as if the ratio of actual cache hits,  3812, to HTTP requests,  21990877,  is extremely low.  As with all statistics the obvious conclusion can be misleading. First off, the NetEqualizer cache is deliberately tuned to NOT cache HTTP requests smaller than 2 Megabytes. This is done for a couple of reasons:

1) Generally, there is no advantage to caching small Web pages, as they normally load up quickly on systems with NetEqualizer fairness in place. They already have priority.

2) With a few exceptions of popular web sites , small web hits are widely varied and fill up the cache – taking away space that we would like to use for our target content, Youtube Videos.

Breaking down the amount of data in a typical web site versus a Youtube hit.

It is true that web sites today can often exceed a Megabyte.  However ,rarely does a web site of 2 Megabytes load up as a single hit. It is comprised of many sub-links, each of which generates a web hit in the summary statistics. A simple HTTP page typically triggers about 10 HTTP requests for perhaps 100K bytes of data total. A more complex page may generate 500K. For example, when you go to the CNN home page there are quite a few small links, and each link increments the HTTP counter. On the other hand, a YouTube hit generates one hit for about 20 megabits of data. When we start to look at actual data cached instead of total Web Hits, the ratio of cached to not cached is quite different.

Our cache set up is also designed to only cache Web pages from 2 megabytes to 40 megabytes, with an estimated average of 20 megabytes. When we look at actual data cached (instead of hits) this gives us about 400 gigabytes of regular HTTP data of which about 76 Gigabytes  came from the cache. Conservatively about 10 percent of all HTTP data came from cache by this rough estimate. This number is  much more significant than the  HTTP statistics reveal.

Even more telling, is that effect these hits have on user experience.

YouTube streaming data, although not the majority of data on this customer system, is very time-sensitive while at the same time being very bandwidth intensive.  The subtle boost made possible by caching 10 percent of the data on this system has a discernible effect on the user experience. Think about it, if 10 percent of your experience on the Web is video, and you were resigned to it timing out and bogging down, you will notice the difference when those YouTube videos play through to completion, even if only half of them come from cache.

For a more detailed technical overview of NetEqualizer YouTube caching (NCO) click here.

Setting Up a Squid Proxy Caching Co-Resident with a Bandwidth Controller


Editor’s Note: It was a long road to get here (building the NetEqualizer Caching Option (NCO) a new feature offered on the NE3000 & NE4000), and for those following in our footsteps or just curious on the intricacies of YouTube caching, we have laid open the details.

This evening, I’m burning the midnight oil. I’m monitoring Internet link statistics at a state university with several thousand students hammering away on their residential network. Our bandwidth controller, along with our new NetEqualizer Caching Option (NCO), which integrates Squid for caching, has been running continuously for several days and all is stable. From the stats I can see, about 1,000 YouTube videos have been played out of the local cache over the past several hours. Without the caching feature installed, most of the YouTube videos would have played anyway, but there would be interruptions as the Internet link coughed and choked with congestion. Now, with NCO running smoothly, the most popular videos will run without interruptions.

Getting the NetEqualizer Caching Option to this stable product was a long and winding road.  Here’s how we got there.

First, some background information on the initial problem.

To use a Squid proxy server, your network administrator must put hooks in your router so that all Web requests go the Squid proxy server before heading out to the Internet. Sometimes the Squid proxy server will have a local copy of the requested page, but most of the time it won’t. When a local copy is not present, it sends your request on to the Internet to get the page (for example the Yahoo! home page) on your behalf. The squid server will then update a local copy of the page in its cache (storage area) while simultaneously sending the results back to you, the original requesting user. If you make a subsequent request to the same page, the Squid will quickly check it to see if the content has been updated since it stored away the first time, and if it can, it will send you a local copy. If it detects that the local copy is no longer valid (the content has changed), then it will go back out to the Internet and get a new copy.

Now, if you add a bandwidth controller to the mix, things get interesting quickly. In the case of the NetEqualizer, it decides when to invoke fairness based on the congestion level of the Internet trunk. However, with the bandwidth controller unit (BCU) on the private side of the Squid server, the actual Internet traffic cannot be distinguished from local cache traffic. The setup looks like this:

Internet->router->Squid->bandwidth controller->users

The BCU in this example won’t know what is coming from cache and what is coming from the Internet. Why? Because the data coming from the Squid cache comes over the same path as the new Internet data. The BCU will erroneously think all the traffic is coming from the Internet and will shape cached traffic as well as Internet traffic, thus defeating the higher speeds provided by the cache.

In this situation, the obvious solution would be to switch the position of the BCU to a setup like this:

Internet->router->bandwidth controller->Squid->users

This configuration would be fine except that now all the port 80 HTTP traffic (cached or not) will appear like it is coming from the Squid proxy server and your BCU will not be able to do things like put rate limits on individual users.

Fortunately, with the our NetEqualizer 5.0 release, we’ve created an integration with NetEqualizer and co-resident Squid (our NetEqualizer Caching Option) such that everything works correctly. (The NetEqualizer still sees and acts on all traffic as if it were between the user and the Internet. This required some creative routing and actual bug fixes to the bridging and routing in the Linux kernel. We also had to develop a communication module between the NetEqualizer and the Squid server so the NetEqualizer gets advance notice when data is originating in cache and not the Internet.)

Which do you need, Bandwidth Control or Caching?

At this point, you may be wondering if Squid caching is so great, why not just dump the BCU and be done with the complexity of trying to run both? Well, while the Squid server alone will do a fine job of accelerating the access times of large files such as video when they can be fetched from cache, a common misconception is that there is a big relief on your Internet pipe with the caching server.  This has not been the case in our real world installations.

The fallacy for caching as panacea for all things congested assumes that demand and overall usage is static, which is unrealistic.  The cache is of finite size and users will generally start watching more YouTube videos when they see improvements in speed and quality (prior to Squid caching, they might have given up because of slowness), including videos that are not in cache.  So, the Squid server will have to fetch new content all the time, using additional bandwidth and quickly negating any improvements.  Therefore, if you had a congested Internet pipe before caching, you will likely still have one afterward, leading to slow access for many e-mail, Web  chat and other non-cachable content. The solution is to include a bandwidth controller in conjunction with your caching server.  This is what NetEqualizer 5.0 now offers.

In no particular order, here is a list of other useful information — some generic to YouTube caching and some just basic notes from our engineering effort. This documents the various stumbling blocks we had to overcome.

1. There was the issue of just getting a standard Squid server to cache YouTube files.

It seemed that the URL tags on these files change with each access, like a counter, and a normal Squid server is fooled into believing the files have changed. By default, when a file changes, a caching server goes out and gets the new copy. In the case of YouTube files, the content is almost always static. However, the caching server thinks they are different when it sees the changing file names. Without modifications, the default Squid caching server will re-retrieve the YouTube file from the source and not the cache because the file names change. (Read more on caching YouTube with Squid…).

2. We had to move to a newer Linux kernel to get a recent of version of Squid (2.7) which supports the hooks for YouTube caching.

A side effect was that the new kernel destabilized some of the timing mechanisms we use to implement bandwidth control. These subtle bugs were not easily reproduced with our standard load generation tools, so we had to create a new simulation lab capable of simulating thousands of users accessing the Internet and YouTube at the same time. Once we built this lab, we were able to re-create the timing issues in the kernel and have them patched.

3. It was necessary to set up a firewall re-direct (also on the NetEqualizer) for port 80 traffic back to the Squid server.

This configuration, and the implementation of an extra bridge, were required to get everything working. The details of the routing within the NetEqualizer were customized so that we would be able to see the correct IP addresses of  Internet sources and users when shaping.  (As mentioned above, if you do not take care of this, all IPs (traffic) will appear as if they are coming from the Proxy server.

4. The firewall has a table called ConnTrack (not be confused with NetEqualizer connection tracking but similar).

The connection tracking table on the firewall tends to fill up and crash the firewall, denying new requests for re-direction if you are not careful. If you just go out and make the connection table randomly enormous that can also cause your system to lock up. So, you must measure and size this table based on experimentation. This was another reason for us to build our simulation lab.

5. There was also the issue of the Squid server using all available Linux file descriptors.

Linux comes with a default limit for security reasons, and when the Squid server hit this limit (it does all kinds of file reading and writing keeping descriptors open), it locks up.

Tuning changes that we made to support Caching with Squid

a. To limit the file size of a cached object of 2 megabytes (2MB) to 40 megabytes (40MB)

  • minimum_object_size 2000000 bytes
  • maximum_object_size 40000000 bytes

If you allow smaller cached objects it will rapidly fill up your cache and there is little benefit to caching small pages.

b. We turned off the Squid keep reading flag

  • quick_abort_min 0 KB
  • quick_abort_max 0 KB

This flag when set continues to read a file even if the user leave the page, for example when watching a video if the user aborts on their browser the Squid cache continues to read the file. I suppose this could now be turned back on, but during testing it was quite obnoxious to see data transfers talking place to the squid cache when you thought nothing was going on.

c. We also explicitly told the Squid what DNS servers to use in its configuration file. There was some evidence that without this the Squid server may bog down, but we never confirmed it. However, no harm is done by setting these parameters.

  • dns_nameservers   x.x.x.x

d. You have to be very careful to set the cache size not to exceed your actual capacity. Squid is not smart enough to check your real capacity, so it will fill up your file system space if you let it, which in turn causes a crash. When testing with small RAM disks less than four gigs of cache, we found that the Squid logs will also fill up your disk space and cause a lock up. The logs are refreshed once a day on a busy system. With a large amount of pages being accessed, the log will use close to one (1) gig of data quite easily, and then to add insult to injury, the log back up program makes a back up. On a normal-sized caching system there should be ample space for logs

e. Squid has a short-term buffer not related to caching. It is just a buffer where it stores data from the Internet before sending it to the client. Remember all port 80 (HTTP) requests go through the squid, cached or not, and if you attempt to control the speed of a transfer between Squid and the user, it does not mean that the Squid server slows the rate of the transfer coming from the Internet right away. With the BCU in line, we want the sender on the Internet to back off right away if we decide to throttle the transfer, and with the Squid buffer in between the NetEqualizer and the sending host on the Internet, the sender would not respond to our deliberate throttling right away when the buffer was too large (Link to Squid caching parameter).

f. How to determine the effectiveness of your YouTube caching statistics?

I use the Squid client cache statistics page. Down at the bottom there is a entry that lists hits verses requests.

TOTALS

  • ICP : 0 Queries, 0 Hits (0%)
  • HTTP: 21990877 Requests, 3812 Hits (0%)

At first glance, it may appear that the hit rate is not all that effective, but let’s look at these stats another way. A simple HTTP page generates about 10 HTTP requests for perhaps 80K bytes of data total. A more complex page may generate 500k. For example, when you go to the CNN home page there are quite a few small links, and each link increments the HTTP counter. On the other hand, a YouTube hit generates one hit for about 20 megabits of data. So, if I do a little math based on bytes cached we get, the summary of HTTP hits and requests above does not account for total data. But, since our cache is only caching Web pages from two megabits to 40 megabits, with an estimated average of 20 megabits, this gives us about 400 gigabytes of regular HTTP and 76 Gigabytes of data that came from the cache. Abut 20 percent of all HTTP data came from cache by this rough estimate, which is a quite significant.

%d bloggers like this: