Since the release of YouTube caching support on our NetEqualizer bandwidth controller, we have been able to review several live systems in the field. Below we will go over the basic hit rate of YouTube videos and explain in detail how this effects the user experience. The analysis below is based on an actual snapshot from a mid-sized state university, using a 64 Gigabyte cache, and approximately 2000 students in residence.
The Squid Proxy server provides a wide range of statistics. You can easily spend hours examining them and become exhausted with MSOS, an acronym for “meaningless stat overload syndrome”. To save you some time we are going to look at just one stat from one report. From the Squid Statistics Tab on the NetEqualizer, we selected the Cache Client List option. This report shows individual Cache stats for all clients on your network. At the very bottom is a summary report totaling all squid stats and hits for all clients.
- ICP : 0 Queries, 0 Hits (0%)
- HTTP: 21990877 Requests, 3812 Hits (0%)
At first glance it appears as if the ratio of actual cache hits, 3812, to HTTP requests, 21990877, is extremely low. As with all statistics the obvious conclusion can be misleading. First off, the NetEqualizer cache is deliberately tuned to NOT cache HTTP requests smaller than 2 Megabytes. This is done for a couple of reasons:
1) Generally, there is no advantage to caching small Web pages, as they normally load up quickly on systems with NetEqualizer fairness in place. They already have priority.
2) With a few exceptions of popular web sites , small web hits are widely varied and fill up the cache – taking away space that we would like to use for our target content, Youtube Videos.
Breaking down the amount of data in a typical web site versus a Youtube hit.
It is true that web sites today can often exceed a Megabyte. However ,rarely does a web site of 2 Megabytes load up as a single hit. It is comprised of many sub-links, each of which generates a web hit in the summary statistics. A simple HTTP page typically triggers about 10 HTTP requests for perhaps 100K bytes of data total. A more complex page may generate 500K. For example, when you go to the CNN home page there are quite a few small links, and each link increments the HTTP counter. On the other hand, a YouTube hit generates one hit for about 20 megabits of data. When we start to look at actual data cached instead of total Web Hits, the ratio of cached to not cached is quite different.
Our cache set up is also designed to only cache Web pages from 2 megabytes to 40 megabytes, with an estimated average of 20 megabytes. When we look at actual data cached (instead of hits) this gives us about 400 gigabytes of regular HTTP data of which about 76 Gigabytes came from the cache. Conservatively about 10 percent of all HTTP data came from cache by this rough estimate. This number is much more significant than the HTTP statistics reveal.
Even more telling, is that effect these hits have on user experience.
YouTube streaming data, although not the majority of data on this customer system, is very time-sensitive while at the same time being very bandwidth intensive. The subtle boost made possible by caching 10 percent of the data on this system has a discernible effect on the user experience. Think about it, if 10 percent of your experience on the Web is video, and you were resigned to it timing out and bogging down, you will notice the difference when those YouTube videos play through to completion, even if only half of them come from cache.
For a more detailed technical overview of NetEqualizer YouTube caching (NCO) click here.