The World’s Biggest Caching Server


Caching solutions are used in all shapes and sizes to speed up Internet data retrieval. From your desktop keeping a local copy of the last web page viewed, to your cable company keeping an entire library of NetFlix movies,  there is a broad diversity in the scope and size of  caching solutions.

So, what is the biggest caching server out there?  Moreover, if I found the world’s largest caching server, would  it store  just a tiny microscopic subset of the total data  available from the public  Internet?   Is it possible that somebody has actually cached everything Internet? A caching server the size of the Internet seems absurd, but I decided to investigate anyway, and so with an open mind, I set out to find the biggest caching server in the world.  Below I have detailed my research and findings.

As always I started with Google, but not in the traditional sense. If you think about Google, they seem to have every  public page on the Internet indexed. That is a huge amount of data, and I suspect  they are the worlds biggest caching server.  Asserting Google as the worlds largest caching server seems logical , but somewhat hollow and unsubstantiated, my next step was to quantify my assertion.

To figure out how much data is actually stored by Google,  in a weird twist of logic, I figured the best way to estimate the size of the stored data would be to determine what data is not stored in Google.

I would need to find a good way to stumble into some truly random web pages without using Google to find them, and then specifically test to see if Google knew about those pages by  asking Google to search for unique, deep rooted, text strings within those sites.

Rather than ramble too much, I’ll just walk through one of my experiments below.

To find a random Web site, I started with  one of those random web site stumblers. As advertised, it took me to a  random  web site titled, “Finest Polynesian Tiki Objects”. From there, I looked for unique text strings on the Tiki site.  The  idea here is find a sentence of text from this site that is not likely to found anywhere but on this site. In essence something deep enough so as not to be a deliberatly indexed title already submitted to google.   I poked around on the Tiki site  and found some seemingly innocuous text on their merchant  site. “Presenting Genuine Witco Art – every piece will come with a scanned”. I put that exact string in my Google search box and presto there it was.

Screen Shot 2013-05-29 at 4.21.04 PM

Wow it looks like Google has this somewhat random page archived and indexed because it came up in my search.

A sample set of two data points is not large enough to extrapolate from and draw conclusions, so I repeated my experiment a few more times and here are more samples of what I found….

Try number two.

Random Web Site

http://www.genarowlandsband.com/contact.php

Search String In Google

“For booking or general whatnot, contact Bob. Heck, just write to say hello if you feel like it.”

Screen Shot 2013-05-30 at 2.06.35 PM

It worked again, it found the exact page from a search on a string buried deep on the page.

And then I did it again.

Screen Shot 2013-05-30 at 2.18.55 PM

And again Google found the page.

The conclusion is that Google has cached close to 100 percent of the publicly accessible text on the Internet. In fairness to Google’s competitors they also found the same Web pages using the same search terms.

So how much data is cached in terms of a raw number?

 

There are plenty of public statistics for number of Web sites/pages connected to the Internet, and there is also data detailing the average size of a Web Page, what I have not determined  is how much of the Video, and Images are cached by Google, I do know they are working on image search engines, but for now, to be conservative I’ll base my estimates on Text only.

So roughly there are 15 billion Web Pages, and the average amount of text is 25 thousand bytes. (note most of the Web is Video and Images text is actually a small percentage)

So to get a final number I multiply 15 billion  15,000,000,000 times 25 thousand 25,000 and I get…

375,000,000,000,000 bytes cached…

 

 

Notice the name of te site or the band does not appear in my search string, nothing to tip off the google search engine what I am looking for and presto!

Internet Regulation, what is the world coming to ?


A friend of mine just forwarded an article titled “How Net Neutrality Rules Could Undermine the Open Internet”

Basically Net Neutrality advocates are now worried that bringing the FCC in to help enforce Neutrality will set a legal precedent allowing wide-reaching control over other aspects of the Internet. For example, some form of content control extending into gray areas.

Let’s look at the history of the FCC for precedents.

The FCC came into existence to manage and enforce the wireless spectrum,  essentially so you did not get 1000 radio/tv stations blasting signals over each other in every city.  A very necessary and valid government service. Without it, there would be utter anarchy in the airwaves. Imagine roads without traffic signals, or airports without control towers.

At some point in time, their control over frequencies got into content and accessibility mandates.  How did this come about? Simply put, it is the normal progression of government asserting control over a resource. It is what it is, neither good nor bad, just a reflection of a society that looks to government to make things “right”. And like an escaped non-native species in the Hawaiian Islands, it tends to take as much real estate as the ecosystem will allow.

What I do know as a certainty, the FCC, once in the door at regulating anything on the Internet, will continue to grow in order to make things “right” and “fair” during our browsing experience.

At best we can hope the inevitable progression of control by the FCC gets thwarted at every turn allowing us a few more good years of the good old Internet as we know it. I’ll take the current Internet flaws for a few more years while I can.

For more information on non-native species invading Hawaii’s ecosystem, check out this blog, from the Kohala Watershed Partnership.

For an overview of Net Neutrality – check out this Net Neutrality for Dummies Article explaining the act’s possible effects on the everyday internet user.

For a discussion on the possible lawlessness of the FCC’s control over the internet, read this blog entitled “Is the FCC Lawless?”.

NetEqualizer News: May 2013


May 2013

Greetings!

Enjoy another issue of NetEqualizer News! This month, we preview our upcoming integration of our Microsoft Excel Dynamic Real-Time Reporting Tool into NetEqualizer, discuss our new Hotel Management System Integration Offering, and feature a story from a happy NetEqualizer Customer. As always, feel free to pass this along to others who might be interested in NetEqualizer News.

A message from Art…
Art Reisman, CTO – APconnections

art_smallIn May, my thoughts turn to the BolderBoulder, a large 10K running race that I compete in each year. The race has 50,000+ participants, and is split into two, a “people’s race” and  a “professionals race” (International Team Challenge). I compete first and then watch the professionals race, which is usually won by someone from Ethiopia or Kenya, as professionals fly in from all over the world for this race. By May my goal is always to train hard, so that I am at my peak performance on Memorial Day for the run. I work hard to get ready, and plan to run a personal best this year!

We love it when we hear back from you – so if you have a story you would like to share with us of how we have helped you, let us know. Email me directly at art@apconnections.net. I would love to hear from you!

New! Our Hotel Management System Integration Offering

APconnections is excited to announce our Hotel Management System Integrated Offering (HMSIO). We have partnered with Global Gossip, LLC, a leader in the lodging managed network services industry, to offer an end-to-end network managed services solution for our hotel & lodging customers.
hmsio_data_sheet
We are combining strengths to offer NetEqualizer, the best in bandwidth shaping, with Global Gossip’s world class managed network services offering. We now can offer our hotel and lodging customers a full suite of capabilities to manage your wireless network, such as authentication, 24/7/365 support, cloud-based monitoring access, and network design services.

Hotel Management System Integrated Offering has grown organically from Global Gossip’s own use of NetEqualizers in its wireless services solutions in remote places all over the world, including many U.S. National Parks.For more details, check out our HMSIO Data Sheet, or contact us at:

sales@apconnections.net

-or-

toll-free U.S. (888-287-2492),

worldwide (303) 997-1300 x. 103


NetEqualizer Featured Customer

Every so often, NetEqualizer News features a customer who has benefited greatly from our technology and has told us about it!

This month, we feature Gordon College, and Russ Leathe, Director of Network and Computing Services.Here is what Russ had to say about his experience with NetEqualizer:

“We had an incident over the weekend I wanted to tell you about:
gordon_collegeOne of our web servers got hit with a ‘zero-day’ malware. We noticed our bandwidth was completely pegged even though our student population was on, or leaving for Spring-Break (so our bandwidth consumption should have been trending downwards, not upwards). We maintain over 100 servers, 95% of which are in a VM environment. Needless to say, finding the exposed culprit would be like finding the proverbial “needle in a haystack”.Alas, NetEQ to the rescue.
We used NTOP to discover our ‘Top Talkers’.  The Inbound bandwidth was saturated, which was unusual and we pinpointed it to one machine. We quickly wrote a bandwidth rule for that web-server and things returned to normal.
We found the malware and inoculated the server…all within an hour’s time. Normally, this could have taken hours or a few days.

Thanks again… for creating such a great solution for Higher ED!!”

Thanks Russ!

Coming Soon: Microsoft Excel Dynamic Real-Time Reporting Integration

One of our most popular unpublished tools that we release to customers who request it is our Dynamic Real-Time Reporting tool which sends data from your NetEqualizer to Excel so that you can monitor usage from your local PC.

The next generation of this software has arrived.

Coming soon, we will be releasing our built in version of this tool so that you can get the same benefits of its reporting features right on your NetEqualizer. It will require no setup and will be completely web based.

Here is a quick screenshot preview:

micro

You’ll be able to view active connections, connections which are bandwidth hogs, IP to country translation, and more!

This tool is free to customers with valid NetEqualizer Software and Support. If you are not current with NSS, contact us today!

sales@apconnections.net

-or-

toll-free U.S. (888-287-2492),

worldwide (303) 997-1300 x. 103


Best Of The Blog

You Heard it Here First, Our Prediction on How Video Will Evolve to Conserve Bandwidth

By Art Reisman – CTO – APconnections

Editors Note:

I suspect somebody out there has already thought of this, but in my quick Internet search I could not find any references to this specific idea, so I am taking journalistic first claim and unofficial first rights to this idea.

The best example I think of to exemplify efficiency in video, are the old style cartoons,  such as the parody of South Park. If you ever watch South Park animation, the production quality is done deliberately cheesy – very few moving parts with fixed backgrounds. In the South Park case, the intention was obviously not to save production costs. The cheap animation is part of the comedy. That was not always the case, the evolution of this sort of stop animation cartoon was from the early days before computer animation took over the work of human artists working frame by frame. The fewer moving parts in a scene, the less work for the animator. They could re-use existing drawings of a figure and just change the orientation of the mouth in perhaps three positions to animate talking.

Modern video compression tries to take advantage of some of the inherit static data from image to image , such that, each new frame is transmitted with less information.  At best, this is a hit or miss proposition.  There are likely many frivolous moving parts in a back ground that perhaps on the small screen of hand held device are not necessary.

My prediction is we will soon see a collaboration between production of video and Internet transport providers that allows for the average small device video production to have a much smaller footprint in transit.

Some of the basics of this technique would involve…

Photo Of The Month

IMG_0456
This picture of downtown Helsinki, Finland was taken on a recent visit to a customer site by one of our staff members.

APconnections and Global Gossip Announce Joint Network Solution Offering for Lodging Industry


Editor’s Note:  This release went out on May 16, 2013 11:05 AM Mountain Daylight Time.

LAFAYETTE, Colo.–(BUSINESS WIRE)–APconnections, an innovation-driven technology company that delivers best-in-class network traffic management appliances, and Global Gossip, a leader in network managed services for the lodging industry, today announced the joint Hotel Management System Integrated Offering (HMSIO).

“Working with APconnections on this joint solution offers tremendous potential. Since the integration of NetEqualizer into our head-end stack we have been able to offer a much improved end user Wi-Fi experience and overall greater customer satisfaction.”
Sam Beskur
Director of U.S. Operations
Global Gossip

GG-Horiz-w-orange-line

The joint offering combines the strengths of the NetEqualizer behavior-based bandwidth shaping appliance, with Global Gossip’s world-class managed network services offering. HMSIO will offer hotel and lodging customers a full suite of capabilities to manage their wireless networks, including customized authentication, behavior-based bandwidth shaping, 24/7/365 support, a cloud-based monitoring portal, and network design services. With HMSIO, hospitality and lodging customers can provide a “low noise”, high-quality, wireless Internet experience to guests along with unmatched excellence in customer support. Learn more in our HMSIO Data Sheet.

Global Gossip’s Director of U.S. Operations, Sam Beskur, says, “Working with APconnections on this joint solution offers tremendous potential. Since the integration of NetEqualizer into our head-end stack we have been able to offer a much improved end user Wi-Fi experience and overall greater customer satisfaction.”

APconnections’ CEO, Art Reisman, stated, “We have been looking for the right partner to offer an end-to-end network solution to our lodging industry customers. With their worldwide footprint and excellent technical support, Global Gossip’s network services are a great complement to our NetEqualizer bandwidth shaping products.”

About Global Gossip

Global Gossip (http://hsia.globalgossip.com) has been developing network and communication solutions since 1999 and currently manages and maintains over three hundred wired and wireless access networks globally. Our service locations span seven countries and include locations as remote and bandwidth challenged as the central Australian desert to high throughput networks in downtown London, England. Global Gossip has offices in Denver, Colorado; Sydney, Australia; and London, England.

About APconnections

APconnections is a privately held company founded in 2003 and is based in Lafayette, Colorado, USA (http://netequalizer.com). Our flexible and scalable network traffic management solutions can be found at thousands of customer sites in public and private organizations of all sizes across the globe, including: Fortune 500 companies, major universities, K-12 schools, Internet providers, libraries, and government agencies on six continents.

Contacts

APconnections, Inc.
Sandy McGregor, 303-997-1300 x.104
sandy@apconnections.net
or
Global Gossip LLC
Stephanie Dickens, 720-378-5087
sdickens@globalgossip.net

Your heard it here first, our prediction on how video will evolve to conserve bandwidth


Editors Note:

I suspect somebody out there has already thought of this,  but in my quick internet search I could not find any references to this specific idea, so I am takaing journalistic first  claim unofficial first rights to this idea.

The best example I think of to exemplify efficiency in video, are the old style cartoons,  such as the parody of South Park. If you ever watch south park animation,  the production quality  is done deliberately cheesy, very few moving parts with fixed backgrounds. In the South Park case, the intention was obviously not to save production costs.  The cheap animation is part of the comedy. That was not always the case,  the evolution of this sort of stop animation cartoon was from the early days  before computer animation took over the work of human artists working frame by frame. The fewer moving parts in a scene, the less work for the animator.  They could re-use existing drawings of a figure and just change the orientation of the mouth in perhaps three positions to animate talking.

Modern video compression tries to take advantage of some of the inherit static data from image to image , such that, each new frame is transmitted with less information.  At best, this is a hit or miss proposition.  There are likely many frivolous moving parts in a back ground that perhaps on the small screen of hand held device are not necessary.

My prediction is we will soon see a collaboration between production of video and Internet transport providers that allows for the average small device video production to have a much smaller footprint in transit.

Some of the basics of this technique would involve.

1) deliberately blurring or sending a background separate from the action. Think of a wide shot of break away lay-up in a basketball game. All you really need to see is the player and the basket in the frame the brain is going to ignore background details such as the crowd, they might as well be static character animations, especially on the scale of the screen of your Iphone not the same experience as your 56 inch HD flat screen.

2) Many of the videos in circulation the internet are news casts of a talking head giving the latest headlines. If you wanted to be extreme, you could  make the production such that the head is  tiny and animate it like a south park character,  this will take a much smaller footprint but technically still be video, and it would be much more like to play through without pausing.

3) The content sender can actually send a different production of the same video for low-bandwidth clients.

Note the reason why the production side of the house must get involved with the compression and delivery side of video is that the compression engines can only make assumptions on what is important and what is not, when removing information (pixels) from a video.

With a smart production engine geared toward the Internet, there is big savings here. Video is busting out all over the Internet and conserving from a production side only makes sense if you want to get your content deployed and viewed everywhere .

The security industry also does something similar taking advantage with fixed cameras on fixed backgrounds.

Related How much YouTube can the Internet Handle

Related Out of the box ideas on how to speed up your Internet

Blog dedicated to video compression, Euclid Discoveries.