Google's Infrastructure is its Strategic Advantage

87 Comments

Back in the day, when PC stocks were kings on Wall Street, a pesky college kid named Michael Dell figured out that he could do an end run around the then-established PC makers by developing a smarter way of making and selling boxes. His strategy was simple: get components and PCs from the factories in Asia to the U.S. as fast as possible, but only after he had charged for the machine.

By squeezing the supply chain as hard as he could, he turned Dell into a fearsome (and loathsome) competitor. With his help, the supply chain for the PC era came to consist of foundries, ships, U.S. assembly plants and UPS trucks. Google (GOOG), with over $200 billion in market capitalization, is following a similar strategy, fine tuning and adapting it for the Web & broadband.

Instead of trucks and assembly plants, however, Google’s supply chain is made up of fiber networks, data centers, switches, servers and storage devices. From that perspective, its business model is no different than that of Dell’s (DELL): Google has to deliver search results (information, if you want to be generous about their other projects) as fast as possible at as low a cost as possible.

To better understand Google and its business model, one needs to break it down into three data inputs.

  • Relevancy of results.
  • Speed of search.
  • Cost of executing a search query.

While their results aren’t optimal, they are good enough. Just like Microsoft Windows was good enough to dominate the market. Google, according to Hitwise, now has 64 percent of the total search market. And although a typical Google query can often be an act of futility, we put up with it because the results are fast. If they’re wrong, we can just start all over again.

The faster the results show up on our browsers, the less inclined we’ll be to switch to a rival search engine, no matter how great the rival’s search methodology may be. The faster (and more efficient) its infrastructure, the more easily Google can keep serving the ad-based money machine.

In other words, the company has to make sure that the speed of its search is really, really fast. Any random search on Google these days takes between 0.12 to 0.06 seconds. Now that is really, really fast. Google does this by indexing the Internet quite well. The magic is in delivering the search results from this index at lightening speed, and that requires an infrastructure — oodles of bandwidth and specialized hardware — that is finely tuned, much like a Formula One Car.

Against this backdrop, it makes perfect sense for Google to build their own servers, storage systems, Internet switches and perhaps, sometime in the future, even optical transport systems. Let me rephrase that: Imagine connecting thousands of hosts (storage and server systems) at speeds of, say, 10 gigabits per second, in a manner that allows any-to-any connections.

The number of racks, fiber, routers and everything in between is mind-boggling. If this system were built using gear from established hardware makers, it would take a superhuman effort to make it all work together. In other words, the sheer cost to keep such a beast going would suck up a major component of the infrastructure.

A better option is to have gear that is customized for your processes, ones in which you have a major operational expenditure advantage. In the telecom bubble, large service providers were brought to their knees by operational expenditures.

With the exception of optical systems, Google has built or is building the gear. It has been rumored to be a big buyer of dark fiber to connect its data centers, which helps explain why the company spent nearly $3.8 billion over the past seven quarters on capital expenditures.

You can argue that building customized gear is an expensive strategy, but when you are the scale of Google, it starts to become less of an issue. Why? Because process-optimized infrastructure ensures that Google’s cost of executing a query keep going down.

To sum it up, Google’s gigantic infrastructure is the big barrier to entry for its rivals, and will remain so, as long as the company keeps spending billions on it. That said, there’s another thing Google could learn from Dell: Maintain the quality of your search results — customers will only put up with shoddiness for so long.

Note #1: Ethan, you are absolutely right about the software aspect of Google architecture, and I was going to do a separate post. This one is already 750 words.

Note #2: Earth2Tech has a post about Google’s vertically integrated green energy strategy.

87 Comments

Jayson

I can agree with most of this article, except the part apart their search results. Every now and again, I will jump on bing or yahoo and they just dont give me as many quality results. Google is just better. And Now they have picked up their real time search game adding twitter and the like. Type in Los Angeles weather in the search bar, you get the results before your done typing. No need to even hit enter. You can’t get those type of results anywhere else. So now its there job to keep innovating.

irin

hi everyone,,,
i have a question regarding googles business,just a bit curious as we were having a discussion in the class that does google get paid for the search result or they get paid only for the ads and sponsored link when people click on them?i mean if i search for the term “economy” and they provide me with few pages of search result and i click on one of them,,,do googleget paid?
plz let me know if anyone knows it,,,
thx in advance

Omer Altay

great post, totally on point..

the next question is why do even need the internet? once they have everything served up from their own infrastructure and massive supercomputer?

Giga reader

Interesting read, Om. While Goog’s infrastructure is a competitive advantage, I would argue that it is not Google’s only one. Otherwise, it would be in Google’s best interest to collect the Ricardian rent on it by allowing others to build on it.

I believe Google’s strengths run deeper. I suspect the infrastructure they’ve built out is closely aligned to how they carry out their other activities – process of developing new products, their software stack, etc. The intertwining of these different activities make Google’s infrastructure more valuable to itself than to others (at least for the present), and, in my mind, justifies Google’s experiments to churn out its dizzying array of products.

The more interesting question is how sustainable is the advantage?

Robin Harris

I’ve done a lot of investigation of Google’s infrastructure. For more on software and hardware on Google check out:

For Microsoft’s best idea, look at this article on a Microsoft Research project named Boxwood. Honestly though, I don’t think Ballmer gets it. Ozzie maybe.

Robin Harris

Michael_ONeil

Om,

Interesting post – thanks! But fwiw (and I know it’s peripheral to your main point), the premise at the beginning is incorrect. Dell’s initial breakthrough wasn’t in supply chain, it was in distribution. As “PCs Etc.”, it didn’t have enough clout to elbow its way onto retail shelves, so it developed a different way of reaching customers. The supply chain stuff came later…

Borislav Agapiev

Om,

I enjoy your articles a lot but for this one I will take you to task. First, I am amazed that you would quote “0.12-0.06” as what it takes to do a search. Have you ever checked your TCP/IP roundtrip delay (e.g. by ping)? I got 100+ ms a minute ago from my FIOS, now it is 70, this is the floor for the delay, it makes no sense to talk about 60 ms delay when it becomes dominated by transport.

The other point is that the speed is simply about perception, once you get to fractions of seconds nobody can notice anyway.

Google has always played the speed card, and I agree, it is cool but is no big deal these days, you can do really fast search with amazingly small resources (hint: see Gigablast).

The fact is, despite oohhing and aahhing by mesmerized onlookers and assorted Wallstreet types, Google search results have barely changed from 10 years ago. The quality of a multitude of results is visibly deteriorating as they are being search spammed to death, by tens of thousands of COMPANIES, not to mention legions of individuals . This is hardly a secret as it is being openly discussed among search experts.

It is true that Google has amassed enormous infrastructure and they are playing the card of giving an impression that all that is absolutely REQUIRED since they keep mum about what it is being used for. But how about how much of that is really needed, and how efficiently it is deployed?

The size of the crawlable Web is small, 20B pages will fit into 200TB which today one can jam into a couple of racks for two hundred grand.
Sure you want to back it up, cache it, have copies for speedup etc but that hardly an imposing infrastructure make.

As for bandwidth, that is really funny since for $100K/mo one can get 10Gbps in the Bay Area, you can crawl 10B pages/DAY with that, well, nobody dares to try that incl Goggle, Yahoo and Microsoft. The moral of the story is that crawling DOES NOT SCALE! Google probably spends more that $100K/mo on coffee so crawling is certainly not any kind of “infrastructure advantage”.

The fact is, IMHO, that Google has been getting a free ride as they have not been faced with serious competition (from Microsoft, Yahoo nor anyone else). Couple that with a perception, I would say perfectly exemplified by your article, that it takes some mythical, enormous resources to do search and presto, we get Google invincibility.

There are other claims you make in there, e.g. “indexing the Internet really well”. What is so special about that? It is well known that indexing is the game of RAM i.e. if you stuff your index into RAM then it will be really fast and that is it. The size of a really large Internet index, with all the languages words, product names, acronyms, people names, parts of URLs and whatever else these days is few billions (Google actually admits that in one of their earlier papers), throw in few more billion for N-grams (most frequent combinations of 2,3,4 .. keywords, they have actually released these numbers) and you end up with, say, tens of TB of RAM. The cost of RAM itself is not much few hundred grand, perhaps few million for machines to stuff the RAM in but what is the big deal about it?

You also mention “imagine connecting hosts with 10Gbps and any-to-any connections”. You seem to be confusing the connectivity speed with throughput of the switching fabric. But even with the full crossbar, what is the point? Will that help Google rotate their index faster? Are Google customers happy with the current rotation frequency? Will it allow them to have say, incremental index, instead of the current, write-once one?

OK, I will stop here. Do no get me wrong, I like your stuff, just thought to give you a little workout on this one :)

Finally, I guess as a disclaimer too, one can say if I know and say all this, what am I doing about it? Fair question, I have experience (founded two search startups) and resources now, and am doing something about it, will have more to say pretty soon :)

Tricia Duryee

Hey Om, Great post. I heard awhile ago that Google owned fiber, but didn’t know how it fit into the picture. If infrastructure is really their game, I’m wondering if we could apply your thinking to what they are doing in wireless? You can see my post on the subject here.

tak

just cuz of this post, i will keep coming here rather than techcrunch.

rohit

@ Parag,

Microsoft and Live Search may benefit from money being poured in to it, but if they are stacking bricks, a concorde they do not get…

For more on Microsoft Research’s ideas on Rethinking Data Centers, see powerpoint from Chuck Thacker’s talk at Stanford in Oct
http://yuba.stanford.edu/~nbehesht/netseminar/seminars/10_25_07.ppt

From his talk it is clear that what is ‘practice’ at Google and others is merely ‘talk’ @ Microsoft.

rohit

Comments are closed.