Facts are hard to find when you google Google

Net Results/Karlin Lillington: The Google initial public offering documents were filed last week amid much hoopla over whether…

Net Results/Karlin Lillington: The Google initial public offering documents were filed last week amid much hoopla over whether or not this IPO signals the return of the ditzy tech glory days of the turn of the century.

Well, let's hope there's not a return to any such thing. The glory was always relative - sometimes quite literally, as that specially permissioned investor group, "friends and family", as well as big institutional investors, creamed off the cash rather than the small fellows (if they sold in time, that is).

But one does hope it's a sign that the tech investment sector is showing some life, because of course, there are many solid companies out there who had to drop any consideration of going public due to market jitters, especially in this sector. Once bitten, twice shy.

But the Google IPO is more than just any old IPO and, for that reason, it is raising pulse rates all over the globe. It has been one of the few pure dotcoms to shine both financially and functionally. It makes money - lots of it - and its search engine technology remains phenomenal.

READ MORE

A sure sign of Google's value to the plain people is that it has become a verb, unlike many of the other high-profile internet companies. Despite its success, no one talks about "ebaying" something, or getting "amazoned". But we do "google" things all the time, including people.

Sometimes it seems that every piece of information in existence must be google-able. Except, that is, if you want to google Google.

The company is well-known for being outwardly friendly and quirky and fun, while being inwardly secretive and protective. That's just one of the reasons why so many wanted to get a copy of Google's prospectus - so many that they almost brought down the Security and Exchange Commission website. Inside of it, buried among the requisite legal statements of this and that, are little nuggets of information about the company that have been hard to excavate before, including salaries of key executives and the fact that the company will lose its exclusive license to its search technologies in 2011.

But as the Los Angeles Times pointed out, people still don't know some pretty basic things about the company, like how many people use its search engine daily, how many searches it performs, how much server power this requires, and how it performs them.

Oh, sure, the company regularly has stated some figures around some of these areas. But a fascinating article tucked away recently in the MIT Technology Review suggests that the company either can't do its mathematical sums, or is telling a few porkies. The article reports a presentation given by Dr Martin Farach-Colton, professor of computer science at Rutgers University in the US. During the talk, he shows a slide with some of the figures that Google does release:

150 million queries/day

1,000 queries/sec (peak)

10,000+ servers

More than four tera-ops/sec at daily peak

Index: 3 billion Web pages

4 billion total documents

4+ petabytes disk storage.

As Mr Simson Garfinkel, the article's author, states: "Let's see: 'Four tera-ops/sec' means 4,000 billion operations per second; a top-of-the-line server can do perhaps two billion operations per second, so that translates to perhaps 2,000 servers - not 10,000. Four petabytes is 4x1015 bytes of storage; spread that over 10,000 servers and you'd have 400 gigabytes per server, which again seems wrong, since Farach-Colton had previously said that Google puts two 80-gigabyte hard drives into each server.

"And then there is that issue of 150 million queries per day. If the system is handling a peak load of 1,000 queries per second, that translates to a peak rate of 86.4 million queries per day - or perhaps 40 million queries per day if you assume that the system spends only half its time at peak capacity. No matter how you crank the math, Google's statistics are not self-consistent."

Dr Farach-Colton, who spent two years on sabbatical working at Google, describes the figures as "crazily low" - and deliberately low. He describes Google's PR department as vetting any talks given by Google executives to make sure the numbers stay consistently on the wildly conservative side. That 10,000-plus server figure includes a lot of "plus", he says.

But why? Garfinkel argues that it is to Google's benefit to hide such details that would enable other search contenders to try and benchmark themselves against Google's achievements. And it means the company always seems to perform at astonishing levels of efficiency and speed given the hardware. My, what algorithms they must have!

Of course I jest somewhat here, because Google is phenomenal, it does have amazing technology, and I live most of my online search-and-retrieve life through Google. For that I humbly thank them.

But I do think all this has some relevance for their new Dublin-based European HQ. They've left open the door to employment figures, suggesting recently that job numbers might rise significantly over time here.

They also have been rather secretive about their server purchases (rumours are that at least 70 were bought from Dell alone). I'll bet they aren't going to let any visiting journalists stay long enough near the Irish server racks to start doing any mental calculations.

If they routinely underdeclare their number of servers, and as they enter a phase of growth and expansion post-IPO (it's on the cards - they are already dabbling in new areas and Google observers know something is up), then who knows how large the Dublin facility really is expected to be. Lots of room, then, for adding in new functions and further jobs in Ireland.

Which possibly adds a little more interest for Irish IPO watchers.

klillington@irish-times.ie

weblog: http://weblog.techno-culture.com