"Enabling Grids for E-Science in Europe" will be the largest computing project in the world but solid business models will be required to ensure its success, writes Karlin Lillington
It promises to be the largest grid computing project in the world and the biggest computing project ever undertaken in Europe.
But even as EGEE - Enabling Grids for E-Science in Europe - got a formal kick-off at a conference in Cork last week, the project is confronting the same old niggling questions from investors, commercial interests and researchers themselves: where is the business model?
"Many people are worried and discussing the business model," concedes Dr Fabrizio Gagliardi, the affable Italian project director of EGEE.
"But there are likely to be several models. Many people feel they could sell computing on demand."
EGEE, backed by €32 million from the EU and 70 partners in 27 countries, is without doubt the global glamour project in the very hot technology corner of grid computing.
Grid computing is a mating between the internet and computer clusters (groups of computers in the same physical place that are linked and act as a lower-cost supercomputer).
Taking their name from the electricity grid - because it is foreseen that they could supply computing power as a kind of utility - grids are based on the possibility of joining together thousands of individual computers.
The computers can be distributed anywhere on a network, even though they might be - and typically will be - dispersed across the world.
Speaking at Trinity College as part of the Royal Irish Academy's lecture series, Dr Gagliardi said the ambitious goal of EGEE is ultimately to link some 20,000 computers, usually described as CPUs (central processing units). The result will be the world's largest collective computing intelligence, able to crunch through currently unimaginable amounts of data.
Driving the project is Dr Gagliardi's employer, CERN, the famous Geneva-based nuclear physics laboratory, which is partway through building a massive, underground, ring-shaped atomic collider that is 27 kilometres in circumference and will operate at minus-3000 degrees centigrade in temperature.
Called the Large Hadron Collider (LHC), it will be able to explore new realms of science - perhaps even prove or disprove the Big Bang theory of the universe's origin.
But it will need to be able to store and analyse 12-15 petabytes of data a year (a petabyte is 10 to the 15th power of bytes) - an amount of information that would need a 20 km high stack of more than 20 million CDs for storage.
The EGEE already has a good substrate - the infrastructure from a predecessor three-year grid project called the European DataGrid (EDG), which wrapped up on March 31st, the day before EGEE was launched.
The back-to-back nature of the projects is deliberate. "We have all this capital investment, plus regional efforts," explains Dr Gagliardi. "I think it would be a disaster if somehow there is not a major project to take over this (DataGrid) infrastructure project. What EGEE is proposing is to leverage national resources for a broader European benefit."
The EDG at peak shared 1,000 computers, says Dr Gagliardi, and involved 21 organisations (including Trinity College Dublin) and €10 million in EU funding, with twice that amount coming from commercial partners.
A community of users has been designated for pilot projects for EGEE - scientists in two distinct research areas that have immediate need of the grid's incredible muscle.
These are research communities involved with the LHC and scientists working in the area of biomedical grids, who want to be able to do data mining on genome databases as well as index the huge medical databases in hospitals.
Advanced software developed for the DataGrid project, the crucial "middleware" that enables the dispersed CPUs to communicate together seamlessly, process data and secure it from hackers or from being corrupted will be brought forward to EGEE and will continue to evolve, Dr Gagliardi says.
He describes middleware as one of the trickiest elements of creating the grid because it is the essential "glue which combines all the different elements that create the web of data".
It needs to allow many different types of computers running different operating systems and varied software programs to join hands and exchange data.
Middleware for DataGrid - which will form the basis for EGEE's middleware - has partly come from development groups within CERN and been made freely available to other organisations and individuals as "open source" software.
Other middleware elements have come from commercial partners, such as Oracle, HP, IBM and Microsoft, which are eager to have the testbed of such a grid project for their own development teams.
Commercial partners are crucial to the success of the grid, says Dr Gagliardi.
"If we want to build a sustainable model of computing, we need the involvement of industry."
And industry has been quick to use its involvements with the previous DataGrid project as part of major advertising campaigns directed at business customers.
Even if there are currently no commercial applications being marketed to business clients, such involvement both heightens the cutting-edge profile of the major industry players and helps prepare a future market for on-demand computing, said several industry insiders attending the Cork EGEE conference last week.
In addition, while research organisations might work on infrastructure, industry needs to decide on the all-important standards to be used for building middleware and other applications that will sit on top of the infrastructure.
And academia and research organisations need commercial applications for grids to make these networks viable in the long term or they will be too costly to build, manage and maintain.
But if the public face of EGEE is smiling at industry, privately many in the research community grumble that industry is sitting back and waiting for the cash-squeezed research community to develop a costly infrastructure that industry can then commercially exploit.
One researcher pointedly asked a panel of industry representatives at the Cork conference why they were not more involved in the up-front development and funding of grids. The panel's responses were polite but sidestepped the key issue of who should be digging in pockets to fund grids and why.
Many grid pioneers at the conference - from the original grid theorist Dr Carl Kesselman of the University of Southern California to French researchers Drs Guy Wormser and Christian Suagez, members of EGEE's industry forum - speak of the potential revolutionary potential of grids, which could be economically and socially transforming.
In addition to pushing out human knowledge about the universe, they could be used in projects as varied as high-end medical research to school assignments, or to model natural disasters to help manage fire, flood and earthquake relief.
But grids have yet to receive any real interest from European politicians whose countries would benefit, scientists say privately with disappointment.
Without strong international political support, grids will be far more difficult to implement, because the mammoth projects are both costly and daunting to manage.
Dr Gagliardi and others say the key challenge in building out the EGEE will not be perfecting the middleware - itself a formidable task - or raising funding, or even finding commercial applications and customers for them. The real difficulty will be in co-ordinating the human administration, getting the many existing organisations from the DataGrid project, and the various national governments, to work smoothly together.
That, observers say, is where political will - perhaps even more than the business model - must become an essential part of Europe's great grid vision.
klillington_at_irish-times.ie weblog: http://weblog.techno-culture.com