Lawrence Berkeley National Laboratory
Implementing Production Grids
- Author(s): Johnston, William E.
- et al.
Starting from Section 2, "The Grid Context," we lay out our view of a Grid architecture, and this definition provides a structure for the subsequent detailed description. In particular we identify what it is that differentiates a Grid from other structures for distributed computing, e.g. hierarchical clusters. The question of what is a minimum set of Grid services - the Grid Common Services, the neck of the hourglass model - and what needs to be added to make the Grid usable for particular communities is stated. Issues of interoperability and heterogeneity are addressed, and these are perhaps the most important distinguishing features of a Grid. Section 3, "The Anticipated Grid Usage Model Will Determine What Gets Deployed, and When," addresses the question of Grid building from the points of view of various different types of Grid usage. This is an important point because differing usage patterns require different middleware, this is why the distinction of a minimal common set of Grid services and tools is so important. The underlying case studies have a supercomputing background, and so attention is given to the problems of coupling and synchronicity of resources that are not required in other sorts of Grids, e.g. Data Grids and Grids based on the seti@home concept (e.g. Entropia). This is why interoperability is so important, different usages of Grids will result in different middleware, scheduling strategies and tools for collaboration. The work of the Global Grid Forum is vital in ensuring that standards are defined so that these can interoperate. Nobody is going to be able to produce a commercial product or a Grid-in-a-box that can address the requirements of all Grid usage patterns, indeed much of the strength of the Grid concept is that it clearly recognizes this. The Globus team, who are the basis for the Grid building work described here, understood this very well and have produced a toolkit of sufficient flexibility and robustness to allow building of many different types of Grid. Section 3 also analyses different data usage patterns in Data Grids, and this highlights the realization in Grid computing that the distribution of data is even more important than the distribution of computing resources, since the curation and storage of data is becoming a key issue in tera- and peta- scale computing. The importance of workflow management has also come to the fore. The integration of message passing with the Grid is discussed primarily in the context of MPICH-G2, which provides access to both highly optimized vendor supplied MPI for intra-machine communication and socket based communication for the inter-machine communication. It is important that Globus, as core essential middleware, can interoperate with the best tools from anywhere in the world and a few examples of this are given. Section 4, "Grid Support for Collaboration," describes how the Grid Common Services promote collaboration via the mechanisms for enabling secure resource sharing in Virtual Organizations. The Access Grid has an important role in enabling the human side of such collaboration, and in the building of trust and working relationships in a VO, and is mentioned as an aside. Sections 5, "Building an Initial Multi-site, Computational and Data Grid," and 6, "Cross-Site Trust Management," provide an account of the detail of Grid building. The interaction of the sociology and working practices of the administrators and users of a Grid is integrated with the technical details of Grid deployment and certificate management. Some detail is provided on the building of an identity Certification Authority and the issues of interoperability that are raised here .Section 7, "Transition to a Prototype-Production Grid," fill in the essential steps necessary for Grid building. Section 7.3, The Model for the Grid Information System, describes the issue of Grid Information Service mechanisms. The strengths of the Globus model for GIS, which has been built on top of e