Abstract
This paper describes scientific data discovery for the earth sciences in the context of data Grids and Grid computing. Requirements and use cases illustrate current challenges due to size, distribution, and minimal annotation of data. Semantics and the characterization of provenance in large data archives are discussed. The targeted community of users is also discussed. Solutions implemented by the Earth System Grid and the National Environment Research Council Data Grid include a prototype ontology, metadata schemas, search mechanisms, and discovery architectures. The use of Semantic Web technologies has facilitated the development of meaningful annotations of data content and opened the door to data discovery in federated systems.
Similar content being viewed by others
References
Foster I, Alpert E, Chervenak A, Drach B, Kesselman C, Nefedova V, Middleton D, Shoshani A, Sim A, Williams D (2002) The Earth System Grid II: Turning climate datasets into community resources. In: AMS Proceedings of the Annual Meeting of the American Meteorological Society. http://www.earthsystemgrid.org/
FGDC Metadata Workbook http://www.fgdc.gov/metadata/meta_workbook.html [available 1 January 2004]
W3C www.w3c.org [available 1 January 2004]
Foster I, Kesselman C (eds) (2004) The Grid: blueprint for a new computing infrastructure, 2nd edn. Morgan Kaufmann, San Francisco
Foster I, Kesselman C, Tuecke S (2001) The anatomy of the Grid: enabling scalable virtual organizations. Int J High Perform Comput Appl 15(3):200–222
CCLRC http://www.clrc.ac.uk/ [available 1 January 2004]
Buneman P, Khanna S, Tan W-C (2001) Why and where: a characterization of data provenance. In: International conference on database theory
The Global Change Master Directory http://gcmd.gsfc.nasa.gov/Aboutus/sitemap.html [available 1 January 2004]
The Dublin Core Metadata Element Set v1.1 (DCMES) http://dublincore.org/usage/terms/dc/current-elements/ [available 1 January 2004]
Foster I, Kesselman C (1997) Globus: a metacomputing infrastructure toolkit. Int J Supercomput Appl 11(2):115–128 And http://www.globus.org/
The Open Grid Services Architecture Data Access and Integration http://www.ogsadai.org/ [available 1 January 2004]
Pouchard L, Cinquini L, Drach B, Middleton D, Bernholdt D, Chanchio K, Chen M, Foster I, Nefedova V, Brown D, Fox P, Garcia J, Strand G, Williams D, Chervenak A, Kesselman C, Shoshani A, Sim A (2003) An ontology for scientific information in a grid environment: the Earth System Grid. In: Proc. symposium on cluster computing and the Grid (CCGrid 2003). Tokyo, 12–15 May 2003
Protégé-2000 http://protege.stanford.edu/ [available 1 January 2004]
Schweitzer PN, Nebert D, Miller E, Hart Q, Frew J, Warnock A. FGDC Metadata DTD version 3.0.2, revised 2002-02-05. http://geology.usgs.gov/tools/metadata/tools/doc/ mp.html [available 1 January 2004]
Foster I, Kesselman C, Nick J, Tuecke S (2002) The physiology of the Grid: an open Grid services architecture for distributed systems integration. In: Global Grid Forum Meeting, 2002, Edinburgh, Scotland
Allcock W, Foster I, Nefedova V, Chervenak A, Deelman E, Kesselman C, Lee J, Sim A, Shoshani A, Drach B, Williams D (2001) High-performance remote access to climate simulation data: a challenge problem for data Grid technologies. In: SC’2001. ACM Press, New York
Open Archives Initiative http://www.openarchives.org/ [available 1 January 2004]
OilEd Editor http://oiled.man.ac.uk/ [available 1 January 2004]
Open Grid Service Infrasructure (2003) GWD-R (draft-ggf-ogsi-gridservice-29), 5 April 2003, p 5. http://www.gridforum.org/ogsi-wg/drafts/ draft-ggf-ogsi-gridservice-29_2003-04-05.pdf
Cramer R, Gutierrez M, Kleese van Dam K, Kondapalli S, Latham S, Lawrence B, Lowry R, Woolf A (2003) The metadata model of the NERC DataGrid. In: Proc. e-Science All Hands Meeting. Nottingham, UK, September 2003
Cramer R, Gutierrez M, Kleese van Dam K, Kondapalli S, Latham S, Lawrence B, Lowry R, O’Neill K (2003) Data virtualisaton in the NERC DataGrid. In: Proc. UK e-Science All Hands Meeting. Nottingham, UK, September 2003
Cramer R, Gutierrez M, Kleese van Dam K, Kondapalli S, Latham S, Lowry R, O’Neill K, Woolf A (2003) The NERC DataGrid prototype. In: Proc. UK e-Science All Hands Meeting. Nottingham, UK, September
Ramachandran R, Alshayeb M, Beaumont B, Conover H, Graves SJ, Hanish N, Li X, Movva S, McDowell A, Smith M (2001) Earth Science Markup Language. In: 17th Conference on Interactive Information and Processing Systems for Meteorology, Oceanography, and Hydrology, 81st American Meteorological Society (AMS) Annual Meeting, Albuquerque, NM, 2001
Movva S, Graves SJ, Conover H (2003) Semantics and the Earth Science Markup Language. In: Earth science technology conference, College Park, MD, 24–26 June 2003
ISO 19101:2002 (2002) Geographic information – reference model
The NetCDF Markup Language (NcML) http://www.unidata.ucar.edu/packages/netcdf/ncml/ [available 1 January 2004]
Data Format Description Language http://forge.gridforum.org/projects/dfdl-wg/ [available 1 January 2004]
The Semantic Grid http://www.semanticgrid.org
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Pouchard, L., Woolf, A. & Bernholdt, D. Data Grid discovery and Semantic Web technologies for the earth sciences. Int J Digit Libr 5, 72–83 (2005). https://doi.org/10.1007/s00799-004-0085-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00799-004-0085-9