Abstract
Tasked with designing a metadata management system for a large scientific data repository, we find that the customary database application development procedure exhibits several disadvantages in this environment. Data cannot be accessed until the system is fully designed and implemented, specialized data modeling skills are required to design an appropriate schema, and once designed, such schemas are intolerant of change. We minimize setup and maintenance costs by automating the database design, data load, and data transformation tasks. Data creators are responsible only for extracting data from heterogeneous sources according to a simple RDF-based data model. The system then loads the data into a generic RDBMS schema. Additional grouping structures to support query formulation and processing are discovered by the system or defined by the users via a web interface. Discovered and imposed structures constitute emergent semantics for otherwise disorganized information.
Chapter PDF
Similar content being viewed by others
Keywords
- Resource Description Framework
- Query Formulation
- Resource Description Framework Data
- Data Creator
- Resource Description Framework Graph
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Dublin Core Metadata Standard (1997), http://dublincore.org/index.shtml
Content standard for digital geospatial metadata (2004), http://www.fgdc.gov/metadata/metadata.html
Agrawal, R., Somani, A., Xu, Y.: Storage and querying of e-commerce data. The VLDB Journal, 149–158 (2001)
Baptista, M., Wilkin, P., Pearson, P., Turner, M.C.: P. Barrett. Coastal and estuarine forecast systems: A multi-purpose infrastructure for the columbia river. Earth System Monitor, NOAA, 9(3) (1999)
Baru, C., Moore, R., Rajasekar, A., Wan, M.: The SDSC storage resource broker. In: Proceedings of the Centers for Advanced Studies Conference, Toronto, Canada (November 1998)
Bechhofer, S., van Harmelen, F., Hendler, J., Horrocks, I., McGuinness, D.L., Patel-Schneider, P.F., Stein, L.A.: Web ontology language reference. W3C Recommendation (1999), http://www.w3.org/TR/2003/CR-owl-ref-20030818/
Beckett, D., McBride, B.: RDF/XML syntax specification (2004), http://www.w3.org/TR/rdf-syntax-grammar/
Brickley, D., Guha, R.V.: RDF vocabulary description language 1.0: RDF Schema (2004), http://www.w3.org/TR/rdf-schema/
Broekstra, J., Kampman, A., van Harmelen, F.: Sesame: A generic architecture for storing and querying rdf and rdf schema. In: Horrocks, I., Hendler, J. (eds.) ISWC 2002. LNCS, vol. 2342, pp. 54–68. Springer, Heidelberg (2002)
Chervenak, A., Foster, I., Kesselman, C., Salisbury, C., Tuecke, S.: The data grid: Towards an architecture for the distributed management and analysis of large scientific datasets. Journal of Network and Computer Applications 23, 187–200 (1999)
Florescu, D., Kossmann, D.: Storing and querying xml data using an rdmbs. IEEE Data Engineering Bulletin 22, 27–34 (1999)
Garofalakis, M., Gionis, A., Rastogi, R., Seshadri, S., Shim, K.: XTRACT: a system for extracting document type descriptors from XML documents. ACM Special Interest Group on Management of Data, 165–176 (2000)
Goldman, R., Widom, J.: Dataguides: Enabling query formulation and optimization in semistructured databases. In: Jarke, M., Carey, M.J., Dittrich, K.R., Lochovsky, F.H., Loucopoulos, P., Jeusfeld, M.A. (eds.) Proceedings of 23rd International Conference on Very Large Data Bases, pp. 436–445. Morgan Kaufmann, San Francisco (1997)
Gray, J., Bosworth, A., Layman, A., Pirahesh, H.: Data cube: A relational operator generalizing group-by, crosstab and sub-totals. ICDE, 152–159 (1996)
Guha, R.V.: rdfDB: An RDF Database (2001), http://www.guha.com/rdfdb/
Hayes, P., McBride, B.: Rdf semantics. W3C Recommendation (2003), http://www.w3.org/TR/rdf-mt/
Howes, M.C.: Smiths, and G. S. Good. Understanding and deploying LDAP directory services. Technical report (1999)
Karvounarakis, G., Alexaki, S., Christophides, V., Plexousakis, D., Scholl, M.: RQL: A declarative query language for RDF. In: The Eleventh International World Wide Web Conference
Karvounarakis, G., Christophides, V., Plexousakis, D., Alexaki, S.: Querying RDF descriptions for community web portals. Journees Bases de Donnees Avancees, 133–144 (2001)
Lakshmanan, L.V.S., Sadri, F., Subramanian, S.N.: Schemasql: An extension to SQL for multidatabase interoperability. Database Systems 26(4), 476–519 (2001)
Magkanaraki, A., Tannen, V., Christophides, V., Plexousakis, D.: Viewing the semantic web through RVL lenses. In: Second International Semantic Web Conference, pp. 20–23 (2003)
Maier, D., Ullman, J.D., Vardi, M.Y.: On the foundations of the Universal Relation Model. ACM Transactions on Database Systems (TODS)
Mannila, H., Räihä, K.-J.: Algorithms for inferring functional dependencies from relations. Data and Knowledge Engineering 12(1), 83–99 (1994)
McGuinness, D.L.: Conceptual modeling for distributed ontology environments. In: Proceedings of The Eighth International Conference on Conceptual Structures (2000)
Miller, L.: RDF Squish query language and Java implementation. Public draft, Institute for Learning and Research Technology (2001), http://ilrt.org/discovery/2001/02/squish/
Nestorov, S., Abiteboul, S., Motwani, R.: Extracting schema from semistructured data. In: ACM Conference on Knowledge Discovery and Data Mining, pp. 295–306 (1998)
Rajasekar, A.: MCAT - a meta information catalog, http://www.npaci.edu/DICE/SRB/mcat.html
Robillard, E.: GenericDB, http://www.genericdb.com/
Singh, G., Bharathi, S., Chervenak, A., Deelman, E., Kesselman, C., Mahohar, M., Pail, S., Pearlman, L.: A metadata catalog service for data intensive applications. In: Proceedings of the 2003 ACM/IEEE conference on Supercomputing (November 2003)
Stonebraker, M., Rowe, L.A., Hirohama, M.: The implementation of Postgres. TKDE 2(1), 125–142 (1990)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Howe, B., Tanna, K., Turner, P., Maier, D. (2004). Emergent Semantics: Towards Self-Organizing Scientific Metadata. In: Bouzeghoub, M., Goble, C., Kashyap, V., Spaccapietra, S. (eds) Semantics of a Networked World. Semantics for Grid Databases. ICSNW 2004. Lecture Notes in Computer Science, vol 3226. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30145-5_11
Download citation
DOI: https://doi.org/10.1007/978-3-540-30145-5_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23609-2
Online ISBN: 978-3-540-30145-5
eBook Packages: Springer Book Archive