Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Querying multiple bioinformatics information sources: can semantic web research help?

Published: 01 December 2002 Publication History

Abstract

Advances in Semantic Web and Ontologies have pushed the role of semantics to a new frontier: Semantic Composition of Web Services. A good example of such compositions is the querying of multiple bioinformatics data sources. Supporting effective querying over a large collection of bioinformatics data sources presents a number of unique challenges. First, queries over bioinformatics data sources are often complex associative queries over multiple Web documents. Most associations are defined by string matching of textual fragments in two documents. Second, most of the queries required by Genomics researchers involve complex data extraction, and sophisticated workflows that implement the complex associative access. Third but not the least, complex Genomics-specific queries are often reused many times by Genomics researchers, either directly or through some refinements, and are considered as a part of the research results by Genomics researchers. In this short article we present a list of challenging issues in supporting effective querying over bioinformatics data sources and illustrate them through a selection of representative search scenarios provided by biologists. We end the article with a discussion on how the state-of-art research and technological development in Semantic Web, Ontology, Internet Data Management, and Internet Computing Systems can help addressing these issues.

References

[1]
S. F. Altschul et al., Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Research25 (1997) 3389--3402.]]
[2]
H. Berman et al., Protein Structures: From Famine to Feast American Scientist90.4 (2002) 350--359.]]
[3]
P. Buneman et al., Why and Where: A Characterization of Data Provenance International Conference on Database Theory (ICDT) (2001).]]
[4]
Z. Cheng et al., Composition Constraints for Semantic Web Services, WWW2002 Workshop on Real World RDF and Semantic Web Applications (2002)]]
[5]
E. Christensen, et al., Web Services Description Language (WSDL) 1.1, Technical Report, World Wide Web Consortium (2001). See. http://www.w3.org/TR/wsdl]]
[6]
DBCAT, The Public Catalog of Databases. See http://www.infobiogen.fr/services/dbcat/]]
[7]
A.J. Fornace et al., The complexity of radiation stress responses: analysis by informatics and functional genomics approaches, Gene Expr 7 (1999) 387--400.]]
[8]
GenBank, Nucleic Acids Research30(1) (2002) 17--20]]
[9]
C.A. Goble et al., Transparent access to multiple bioinformatics information sources, IBM Systems Journal40.2 (2001) 532--551]]
[10]
A. Gupta et al., Registering Scientific Information Sources for Semantic Mediation, 21st International Conference on Conceptual Modeling, (2002).]]
[11]
K. J. Kochut et al., IntelliGEN: a distributed workflow system for discovering protein-protein interactions, International Journal on Distributed and Parallel Databases, Special Issue on Bioinformatics (2002)]]
[12]
B. Ludäscher et al., Model-Based Mediation with Domain Maps, 17th Intl. Conference on Data Engineering (2001)]]
[13]
J. Meidanis et aL, Using Workflow Management in DNA Sequencing, lntl. Conf. on Cooperative Information Systems (1996), 114--123]]
[14]
L. Peterson, CLUSFAVOR, Baylor College of Medicine (2002). See http://mbcr.bcm.tmc.edu/genepi/]]
[15]
K. Quandt et al., MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data, Nucl Acids Res. 23 (1995) 4878--4884.]]
[16]
Z. Ronai, Deciphering the mammalian stress response --- a stressful task, Oncogene18 (1999) 6084--6.]]
[17]
M. Shepherd et al., Building Trust for E- Commerce: Collaborating Label Bureaus, ISEC 2001, LNCS 2040 (2001) 42--56]]
[18]
L. Stein et al, Scriptable access to the Caenorhabditis elegans genome sequence and other ACEDB databases. Genome Res. 8 (1999) 1308--1315]]
[19]
G. Wiederhold et al., Composing Diverse Ontologies, Technical Report, Stanford University (1998)]]

Cited By

View all
  • (2017)PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasetsJournal of Biomedical Semantics10.1186/s13326-017-0151-z8:1Online publication date: 20-Sep-2017
  • (2015)Managing changes in distributed biomedical ontologies using hierarchical distributed graph transformationInternational Journal of Data Mining and Bioinformatics10.1504/IJDMB.2015.06633411:1(53-83)Online publication date: 1-Dec-2015
  • (2013)IntroductionData Intensive Computing for Biodiversity10.1007/978-3-642-38047-1_1(1-6)Online publication date: 1-Jun-2013
  • Show More Cited By
  1. Querying multiple bioinformatics information sources: can semantic web research help?

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM SIGMOD Record
      ACM SIGMOD Record  Volume 31, Issue 4
      December 2002
      104 pages
      ISSN:0163-5808
      DOI:10.1145/637411
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 01 December 2002
      Published in SIGMOD Volume 31, Issue 4

      Check for updates

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)8
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 16 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2017)PIBAS FedSPARQL: a web-based platform for integration and exploration of bioinformatics datasetsJournal of Biomedical Semantics10.1186/s13326-017-0151-z8:1Online publication date: 20-Sep-2017
      • (2015)Managing changes in distributed biomedical ontologies using hierarchical distributed graph transformationInternational Journal of Data Mining and Bioinformatics10.1504/IJDMB.2015.06633411:1(53-83)Online publication date: 1-Dec-2015
      • (2013)IntroductionData Intensive Computing for Biodiversity10.1007/978-3-642-38047-1_1(1-6)Online publication date: 1-Jun-2013
      • (2013)Framework for Biodiversity Information Retrieval in MalaysiaAdvances in Biomedical Infrastructure 201310.1007/978-3-642-37137-0_4(15-24)Online publication date: 2013
      • (2012) i BIRA – integrated bioinformatics information resource access Reference Services Review10.1108/0090732121122835440:2(326-343)Online publication date: 11-May-2012
      • (2010)A data warehouse approach to semantic integration of pseudomonas dataProceedings of the 7th international conference on Data integration in the life sciences10.5555/1884477.1884488(90-105)Online publication date: 25-Aug-2010
      • (2010)A Data Warehouse Approach to Semantic Integration of Pseudomonas DataData Integration in the Life Sciences10.1007/978-3-642-15120-0_8(90-105)Online publication date: 2010
      • (2009)Bio-medical Ontologies Maintenance and Change ManagementBiomedical Data and Applications10.1007/978-3-642-02193-0_6(143-168)Online publication date: 2009
      • (2007)Engineering agent-mediated integration of bioinformatics analysis toolsMultiagent and Grid Systems10.5555/1375358.13753653:2(245-258)Online publication date: 1-Apr-2007
      • (2007)An ontology-based framework for bioinformatics workflowsInternational Journal of Bioinformatics Research and Applications10.1504/IJBRA.2007.0150033:3(268-285)Online publication date: 1-Sep-2007
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media