Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Bio-STEER: A Semantic Web workflow tool for Grid computing in the life sciences

Published: 01 March 2007 Publication History

Abstract

Life science research is becoming evermore computationally intensive. Hence, from a computational resource perspective, Grid computing provides a logical approach to meeting many of the computational needs of life science research. However, there are several barriers to the widespread use of Grid computing in life sciences. In this paper, we attempt to address one particular barrier: the difficulty of using Grid computing by life scientists. Life science research often involves connecting multiple applications together to form a workflow. This process of constructing a workflow is complex. When combined with the difficulty of using Grid services, composing a meaningful workflow using Grid services can present a challenge to life scientists. Our proposed solution is a Semantic Web-enabled computing environment, called Bio-STEER. In Bio-STEER, bioinformatics Grid services are mapped to Semantic Web services, described in OWL-S. We also defined an ontology in OWL to model bioinformatics applications. A graphical user interface helps to construct a scientific workflow by showing a list of services that are semantically sound; that is, the output of one service is semantically compatible with the input of the connecting service. Bio-STEER can help users take full advantage of Grid services through a user-friendly graphical user interface (GUI), which allows them to easily construct the workflows they need.

References

[1]
Altschul, S., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J., Basic local alignment search tool. J. Mol. Biol. v215. 403-410.
[2]
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W. and Lipman, D.J., Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. v25. 3389-3402.
[3]
A.L. Bazinet, D.S. Myers, J. Fuetsch, M.P. Cummings, Grid services base library: A high-level, procedural application programming interface for writing Globus-based Grid services, Future Gener. Comput. Syst. (in press).
[4]
Bedell, J., Korf, I. and Gish, W., Masker Aid: A performance enhancement to RepeatMasker. Bioinformatics. v16. 1040-1041.
[5]
Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J. and Wheeler, D.L., GenBank. Nucleic Acids Res. v33. D34-D38.
[6]
Bio Grid Service Ontology, http://www.flacp.fujitsulabs.com/tce/ontologies/2005/08/BioGridService.owl
[7]
Bioinformatic Workflow Builder Interface, http://www.alphaworks.ibm.com/tech/biowbi
[8]
Blair, D., Campos, A., Cummings, M.P. and Laclette, J.P., Evolutionary biology of platyhelminths comes of age: The role of molecular phylogenetics. Parasitol. Today. v12. 66-71.
[9]
BLAST Semantic Web service description, http://www.flacp.fujitsulabs.com/sylee/BLAST.owl
[10]
Brudno, M., Do, C., Cooper, G., Kim, M., Davydov, E., Green, E., Sidow, A. and Batzoglou, S., LAGAN and Multi-LAGAN: Efficient tools for large-scale multiple alignment of genomic DNA. Genome Res. v13. 721-731.
[11]
Burge, C. and Karlin, S., Prediction of complete gene structures in human genome DNA. J. Mol. Biol. v268. 78-94.
[12]
Campos, A., Cummings, M.P., Reyes, J.L. and Laclette, J.P., Phylogenetic relationships of Platyhelminthes based on 18S ribosomal gene sequences. Mol. Phylogenet. Evol. v10. 1-10.
[13]
Chua, C., Tang, F., Issac, P. and Krishnan, A., GEL: Grid execution language. J. Parallel. Distr. Com. v65. 857-869.
[14]
Clegg, M.T., Cummings, M.P. and Durbin, M.L., The evolution of plant nuclear genes. Proc. Natl. Acad. Sci. USA. v94. 7791-7798.
[15]
Cummings, M.P., Handley, S.A., Myers, D.S., Reed, D.L., Rokas, A. and Winka, K., Comparing bootstrap and posterior probability values in the four-taxon case. Syst. Biol. v52. 477-487.
[16]
Cummings, M.P. and Huskamp, J.C., Grid computing. EDUCAUSE Rev. v40. 116-117.
[17]
Cummings, M.P., King, L.M. and Kellogg, E.A., Slipped-strand mispairing in a plastid gene: rpoC2 in grasses (Poaceae). Mol. Biol. Evol. v11. 1-8.
[18]
Cummings, M.P. and Meyer, A., Magic bullets and golden rules: Data sampling in molecular phylogenetics. Zoology. v108. 329-336.
[19]
Cummings, M.P., Nugent, J.M., Olmstead, R.G. and Palmer, J.D., Phylogenetic analysis reveals five independent transfers of the chloroplast gene rbcL to the mitochondrial genome in angiosperms. Curr. Genet. v43. 131-138.
[20]
Cummings, M.P., Otto, S.P. and Wakeley, J., Genes and other samples of DNA sequence data for phylogenetic inference. Biol. Bull. v196. 345-350.
[21]
T. de Boer, AJAX Command Definition (ACD files). http://www.rfcgr.mrc.ac.uk/Software/EMBOSS/Acd/
[22]
Edgar, R.C., MUSCLE: Multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. v32. 1792-1797.
[23]
EFetch, http://eutils.ncbi.nlm.nih.gov/entrez/query/static/efetchseq_help.html
[24]
Florea, L., Hartzell, G., Zhang, Z., Rubin, G. and Miller, W., A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res. v8. 967-974.
[25]
Förster, H., Cummings, M.P. and Coffey, M.D., Phylogenetic relationships of Phytophthora species based on ribosomal ITS I DNA sequence analysis with emphasis on Waterhouse groups V and VI. Mycol. Res. v104. 1055-1061.
[26]
García-Verela, M., Cummings, M.P., Pérez-Ponce de León, G., Gardner, S.L. and Laclette, J.P., Phylogenetic analysis based on 18S ribosomal RNA gene sequences supports the existence of class Polyacanthocephala (Acanthocephala). Mol. Phylogenet. Evol. v23. 288-292.
[27]
García-Verela, M., Pérez-Ponce de León, G., de la Torre, P., Cummings, M.P., Sarma, S.S.S. and Laclette, J.P., Phylogenetic analysis of Acanthocephala based on 18S ribosomal gene sequences. J. Mol. Evol. v50. 532-540.
[28]
Globus Toolkit, http://www.globus.org/toolkit/
[29]
Guindon, S. and Gascuel, O., A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. v52. 696-704.
[30]
N. Hashmi, S. Lee, M.P. Cummings, Abstracting workflows: Unifying bioinformatics task conceptualization and specification through semantic Web services, in: W3C Workshop on Semantic Web for Life Sciences, Cambridge, MA, USA, 2004
[31]
A. Krogh, Two methods for improving performance of an HMM and their application for gene finding, in: Proc. Int. Conf. Intell. Syst. Mol. Biol., vol. 5, 1997, pp. 179-186
[32]
Lee, S., Galdzicki, M., Masuoka, R., Labrou, Y. and Agre, J., Med-STEER: Enabling composition and execution of semantically described medical informatics services for mobile caregivers. In: Biomedical Informatics for Clinical Decision Support, A vision for the 21st Century, BECON/BISTIC Symposium, Bethesda, MD, USA.
[33]
S. Lee, N. Hashmi, J. Hendler, B. Parsia, Bio-STEER: An application of Task Computing - the Semantic Web meets Grid computing, Technical Report FLA-PCR-TM-3, Pervasive Computing Research, Fujitsu Laboratories of America, Inc., 2004
[34]
Lowe, T. and Eddy, S., tRNAscan-SE: A program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. v25. 955-964.
[35]
Maddison, D.R., Swofford, D.L. and Maddison, W.P., NEXUS: An extensible file format for systematic information. Syst. Biol. v46. 590-621.
[36]
Mark Welch, D.B., Cummings, M.P., Hillis, D.M. and Meselson, M., Divergent gene copies in the asexual class Bdelloidea (Rotifera) separated before the bdelloid radiation or within bdelloid families. Proc. Natl. Acad. Sci. USA. v101. 1622-1625.
[37]
Marks, J.C. and Cummings, M.P., DNA sequence variation in the ribosomal internal transcribed spacer region of freshwater Cladophora (Chlorophyta). J. Phycol. v32. 1035-1042.
[38]
Masuoka, R., Labrou, Y., Parsia, B. and Sirin, E., Ontology-enabled pervasive computing applications. IEEE Intell. Syst. v18. 68-72.
[39]
Masuoka, R., Parsia, B. and Labrou, Y., Task Computing - the Semantic Web meets pervasive computing. In: Proceedings of the 2nd International Semantic Web Conference, Sundial Resort, Sanibel Island, FL, USA.
[40]
Mindswap, http://www.mindswap.org/2004/ontolink/
[41]
Myers, D.S. and Cummings, M.P., Necessity is the mother of invention: A simple Grid computing system using commodity tools. J. Parallel. Distr. Com. v63. 578-589.
[42]
Neel, M.C. and Cummings, M.P., Section-level relationships of North American Agalinis (Orobanchaceae) based on DNA sequence analysis of three chloroplast gene regions. BMC Evol. Biol. v4. 15
[43]
Oinn, T., Addis, M., Ferris, J., Marvin, D., Greenwood, M., Carver, T., Pocock, M.R., Wipat, A. and Li, P., Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics. v20. 3045-3054.
[44]
OWL-S, http://www.daml.org/services/owl-s/1.0/
[45]
Pertea, M., Lin, X. and Salzberg, S., GeneSplicer: A new computational method for splice site prediction. Nucleic Acids Res. v29. 1185-1190.
[46]
Pollock, D.D., Eisen, J.A., Doggett, N.A. and Cummings, M.P., A case for evolutionary genomics and the comprehensive examination of sequence biodiversity. Mol. Biol. Evol. v17. 1776-1788.
[47]
Posada, D. and Crandall, K.A., Modeltest: Testing the model of DNA substitution. Bioinformatics. v14. 817-818.
[48]
Rice, P., Longden, I. and Bleasby, A., EMBOSS: The European Molecular Biology Open Software Suite. Trends Genet. v16. 276-277.
[49]
Ronquist, F. and Huelsenbeck, J.P., MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics. v19. 1572-1574.
[50]
QT, http://www.trolltech.com/products/qt/index.html
[51]
Shah, S.P., He, D.Y.M., Sawkins, J.N., Druce, J.C., Quon, G., Lett, D., Zheng, G.X.Y., Xu, T. and Quellette, B.F.F., Pegasys: Software for executing and integrating analyses of biological sequences. BMC Bioinformatics. v5. 40
[52]
Simple Object Access Protocol, http://www.w3.org/tr/soap
[53]
E. Sirin, J. Hendler, B. Parsia, Semi-automatic composition of Web services using semantic descriptions, in: Web Services: Modeling, Architecture and Infrastructure workshop in ICEIS, Angers, France, 2003
[54]
Smith, T.F. and Waterman, M.S., Identification of common molecular subsequences. J. Mol. Biol. v147. 195-197.
[55]
Z. Song, Y. Labrou, R. Masuoka, Dynamic service discovery and management in Task Computing, in: First Annual International Conference on Mobile and Ubiquitous Systems: Networking and Services, MobiQuitous'04, Boston, MA, USA, 2003
[56]
Stevens, R., Robinson, A. and Goble, C.A., myGrid: Personalised Bioinformatics on the Information Grid. Bioinformatics. v19 iSuppl. 1. i302-i304.
[57]
D.L. Swofford, PAUP*: Phylogenetic analysis using parsimony (*and other methods), version 4, Sinauer Associates. Sunderland, MA, USA
[58]
E. Sirin, B. Parsia, B. Cuenca Grau, A. Kalyanpur, Y. Katz, Pellet: A Practical OWL-DL Reasoner, http://www.mindswap.org/papers/PelletJWS.pdf
[59]
Task Computing, http://taskcomputing.org
[60]
Thompson, J.D., Higgins, D.G. and Gibson, T.J., CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. v22. 4673-4680.
[61]
Universal Plug and Play, http://www.upnp.org
[62]
Washington University BLAST 2.0, http://blast.wustl.edu/blast/README.html
[63]
Wildfire, http://wildfire.bii.a-star.edu.sg/wildfire/
[64]
Wildfire/GEL, http://Web.bii.a-star.edu.sg/~francis/wildfiregel/
[65]
W3C Resource Description Framework, http://www.w3.org/rdf
[66]
W3C Web-Ontology (WebOnt) Working Group, http://www.w3.org/2001/sw/Webont
[67]
W3C Web Services Description Working Group, http://www.w3.org/2002/ws/desc
[68]
Yu, J. and Buyya, R., A taxonomy of scientific workflow systems for Grid computing. SIGMOD Rec. v34. 44-49.

Cited By

View all
  • (2013)A roadmap to domain specific programming languages for environmental modelingProceedings of the 2013 ACM workshop on Domain-specific modeling10.1145/2541928.2541934(27-32)Online publication date: 27-Oct-2013
  • (2011)BioTRONProceedings of the 2011 ACM Symposium on Applied Computing10.1145/1982185.1982206(77-82)Online publication date: 21-Mar-2011
  • (2010)Helping biologists effectively build workflows, without programmingProceedings of the 7th international conference on Data integration in the life sciences10.5555/1884477.1884487(74-89)Online publication date: 25-Aug-2010
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Future Generation Computer Systems
Future Generation Computer Systems  Volume 23, Issue 3
March, 2007
245 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 March 2007

Author Tags

  1. Client/server
  2. Distributed systems
  3. Integrated environments
  4. Semantics
  5. User interface
  6. Web-base services
  7. Workflow management

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2013)A roadmap to domain specific programming languages for environmental modelingProceedings of the 2013 ACM workshop on Domain-specific modeling10.1145/2541928.2541934(27-32)Online publication date: 27-Oct-2013
  • (2011)BioTRONProceedings of the 2011 ACM Symposium on Applied Computing10.1145/1982185.1982206(77-82)Online publication date: 21-Mar-2011
  • (2010)Helping biologists effectively build workflows, without programmingProceedings of the 7th international conference on Data integration in the life sciences10.5555/1884477.1884487(74-89)Online publication date: 25-Aug-2010
  • (2010)Approaching cardiac modeling challenges to computer science with CellML-based web toolsFuture Generation Computer Systems10.1016/j.future.2009.09.00226:3(462-470)Online publication date: 1-Mar-2010
  • (2009)BLAST Application with Data-Aware Desktop Grid MiddlewareProceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid10.1109/CCGRID.2009.91(284-291)Online publication date: 18-May-2009
  • (2009)The data playgroundFuture Generation Computer Systems10.1016/j.future.2008.09.00925:4(453-459)Online publication date: 1-Apr-2009
  • (2009)A mechanism for grid service composition behavior specification and verificationFuture Generation Computer Systems10.1016/j.future.2008.02.01325:3(378-383)Online publication date: 1-Mar-2009
  • (2007)SWAMIProceedings of the 4th international conference on Data integration in the life sciences10.5555/1768933.1768942(48-58)Online publication date: 27-Jun-2007
  • (2007)Grid Services Base LibraryFuture Generation Computer Systems10.1016/j.future.2006.07.00923:3(517-522)Online publication date: 1-Mar-2007

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media