Abstract
BLASTZ is a sequence alignment tool designed mainly for aligning neutrally evolved bio-sequences and has been the choice for aligning noncoding sequences. However, its running time is impractical for high throughput alignment of long sequences, for example, for the alignment of human and mouse genomes. In order to improve the performance and efficiency for alignment at genome scale, BLASTZ was implemented using the GLOBUS toolkit on a computing grid. A dynamic load balancing technique was introduced to achieve enhanced performance on a grid which consists of sources of heterogeneous characteristics, such as resources of different computational powers. The robustness of the implementation to disturbances due to other processes on the grid is demonstrated.
Similar content being viewed by others
References
S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, “Basic Local Alignment Search Tool,” J. Mol. Biol., vol. 215, 1990, 403–410.
S. F. Altschul, T. L. Madden, A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman, “Gapped BLAST and PSI-BLAST—A New Generation of Protein Database Search Programs,” Nucleic Acids Res., vol. 25, 1997, 3389–3402.
C. X. Chen and B. Schmidt, “An Adaptive Grid Implementation of DNA Sequence Alignment,” Future Generation Computer Systems—The International Journal of Grid Computing: Theory, Methods and Applications, vol. 21, no. 7, 2005, pp. 988–1003.
C. X. Chen and B. Schmidt, “Constructing Large Suffix Trees on a Computational Grid,” Journal of Parallel Distributed Computing, in press, doi: 10.1016/j.jpdc.2006.08.004.
E. T. Dermitzakis, A. Reymond, R. Lyle, N. Scamuffa, C. Ucla, S. Deutsch, etc., “Numerous Potentially Functional but Non-genic Conserved Sequences on Human Chromosome 21,” Nature, vol. 420, 2002.
B. Giardine, L. Elnitski, C. Riemer, L. Makalowska, S. Schwartz, W. Miller, “GALA, a Database for Genomic Sequence Alignments and Annotations,” Genome Res., vol. 13, 2003, 732–741.
Y. Gitton, N. Dahmane, S. Baik, A. R. i Altaba, L. Neidhardt, M. Scholze, B. G. Herrmann, P. Kahlemk, A. Benkahlak, S. Schrinnerk, R. Yildirimmank, R. Herwigk, H. Lehrachk, and M.-L. Yaspok, “A Gene Expression Map of Human Chromosome 21 Orthologues in the Mouse,” Nature, vol. 420, 2002.
A. Reymond, V. Marigo, M. B. Yaylaoglu, A. Leoni, C. Ucla, N. Scamuffa, C. Caccioppoli, E. T. Dermitzakis, R. Lyle, S. Banfi, G. Eichele, S. E. Antonarakis, and A. Ballabio, “Human Chromosome 21 Gene Expression Atlas in the Mouse,” Nature, 420, 2002.
S. Karlin and S. F. Altschul, “Methods for Assessing the Statistical Significance of Molecular Sequence Features by Using General Scoring Schemes,” Proc. Natl. Acad. Sci., vol. 87, 1990, 2264–2268.
S. Karlin and S. F. Altschul, “Applications and Statistics for Multiple High-Scoring Segments in Molecular Sequences,” Proc. Natl. Acad. Sci., vol. 90, 1993, 5873–5877.
I. Foster, C. Kesselman, and M. Kaufmann, “The Grid 2: Blueprint for a New Computing Infrastructure,” 2003.
A. Krishnan, “GridBLAST: a Globus-Based High-Throughput Implementation of BLAST in a Grid Computing Framework,” Concurrency and Computation: Practice and Experience, vol. 17, no. 13, 2005, pp. 1607–1623.
R. C. Hardison “Comparative Genomics,” PLoS Biol., vol. 1, no. 2, 2003, pp. e58.
D. A. Pollard, C. M. Bergman, J. Stoye, S. E. Celniker, and M. B. Eisen, “Benchmarking Tools for the Alignment of Functional Noncoding DNA,” BMC Bioinformatics, 2004.
B. Schmidt, C. X. Chen, W. G. Liu, “Hierarchical Grid Computing for High Performance Bioinformatics”, in Grids for Bioinformatics and Computational Biology, Wiley, to appear in 2007.
S. Schwartz, W. J. Kent, A. Smit, Z. Zhang, R. Baertsch, R. C. Hardison, D. Haussler, and W. Miller, “Human–mouse Alignments with BLASTZ,” Genome Res., vol. 13, 2003, pp. 103–107.
C. A. Stewart, D. Hart, D. K. Berry, G. J. Olsen, E. A. Wernert, and F. W. ischer, “Parallel Implementation and Performance of fastDNAml: A Program for Maximum Likelihood Phylogenetic Inference,” SC2001. Denver, CO, USA. 2001.
O. Trelles, “On the Parallelisation of Bioinformatics Applications,” Brief. Bioinform., vol. 2, no. 2, 2001, 181–194.
W. R. Zhu, Y. W. Niu, J. Z. Lu, C. Shen, and G. R. Gao, “A Cluster-Based Solution for High Performance Hmmpfam Using EARTH Execution Model,” Cluster 2003, 2003.
BLAST Manual: http://www.ncbi.nlm.nih.gov/blast/blast_help.shtml
BLASTZ Alignment Program: http://www.bx.psu.edu/miller\_lab/
Conserved Non-coding Databases: http://pipeline.lbl.gov/cgi-bin/cnc
Ensembl Project: http://www.ensembl.org/index.html
GLOBUS Project: http://www.globus.org
Master-Slave paradigm: http://charm.cs.uiuc.edu/research/masterSlave/
Sun grid engine project: http://gridengine.sunsource.net/
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, C., Rajapakse, J.C. Grid-Enabled BLASTZ: Application to Comparative Genomics. J VLSI Sign Process Syst Sign Im 48, 301–309 (2007). https://doi.org/10.1007/s11265-007-0065-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-007-0065-6