Abstract
Multiple Sequence Alignment (MSA) is a fundamental process that involves aligning a set of sequences based on their identical form and structure through dynamic programming. MSA is a crucial tool for temporal analyses such as classification, aggregation, and speech recognition. The process uses a penalty score to assess the similarity among the sequences, with or without gaps. In bioinformatics, MSA is widely used to identify conserved regions and functional sites and to reveal similarities and differences between homologous biological sequences consisting of nucleic acid bases or protein amino acids. This provides a basis for genomics research and drug design. However, the large scale of biological sequences often results in high time complexity, making it difficult to achieve efficient processing. To address this issue, a new multi-sequence alignment algorithm called CUK-band has been proposed, which combines the improved central star alignment and the Compute Unified Device Architecture (CUDA) platform. Experimental results have demonstrated that CUK-band is significantly faster than previous methods. Compared to existing GPU approaches such as CMSA, CUK-band has achieved an accelerated running time of over 1.5X.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Du, Y., He, J., Du, C.: A novel binary particle swarm optimization for multiple sequence alignment. In: ICIC. pp. 13–25 (2019)
Lindegger, J., Senol Cali, D., Alser, M., Gómez-Luna, J., Ghiasi, N.M., Mutlu, O.: Scrooge: a fast and memory-frugal genomic sequence aligner for CPUs, GPUs, and ASICs. Bioinform. 39(5), btad151 (2023)
Liu, Z.P., Liu, S., Chen, R., Huang, X., Wu, L.Y.: Structure alignment-based classification of RNA-binding pockets reveals regional RNA recognition motifs on protein surfaces. BMC Bioinform. 18, 1–13 (2017)
Li, L., Sun, L., Chen, G., Wong, C.W., Ching, W.K., Liu, Z.P.: LogBTF: gene regulatory network inference using Boolean threshold network model from single-cell gene expression data. Bioinform. 39(5), btad256 (2023)
Shen, C., Mao, D., Tang, J., Liao, Z., Chen, S.: Prediction of lncRNA-protein interactions based on kernel combinations and graph convolutional networks. IEEE J. Biomed. Health Inform. (2023)
Liu, W., Schmidt, B., Voss, G., Müller-Wittig, W.: GPU-ClustalW: Using graphics hardware to accelerate multiple sequence alignment. In: HiPC. pp. 363–374 (2006)
Manavski, S.A., Valle, G.: CUDA compatible GPU cards as efficient hardware accelerators for Smith-Waterman sequence alignment. BMC Bioinform. 9(S-2), 1–9 (2008)
Carroll, T.C., Ojiaku, J.T., Wong, P.W.: Semiglobal sequence alignment with gaps using GPU. IEEE ACM Trans. Comput. Biol. Bioinform. 17(6), 2086–2097 (2019)
Zou, Q., Shan, X., Jiang, Y.: A novel center star multiple sequence alignment algorithm based on affine gap penalty and k-band. Phys. Proc. 33, 322–327 (2012)
Ye, W., Chen, Y., Zhang, Y., Xu, Y.: H-BLAST: a fast protein sequence alignment toolkit on heterogeneous computers with GPUs. Bioinform. 33(8), 1130–1138 (2017)
Chen, X., Wang, C., Tang, S., Yu, C., Zou, Q.: CMSA: a heterogeneous CPU/GPU compting system for multiple similar RNA/DNA sequence alignment. BMC Bioinform. 18, 1–10 (2017)
Pérez-Serrano, J., Sandes, E., Magalhaes Alves de Melo, A.C., Ujaldón, M.: DNA sequences alignment in Multi-GPUs: acceleration and energy payoff. BMC Bioinform. 19, 161–176 (2018)
Prasad, D.V.V., Jaganathan, S.: Improving the performance of Smith-Waterman sequence algorithm on GPU using shared memory for biological protein sequences. Clust. Comput. 22(Suppl 4), 9495–9504 (2019)
Alawneh, L., Shehab, M.A., Al-Ayyoub, M., Jararweh, Y., Al-Sharif, Z.A.: A scalable multiple pairwise protein sequence alignment acceleration using hybrid CPU-GPU approach. Clust. Comput. 23, 2677–2688 (2020)
Wei, Y., Zou, Q., Tang, F., Yu, L.: WMSA: a novel method for multiple sequence alignment of DNA sequences. Bioinform. 38(22), 5019–5025 (2022)
Chen, J., Chao, J., Liu, H., Yang, F., Zou, Q., Tang, F.: WMSA 2: a multiple DNA/RNA sequence alignment tool implemented with accurate progressive mode and a fast win-win mode combining the center star and progressive strategies. Briefings Bioinform. 24(4), bbad190 (2023)
Wang, Y., Chen, Z., Han, Y.: Accelerating the Smith-Waterman algorithm by GPU for high-throughput sequence alignment. In: MICML. pp. 77–84 (2023)
Tang, F., et al.: HAlign 3: fast multiple alignment of ultra-large numbers of similar DNA/RNA sequences. Mol. Biol. Evol. 39(8), msac166 (2022)
Zhang, P., Liu, H., Wei, Y., Zhai, Y., Tian, Q., Zou, Q.: FMAlign2: a novel fast multiplenucleotide sequence alignment method for ultralong datasets. Bioinform. 40(1), btae014 (2024)
de Oliveira Sandes, E.F., Miranda, G., Martorell, X., Ayguade, E., Teodoro, G., Melo, A.C.M.: CUDAlign 4.0: Incremental speculative traceback for exact chromosome-wide alignment in GPU clusters. IEEE Trans. Parallel Distributed Syst. 27(10), 2838–2850 (2016)
Awan, M.G., et al.: ADEPT: a domain independent sequence alignment strategy for GPU architectures. BMC Bioinform. 21, 1–29 (2020)
Schmidt, B., Kallenborn, F., Chacon, A., Hundt, C.: CUDASW++ 4.0: ultra-fast GPU-based Smith-Waterman protein sequence database search. bioRxiv, 1–18 (2023)
Hung, C.L., Lin, Y.S., Lin, C.Y., Chung, Y.C., Chung, Y.F.: CUDA ClustalW: An efficient parallel algorithm for progressive multiple sequence alignment on Multi-GPUs. Comput. Biol. Chem. 58, 62–68 (2015)
Kalare, K.W., Obaidat, M.S., Tembhurne, J.V., Meshram, C., Hsiao, K.F.: Parallelization of global sequence alignment on graphics processing unit. In: CCCI. pp. 1–5. IEEE (2020)
Suzuki, H., Kasahara, M.: Acceleration of nucleotide semi-global alignment with adaptive banded dynamic programming. BioRxiv p. 130633 (2017)
Su, W., Liao, X., Lu, Y., Zou, Q., Peng, S.: Multiple sequence alignment based on a suffix tree and center-star strategy: A linear method for multiple nucleotide sequence alignment on spark parallel framework. J. Comput. Biol. 24(12), 1230–1242 (2017)
Perez-Wohlfeil, E., Trelles, O., Guil, N.: Irregular alignment of arbitrarily long DNA sequences on GPU. J. Supercomput. 79(8), 8699–8728 (2023)
Aljouie, A., Zhong, L., Roshan, U.: High scoring segment selection for pairwise whole genome sequence alignment with the maximum scoring subsequence and GPUs. Int. J. Comput. Biol. Drug Des. 13(1), 71–81 (2020)
Park, S., et al.: SALoBa: maximizing data locality and workload balance for fast sequence alignment on GPUs. In: IPDPS, pp. 728–738 (2022)
Fang, W., Jiang, H., Lu, H., Sun, J., Wu, X., Lin, J.C.W.: GPU-based efficient parallel heuristic algorithm for high-utility itemset mining in large transaction datasets. Trans. Knowl. Data Eng. (2023)
Sayers, E.W., et al.: GenBank 2023 update. Nucleic Acids Res. 51(D1), D141–D144 (2023)
Acknowledgement
The authors gratefully acknowledge the reviewers' comments and suggestions. This work is sponsored in part by National Natural Science Foundation of China (No. 62106175), and Postgraduate Scientific Research Innovation Practice Program of Tianjin University of Technology (YJ2350).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kong, X., Shen, C., Tang, J. (2024). CUK-Band: A CUDA-Based Multiple Genomic Sequence Alignment on GPU. In: Huang, DS., Pan, Y., Zhang, Q. (eds) Advanced Intelligent Computing in Bioinformatics. ICIC 2024. Lecture Notes in Computer Science(), vol 14882. Springer, Singapore. https://doi.org/10.1007/978-981-97-5692-6_8
Download citation
DOI: https://doi.org/10.1007/978-981-97-5692-6_8
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5691-9
Online ISBN: 978-981-97-5692-6
eBook Packages: Computer ScienceComputer Science (R0)