Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3569966.3570012acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsseConference Proceedingsconference-collections
research-article

Accelerating minimap2 for long-read sequencing on NUMA multi-core CPU

Published: 20 December 2022 Publication History

Abstract

Recent advances in three-generation sequencing technology allow for the rapid generation of large throughput of long reads, and mapping these long reads to a reference sequence is one of the first and most time-consuming steps in the downstream application of genomics. Minimap2, the state-of-the-art long-read sequencing aligner available today, has the advantage of being fast and accurate. However, as NUMA multi-core CPU gradually becomes the processors of mainstream computers, minimap2 is not specifically optimised and adapted for the NUMA multi-core architecture. Frequent remote memory accesses, resource contention and idle hardware resources result in a performance far below the theoretical peak performance of NUMA multi-core CPU. Based on the above problems, we propose three optimisation strategies, namely copying index at each NUMA node and binding threads to the cores of NUMA node, designing new IO and computation overlap mechanism, and adaptively adjusting batch_size based on IO and computation time, to achieve full utilisation of resources. We obtain three sets of human genome sequencing data from the ENA database and performed performance tests on the FT 2000+ MCD-FP92 NUMA multi-core CPU system. The three-point strategies proposed in this paper are effective in improving the performance of minimap2, with a maximum speedup of 13 percentage points.

References

[1]
[1] Heng Li. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics (Oxford, England), pages 3094–3100, 2018.
[2]
[2] Roberts M.;Hayes W.;Hunt B.R.;Mount S.M.;Yorke J.A. Reducing storage requirements for biological sequence comparison. Bioinformatics, pages 3363–3369, 2004.
[3]
[3] Carolina Teng;Renan W. Achjian;Caio C. Braga;Marcelo K. Zuffo;Wang J. Chau. Accelerating the base-level alignment step of dna assembling in minimap2 algorithm using fpga. In 2021 IEEE 12th Latin America Symposium on Circuits and System (LASCAS), 2021.
[4]
[4] Licheng Guo;Jason Lau;Zhenyuan Ruan;Peng Wei;Jason Cong. Hardware acceleration of long read pairwise overlapping in genome sequencing: A race between fpga and gpu. In 2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM), 2019.
[5]
[5] Saurabh Kalikar. Accelerating minimap2 for long-read sequencing applications on modern cpus. Nature Computational Science, pages 78–83, 2022.
[6]
[6] Shuang;Wang Lipeng;Luo Qiong Feng, Zonghao;Qiu. Accelerating long read alignment on three processors. In ICPP 2019: Proceedings of the 48th International Conference on Parallel Processing, 2019.
[7]
[7] Mostafa Hadadian Nejad Yousefi;Maziar Goudarzi;Seyed Abolfazl Motahari. Imos: improved meta-aligner and minimap2 on spark. BMC bioinformatics, page 51, 2019.
[8]
[8]  Shaoliang Peng  3  Xiangke Liao  4  Yangbo Yu  1 Zihang Wang  1,  Yingbo Cui  2. Minimapr: A parallel alignment tool for the analysis of large-scale third-generation sequencing data. Computational biology and chemistry, page 107735, 2022.
[9]
[9] Qingda Lu, Christophe Alias, Uday Bondhugula, Thomas Henretty, Sriram Krishnamoorthy, Jagannathan Ramanujam, Atanas Rountev, Ponnuswamy Sadayappan, Yongjian Chen, Haibo Lin, et al. Data layout transformation for enhancing data locality on nuca chip multiprocessors. In 2009 18th International Conference on Parallel Architectures and Compilation Techniques, pages 348–357. IEEE, 2009.
[10]
[10] Yufei;Yu Dantong;Jin Shudong;Robertazzi Thomas Li, Tan;Ren. Characterization of input/output bandwidth performance models in numa architecture for data intensive applications. In 2013 42nd International Conference on Parallel Processing, 2013.
[11]
[11] Xiaowen1 2; Li Chen1; Guo Yang1; Liao Man1; Liu Zhong1 Wang, Zicong1; Chen. Load-balanced link distribution in mesh-based many-core systems. In 21st IEEE International Conference on High Performance Computing and Communications, 17th IEEE International Conference on Smart City and 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019, 2019.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
CSSE '22: Proceedings of the 5th International Conference on Computer Science and Software Engineering
October 2022
753 pages
ISBN:9781450397780
DOI:10.1145/3569966
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Accelerate minimap2
  2. Gene Sequencing
  3. Minimap2
  4. NUMA multi-core CPU

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

CSSE 2022

Acceptance Rates

Overall Acceptance Rate 33 of 74 submissions, 45%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 91
    Total Downloads
  • Downloads (Last 12 months)60
  • Downloads (Last 6 weeks)3
Reflects downloads up to 26 Sep 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media