Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3459930.3469534acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article
Open access

Predicting drug resistance in M. tuberculosis using a long-term recurrent convolutional network

Published: 01 August 2021 Publication History

Abstract

Motivation: Drug resistance in Mycobacterium tuberculosis (MTB) is a growing threat to human health worldwide. One way to mitigate the risk of drug resistance is to enable clinicians to prescribe the right antibiotic drugs to each patient through methods that predict drug resistance in MTB using whole-genome sequencing (WGS) data. Existing machine learning methods for this task typically convert the WGS data from a given bacterial isolate into features corresponding to single-nucleotide polymorphisms (SNPs) or short sequence segments of a fixed length K (K-mers). Here, we introduce a gene burden-based method for predicting drug resistance in TB. We define one numerical feature per gene corresponding to the number of mutations in that gene in a given isolate. This representation greatly reduces the number of model parameters. We further propose a model architecture that considers both gene order and locality structure through a Long-term Recurrent Convolutional Network (LRCN) architecture, which combines convolutional and recurrent layers.
Results: We find that using these strategies yields a substantial, statistically significant improvement over state-of-the-art methods on a large dataset of M. tuberculosis isolates, and suggest that this improvement is driven by our method's ability to account for the order of the genes in the genome and their organization into operons.
Availability: The implementations of our feature preprocessing pipeline1 and our LRCN model2 are publicly available, as is our complete dataset3.
Supplementary information: Additional data are available in the Supplementary Materials document4.

References

[1]
P. Bradley, N. Gordon, T Walker, et al. 2015. Rapid antibiotic-resistance predictions from genome sequence data for Staphylococcus aureus and Mycobacterium tuberculosis. Nature Communications 6 (2015).
[2]
Joshua J Carter, Timothy M Walker, A Sarah Walker, Michael G. Whitfield, Glenn P. Morlock, Timothy EA Peto, James E. Posey, Derrick W Crook, and Philip W Fowler. 2019. Prediction of pyrazinamide resistance in Mycobacterium tuberculosis using structure-based machine learning approaches. bioRxiv 37 (2019).
[3]
Michael L Chen, Akshith Doddi, Jimmy Royer, Luca Freschi, Marco Schito, Matthew Ezewudo, Isaac S Kohane, Andrew Beam, and Maha Farhat. 2019. Beyond multidrug resistance: Leveraging rare variants with machine and statistical learning models in Mycobacterium tuberculosis resistance prediction. EBioMedicine 43 (2019), 356--369.
[4]
François Chollet et al. 2015. Keras. https://github.com/keras-team/keras.
[5]
Francesc Coll, Ruth McNerney, José Afonso Guerra-Assunção, Judith R. Glynn, João Perdigão, Miguel Viveiros, Isabel Portugal, Arnab Pain, Nigel Martin, and Taane G. Clark. 2014. A robust SNP barcode for typing Mycobacterium tuberculosis complex strains. Nature Communications 5 (2014).
[6]
F Coll, R McNerney, MD Preston, et al. 2015. Rapid determination of anti-tuberculosis drug resistance from whole-genome sequences. Genome Med. 7 (2015), 51.
[7]
Wouter Deelder, Sofia Christakoudi, Jody Phelan, Ernest Diez Benavente, Susana Campino, Ruth McNerney, Luigi Palla, and Taane Gregory Clark. 2019. Machine learning predicts accurately Mycobacterium tuberculosis drug resistance from whole genome sequencing data. Frontiers in Genetics 10 (2019), 922.
[8]
Christopher A Desjardins, Keira A Cohen, Vanisha Munsamy, Thomas Abeel, Kashmeel Maharaj, Bruce J Walker, Terrance P Shea, Deepak V Almeida, Abigail L Manson, Alex Salazar, et al. 2016. Genomic and functional analyses of Mycobacterium tuberculosis strains implicate ald in D-cycloserine resistance. Nature genetics 48, 5 (2016), 544--551.
[9]
Jeff Donahue, Lisa Anne Hendricks, Marcus Rohrbach, Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, and Trevor Darrell. 2017. Long-term Recurrent Convolutional Networks for Visual Recognition and Description. IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (2017), 667--691. Issue 4.
[10]
Alexandre Drouin, Gaël Letarte, Frédéric Raymond, Mario Marchand, Jacques Corbeil, and François Laviolette. 2019. Interpretable genotype-to-phenotype classifiers with performance guarantees. Scientific Reports 9, 1 (11 Mar 2019), 4071.
[11]
D. Falzon, E. Jaramillo, H. J. Schunemann, M. Arentz, M. Bauer, J. Bayona, L. Blanc, J. A. Caminero, C. L. Daley, C. Duncombe, C. Fitzpatrick, A. Gebhard, H. Getahun, M. Henkens, T. H. Holtz, J. Keravec, S. Keshavjee, A. J. Khan, R. Kulier, V. Leimane, C. Lienhardt, C. Lu, A. Mariandyshev, G. B. Migliori, F. Mirzayev, C. D. Mitnick, P. Nunn, G. Nwagboniwe, O. Oxlade, D. Palmero, P. Pavlinac, M. I. Quelapio, M. C. Raviglione, M. L. Rich, S. Royce, S. Rusch-Gerdes, A. Salakaia, R. Sarin, D. Sculier, F. Varaine, M. Vitoria, J. L. Walson, F. Wares, K. Weyer, R. A. White, and M. Zignol. 2011. WHO guidelines for the programmatic management of drug-resistant tuberculosis: 2011 update. European Respiratory Journal 38, 3 (Aug. 2011), 516--528.
[12]
Maha R Farhat, Luca Freschi, Roger Calderon, Thomas Ioerger, Matthew Snyder, Conor J Meehan, Bouke de Jong, Leen Rigouts, Alex Sloutsky, Devinder Kaur, et al. 2019. GWAS for quantitative resistance phenotypes in Mycobacterium tuberculosis reveals resistance genes and regulatory regions. Nature communications 10, 1 (2019), 1--11.
[13]
Silke Feuerriegel, Viola Schleusener, Patrick Beckert, Thomas A. Kohl, Paolo Miotto, Daniela M. Cirillo, Andrea M. Cabibbe, Stefan Niemann, and Kurt Fellenberg. 2015. PhyResSE: a Web Tool Delineating Mycobacterium tuberculosis Antibiotic Resistance and Lineage from Whole-Genome Sequencing Data. Journal of Clinical Microbiology 53, 6 (2015), 1908--1914.
[14]
Guo Liang Gan, Matthew Nguyen, Elijah Willie, Brian Lee, Cedric Chauve, Maxwell Libbrecht, and Leonid Chindelevitch. 2020. Geographic heterogeneity impacts drug resistance predictions in Mycobacterium tuberculosis. bioRxiv 27 (2020).
[15]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Comput. 9, 8 (Nov. 1997), 1735--1780.
[16]
H. Iwai, M. Kato-Miyazawa, T Kirikae, and T. Miyoshi-Akiyama. 2015. CASTB (the comprehensive analysis server for the Mycobacterium tuberculosis complex): A publicly accessible web server for epidemiological analyses, drug-resistance prediction and phylogenetic comparison of clinical isolates. Tuberculosis 1 (2015), 843--844. Issue 95.
[17]
Suha Kadura, Nicholas King, Maria Nakhoul, Hongya Zhu, Grant Theron, Claudio U Köser, and Maha Farhat. 2020. Systematic review of mutations associated with resistance to the new and repurposed Mycobacterium tuberculosis drugs bedaquiline, clofazimine, linezolid, delamanid and pretomanid. Journal of Antimicrobial Chemotherapy 75 (05 2020). dkaa136.
[18]
Adamandia Kapopoulou, Jocelyne M Lew, and Stewart T Cole. 2011. The Myco-Browser portal: a comprehensive and manually annotated resource for mycobacterial genomes. Tuberculosis 91, 1 (2011), 8--13.
[19]
Samaneh Kouchaki, Yang Yang, Alexander Lachapelle, Timothy M. Walker, A. Sarah Walker, CRyPTIC Consortium, Timothy E. A. Peto, Derrick W. Crook, and David A. Clifton. 2020. Multi-Label Random Forest Model for Tuberculosis Drug Resistance Classification and Mutation Ranking. Frontiers in Microbiology 11 (2020), 667.
[20]
Samaneh Kouchaki, Yang Yang, Timothy M Walker, A Sarah Walker, Daniel J Wilson, Timothy E A Peto, Derrick W Crook, CRyPTIC Consortium, and David A Clifton. 2018. Application of machine learning techniques to tuberculosis drug resistance analysis. Bioinformatics 35, 13 (11 2018), 2276--2282.
[21]
Rasko Leinonen, Ruth Akhtar, Ewan Birney, Lawrence Bower, Ana Cerdeno-Tárraga, et al. 2011. The European Nucleotide Archive. Nucleic Acids Research 39 (2011), D28--31.
[22]
Raski Leinonen, Hideaki Sugawara, and Martin Shumway. 2011. The Sequence Read Archive. Nucleic Acids Res. 39 (2011), D19--21.
[23]
H Li. 2013. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. arXiv 3 (2013).
[24]
H Li, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, and 1000 Genome Project Data Processing Subgroup. 2009. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 16 (2009), 2078--2079.
[25]
Niall McLaughlin, Jesus Martinez del Rincon, and Paul Miller. 2016. Recurrent Convolutional Network for Video-Based Person Re-identification. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, USA, 1325--1334.
[26]
Tra-My Ngo and Yik-Ying Teo. 2019. Genomic prediction of tuberculosis drug-resistance: benchmarking existing databases and prediction algorithms. BMC bioinformatics 20, 1 (2019), 68.
[27]
Fernando Nogueira. 2014--. Bayesian Optimization: Open source constrained global optimization tool for Python. https://github.com/fmfn/BayesianOptimization
[28]
Jim O'Neill. 2014. Antimicrobial Resistance: Tackling a crisis for the health and wealth of nations. Technical Report. Review on Antimicrobial Resistance.
[29]
Anne E Osbourn and Ben Field. 2009. Operons. Cellular and Molecular Life Sciences 66, 23 (2009), 3755--3775.
[30]
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. 2011. Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12 (2011), 2825--2830.
[31]
Ryan Poplin, Valentin Ruano-Rubio, Mark A. DePristo, Tim J. Fennell, Mauricio O. Carneiro, Geraldine A. Van der Auwera, David E. Kling, Laura D. Gauthier, Ami Levy-Moonshine, David Roazen, Khalid Shakir, Joel Thibault, Sheila Chandran, Chris Whelan, Monkol Lek, Stacey Gabriel, Mark J Daly, Ben Neale, Daniel G. MacArthur, and Eric Banks. 2017. Scaling accurate genetic variant discovery to tens of thousands of samples. bioRxiv 22 (2017).
[32]
Mario C Raviglione and Ian M Smith. 2007. XDR tuberculosis---implications for global public health. New England Journal of Medicine 356, 7 (2007), 656--659.
[33]
V. Schleusener, C. Köser, P. Beckert, et al. 2017. Mycobacterium tuberculosis resistance prediction and lineage classification from genome sequencing: comparison of automated analysis tools. Scientific Reports 7 (2017).
[34]
Jürgen Schmidhuber. 2014. Deep Learning in Neural Networks: An Overview. CoRR abs/1404.7828 (2014). arXiv:1404.7828
[35]
Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. 2012. Practical Bayesian Optimization of Machine Learning Algorithms. In Proceedings of the 25th International Conference on Neural Information Processing Systems - Volume 2 (Lake Tahoe, Nevada) (NIPS'12). Curran Associates Inc., Red Hook, NY, USA, 2951--2959.
[36]
Angela M Starks, Enrique Avilés, Daniela M Cirillo, Claudia M Denkinger, David L Dolinger, Claudia Emerson, Jim Gallarda, Debra Hanna, Peter S Kim, Richard Liwski, et al. 2015. Collaborative effort for a centralized worldwide tuberculosis relational sequencing data platform. Clinical Infectious Diseases 61, suppl_3 (2015), S141--S146.
[37]
A Steiner, D Stucki, M Coscolla, S Borrell, and S Gagneux. 2014. KvarQ: targeted and direct variant calling from fastq reads of bacterial genomes. BMC Genomics 15 (2014). Issue 1.
[38]
Michelle Su, Sarah W. Satola, and Timothy D. Read. 2019. Genome-Based Prediction of Bacterial Antibiotic Resistance. Journal of clinical microbiology 57, 3 (27 Feb 2019), e01405--18.
[39]
Andrej Trauner, Sonia Borrell, Klaus Reither, and Sebastien Gagneux. 2014. Evolution of Drug Resistance in Tuberculosis: Recent Progress and Implications for Diagnosis and Therapy. Drugs 10 (2014).
[40]
Alice R Wattam, David Abraham, Oral Dalay, Terry L Disz, Timothy Driscoll, Joseph L Gabbard, Joseph J Gillespie, Roger Gough, Deborah Hix, Ronald Kenyon, et al. 2014. PATRIC, the bacterial bioinformatics database and analysis resource. Nucleic acids research 42, D1 (2014), D581--D591.
[41]
WHO. 2014. Antimicrobial resistance: global report on surveillance. Technical Report. World Health Organization. 257 pages.
[42]
WHO. 2019. Global tuberculosis report 2019. Technical Report. World Health Organization. 297 pages.
[43]
Yang Yang, Timothy M Walker, A Sarah Walker, Daniel J Wilson, Timothy E A Peto, Derrick W Crook, Farah Shamout, CRyPTIC Consortium, Tingting Zhu, and David A Clifton. 2019. DeepAMR for predicting co-occurrent resistance of Mycobacterium tuberculosis. Bioinformatics 35, 18 (01 2019), 3240--3249.
[44]
Han Yuan, Ivan Paskov, Hristo Paskov, Alvaro J. González, and Christina S. Leslie. 2016. Multitask learning improves prediction of cancer drug sensitivity. Scientific Reports 6, 1 (23 Aug 2016), 31619.
[45]
Hooman Zabeti, Nick Dexter, Amir Hosein Safari, Nafiseh Sedaghat, Maxwell Libbrecht, and Leonid Chindelevitch. 2020. An interpretable classification method for predicting drug resistance in M. tuberculosis. bioRxiv 18 (2020).
[46]
Lilia E Ziganshina, Albina F Titarenko, and Geraint R Davies. 2013. Fluoro-quinolones for treating tuberculosis (presumed drug-sensitive). Cochrane Database of Systematic Reviews 72 (June 2013).

Cited By

View all
  • (2023)Machine Learning of the Whole Genome Sequence of Mycobacterium tuberculosis: A Scoping PRISMA-Based ReviewMicroorganisms10.3390/microorganisms1108187211:8(1872)Online publication date: 25-Jul-2023

Index Terms

  1. Predicting drug resistance in M. tuberculosis using a long-term recurrent convolutional network
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image ACM Conferences
          BCB '21: Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
          August 2021
          603 pages
          ISBN:9781450384506
          DOI:10.1145/3459930
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

          Sponsors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 01 August 2021

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. antimicrobial resistance
          2. deep learning
          3. infectious disease
          4. next-generation sequencing
          5. tuberculosis

          Qualifiers

          • Research-article

          Funding Sources

          Conference

          BCB '21
          Sponsor:

          Acceptance Rates

          Overall Acceptance Rate 254 of 885 submissions, 29%

          Upcoming Conference

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)152
          • Downloads (Last 6 weeks)20
          Reflects downloads up to 30 Sep 2024

          Other Metrics

          Citations

          Cited By

          View all
          • (2023)Machine Learning of the Whole Genome Sequence of Mycobacterium tuberculosis: A Scoping PRISMA-Based ReviewMicroorganisms10.3390/microorganisms1108187211:8(1872)Online publication date: 25-Jul-2023

          View Options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Get Access

          Login options

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media