Abstract
Accurate prediction of the host phenotypes from a microbial sample and identification of the associated microbial markers are important in understanding the impact of the microbiome on the pathogenesis and progression of various diseases within the host. A deep learning tool, PopPhy-CNN, has been developed for the task of predicting host phenotypes using a convolutional neural network (CNN). By representing samples as annotated taxonomic trees and further representing these trees as matrices, PopPhy-CNN utilizes the CNN’s innate ability to explore locally similar microbes on the taxonomic tree. Furthermore, PopPhy-CNN can be used to evaluate the importance of each taxon in the prediction of host status. Here, we describe the underlying methodology, architecture, and core utility of PopPhy-CNN. We also demonstrate the use of PopPhy-CNN on a microbial dataset.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Marchesi JR, Adams DH, Fava F et al (2016) The gut microbiota and host health: a new clinical frontier. Gut 65(2):330–339
Pascale A, Marchesi N, Marelli C et al (2018) Microbiota and metabolic diseases. Endocrine 61(3):357–371. https://doi.org/10.1007/s12020-018-1605-5
Hu J, Koh H, He L et al (2018) A two-stage microbial association mapping framework with advanced FDR control. Microbiome 6(1):131
Vangay P, Hillmann BM, Knights D (2019) Microbiome Learning Repo (ML Repo): A public repository of microbiome regression and classification tasks. GigaScience 8(5):giz042
Pasolli E, Truong DT, Malik F et al (2016) Machine learning meta-analysis of large metagenomic datasets: tools and biological insights. PLoS Comput Biol 12(7):e1004977
Ditzler G, Polikar R, Rosen G (2015) Multi-layer and recursive neural networks for metagenomic classification. IEEE Trans Nanobioscience 14(6):608–616
Reiman D, Metwally A, Dai Y (2017) Using convolutional neural networks to explore the microbiome. Proc. 2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 4269-4272
Reiman D, Metwally AA, and Dai Y (2018) PopPhy-CNN: A Phylogenetic Tree Embedded Architecture for Convolution Neural Networks for Metagenomic Data. bioRxiv
Fioravanti D, Giarratano Y, Maggio V et al (2018) Phylogenetic convolutional neural networks in metagenomics. BMC Bioinformatics 19(2):49
Thanh Hai Nguyen, Yann Chevaleyre, Edi Prifti et al (2017) Deep Learning for Metagenomic Data: using 2D Embeddings and Convolutional Neural Networks. arXiv:1712.00244
Oudah M, Henschel A (2018) Taxonomy-aware feature engineering for microbiome classification. BMC Bioinformatics 19(1):227
Lloyd-Price J, Arze C, Ananthakrishnan AN et al (2019) Multi-omics of the gut microbial ecosystem in inflammatory bowel diseases. Nature 569(7758):655–662
Esteva A, Robicquet A, Ramsundar B et al (2019) A guidle to deep learning in healthcare. Nat Med 25(1):24–29
Eraslan G, Avsec Ž, Gagneur J et al (2019) Deep learning: new computational modelling techniques for genomics. Nat Rev Genet 20(7):389–403
Lecun Y, Bottou L, Bengio Y et al (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324
Zhao N, Chen J, Carroll Ian M et al (2015) Testing in microbiome-profiling studies with MiRKAT, the microbiome regression-based kernel association test. Am J Hum Genet 96(5):797–807
Gilbert JA, Quinn RA, Debelius J et al (2016) Microbiome-wide association studies link dynamic microbial consortia to disease. Nature 535:94–103
Xia Y, Sun J (2017) Hypothesis testing and statistical analysis of microbiome. Genes Dis 4(3):138–148
Collins C, Didelot X (2018) A phylogenetic method to perform genome-wide association studies in microbes that accounts for population structure and recombination. PLoS Comput Biol 14(2):e1005958
Knights D, Parfrey LW, Zaneveld J et al (2011) Human-associated microbial signatures: examining their predictive value. Cell Host Microbe 10(4):292–296. https://doi.org/10.1016/j.chom.2011.1009.1003
Thomas T, Gilbert J, Meyer F (2012) Metagenomics—a guide from sampling to data analysis. Microb Inform Exp 2(1):3
Caporaso JG, Kuczynski J, Stombaugh J et al (2010) QIIME allows analysis of high-throughput community sequencing data. Nat Methods 7:335
Meyer F, Paarmann D, D'Souza M et al (2008) The metagenomics RAST server—a public resource for the automatic phylogenetic and functional analysis of metagenomes. BMC Bioinformatics 9:386
Shannon P, Markiel A, Ozier O et al (2003) Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13(11):2498–2504
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Science+Business Media, LLC, part of Springer Nature
About this protocol
Cite this protocol
Reiman, D., Farhat, A.M., Dai, Y. (2021). Predicting Host Phenotype Based on Gut Microbiome Using a Convolutional Neural Network Approach. In: Cartwright, H. (eds) Artificial Neural Networks. Methods in Molecular Biology, vol 2190. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-0826-5_12
Download citation
DOI: https://doi.org/10.1007/978-1-0716-0826-5_12
Published:
Publisher Name: Humana, New York, NY
Print ISBN: 978-1-0716-0825-8
Online ISBN: 978-1-0716-0826-5
eBook Packages: Springer Protocols