Abstract
Personalized medicine is one of the most concern of the scientists to propose successful treatments for diseases. This approach considers patients’ genetic make-up and attention to their preferences, beliefs, attitudes, knowledge and social context. Deep learning techniques hold important roles and obtain achievements in bioinformatics tasks. Metagenomic data analysis is very important to develop and evaluate methods and tools applying to Personalized medicine. Metagenomic data is usually characterized by high-dimensional spaces where humans meet difficulties to interpret data. Visualizing metagenomic data is crucial to provide insights in data which can help researchers to explore patterns in data. Moreover, these visualizations can be fetched into deep learning such as Convolutional Neural Networks to do prediction tasks. In this study, we propose a visualization method for metagenomic data where features are arranged in the visualization based on K-means clustering algorithms. We show by experiments on metagenomic datasets of three diseases (Colorectal Cancer, Obesity and Type 2 Diabetes) that the proposed approach not only provides a robust method for visualization where we can observe clusters in the images but also enables us to improve the performance in disease prediction with deep learning algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Moscow, J.A., et al.: The evidence framework for precision cancer medicine. Nat. Rev. Clin. Oncol. 15(3), 183–192 (2017)
Chial, H.: DNA sequencing technologies key to the Human Genome Project. Nat. Educ. 1(1), 219 (2008)
Handelsman, J.: Metagenomics: application of genomics to uncultured microorganisms. Microbiol. Mol. Biol. 69(1), 195–195 (2005)
Turnbaugh, P., Ley, R., Hamady, M., et al.: The human microbiome project. Nature 449, 804–810 (2007). https://doi.org/10.1038/nature06244
Chen, H., et al.: An assessment of the functional enzymes and corresponding genes in chicken manure and wheat straw composted with addition of clay via meta-genomic analysis. Ind. Crops Prod. 153, 2020 (2020). https://doi.org/10.1016/j.indcrop.2020.112573
Nakamura, S., et al.: Direct metagenomic detection of viral pathogens in nasal and fecal specimens using an unbiased high-throughput sequencing approach. PLoS ONE 4(1), e4219 (2009)
Li, L., Delwart, E.: From orphan virus to pathogen: the path to the clinical lab. Curr. Opin. Virol. 1(4), 282–288 (2011)
Udugama, B., et al.: Diagnosing COVID-19: the disease and tools for detection. ACS Nano 14(4), 3822–3835 (2020)
Shah, S.H.J., Malik, A.H., Zhang, B., Bao, Y., Qazi, J.: Metagenomic analysis of relative abundance and diversity of bacterial microbiota in Bemisia tabaci infesting cotton crop in Pakistan, May 2020 (2020). https://doi.org/10.1016/j.meegid.2020.104381
Pasolli, E., et al.: Machine learning meta-analysis of large metagenomic datasets: tools and biological insights. PLoS Comput. Biol. 12(7), e1004977 (2016). https://doi.org/10.1371/journal.pcbi.1004977
Soueidan, H., Nikolski, M.: Machine learning for metagenomics: methods and tools. Metagenomics 1(1) (2017)
Patwardhan, A., Ray. S., Roy, A.: Molecular markers in phylogenetic studies-a review. J. Phylogenetics Evol. Biol. 02(02) (2014)
Reiman, D., Metwally, A., Sun, J., Dai, Y.: PopPhy-CNN: a phylogenetic tree embedded architecture for convolutional neural networks to predict host phenotype from metagenomic data. IEEE J. Biomed. Health Inform. (2020). https://doi.org/10.1109/JBHI.2020.2993761
Zhou, F., et al.: Bayesian biclustering for microbial metagenomic sequencing data via multinomial matrix factorization. arXiv:2005.08361 (2020)
Asnicar, F., et al.: Precise phylogenetic analysis of microbial isolates and genomes from metagenomes using PhyloPhlAn 3.0. Nat. Commun. 11, 2500 (2020). https://doi.org/10.1038/s41467-020-16366-7
Nguyen, T.H., et al.: Disease prediction using synthetic image representations of metagenomic data and convolutional neural networks. In: IEEE-RIVF, pp 231–236. IEEE Xplore (2019). ISBN 978-1-5386-9313-1
Alonso, J.B.: K-means vs mini batch k-means: a comparison (2013)
Soni, R., James Mathai, K.: An innovative ‘cluster-then-predict’ approach for improved sentiment prediction. In: Choudhary, R.K., Mandal, J.K., Auluck, N., Nagarajaram, H.A. (eds.) Advanced Computing and Communication Technologies. AISC, vol. 452, pp. 131–140. Springer, Singapore (2016). https://doi.org/10.1007/978-981-10-1023-1_13
Liang, Q. et al.: DeepMicrobes: taxonomic classification for metagenomics with deep learning. NAR Genomics Bioinform. 2(1) (2020)
Reiman, D., Dai, Y.: Using Conditional Generative Adversarial Networks to Boost the Performance of Machine Learning in Microbiome Datasets. bioXiv:2020.05.18.102814 (2020). https://doi.org/10.1101/2020.05.18.102814
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Nguyen, H.T., Tran, T.B., Luong, H.H., Le, T.P., Tran, N.C., Truong, QD. (2020). K-Means Clustering for Features Arrangement in Metagenomic Data Visualization. In: Hernes, M., Wojtkiewicz, K., Szczerbicki, E. (eds) Advances in Computational Collective Intelligence. ICCCI 2020. Communications in Computer and Information Science, vol 1287. Springer, Cham. https://doi.org/10.1007/978-3-030-63119-2_7
Download citation
DOI: https://doi.org/10.1007/978-3-030-63119-2_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-63118-5
Online ISBN: 978-3-030-63119-2
eBook Packages: Computer ScienceComputer Science (R0)