Abstract
This research proposes a novel method for enhancing the prediction of microbe-disease associations and the identification of possible drug treatments through the utilization of advanced computational strategies. The foundation of this approach is the integration of data from the Human Microbe-Disease Association Database with advanced techniques such as Conditional Tabular Generative Adversarial Networks for augmenting data and a combination of Graph Neural Networks and Graph Attention Networks for modeling the intricate relationships between microbes and diseases. Additionally, this study leverages the Bidirectional Encoder Representations from Transformers for the encoding of diseases and drugs, thereby improving the prediction capabilities for disease-drug associations. A thorough experimental analysis validates the efficacy of these approaches in decoding complex biological datasets, offering valuable insights into disease etiology and potential treatment pathways. The results highlight the significance of merging generative models, graph-based deep learning, and transformer-based natural language processing models to push forward the boundaries of biomedical research and its applications.
Similar content being viewed by others
Data availability
The dataset used in this work is available in open domain and no new dataset is generated. The required data will be provided on request.
References
Angermueller C, Pärnamaa T, Parts L, Stegle O (2016) Deep learning for computational biology. Molecular Systems Biology 12(7), 878 https://doi.org/10.15252/msb.20156651https://www.embopress.org/doi/pdf/10.15252/msb.20156651
Canziani A, Paszke A, Culurciello E (2017) An Analysis of Deep Neural Network Models for Practical Applications
Chen Y, Lei X (2022) Metapath aggregated graph neural network and tripartite heterogeneous networks for microbe-disease prediction. Front Microbiol 13:919380. https://doi.org/10.3389/fmicb.2022.919380
David L, Maurice C, Carmody R, Gootenberg D, Button J, Wolfe B, Ling A, Devlin A, Varma Y, Fischbach M, Biddinger S, Dutton R, Turnbaugh P (2013) Diet rapidly and reproducibly alters the gut microbiome. Nature 505https://doi.org/10.1038/nature12820
Del Chierico F, Rapini N, Deodati A, Matteoli MC, Cianfarani S, Putignani L (2022) Pathophysiology of type 1 diabetes and gut microbiota role. International Journal of Molecular Sciences 23(23) https://doi.org/10.3390/ijms232314650
Devlin J, Chang M-W, Lee K, Toutanova K (2019) Bert: Pre-training of deep bidirectional transformers for language understanding arXiv:1810.04805 [cs.CL]
Disbiome Research Team: Disbiome Database. (2023) https://disbiome.ugent.be/home. Accessed: insert-date-here
Donia M, Cimermancic P, Schulze C, Brown L, Martin J, Mitreva M, Clardy J, Linington R, Fischbach M (2014) A systematic analysis of biosynthetic gene clusters in the human microbiome reveals a common family of antibiotics. Cell 158:1402–14. https://doi.org/10.1016/j.cell.2014.08.032
Du G, Zhang J, Jiang M, Long J, Lin Y, Li S, Tan KC (2021) Graph-based class-imbalance learning with label enhancement. IEEE Trans Neural Netw Learn Syst 34(9):6081–6095
Fujihashi T, Koike-Akino T (2023) Graph-based eeg signal compression for human-machine interaction. IEEE Access
Guarner F, Malagelada J-R (2003) Gut flora in health and disease. Lancet 361(9356):512–519. https://doi.org/10.1016/S0140-6736(03)12489-0
Hooks KB, O’Malley MA (2017) Dysbiosis and its discontents. mBio 8(5), 10–11280149217 https://doi.org/10.1128/mbio.01492-17 https://journals.asm.org/doi/pdf/10.1128/mbio.01492-17
Hoque MN, Akter S, Mishu ID, Islam MR, Rahman MS, Akhter M, Islam I, Hasan MM, Rahaman MM, Sultana M, Islam T, Hossain MA (2021) Microbial co-infections in covid-19: Associated microbiota and underlying mechanisms of pathogenesis. Microbial Pathogenesis 156:104941. https://doi.org/10.1016/j.micpath.2021.104941
Hua M, Yu S, Liu T, Yang X, Wang H (2022) Mvgcnmda: Multi-view graph augmentation convolutional network for uncovering disease-related microbes. Interdisciplinary Sciences: Computational Life Sciences 14https://doi.org/10.1007/s12539-022-00514-2
Huang H-c, Chen Z-h, Li B-w, Ma Q-h, He H-d (2024) Festgcn: A frequency-enhanced spatio-temporal graph convolutional network for traffic flow prediction under adaptive signal timing. Appl Intell 1–17
Huang Y-A, You Z-H, Yan G, Wang X-S (2017) A novel approach based on katz measure to predict associations of human microbiota with non-infectious diseases. Bioinformatics 34https://doi.org/10.1093/bioinformatics/btx773
Hwang S, Kim C, Yang S, Kim E, Hart T, Marcotte E, Lee I (2018) Humannet v2: human gene networks for disease research. Nucl Acids Res 47https://doi.org/10.1093/nar/gky1126
Islam MA, Ahmed CF, Alam MT, Leung CK-S (2024) Graph-based substructure pattern mining with edge-weight. Applied Intelligence 54(5):3756–3785
Kang M, Martin A (2017) Microbiome and colorectal cancer: Unraveling host-microbiota interactions in colitis-associated colorectal cancer development. Seminars in Immunology 32https://doi.org/10.1016/j.smim.2017.04.003
Kang Y, Jia N, Cui R, Deng J (2021) A graph-based semi-supervised reject inference framework considering imbalanced data distribution for consumer credit scoring. Appl Soft Comput 105:107259
Knights D, Silverberg M, Weersma R, Gevers D, Dijkstra G, Huang H, Tyler A, Sommeren S, Imhann F, Stempak J, Huang H, Vangay P, Al-Ghalith G, Russell C, Sauk J, Knight J, Daly M, Huttenhower C, Xavier R (2014) Complex host genetics influence the microbiome in inflammatory bowel disease. Genome Med 6:107. https://doi.org/10.1186/s13073-014-0107-1
Koroteev MV (2021) BERT: A Review of Applications in Natural Language Processing and Understanding
Laarhoven T, Nabuurs S, Marchiori E (2011) Gaussian interaction profile kernels for predicting drug-target interaction. Bioinformatics (Oxford, England) 27:3036–43. https://doi.org/10.1093/bioinformatics/btr500
Lambrecht BN, Hammad H (2014) The immunology of asthma. Nat Immunol 16:45–56
Li C, Zhu X, Yan Y, Zhao Z, Su L, Zeng Q (2024) Mhgnn: Multi-view fusion based heterogeneous graph neural network. Appl Intell 1–19
Li H, Tang M, Mu Y, Wang Y, Yang T, Wang H (2024) Achieving accurate and balanced regional electric vehicle charging load forecasting with a dynamic road network: a case study of lanzhou city. Appl Intell, 1–23
Liu D, Liu J, Luo Y, He Q, Deng L (2022) Mgatmda: Predicting microbe-disease associations via multi-component graph attention network. IEEE/ACM Trans Comput Biol Bioinform 19(6):3578–3585. https://doi.org/10.1109/TCBB.2021.3116318
Liu Z, Wang Y, Luo Y, Luo C (2024) Graph-based few-shot incremental learning algorithm for unknown class detection. Appl Soft Comput 154:111363
Long Y, Luo J, Zhang Y, Xia Y (2020) Predicting human microbe-disease associations via graph attention networks with inductive matrix completion. Briefings in Bioinformatics 22(3):146. https://doi.org/10.1093/bib/bbaa146 https://academic.oup.com/bib/article-pdf/22/3/bbaa146/37963818/bbaa146.pdf
Long Y, Wu M, Liu Y, Kwoh CK, Luo J, Li X (2020) Ensembling graph attention networks for human microbe-drug association prediction. Bioinformatics 36(Suppl_2), 779–786 https://doi.org/10.1093/bioinformatics/btaa891
Lv X, Liu Z, Zhao Y, Xu G, You X (2023) Hbert: A long text processing method based on bert and hierarchical attention mechanisms. Int J Semantic Web Inform Syst (IJSWIS) 19(1):1–14
Ma L, Rabbany R, Romero-Soriano A (2021) Graph attention networks with positional embeddings. In: Pacific-Asia conference on knowledge discovery and data mining, pp. 514–527. Springer
Ma W, Zhang L, Zeng P, Huang C, Li J, Geng B, Yang J, Kong W, Zhou X, Cui Q (2016) An analysis of human microbe-disease associations. Briefings Bioinform 18(1):85–97. https://doi.org/10.1093/bib/bbw005 (https://academic.oup.com/bib/article-pdf/18/1/85/25408549/bbw005.pdf)
Mathew B (2014) A review on recent diseases caused by microbes, J Appl Environ Microbiol 2014, 2(4): 106-115
Methé B, Nelson KE, Pop M, Huot Creasy H, Giglio M, Huttenhower C, Gevers D, JF P, Abubucker S, Badger J, Chinwalla A, AM E, Fitzgerald M, Fulton R, Pepin K, EA L, Madupu R, Magrini V, Martin J, White O, (2012) A framework for human microbiome research. Nature 486:215–221. https://doi.org/10.1038/nature11209
Muegge B, Kuczynski J, Knights D, Clemente J, González A, Fontana L, Henrissat B, Knight R, Gordon J (2011) Diet drives convergence in gut microbiome functions across mammalian phylogeny and within humans. Science (New York, N.Y.) 332, 970–4 https://doi.org/10.1126/science.1198719
Naik A, Patwardhan I, Joshi A (2023) Synthesizing microbiome-disease association data using gans. In: 2023 Second International Conference on Advances in Computational Intelligence and Communication (ICACIC), pp. 1–6. https://doi.org/10.1109/ICACIC59454.2023.10435071
Qu J, Zhao Y, Yin J (2019) Identification and analysis of human microbe-disease associations by matrix decomposition and label propagation. Front Microbiol 10 https://doi.org/10.3389/fmicb.2019.00291
Sarkar S, Babar MF, Hassan MM, Hasan M, Santu SKK (2023) Exploring challenges of deploying bert-based nlp models in resource-constrained embedded devices. arXiv preprint arXiv:2304.11520
Read J, Perez-Cruz F (2014) Deep Learning for Multi-label classification
Scarselli F, Gori M, Tsoi A, Hagenbuchner M, Monfardini G (2009) Computational capabilities of graph neural networks. IEEE transactions on neural networks / a publication of the IEEE Neural Networks Council 20:81–102. https://doi.org/10.1109/TNN.2008.2005141
Tan Y, Zou J, Kuang L, Wang X, Zeng B, Zhang Z, Wang L (2022) Gsamda: a computational model for predicting potential microbe-drug associations based on graph attention network and sparse autoencoder. BMC Bioinform 23(1):492. https://doi.org/10.1186/s12859-022-05053-7
Teh J, Berendsen E, Hoedt E, Kang S, Zhang J, Zhang F, Liu Q, Hamilton A, Wilson-O’Brien A, Ching J, Sung J, Yu J, Ng S, Kamm M, Morrison M (2021) Novel strain-level resolution of crohn’s disease mucosa-associated microbiota via an ex vivo combination of microbe culture and metagenomic sequencing. ISME J 15:1–13. https://doi.org/10.1038/s41396-021-00991-1
Veličković P, Cucurull G, Casanova A, Romero A, Liò P, Bengio Y (2018) Graph attention networks
Waikhom L, Patgiri R (2021) Graph neural networks: methods, applications, and opportunities
Wali A, Ahmad M, Naseer A, Tamoor M, Gilani S (2023) Stynmedgan: Medical images augmentation using a new gan model for improved diagnosis of diseases. J Intell Fuzzy Syst (Preprint), 1–18
Wang C, Huang G, Huang Z, He W, et al (2023) Conditional transgan-based data augmentation for pcb electronic component inspection. Comput Intell Neurosci 2023
Wang F, Huang Z-A, Zhu Z, Wen Z, Zhao J, Yan G (2017) Lrlshmda: Laplacian regularized least squares for human microbe-disease association prediction. Sci Rep 7:7601. https://doi.org/10.1038/s41598-017-08127-2
Wu L, Cui P, Pei J, Zhao L, Guo X (2022) Graph neural networks: Foundation, frontiers and applications. In: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. KDD ’22, pp. 4840–4841. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3534678.3542609
Xu J, Li Y (2006) Discovering disease-genes by topological features in human protein-protein interaction network. Bioinformatics (Oxford, England) 22:2800–5. https://doi.org/10.1093/bioinformatics/btl467
Xu L, Skoularidou M, Cuesta-Infante A, Veeramachaneni K (2019) Modeling Tabular data using Conditional GAN
Yang Y-Y, Lin Y-A, Chu H-M, Lin H-T (2019) Deep learning with a rethinking structure for multi-label classification
Yatsunenko T, Rey F, Manary M, Trehan I, Dominguez-Bello M, Contreras M, Magris M, Hidalgo G, Baldassano R, Anokhin A, Heath A, Warner B, Reeder J, Kuczynski J, Caporaso J, Lozupone C, Lauber C, Clemente J, Knights D, Gordon J (2012) Human gut microbiome viewed across age and geography. Nature 486:222–7. https://doi.org/10.1038/nature11053
Yogarajan V, Montiel J, Smith T, Pfahringer B (2021) Transformers for multi-label classification of medical text: an empirical comparison, pp. 114–123. https://doi.org/10.1007/978-3-030-77211-6_12
Zhao G, Cai Z, Wang X, Dang X (2023) Gan data augmentation methods in rock classification. Appl Sci 13(9):5316
Zhang W, Yang W, Lu X, Huang F, Luo F (2018) The bi-direction similarity integration method for predicting microbe-disease associations. IEEE Access PP, 1–1 https://doi.org/10.1109/ACCESS.2018.2851751
Zhang Z, Cui P, Zhu W (2020) Deep learning on gaphs: a survey
Zhou J, Cui G, Hu S, Zhang Z, Yang C, Liu Z, Wang L, Li C, Sun M (2020) Graph neural networks: a review of methods and applications. AI Open 1:57–81. https://doi.org/10.1016/j.aiopen.2021.01.001
Funding
The authors have not received any funding support during experimentation of the work and writing of the manuscript.
Author information
Authors and Affiliations
Contributions
All authors contributed to the work conceptualization and development of solution. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors do not have Conflict of interest to declare.
Ethics approval
This study does not violate and does not involve moral and ethical statements.
Consent for publication
All authors are aware of the publication of this manuscript and agreed to its publication.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Naik, A., Patwardhan, I. & Joshi, A. CGDGMDA-Net: discovering microbe-disease and drug associations through CTGAN and graph-based deep learning. Netw Model Anal Health Inform Bioinforma 13, 48 (2024). https://doi.org/10.1007/s13721-024-00484-z
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13721-024-00484-z