Abstract
Emerging evidence indicates that long non-coding RNA (lncRNA) plays a crucial role in human disease. Discovering disease-gene association is a fundamental and critical biomedical task, which assists biologists and physicians to discover complex pathogenic mechanisms under the phenotype. With high-throughput sequencing technology and various clinical biomarkers to measure the similarities between lncRNA and disease phenotype, network-based semi-supervised learning has been commonly utilized by these studies to address this class imbalanced large-scale data issue. However, most existing approaches are based on linear models and suffer from two major limitations: 1) They implicitly consider a local-structure representation for each candidate; 2) They are unable to capture nonlinear associations between lncRNAs and diseases. In this paper, we propose a new framework for lncRNA-disease association task by combining Graph Neural Network (GNN) and inductive matrix completion, named GNN-IMC. With the help of GNN, we could generate subgraphs based on (lncRNA, disease) pairs from the observed association matrix and maps these subgraphs to their corresponding associations. In addition, GNN-IMC is inductive–it can generalize to lncRNAs/diseases unseen during the training (given that their associations exist), and can even transfer to new tasks. Empirical results demonstrate that the proposed deep learning algorithm outperforms all other state-of-the-art methods on most of metrics.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hobert, O.: Gene regulation by transcription factors and microRNAs. Science 319(5871), 1785–1786 (2008)
Swami, M.: Transcription factors: MYC matters. Nat. Rev. Cancer 10(12), 812 (2010)
Collins, F.S., Morgan, M., Patrinos, A.: The human genome project: lessons from large-scale biology. Science 300(5617), 286–290 (2003)
Yuan, L., Guo, L.H., Yuan, C.A., et al.: Integration of multi-omics data for gene regulatory network inference and application to breast cancer. IEEE/ACM Trans. Comput. Biol. Bioinf. 16(3), 782–791 (2019)
International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409(6822), 860 (2001)
Louro, R., Smirnova, A.S., Verjovski-Almeida, S.: Long intronic noncoding RNA transcription: expression noise or expression choice? Genomics 93(4), 291–298 (2009)
Yuan, L., Zhu, L., Guo, W.L., Huang, D.S.: Nonconvex penalty based low-rank representation and sparse regression for eQTL mapping. IEEE/ACM Trans. Comput. Biol. Bioinf. 14(5), 1154–1164 (2017)
Geisler, S., Coller, J.: RNA in unexpected places: long non-coding RNA functions in diverse cellular contexts. Nat. Rev. Mol. Cell Biol. 14(11), 699–712 (2013)
Xing, Z., Lin, A., Li, C., et al.: lncRNA directs cooperative epigenetic regulation downstream of chemokine signals. Cell 159(5), 1110–1125 (2014)
Yuan, L., Yuan, C.A., Huang, D.S.: FAACOSE: a fast adaptive ant colony optimization algorithm for detecting SNP epistasis. Complexity 1, 1–10 (2017)
Yuan, L., Huang, D.S.: A network-guided association mapping approach from DNA methylation to disease. Sci. Rep. 9(1), 1–16 (2019)
Chen, X., Yan, C.C., Zhang, X., et al.: Long non-coding RNAs and complex diseases: from experimental results to computational models. Brief. Bioinf. 18(4), 558–576 (2016)
Gao, Y., Meng, H., Liu, S., et al.: LncRNA-HOST2 regulates cell biological behaviors in epithelial ovarian cancer through a mechanism involving microRNA let-7. Hum. Mol. Genet. 24(3), 841–852 (2014)
Yuan, L., Zheng, C.H., Xia, J.F., Huang, D.S.: Module based differential coexpression analysis method for type 2 diabetes. Biomed. Res. Int. 1, 1–8 (2015)
Chen, G., Wang, Z., Wang, D., et al.: LncRNADisease: a database for long-non-coding RNA-associated diseases. Nucl. Acids Res. 41(D1), D983–D986 (2012)
Lan, W., Li, M., Zhao, K., et al.: LDAP: a web server for lncRNA-disease association prediction. Bioinformatics 33(3), 458–460 (2016)
Wang, J., Ma, R., Ma, W., et al.: LncDisease: a sequence based bioinformatics tool for predicting lncRNA-disease associations. Nucl. Acids Res. 44(9), e90–e90 (2016)
Zheng, C.H., Yuan, L., Sha, W., et al.: Gene differential coexpression analysis based on biweight correlation and maximum clique. BMC Bioinf. 15 Suppl 15(S15), S3 (2014)
Lin, Y., Han, K., Huang, D.S.: Novel algorithm for multiple quantitative trait loci mapping by using bayesian variable selection regression. In: International Conference on Intelligent Computing, pp. 862–868 (2016)
Acknowledgement
This work was supported by the National Key R&D Program of China [No.2019YFB1404700], the Natural Science Foundation of Shandong Province [No. ZR2017LF019].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Yuan, L. et al. (2020). LncRNA-Disease Association Prediction Based on Graph Neural Networks and Inductive Matrix Completion. In: Huang, DS., Jo, KH. (eds) Intelligent Computing Theories and Application. ICIC 2020. Lecture Notes in Computer Science(), vol 12464. Springer, Cham. https://doi.org/10.1007/978-3-030-60802-6_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-60802-6_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60801-9
Online ISBN: 978-3-030-60802-6
eBook Packages: Computer ScienceComputer Science (R0)