Nothing Special   »   [go: up one dir, main page]

Skip to main content

lncRNA-LSTM: Prediction of Plant Long Non-coding RNAs Using Long Short-Term Memory Based on p-nts Encoding

  • Conference paper
  • First Online:
Intelligent Computing Methodologies (ICIC 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11645))

Included in the following conference series:

Abstract

Long non-coding RNA (lncRNA) plays an important role in regulating biological activities. Traditional feature engineering methods for lncRNA prediction rely on prior experience and require manual feature extraction from some related datasets. Besides, the structure of plant lncRNA is complex. It is difficult to extract features with good discrimination. This paper proposes a method based on long short-term memory networks (LSTM) for lncRNA recognition called lncRNA-LSTM. K-means clustering is used to solve the problem of unbalanced sample size at first, p-nts coding is performed according to the characteristics of RNA sequences, and it is input into a recurrent neural network including embedded layer, LSTM layer and full connection layer. lncRNA-LSTM is more effective than support vector machine, Naive Bayes and other model with feature fusing of open reading frame, second structure and k-mers. Using the same Zea mays dataset, lncRNA-LSTM achieves 96.2% accuracy which is 0.053, 0.173, 0.211 and 0.162 higher than that of CPC2, CNCI, PLEK and LncADeep, the precision and recall are much more effective and robust.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Falazzo, A.F., Lee, E.S.: Non-coding RNA: what is functional and what is junk? Front. Genet. 6, 2 (2015)

    Google Scholar 

  2. Aryal, B., Rotllan, N., Fernándezhernando, C.: Noncoding RNAs and atherosclerosis. Curr. Atherosclerosis Rep. 16(5), 1–11 (2014)

    Article  Google Scholar 

  3. Schmitz, S.U., Grote, P., Herrmann, B.G.: Mechanisms of long noncoding RNA function in development and disease. Cell. Mol. Life Sci. 73(13), 2491–2509 (2016)

    Article  Google Scholar 

  4. O’Leary, V.B., Ovsepian, S.V., et al.: PARTICLE, a triplex-forming long ncRNA, regulates locus-specific methylation in response to low-dose irradiation. Cell Rep. 11(3), 474–485 (2015)

    Article  Google Scholar 

  5. Schneider, H.W., Raiol, T., Brigido, M.M., et al.: A support vector machine based method to distinguish long non-coding RNAs from protein coding transcripts. BMC Genom. 18(1), 804 (2017)

    Article  Google Scholar 

  6. Long, H., Xu, Z., Hu, B., et al.: COME: a robust coding potential calculation tool for lncRNA identification and characterization based on multiple features. Nucleic Acids Res. 45(1), e2 (2017)

    Article  Google Scholar 

  7. Kong, L., Zhang, Y., Ye, Z.Q., et al.: CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine. Nucleic Acids Res. 36, W345–W349 (2007)

    Article  Google Scholar 

  8. Kang, Y.J., Yang, D., Kong, C.L., et al.: CPC2: a fast and accurate coding potential calculator based on sequence intrinsic features. Nucleic Acids Res. 45(W1), W12–W16 (2017)

    Article  Google Scholar 

  9. Wang, L.G., Hyun, J.P., Surendra, D., et al.: CPAT: coding-potential assessment tool using an alignment-free logistic regression model. Nucleic Acids Res. 41(6), e74 (2013)

    Article  Google Scholar 

  10. Sun, L., Luo, H.T., Bu, D.C., et al.: Utilizing sequence intrinsic composition to classify protein-coding and long non-coding transcripts. Nucleic Acids Res. 41(17), e166 (2013)

    Article  Google Scholar 

  11. Li, A.M., Zhang, J.Y., Zhou, Z.Y.: PLEK: a tool for predicting long non-coding RNAs and messenger RNAs based on an improved k-mer scheme. BMC Bioinform. 15, 311 (2014)

    Article  Google Scholar 

  12. Baek, J., Lee, B., Kwon, S., et al.: LncRNAnet: long non-coding RNA identification using deep learning. Bioinformatics 34(22), 3889–3897 (2018)

    Article  Google Scholar 

  13. Yang, C., Yang, L.S., Zhou, M., et al.: LncADeep: an ab initio lncRNA identification and functional annotation tool based on deep learning. Bioinformatics 34(22), 3825–3834 (2018)

    Article  Google Scholar 

  14. Pan, X.Y., Shen, H.B.: RNA-protein binding motifs mining with a new hybrid deep learning based cross-domain knowledge integration approach. BMC Bioinform. 18, 136 (2017)

    Article  Google Scholar 

  15. Bai, Y., Dai, X., Harrison, A.P., et al.: RNA regulatory networks in animals and plants: a long noncoding RNA perspective. Brief. Funct. Genomics 14(2), 91–101 (2015)

    Article  Google Scholar 

  16. Liu, G., Guo, J.B.: Bidirectional LSTM with attention mechanism and convolutional layer for text classification. Neurocomputing 337, 325–338 (2019)

    Article  Google Scholar 

  17. Andreu, P.G., Antonio, H.P., Irantzu, A.L., et al.: GREENC: a Wiki-based database of plant lncRNAs. Nucleic Acids Res. 44(D1), D1161–D1166 (2016)

    Article  Google Scholar 

  18. Li, X., Yang, L., Chen, L.-L.: The biogenesis, functions, and challenges of circular RNAs. Mol. Cell 71(3), 428–442 (2018)

    Article  Google Scholar 

  19. Ehsaneddin, A., Mohammad, R.K., et al.: Continuous distributed representation of biological sequences for deep proteomics and genomics. PLoS ONE 10(11), e0141287 (2015)

    Article  Google Scholar 

  20. Hochreite, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)

    Article  Google Scholar 

  21. Dinger, M.E., Pang, K.C., Mercer, T.R., et al.: Differentiating protein-coding and noncoding RNA: challenges and ambiguities. PLoS Comput. Biol. 4(11), e1000176 (2008)

    Article  Google Scholar 

  22. Ronny, L., Stephan, H.B., Christian, H.S., et al.: ViennaRNA package 2.0. Algorithms Mol. Biol. 6, 26 (2011)

    Article  Google Scholar 

Download references

Acknowledgment

The current study was supported by the National Natural Science Foundation of China (Nos. 61872055 and 31872116).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yushi Luan .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Meng, J., Chang, Z., Zhang, P., Shi, W., Luan, Y. (2019). lncRNA-LSTM: Prediction of Plant Long Non-coding RNAs Using Long Short-Term Memory Based on p-nts Encoding. In: Huang, DS., Huang, ZK., Hussain, A. (eds) Intelligent Computing Methodologies. ICIC 2019. Lecture Notes in Computer Science(), vol 11645. Springer, Cham. https://doi.org/10.1007/978-3-030-26766-7_32

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-26766-7_32

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-26765-0

  • Online ISBN: 978-3-030-26766-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics