Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3543377.3543388acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbbtConference Proceedingsconference-collections
research-article

GAN-PCL: An Efficient Protein Subchloroplast Site Predictor with GAN-based Data Augmented and Feature Fusion

Published: 08 August 2022 Publication History

Abstract

Chloroplasts are important for photosynthesis, and proteins are distributed in different chloroplast regions to perform different functions. Although many computational methods for protein subchloroplast localization have proposed, prediction accuracy is still limited due to scarce and severely unbalanced samples. The development of a model that generates high-quality samples to supplement existing data to improve prediction performance is great significance for the study of chloroplast abundance in early plant development and the regulation of photosynthetic efficiency in agricultural production. This paper proposes a protein subchloroplast site predictor GAN-PCL based on data augmentation and sequence feature fusion with an adversarial neural network GAN, which effectively solves the problem. The experimental results show that GAN-PCL has good prediction performance and generalization ability compared with current advanced predictors.

References

[1]
Kleffmann T, Russenberger D, von Zychlinski A, Christopher W, Sjölander K, Gruissem W, Baginsky S. The Arabidopsis thaliana chloroplast proteome reveals pathway abundance and novel protein functions. Curr Biol. 2004 Mar 9;14(5):354-62. 15028209.
[2]
Pufeng Du, Shengjiao Cao, and Yanda Li. 2009. SubChlo: Predicting protein subchloroplast locations with pseudo-amino acid composition and the evidence-theoretic K-nearest neighbor (ET-KNN) algorithm. J. Theor. Biol. 261, 2 (November 2009), 330–335.
[3]
Jing Hu and Xianghe Yan. 2012. BS-KNN: An Effective Algorithm for Predicting Protein Subchloroplast Localization. Evol. Bioinforma. 8, (January 2012), EBO.S8681.
[4]
Hao Lin, Chen Ding, Lu-Feng Yuan, Wei Chen, Hui Ding, Zi-Qiang Li, Feng-Biao Guo, Jian Huang, and Ni-Ni Rao. 2013.Predicting subchloroplast locations of proteins based on the general form of chou's pseudo amino acid composition: approached from optimal tripeptide composition. International Journal of Biomathematics, 6(02), 1350003.
[5]
Vijayakumar Saravanan and P.T.V. Lakshmi. 2013. SCLAP: An Adaptive Boosting Method for Predicting Subchloroplast Localization of Plant Proteins. OMICS J. Integr. Biol. 17, 2 (February 2013), 106–115.
[6]
Shao-Ping Shi, Jian-Ding Qiu, Xing-Yu Sun, Jian-Hua Huang, Shu-Yun Huang, Sheng-Bao Suo, Ru-Ping Liang, and Li Zhang. 2011. Identify submitochondria and subchloroplast locations with pseudo amino acid composition: Approach from the strategy of discrete wavelet transform feature extraction. Biochim. Biophys. Acta BBA - Mol. Cell Res. 1813, 3 (March 2011), 424–430.
[7]
Chun-Wei Tung, Chyn Liaw, Shinn-Jang Ho, and Shinn-Ying Ho. 2010. Prediction of Protein Subchloroplast Locations using Random Forests. (2010), 5.
[8]
Chao Huang and Jing-Qi Yuan. 2013. Predicting protein subchloroplast locations with both single and multiple sites via three different modes of Chou's pseudo amino acid compositions. J. Theor. Biol. 335, (October 2013), 205–212.
[9]
Xiao Wang, Weiwei Zhang, Qiuwen Zhang, and Guo-Zheng Li. 2015. MultiP-SChlo: multi-label protein subchloroplast localization prediction with Chou's pseudo amino acid composition and a novel multi-label classifier. Bioinformatics 31, 16 (August 2015), 2639–2645.
[10]
Shibiao Wan, Man-Wai Mak, and Sun-Yuan Kung. 2016. Ensemble Linear Neighborhood Propagation for Predicting Subchloroplast Localization of Multi-Location Proteins. J. Proteome Res. 15, 12 (December 2016), 4755–4762.
[11]
Shibiao Wan, Man-Wai Mak, and Sun-Yuan Kung. 2017. Transductive Learning for Multi-Label Protein Subchloroplast Localization Prediction. IEEE/ACM Trans. Comput. Biol. Bioinform. 14, 1 (January 2017), 212–224.
[12]
Meng-Lu Liu, Wei Su, Zheng-Xing Guan, Dan Zhang, Wei Chen, Li Liu, and Hui Ding. 2020. An Overview on Predicting Protein Subchloroplast Localization by using Machine Learning Methods. Curr. Protein Pept. Sci. 21, 12 (December 2020), 1229–1241.
[13]
Antreas Antoniou, Amos Storkey, and Harrison Edwards. Data Augmentation Generative Adversarial Networks[J]. 2017.
[14]
Cen Wan and David T. Jones. 2020. Protein function prediction is improved by creating synthetic feature samples with generative adversarial networks. Nat. Mach. Intell. 2, 9 (September 2020), 540–550.
[15]
Donatas Repecka, Vykintas Jauniskis, Laurynas Karpus, Elzbieta Rembeza, Irmantas Rokaitis, Jan Zrimec, Simona Poviloniene, Audrius Laurynenas, Sandra Viknander, Wissam Abuajwa, Otto Savolainen, Rolandas Meskys, Martin K. M. Engqvist, and Aleksej Zelezniak. 2021. Expanding functional protein sequence spaces using generative adversarial networks. Nat. Mach. Intell. 3, 4 (April 2021), 324–333.
[16]
Kuo-Chen Chou. 2009. Pseudo Amino Acid Composition and its Applications in Bioinformatics, Proteomics and System Biology. Curr. Proteomics 6, 4 (December 2009), 262–274.
[17]
Qiwen Dong, Shuigeng Zhou, and Jihong Guan. 2009. A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation. Bioinformatics 25, 20 (October 2009), 2655-2662.
[18]
Bin Liu, Xiaolong Wang, Qingcai Chen, Qiwen Dong, and Xun Lan. 2012. Using Amino Acid Physicochemical Distance Transformation for Fast Protein Remote Homology Detection. PLoS ONE 7, 9 (September 2012), e46633.
[19]
Chawla N V, Bowyer K W, Hall L O, SMOTE: Synthetic Minority Over-sampling Technique[J]. Journal of Artificial Intelligence Research, 2002, 16(1):321-357.
[20]
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2 (NIPS'14). MIT Press, Cambridge, MA, USA, 2672–2680.
[21]
Sanjay Bankapur and Nagamma Patil. 2020. An Effective Multi-Label Protein Sub-Chloroplast Localization Prediction by Skipped-grams of Evolutionary Profiles using Deep Neural Network. IEEE/ACM Trans. Comput. Biol. Bioinform. (2020), 1–1.

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICBBT '22: Proceedings of the 14th International Conference on Bioinformatics and Biomedical Technology
May 2022
190 pages
ISBN:9781450396387
DOI:10.1145/3543377
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 August 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Data augmentation
  2. Deep learning
  3. GAN
  4. Subchloroplast localization

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICBBT 2022

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 48
    Total Downloads
  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media