Ensemble of Diversely Trained Support Vector Machines for Protein Fold Recognition

Abdollah Dehzangi^21,22 &
Abdul Sattar^21,22

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7802))

Included in the following conference series:

Asian Conference on Intelligent Information and Database Systems

2012 Accesses

Abstract

Protein Fold Recognition (PFR) is defined as assigning a given protein to a fold based on its major secondary structure. PFR is considered as an important step toward protein structure prediction and drug design. However, it still remains as an unsolved problem for biological science and bioinformatics. In this study, we explore the impact of two novel feature extraction methods namely overlapped segmented distribution and overlapped segmented autocorrelation to provide more local discriminatory information for the PFR compared to previously proposed methods found in the literature. We study the impact of our proposed feature extraction methods using 15 promising physicochemical attributes of the amino acids. Afterwards, by proposing an ensemble Support Vector Machines (SVM) which are diversely trained using features extracted from different physicochemical-based attributes, we enhance the protein fold prediction accuracy for up to 5% better than similar studies found in the literature.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A novel fusion based on the evolutionary features for protein fold recognition using support vector machines

Article Open access 01 September 2020

A parallel classification framework for protein fold recognition

Article 13 January 2020

Protein Fold Recognition Exploited by Computational and Functional Approaches: Recent Insights

References

Ghanty, P., Pal, N.R.: Prediction of protein folds: Extraction of new features, dimensionality reduction, and fusion of heterogeneous classifiers. IEEE Transactions on NanoBioscience 8(1), 100–110 (2009)
Article Google Scholar
Dehzangi, A., Phon Amnuaisuk, S., Ng, K.H., Mohandesi, E.: Protein Fold Prediction Problem Using Ensemble of Classifiers. In: Leung, C.S., Lee, M., Chan, J.H. (eds.) ICONIP 2009, Part II. LNCS, vol. 5864, pp. 503–511. Springer, Heidelberg (2009)
Chapter Google Scholar
Dehzangi, A., Phon-Amnuaisuk, S., Dehzangi, O.: Using random forest for protein fold prediction problem: An empirical study. Journal of Information Science and Engineering 26(6), 1941–1956 (2010)
Google Scholar
Chen, K., Kurgan, L.A.: Pfres: protein fold classification by using evolutionary information and predicted secondary structure. Bioinformatics 23(21), 2843–2850 (2007)
Article Google Scholar
Yang, J.Y., Chen, X.: Improving taxonomy-based protein fold recognition by using global and local features. Proteins: Structure, Function, and Bioinformatics 79(7), 2053–2064 (2011)
Article Google Scholar
Dehzangi, A., Karamizadeh, S.: Solving protein fold prediction problem using fusion of heterogeneous classifiers. INFORMATION, An International Interdisciplinary Journal 14(11), 3611–3622 (2011)
Google Scholar
Yang, T., Kecman, V., Cao, L., Zhang, C., Huang, J.Z.: Margin-based ensemble classifier for protein fold recognition. Expert Systems with Applications 38, 12348–12355 (2011)
Article Google Scholar
Gromiha, M.M., Oobatake, M., Sarai, A.: Important amino acid properties for enhanced thermostability from mesophilic to thermophilic proteins. Biophysical Chemistry 82, 51–67 (1999)
Article Google Scholar
Ding, C., Dubchak, I.: Multi-class protein fold recognition using support vector machines and neural networks. Bioinformatics 17, 349–358 (2001)
Article Google Scholar
Dehzangi, A., Phon-Amnuaisuk, S.: Fold prediction problem: The application of new physical and physicochemical-based features. Protein and Peptide Letters 18(2), 174–185 (2011)
Article Google Scholar
Taguchi, Y.H., Gromiha, M.M.: Application of amino acid occurrence for discriminating different folding types of globular proteins. BMC Bioinformatics 8(1), 404 (2007)
Article Google Scholar
Kurgan, L.A., Cios, K.J., Chen, K.: Scpred: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences. BMC Bioinformatics 9, 226 (2008)
Article Google Scholar
Kavousi, K., Moshiri, B., Sadeghi, M., Araabi, B.N., Moosavi-Movahedi, A.A.: A protein fold classifier formed by fusing different modes of pseudo amino acid composition via pssm. Computational Biology and Chemistry 35(1), 1–9 (2011)
Article MathSciNet Google Scholar
Shen, H.B., Chou, K.C.: Ensemble classifier for protein fold pattern recognition. Bioinformatics 22, 1717–1722 (2006)
Article Google Scholar
Mathura, V.S., Kolippakkam, D.: Apdbase: Amino acid physico-chemical properties database. Bioinformation 12(1), 2–4 (2005)
Article Google Scholar
Dong, Q., Zhou, S., Guan, G.: A new taxonomy-based protein fold recognition approach based on autocross-covariance transformation. Bioinformatics 25(20), 2655–2662 (2009)
Article Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. In: International Conference on Machine Learning, pp. 148–156 (1996)
Google Scholar
Witten, I., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar
Krishnaraj, Y., Reddy, C.K.: Boosting methods for protein fold recognition: An empirical comparison. In: Proceedings of the 2008 IEEE International Conference on Bioinformatics and Biomedicine, pp. 393–396 (2008)
Google Scholar
Breiman, L.: Bagging predictors. Machine Learning 24, 123–140 (1996)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Integrated and Intelligent Systems (IIIS), Griffith University, Brisbane, Australia
Abdollah Dehzangi & Abdul Sattar
National ICT Australia (NICTA), Brisbane, Australia
Abdollah Dehzangi & Abdul Sattar

Authors

Abdollah Dehzangi
View author publications
You can also search for this author in PubMed Google Scholar
Abdul Sattar
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Computer Science and Information Systems, Department of Software Engineering, Universiti Teknologi Malaysia, 81310, Johar Baharu, Johor, Malaysia
Ali Selamat & Habibollah Haron &
Institute of Informatics, Division of Knowledge Managements Systems, Wrocław University of Technology, Str. Wybrzeże Wyspiańskiego 27, 50-370, Wrocław, Poland
Ngoc Thanh Nguyen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dehzangi, A., Sattar, A. (2013). Ensemble of Diversely Trained Support Vector Machines for Protein Fold Recognition. In: Selamat, A., Nguyen, N.T., Haron, H. (eds) Intelligent Information and Database Systems. ACIIDS 2013. Lecture Notes in Computer Science(), vol 7802. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36546-1_35

Download citation

DOI: https://doi.org/10.1007/978-3-642-36546-1_35
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36545-4
Online ISBN: 978-3-642-36546-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics