Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2811411.2811534acmconferencesArticle/Chapter ViewAbstractPublication PagesracsConference Proceedingsconference-collections
research-article

Prediction of protein stability changes upon one-point mutations using machine learning

Published: 09 October 2015 Publication History

Abstract

This paper describes a new approach to the detection of protein stability change upon amino acid mutations. The main goal is to create a new meta-tool, which combines the outputs of eight well-established prediction tools and due to a suitable method of consensus making, it is able to improve the overall prediction accuracy. The optimal combination of outputs of these tools is found by using a various number of machine learning methods. Out of all tested machine learning methods, KStar showed the highest prediction accuracy on the training dataset compiled from experimentally validated mutations originating from ProTherm database. Due to this reason, it is chosen as an optimal consensus maker. The general prediction abilities are validated on the testing dataset composed of multi-point amino acid mutations extracted also from ProTherm database. Since the multi-point mutations were not used for training any of integrated tools, we suppose that such comparison is objective. As a result, the developed meta-tool based on KStar improves the correlation coefficient by 0.130 on the training dataset and 0.239 on the testing dataset, respectively (the comparison is being made against the most successful integrated tool). Based on the obtained results, we claim that machine learning methods may help identify strengths and weaknesses of contemporary protein stability prediction tools.

References

[1]
P. Bash, U. Singh, R. Langridge, and P. Kollman. Free energy calculations by computer simulation. Science, 236(4801):564--568, 1987.
[2]
E. Capriotti, P. Fariselli, and R. Casadio. A neural-network-based method for predicting protein stability changes upon single point mutations. Bioinformatics, 20(suppl 1):i63--i68, 2004.
[3]
E. Capriotti, P. Fariselli, I. Rossi, and R. Casadio. A three-state prediction of single point mutations on protein stability changes. BMC Bioinformatics, 9(S-2), 2008.
[4]
Y. Dehouck, J. M. Kwasigroch, D. Gilis, and M. Rooman. Popmusic 2.1 : a web server for the estimation of protein stability changes upon mutation and sequence optimality. BMC Bioinformatics, 12:151, 2011.
[5]
C. Deutsch and B. Krishnamoorthy. Four-body scoring function for mutagenesis. Bioinformatics, 23(22):3009--3015, 2007.
[6]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The weka data mining software: an update. SIGKDD Explor. Newsl., 11(1):10--18, 2009.
[7]
L.-T. Huang, M. M. Gromiha, and S.-Y. Ho. iptree-stab: interpretable decision tree based method for predicting protein stability changes upon mutations. Bioinformatics, 23(10):1292--1293, 2007.
[8]
S. Khan and M. Vihinen. Performance of protein stability predictors. Hum Mutat, 31(6):675--84, 2010.
[9]
J. Khatun, S. D. Khare, and N. V. Dokholyan. Can contact potentials reliably predict stability of proteins? Journal of Molecular Biology, 336(5):1223--1238, 2004.
[10]
M. D. S. Kumar, K. A. Bava, M. M. Gromiha, P. Prabakaran, K. Kitajima, H. Uedaira, and A. Sarai. Protherm and pronit: thermodynamic databases for proteins and protein-nucleic acid interactions. Nucleic Acids Research, 34(Database-Issue):204--206, 2006.
[11]
M. Masso and I. I. Vaisman. Auto-mute: web-based tools for predicting stability changes in proteins due to single amino acid replacements. Protein Eng Des Sel, 23(8):683--7, 2010.
[12]
V. Parthiban, M. M. Gromiha, and D. Schomburg. Cupsat: prediction of protein stability upon point mutations. Nucleic Acids Research, 34(Web-Server-Issue):239--242, 2006.
[13]
D. E. V. Pires, D. B. Ascher, and T. L. Blundell. mcsm: predicting the effects of mutations in proteins using graph-based signatures. Bioinformatics, 2013.
[14]
R. Polikar. Ensemble based systems in decision making. Circuits and Systems Magazine, IEEE, 6(3):21--45, 2006.
[15]
V. Potapov, M. Cohen, and G. Schreiber. Assessing computational methods for predicting protein stability upon mutation: good on average but not in the details. Protein Engineering Design and Selection, 22(9):553--560, Sept. 2009.
[16]
M. T. Reetz. The importance of additive and non-additive mutational effects in protein engineering. Angewandte Chemie International Edition, 52(10):2658--2666, 2013.
[17]
L. Rokach. Ensemble-based classifiers. Artificial Intelligence Review, 33(1-2):1--39, 2010.
[18]
J. A. Wells. Additivity of mutational effects in proteins. Biochemistry, 29(37):8509--8517, 1990.
[19]
I. H. Witten, E. Frank, and M. A. Hall. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 3rd edition, 2011.
[20]
C. L. Worth, R. Preissner, and T. L. Blundell. Sdm - a server for predicting effects of mutations on protein stability and malfunction. Nucleic Acids Research, 39(Web-Server-Issue):215--222, 2011.

Cited By

View all
  • (2020)A sequence embedding method for enzyme optimal condition analysisBMC Bioinformatics10.1186/s12859-020-03851-521:1Online publication date: 10-Nov-2020

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
RACS '15: Proceedings of the 2015 Conference on research in adaptive and convergent systems
October 2015
540 pages
ISBN:9781450337380
DOI:10.1145/2811411
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 October 2015

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. machine learning
  2. protein mutation
  3. protein stability
  4. protherm
  5. stability prediction

Qualifiers

  • Research-article

Conference

RACS '15
Sponsor:

Acceptance Rates

RACS '15 Paper Acceptance Rate 75 of 309 submissions, 24%;
Overall Acceptance Rate 393 of 1,581 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)2
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2020)A sequence embedding method for enzyme optimal condition analysisBMC Bioinformatics10.1186/s12859-020-03851-521:1Online publication date: 10-Nov-2020

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media