Nothing Special   »   [go: up one dir, main page]

skip to main content
article

Essential Latent Knowledge for Protein-Protein Interactions: Analysis by an Unsupervised Learning Approach

Published: 01 April 2005 Publication History

Abstract

Protein-protein interactions play a number of central roles in many cellular functions, including DNA replication, transcription and translation, signal transduction, and metabolic pathways. A recent increase in the number of protein-protein interactions has made predicting unknown protein-protein interactions important for the understanding of living cells. However, the protein-protein interactions experimentally obtained so far are often incomplete and contradictory and, consequently, existing computational prediction methods have integrated evidence (latent knowledge of proteins) from different and more reliable sources. Analyzing the relationships between proteins and the latent knowledge is important to understanding the cellular processes. For this analysis, we propose a new probabilistic model for protein-protein interactions by considering the latent knowledge of proteins. We further present an efficient learning algorithm for this model, based on an EM algorithm. Experimental results have shown that in a supervised test setting, the proposed method outperformed five other competing methods by a statistically significant factor in all cases. Using the probability parameters of a trained model, we have further shown the latent knowledge that is essential to predicting protein-protein interactions. Overall, our experimental results confirm that our proposed model is especially effective for analyzing protein-protein interactions from a viewpoint of the latent knowledge of proteins.

References

[1]
G. Bader and C. Hogue, “Analyzing Yeast Protein-Protein Interaction Data Obtained from Different Sources,” Nature Biotechnology, vol. 20, pp. 991-997, 2002.
[2]
J. Bader A. Chaudhuri J. Rothberg and J. Chant, “Gaining Confidence in High-Throughput Protein Interaction Networks,” Nature Biotechnology, vol. 22, pp. 78-85, 2004.
[3]
K. Barnard P. Duygulu D. Forsyth N. Freitas D. Blei and M. Jordan, “Matching Words and Pictures,” J. Machine Learning Research, vol. 3, pp. 1107-1135, 2003.
[4]
C. Bishop and M. Tipping, “A Hierarchical Latent Variable Model for Data Visualization,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 20, no. 3, pp. 281-293, Mar. 1998.
[5]
J. Bock and D. Gough, “Predicting Protein-Protein Interactions from Primary Structure,” Bioinformatics, vol. 17, no. 5, pp. 455-460, 2001.
[6]
A. Dempster N. Laird and D. Rubin, “Maximum Likelihood from Incomplete Data via the EM Algorithm,” J. Royal Statistical Soc.: Series B, vol. 39, pp. 1-38, 1977.
[7]
A. Gavin et al., “Functional Organization of the Yeast Proteome by Systematic Analysis of Protein Complexes,” Nature, vol. 415, pp.nbsp141-147, 2002.
[8]
S. Gomez W. Noble and A. Rzhetsky, “Learning to Predict Protein-Protein Interactions from Protein Sequences,” Bioinformatics, vol. 19, pp. 1875-1881, 2003.
[9]
A. Grigoriev, “On the Number of Protein-Protein Interactions in the Yeast Proteome,” Nucleic Acids Research, vol. 31, pp. 4157-4161, 2003.
[10]
Y. Ho et al., “Systematic Identification of Protein Complexes in Saccharomyces Cerevisiae by Mass Spectrometry,” Nature, vol. 415, pp. 180-183, 2002.
[11]
T. Hofmann, “Learning and Representing Topic. A Hierarchical Mixture Model for Word Occurrence in Document Databases,” Proc. Conf. Automated Learning and Discovery (CONALD), 1998.
[12]
T. Hofmann, “Unsupervised Learning by Probabilistic Latent Semantic Analysis,” Machine Learning, vol. 42, pp. 177-196, 2001.
[13]
T. Ito T. Chiba R. Ozawa M. Yoshida M. Hattori and Y. Sakaki, “A Comprehensive Two-Hybrid Analysis to Explore the Yeast Protein Interactome,” Proc. Nat'l Academy of Sciences, vol. 98, no. 8, pp. 4569-4574, 2001.
[14]
T. Ito K. Tashiro S. Muta R. Ozawa T. Chiba M. Nishizawa K. Yamamoto S. Kuhara and Y. Sakaki, “Toward a Protein-Protein Interaction Map of the Budding Yeast: A Comprehensive System to Examine Two-Hybrid Interactions in All Possible Combinations between the Yeast Proteins,” Proc. Nat'l Academy of Sciences, vol. 97,no. 3, pp. 1143-1147, 2000.
[15]
R. Jansen H. Yu D. Greenbaum Y. Kluger N. Krogan S. Chung A. Emili M. Snyder J. Greenblatt and M. Gerstein, “A Bayesian Networks Approach for Predicting Protein-Protein Interactions from Genomic Data,” Science, vol. 302, pp. 449-453, 2003.
[16]
F. Sun M. Deng S. Mehta and T. Chen, “Inferring Domain-Domain Interactions from Protein-Protein Interactions,” Genome Research, vol. 12, pp. 1540-1548, 2002.
[17]
H. Mamitsuka, “Hierarchical Latent Knowledge Analysis for Co-Occurrence Data,” Proc. 20th Int'l Conf. Machine Learning, pp. 504-511, 2003.
[18]
H. Mewes C. Amid R. Arnold D. Frishman U. Guldener G. Mannhaupt M. Munsterkotter P. Pagel N. Strack V. Stumpflen J. Warfsmann and A. Ruepp, “MIPS: Analysis and Annotation of Proteins from Whole Genomes,” Nucleic Acids Research, vol. 32, pp.nbspD41-D44, 2004.
[19]
R. Mrowka A. Patzak and H. Herzel, “Is There a Bias in Proteome Research?” Genome Research, vol. 11, pp. 1971-1973, 2001.
[20]
N. Mulder et al., “The InterPro Database, 2003 Brings Increased Coverage and New Features,” Nucleic Acids Research, vol. 31, pp.nbsp315-318, 2003.
[21]
F. Pereira N. Tishby and L. Lee, “Distributional Clustering of English Words,” Proc. 30th Ann. Meeting of the Assoc. for Computational Linguistics, pp. 183-190, 1993.
[22]
B. Schölkopf et al., “Estimating the Support of a High-Dimensional Distribution,” Neural Computation, vol. 13, pp. 1443-1471, 2001.
[23]
E. Sprinzak and H. Margalit, “Correlated Sequence-Signatures as Markers of Protein-Protein Interactions,” J. Molecular Biology, vol. 311, pp. 681-692, 2001.
[24]
E. Sprinzak S. Sattath and H. Margalit, “How Reliable Are Experimental Protein-Protein Interaction Data?” J. Molecular Biology, vol. 327, pp. 919-923, 2003.
[25]
P. Uetz et al., “A Comprehensive Analysis of Protein-Protein Interactions in Saccharomyces Cerevisiae,” Nature, vol. 403, pp. 623-631, 2000.
[26]
P. Uetz and C. Vollert, “Protein-Protein Interactions,” Encyclopedic Reference of Genomics and Proteomics in Molecular Medicine, Springer-Verlag, 2004.
[27]
V. Vapnik, The Nature of Statistical Learning Theory. Springer, 1995.
[28]
C. von Mering R. Krause B. Snel M. Cornell S.G. Oliver S. Fields and P. Bork, “Comparative Assessment of Large-Scale Datasets of Protein-Protein Interactions,” Nature, vol. 417, pp. 399-403, 2002.

Cited By

View all
  • (2012)Mining from protein–protein interactionsWiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery10.1002/widm.10652:5(400-410)Online publication date: 1-Sep-2012
  • (2007)Predicting functional protein-protein interactions based on computational methodsProceedings of the 2007 international conference on Life System Modeling and Simulation10.5555/2393672.2393717(354-363)Online publication date: 14-Sep-2007

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Computational Biology and Bioinformatics
IEEE/ACM Transactions on Computational Biology and Bioinformatics  Volume 2, Issue 2
April 2005
95 pages

Publisher

IEEE Computer Society Press

Washington, DC, United States

Publication History

Published: 01 April 2005
Published in TCBB Volume 2, Issue 2

Author Tags

  1. Biology and genetics
  2. data mining
  3. machine learning
  4. mining methods and algorithms.

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)1
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2012)Mining from protein–protein interactionsWiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery10.1002/widm.10652:5(400-410)Online publication date: 1-Sep-2012
  • (2007)Predicting functional protein-protein interactions based on computational methodsProceedings of the 2007 international conference on Life System Modeling and Simulation10.5555/2393672.2393717(354-363)Online publication date: 14-Sep-2007

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media