Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2382936.2383053acmconferencesArticle/Chapter ViewAbstractPublication PagesbcbConference Proceedingsconference-collections
research-article

A collective NMF method for detecting protein functional module from multiple data sources

Published: 07 October 2012 Publication History

Abstract

Detecting functional modules from protein-protein interaction (PPI) networks is an active research area with many practical applications. However, there is always a critical concern on the false PPI interactions which are derived from the high-throughput experiments and the unsatisfactory results obtained from single PPI network with severe information insufficiency. To address this problem, we propose a Collective Non-negative Matrix Factorization (CoNMF) based soft clustering method which efficiently integrates information of gene ontology (GO), gene expression data and PPI networks. In our method, the three data sources are formed into two graphs with similarity adjacency matrices and these graphs are approximated by a matrix factorization with their common factor which provides the straight-forward interpretation of clustering results. Extensive experiments show that we can improve the module detection performance by integrating multiple biological data sources and that CoNMF yields superior results compared to other multiple data sources fusion methods by identifying a larger number of more precise protein modules with actual biological meaning and certain degree of overlapping.

References

[1]
Gary D Bader and Christopher Wv Hogue. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics, 4(1):2, 2003.
[2]
Rachel B Brem and Leonid Kruglyak. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proceedings of the National Academy of Sciences of the United States of America, 102(5):1572--1577, 2005.
[3]
Yanhua Chen, Manjeet Rege, Ming Dong, and Jing Hua. Non-negative matrix factorization for semi-supervised data clustering. Knowledge and Information Systems, 17(3):355--379, 2008.
[4]
Young-rae Cho, Woochang Hwang, and Aidong Zhang. Efficient modularization of weighted protein interaction networks using k-hop graph reduction. BioInformatics and BioEngineering 2006 BIBE 2006 Sixth IEEE Symposium on, pages 289--298, 2006.
[5]
Young-Rae Cho, Lei Shi, and Aidong Zhang. Flownet: Flow-based approach for efficient analysis of complex biological networks. 2009 Ninth IEEE International Conference on Data Mining, pages 91--100, 2009.
[6]
Karthik Devarajan. Nonnegative matrix factorization: An analytical and interpretive tool in computational biology. PLoS Computational Biology, 4(7):12, 2008.
[7]
Chris Ding, Xiaofeng He, and Horst D Simon. On the equivalence of nonnegative matrix factorization and spectral clustering. Proc SIAM Data Mining Conf, 44(4):606âĂŞ610, 2005.
[8]
Xiaoli Zhang Fern and Carla E Brodley. Solving cluster ensemble problems by bipartite graph partitioning. Twentyfirst international conference on Machine learning ICML 04, pages 36--41, 2004.
[9]
A C Gavin, P Aloy, P Grandi, R Krause, M Boesche, M Marzioch, C Rau, L J Jensen, S Bastuck, B Dumpelfeld, and et al. Proteome survey reveals modularity of the yeast cell machinery. Nature, 440(7084):631--636, 2006.
[10]
M Girvan and M E J Newman. Community structure in social and biological networks. Proceedings of the National Academy of Sciences of the United States of America, 99(12):7821--7826, 2002.
[11]
Hyunsoo Kim and Haesun Park. Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM Journal on Matrix Analysis and Applications, 30(2):713, 2008.
[12]
Jingu Kim and Haesun Park. Sparse nonnegative matrix factorization for clustering. Science, pages 1--15, 2006.
[13]
Nevan J Krogan, Gerard Cagney, Haiyuan Yu, Gouqing Zhong, Xinghua Guo, Alexandr Ignatchenko, Joyce Li, Shuye Pu, Nira Datta, Aaron P Tikuisis, and et al. Global landscape of protein complexes in the yeast saccharomyces cerevisiae. Nature, 440(7084):637--643, 2006.
[14]
D D Lee and H S Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401(6755):788--91, 1999.
[15]
Chuan Lin, Young-rae Cho, Woo-chang Hwang, Pengjun Pei, and Aidong Zhang. Clustering methods in protein-protein interaction network. in Knowledge Discovery in Bioinformatics Techniques Methods and Application, 2006.
[16]
Dekang Lin. An information-theoretic definition of similarity. Quality, 1:296--304, 1998.
[17]
M E J Newman. Finding community structure in networks using the eigenvectors of matrices. Physical Review E - Statistical, Nonlinear and Soft Matter Physics, 74(3 Pt 2):036104, 2006.
[18]
J-P Onnela, J Saramaki, J Hyvonen, G Szabo, D Lazer, K Kaski, J Kertesz, and A-L Barabasi. Structure and tie strengths in mobile communication networks. Proceedings of the National Academy of Sciences of the United States of America, 104(18):7332--7336, 2007.
[19]
Max Planck and Ulrike Von Luxburg. A tutorial on spectral clustering a tutorial on spectral clustering. Statistics and Computing, 17(August):395--416, 2006.
[20]
Shuye Pu, Jessica Wong, Brian Turner, Emerson Cho, and Shoshana J Wodak. Up-to-date catalogues of yeast protein complexes. Nucleic Acids Research, 37(3):825--831, 2009.
[21]
Seung Yon Rhee, Valerie Wood, Kara Dolinski, and Sorin Draghici. Use and misuse of the gene ontology annotations. Nature Reviews Genetics, 9(7):509--515, 2008.
[22]
B Schwikowski, P Uetz, and S Fields. A network of protein-protein interactions in yeast. Nature Biotechnology, 18(12):1257--1261, 2000.
[23]
Alexander J Smola and Risi Kondor. Kernels and regularization on graphs. Machine Learning, 2777:1--15, 2003.
[24]
Alexander Strehl and Joydeep Ghosh. Cluster ensemblesa knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3(3):583--617, 2003.
[25]
Christian Von Mering, Roland Krause, Berend Snel, Michael Cornell, Stephen G Oliver, Stanley Fields, and Peer Bork. Comparative assessment of large-scale data sets of protein-protein interactions. Nature, 417(6887):399--403, 2002.
[26]
Jianxin Wang, Min Li, Jianer Chen, and Yi Pan. A fast hierarchical clustering algorithm for functional modules discovery in protein interaction networks. IEEEACM Transactions on Computational Biology and Bioinformatics, 8(3):607--620, 2011.
[27]
R Wang, S Zhang, Y Wang, X Zhang, and L Chen. Clustering complex networks and biological networks by nonnegative matrix factorization with various similarity measures. Neurocomputing, 72(1-3):134--141, 2008.

Cited By

View all
  • (2023)Decision of the Optimal Rank of a Nonnegative Matrix Factorization Model for Gene Expression Data Sets Utilizing the Unit Invariant Knee Method: Development and Evaluation of the Elbow Method for Rank SelectionJMIR Bioinformatics and Biotechnology10.2196/436654(e43665)Online publication date: 6-Jun-2023
  • (2021)Semi-supervised community detection on attributed networks using non-negative matrix tri-factorization with node popularityFrontiers of Computer Science10.1007/s11704-020-9203-015:4Online publication date: 11-Feb-2021
  • (2018)Detection of Protein Complexes Based on Penalized Matrix Decomposition in a Sparse Protein–Protein Interaction NetworkMolecules10.3390/molecules2306146023:6(1460)Online publication date: 15-Jun-2018
  • Show More Cited By

Index Terms

  1. A collective NMF method for detecting protein functional module from multiple data sources

          Recommendations

          Comments

          Please enable JavaScript to view thecomments powered by Disqus.

          Information & Contributors

          Information

          Published In

          cover image ACM Conferences
          BCB '12: Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine
          October 2012
          725 pages
          ISBN:9781450316705
          DOI:10.1145/2382936
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Sponsors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 07 October 2012

          Permissions

          Request permissions for this article.

          Check for updates

          Qualifiers

          • Research-article

          Funding Sources

          Conference

          BCB' 12
          Sponsor:

          Acceptance Rates

          BCB '12 Paper Acceptance Rate 33 of 159 submissions, 21%;
          Overall Acceptance Rate 254 of 885 submissions, 29%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • Downloads (Last 12 months)8
          • Downloads (Last 6 weeks)2
          Reflects downloads up to 08 Mar 2025

          Other Metrics

          Citations

          Cited By

          View all
          • (2023)Decision of the Optimal Rank of a Nonnegative Matrix Factorization Model for Gene Expression Data Sets Utilizing the Unit Invariant Knee Method: Development and Evaluation of the Elbow Method for Rank SelectionJMIR Bioinformatics and Biotechnology10.2196/436654(e43665)Online publication date: 6-Jun-2023
          • (2021)Semi-supervised community detection on attributed networks using non-negative matrix tri-factorization with node popularityFrontiers of Computer Science10.1007/s11704-020-9203-015:4Online publication date: 11-Feb-2021
          • (2018)Detection of Protein Complexes Based on Penalized Matrix Decomposition in a Sparse Protein–Protein Interaction NetworkMolecules10.3390/molecules2306146023:6(1460)Online publication date: 15-Jun-2018
          • (2018)Overlapping functional modules detection in PPI network with pair‐wise constrained non‐negative matrix tri‐factorisationIET Systems Biology10.1049/iet-syb.2017.008412:2(45-54)Online publication date: 7-Feb-2018
          • (2013)MultiFacTV: module detection from higher-order time series biological dataBMC Genomics10.1186/1471-2164-14-S4-S214:S4Online publication date: 1-Oct-2013

          View Options

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          Figures

          Tables

          Media

          Share

          Share

          Share this Publication link

          Share on social media