Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1148170.1148214acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article

Latent semantic analysis for multiple-type interrelated data objects

Published: 06 August 2006 Publication History

Abstract

Co-occurrence data is quite common in many real applications. Latent Semantic Analysis (LSA) has been successfully used to identify semantic relations in such data. However, LSA can only handle a single co-occurrence relationship between two types of objects. In practical applications, there are many cases where multiple types of objects exist and any pair of these objects could have a pairwise co-occurrence relation. All these co-occurrence relations can be exploited to alleviate data sparseness or to represent objects more meaningfully. In this paper, we propose a novel algorithm, M-LSA, which conducts latent semantic analysis by incorporating all pairwise co-occurrences among multiple types of objects. Based on the mutual reinforcement principle, M-LSA identifies the most salient concepts among the co-occurrence data and represents all the objects in a unified semantic space. M-LSA is general and we show that several variants of LSA are special cases of our algorithm. Experiment results show that M-LSA outperforms LSA on multiple applications, including collaborative filtering, text clustering, and text categorization.

References

[1]
R. K. Ando. Latent semantic-space: iterative scaling improves precision of inter-document similarity measurement. In Proceedings of the 23th SIGIR, pages 216--223, 2000.
[2]
R. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison-Wesley, 1999.
[3]
B. T. Bartell, G. W. Cottrell, and R. K. Belew. Latent semantic indexing is an optimal special case of multidimensional scaling. In SIGIR, pages 161--167, 1992.
[4]
H. Bast and D. Majumdar. Why spectral retrieval works. In SIGIR, pages 11--18, 2005.
[5]
R. Bekkerman, R. El-Yaniv, and A. McCallum. Multi-way distributional clustering via pairwise interactions. In ICML, 2005.
[6]
M. Berry, S. Dumais, and G. O'Brien. Using linear algebra for intelligent information retrieval. SIAM Review, 37(4):573--595, 1995.
[7]
J. S. Breese, D. Heckerman, and C. M. Kadie. Empirical analysis of predictive algorithms for collaborative filtering. In UAI, pages 43--52, 1998.
[8]
J. K. Cullum and R. A. Willoughby. Lanczos Algorithms for Large Symmetric Eigenvalue Computation, Volumn 1 Theory. Birkhäuser, Boston, 1985.
[9]
B. D. Davison. Toward a unification of text and link analysis. In SIGIR, pages 367--368, 2003.
[10]
S. C. Deerwester, S. T. Dumais, T. K. Landauer, G. W. Furnas, and R. A. Harshman. Indexing by latent semantic analysis. Journal of the American Society for Information Science and Technology (JASIS), 41(6):391--407, 1990.
[11]
C. H. Q. Ding. A probabilistic model for latent semantic indexing. JASIST, 56(6):597--608, 2005.
[12]
G. H. Golub and C. F. V. Loan. Matrix Computations, third edition. The Johns Hopkins University Press, 1996.
[13]
Y. Gong and X. Liu. Generic text summarization using relevance measure and latent semantic analysis. In Proceedings of the 24th SIGIR, pages 19--25, 2001.
[14]
J. L. Herlocker, J. A. Konstan, A. Borchers, and J. Riedl. An algorithmic framework for performing collaborative filtering. In SIGIR, pages 230--237, 1999.
[15]
T. Hofmann. Probabilistic latent semantic indexing. In SIGIR, pages 50--57, 1999.
[16]
G. Jeh and J. Widom. Simrank: A measure of structural-context similarity. In KDD, 2002.
[17]
J. M. Kleinberg. Authoritative sources in a hyperlinked environment. J. ACM, 46(5):604--632, 1999.
[18]
L. D. Lathauwer, B. D. Moor, and J. Wandewalle. A multilinear singular value decomposition. SIAM Journal on Matrix Analysis and Applications, 21(4):1253--1278, 2000.
[19]
D. D. Lee and H. S. Seung. Learning the parts of objects by non-negative matrix factorization. Nature, 401:788--791, 1999.
[20]
F. Monay and D. Gatica-Perez. On image auto-annotation with latent space models. In ACM Multimedia, pages 275--278, 2003.
[21]
C. H. Papadimitriou, P. Raghavan, H. Tamaki, and S. Vempala. Latent semantic indexing: A probabilistic analysis. J. Comput. Syst. Sci., 61(2):217--235, 2000.
[22]
A. Popescul, L. H. Ungar, D. M. Pennock, and S. Lawrence. Probabilistic models for unified collaborative and content-based recommendation in sparse-data environments. In UAI, pages 437--444, 2001.
[23]
M. Steinbach, G. Karypis, and V. Kumar. A comparison of document clustering techniques. In Technical Report 00-034. Department of Computer Science and Engineering, University of Minnesota, 2000.
[24]
W. Xi, E. A. Fox, W. Fan, B. Zhang, Z. Chen, J. Yan, and D. Zhuang. Simfusion: measuring similarity using unified relationship matrix. In SIGIR, pages 130--137, 2005.
[25]
W. Xu, X. Liu, and Y. Gong. Document clustering based on non-negative matrix factorization. In SIGIR, pages 267--273, 2003.
[26]
Y. Yang and X. Liu. A re-examination of text categorization methods. In SIGIR, pages 42--49, 1999.
[27]
H. Zha. Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In SIGIR, pages 113--120, 2002.

Cited By

View all
  • (2023)AI-Empowered Persuasive Video Generation: A SurveyACM Computing Surveys10.1145/358876455:13s(1-31)Online publication date: 13-Jul-2023
  • (2018)The Contribution of Stemming and Semantics in Arabic Topic SegmentationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/315246417:2(1-25)Online publication date: 11-Jan-2018
  • (2018)Classify social image by integrating multi-modal contentMultimedia Tools and Applications10.1007/s11042-017-4657-277:6(7469-7485)Online publication date: 1-Mar-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '06: Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
August 2006
768 pages
ISBN:1595933697
DOI:10.1145/1148170
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 August 2006

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. LSA
  2. M-LSA
  3. multiple-type
  4. mutual reinforcement principle

Qualifiers

  • Article

Conference

SIGIR06
Sponsor:
SIGIR06: The 29th Annual International SIGIR Conference
August 6 - 11, 2006
Washington, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)AI-Empowered Persuasive Video Generation: A SurveyACM Computing Surveys10.1145/358876455:13s(1-31)Online publication date: 13-Jul-2023
  • (2018)The Contribution of Stemming and Semantics in Arabic Topic SegmentationACM Transactions on Asian and Low-Resource Language Information Processing10.1145/315246417:2(1-25)Online publication date: 11-Jan-2018
  • (2018)Classify social image by integrating multi-modal contentMultimedia Tools and Applications10.1007/s11042-017-4657-277:6(7469-7485)Online publication date: 1-Mar-2018
  • (2018)Multimode co-clustering for analyzing terrorist networksInformation Systems Frontiers10.1007/s10796-016-9712-420:5(1053-1074)Online publication date: 1-Oct-2018
  • (2018)Research on the Gene Regulation Network Construction AlgorithmInternational Conference on Applications and Techniques in Cyber Security and Intelligence ATCI 201810.1007/978-3-319-98776-7_135(1107-1112)Online publication date: 5-Nov-2018
  • (2016)Multi-modal learning for social image classification2016 12th International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery (ICNC-FSKD)10.1109/FSKD.2016.7603345(1174-1179)Online publication date: Aug-2016
  • (2016)Trinity: Walking on a User-Object-Tag Heterogeneous Network for Personalised RecommendationsJournal of Computer Science and Technology10.1007/s11390-016-1648-031:3(577-594)Online publication date: 6-May-2016
  • (2016)Integrating multiple types of features for event identification in social imagesMultimedia Tools and Applications10.1007/s11042-014-2436-x75:6(3301-3322)Online publication date: 1-Mar-2016
  • (2015)Using heterogeneous patent network features to rank and discover influential inventorsFrontiers of Information Technology & Electronic Engineering10.1631/FITEE.140039416:7(568-578)Online publication date: 12-Jul-2015
  • (2014)Collaborative Filtering beyond the User-Item MatrixACM Computing Surveys10.1145/255627047:1(1-45)Online publication date: 1-May-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media