Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2502081.2502180acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
poster

Cross-media topic mining on wikipedia

Published: 21 October 2013 Publication History

Abstract

As a collaborative wiki-based encyclopedia, Wikipedia provides a huge amount of articles of various categories. In addition to their text corpus, Wikipedia also contains plenty of images which makes the articles more intuitive for readers to understand. To better organize these visual and textual data, one promising area of research is to jointly model the embedding topics across multi-modal data (i.e, cross-media) from Wikipedia. In this work, we propose to learn the projection matrices that map the data from heterogeneous feature spaces into a unified latent topic space. Different from previous approaches, by imposing the l1 regularizers to the projection matrices, only a small number of relevant visual/textual words are associated with each topic, which makes our model more interpretable and robust. Furthermore, the correlations of Wikipedia data in different modalities are explicitly considered in our model. The effectiveness of the proposed topic extraction algorithm is verified by several experiments conducted on real Wikipedia datasets.

References

[1]
D. M. Blei and M. I. Jordan. Modeling annotated data. In SIGIR, pages 127--134, 2003.
[2]
X. Chen, Y. Qi, B. Bai, Q. Lin, and J. G. Carbonell. Sparse latent semantic analysis. In SDM, 2011.
[3]
H. Hotelling. Relations between two sets of variates. Biometrika, 28(3/4):321--377, 1936.
[4]
J. Hu, L. Fang, Y. Cao, H. jun Zeng, H. Li, Q. Yang, and Z. Chen. Enhancing text clustering by leveraging Wikipedia semantics. In SIGIR, pages 179--186, 2008.
[5]
C. Liu, B. C. Ooi, A. K. H. Tung, and D. Zhang. Crew: cross-modal resource searching by exploiting wikipedia. In ACM MM, pages 1669--1672, 2010.
[6]
J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online learning for matrix factorization and sparse coding. JMLR, 11:19--60, 2010.
[7]
T. Melzer, M. Reiter, and H. Bischof. Kernel canonical correlation analysis. In ICANN, pages 353--360, 2001.
[8]
D. N. Milne and I. H. Witten. Learning to link with Wikipedia. In CIKM, pages 509--518, 2008.
[9]
N. Rasiwasia, J. Costa Pereira, E. Coviello, G. Doyle, G. R. Lanckriet, R. Levy, and N. Vasconcelos. A new approach to cross-modal multimedia retrieval. In ACM MM, 2010.
[10]
A. Sharma, A. Kumar, H. Daume, and D. W. Jacobs. Generalized multiview analysis: A discriminative latent space. In CVPR, pages 2160--2167. IEEE, 2012.
[11]
F. Wu, H. Zhang, and Y. Zhuang. Learning semantic correlations for cross-media retrieval. In ICIP, 2006.
[12]
Y. Yang, D. Xu, F. Nie, J. Luo, and Y. Zhuang. Ranking with local regression and global alignment for cross media retrieval. In ACM MM, 2009.

Cited By

View all
  • (2017)Document Classification Based on Text and Image FeaturesMining Multimedia Documents10.1201/9781315399744-9(107-116)Online publication date: 2-May-2017
  • (2016)A Mixed Generative-Discriminative Based Hashing MethodIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.250712728:4(845-857)Online publication date: 1-Apr-2016
  • (2016)Mining user interests from social media by fusing textual and visual features2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)10.1109/APSIPA.2016.7820713(1-8)Online publication date: Dec-2016
  • Show More Cited By

Index Terms

  1. Cross-media topic mining on wikipedia

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '13: Proceedings of the 21st ACM international conference on Multimedia
    October 2013
    1166 pages
    ISBN:9781450324045
    DOI:10.1145/2502081
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 21 October 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. cross media
    2. sparsity
    3. topic modeling
    4. wikipedia

    Qualifiers

    • Poster

    Conference

    MM '13
    Sponsor:
    MM '13: ACM Multimedia Conference
    October 21 - 25, 2013
    Barcelona, Spain

    Acceptance Rates

    MM '13 Paper Acceptance Rate 47 of 235 submissions, 20%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2017)Document Classification Based on Text and Image FeaturesMining Multimedia Documents10.1201/9781315399744-9(107-116)Online publication date: 2-May-2017
    • (2016)A Mixed Generative-Discriminative Based Hashing MethodIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.250712728:4(845-857)Online publication date: 1-Apr-2016
    • (2016)Mining user interests from social media by fusing textual and visual features2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA)10.1109/APSIPA.2016.7820713(1-8)Online publication date: Dec-2016
    • (2014)Aligning plot synopses to videos for story-based retrievalInternational Journal of Multimedia Information Retrieval10.1007/s13735-014-0065-94:1(3-16)Online publication date: 11-Sep-2014

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media