Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1631272.1631303acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Comprehensive query-dependent fusion using regression-on-folksonomies: a case study of multimodal music search

Published: 19 October 2009 Publication History

Abstract

The combination of heterogeneous knowledge sources has been widely regarded as an effective approach to boost retrieval accuracy in many information retrieval domains. While various technologies have been recently developed for information retrieval, multimodal music search has not kept pace with the enormous growth of data on the Internet. In this paper, we study the problem of integrating multiple online information sources to conduct effective query dependent fusion (QDF) of multiple search experts for music retrieval. We have developed a novel framework to construct a knowledge space of users' information need from online folksonomy data. With this innovation, a large number of comprehensive queries can be automatically constructed to train a better generalized QDF system against unseen user queries. In addition, our framework models QDF problem by regression of the optimal combination strategy on a query. Distinguished from the previous approaches, the regression model of QDF (RQDF) offers superior modeling capability with less constraints and more efficient computation. To validate our approach, a large scale test collection has been collected from different online sources, such as Last.fm, Wikipedia, and YouTube. All test data will be released to the public for better research synergy in multimodal music search. Our performance study indicates that the accuracy, efficiency, and robustness of the multimodal music search can be improved significantly by the proposed folksonomy-RQDF approach. In addition, since no human involvement is required to collect training examples, our approach offers great feasibility and practicality in system development.

References

[1]
A. Andoni and P. Indyk. Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. In FOCS'06, 2006.
[2]
C. Chang and C. Lin. Libsvm: a library for support vector machines, 2001.
[3]
T.-S. Chua, S.-Y. Neo, K.-Y. Li, G. Wang, R. Shi, M. Zhao, and H. Xu. Trecvid 2004 search and feature extraction task by nus pris. In NIST TRECVID Workshop, 2004.
[4]
B. Cui, L. Liu, C. Pu, J. Shen, and K. L. Tan. Quest: querying music databases by acoustic and textual features. In ACM Multimedia'07, 2007.
[5]
C. V. Damme, M. Hepp, and K. Siorpaes. Folksontology: An integrated approach for turning folksonomies into ontologies. In the ESWC Workshop, 2007.
[6]
S. J. Downie. The music information retrieval evaluation exchange (mirex). In ISMIR'08, 2008.
[7]
N. R. Draper and H. Smith. Applied Regression Analysis. Wiley-Interscience, 1998.
[8]
I.-H. Kang and G. Kim. Query type classification for web document retrieval. In SIGIR '03, 2003.
[9]
L. Kennedy, S. F. Chang, and A. Natsev. Query-adaptive fusion for multimodal search. Proc. of the IEEE, 2008.
[10]
L. Kennedy, A. P. Natsev, and S. F. Chang. Automatic discovery of query-class-dependent models for multimodal search. In ACM Multimedia'05, 2005.
[11]
J. Kivinen, A. J. Smola, and R. C. Williamson. Online learning with kernels. IEEE Trans. on Signal Processing, 2004.
[12]
C. D. Manning, P. Raghavan, and H. SchÄutze. Introduction to Information Retrieval. Cambridge University Press, 2008.
[13]
Q. Mei, J. Jiang, H. Su, and C. Zhai. Search and tagging: Two sides of the same coin? Technical report, 2007.
[14]
P. Mika. Ontologies are us: A unified model of social networks and semantics. Web Semantics, 2007.
[15]
G. A. Miller. Wordnet: a lexical database for english. Communications of the ACM, 1995.
[16]
X. Olivares, M. Ciaramita, and R. van Zwol. Boosting image retrieval through aggregating search results based on visual annotations. In ACM Multimedia'08, 2008.
[17]
M. F. Porter. An algorithm for suffix stripping. Program, 1980.
[18]
S. E. Robertson, S. Walker, M. M. Beaulieu, and M. Gatford. Okapi at trec-4. In TREC-4, 1995.
[19]
S. S. Shwartz, Y. Singer, and N. Srebro. Pegasos: Primal estimated sub-gradient solver for svm. In ICML'07, 2007.
[20]
B. SchÄolkopf and A. J. Smola. Learning with Kernels. Cambridge, MA: MIT Press, 2001.
[21]
J. A. Shaw and E. A. Fox. Combination of multiple searches. In TREC-2, 1994.
[22]
S. Shwartz and N. Srebro. SVM optimization: inverse dependence on training set size. In ICML'08, 2008.
[23]
A. F. Smeaton, P. Over, and W. Kraaij. Evaluation campaigns and trecvid. In MIR '06, 2006.
[24]
G. Tzanetakis and P. Cook. marsyas a framework for audio analysis. Organized Sound, 2000.
[25]
L. Xie, A. Natsev, and J. Tesic. Dynamic multimodal fusion in video search. In IEEE ICME'07, 2007.
[26]
R. Yan and A. G. Hauptmann. Probabilistic latent query analysis for combining multiple retrieval sources. In SIGIR'06, 2006.
[27]
R. Yan, J. Yang, and A. G. Hauptmann. Learning query-class dependent weights in automatic video retrieval. In ACM Multimedia'04, 2004.
[28]
B. Zhang, J. Shen, Q. Xiang, and Y. Wang. Compositemap: A novel framework for music similarity measure. In Proc. of ACM SIGIR, 2009.

Cited By

View all
  • (2017)Tight Kernel Bounds for Problems on Graphs with Small DegeneracyACM Transactions on Algorithms10.1145/310823913:3(1-22)Online publication date: 9-Aug-2017
  • (2017)Max-Sum Diversification, Monotone Submodular Functions, and Dynamic UpdatesACM Transactions on Algorithms10.1145/308646413:3(1-25)Online publication date: 13-Jul-2017
  • (2017)Sharing Policies in Multiuser Privacy ScenariosACM Transactions on Computer-Human Interaction10.1145/303892024:1(1-29)Online publication date: 6-Mar-2017
  • Show More Cited By

Index Terms

  1. Comprehensive query-dependent fusion using regression-on-folksonomies: a case study of multimodal music search

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MM '09: Proceedings of the 17th ACM international conference on Multimedia
      October 2009
      1202 pages
      ISBN:9781605586083
      DOI:10.1145/1631272
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 19 October 2009

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. folksonomy
      2. multimodal search
      3. music
      4. query-dependent fusion

      Qualifiers

      • Research-article

      Conference

      MM09
      Sponsor:
      MM09: ACM Multimedia Conference
      October 19 - 24, 2009
      Beijing, China

      Acceptance Rates

      Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 19 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2017)Tight Kernel Bounds for Problems on Graphs with Small DegeneracyACM Transactions on Algorithms10.1145/310823913:3(1-22)Online publication date: 9-Aug-2017
      • (2017)Max-Sum Diversification, Monotone Submodular Functions, and Dynamic UpdatesACM Transactions on Algorithms10.1145/308646413:3(1-25)Online publication date: 13-Jul-2017
      • (2017)Sharing Policies in Multiuser Privacy ScenariosACM Transactions on Computer-Human Interaction10.1145/303892024:1(1-29)Online publication date: 6-Mar-2017
      • (2017)Introducing Mood Self-Tracking at WorkACM Transactions on Computer-Human Interaction10.1145/301405824:1(1-28)Online publication date: 3-Feb-2017
      • (2017)Coloring 3-Colorable Graphs with Less than n1/5 ColorsJournal of the ACM10.1145/300158264:1(1-23)Online publication date: 24-Mar-2017
      • (2016)On Effective Location-Aware Music RecommendationACM Transactions on Information Systems10.1145/284609234:2(1-32)Online publication date: 7-Apr-2016
      • (2015)Disjointness domains for fine-grained aliasingACM SIGPLAN Notices10.1145/2858965.281428050:10(898-916)Online publication date: 23-Oct-2015
      • (2013)Content-based copy detection through multimodal feature representation and temporal pyramid matchingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/2542205.254220810:1(1-20)Online publication date: 27-Dec-2013
      • (2013)A survey of music similarity and recommendation from music context dataACM Transactions on Multimedia Computing, Communications, and Applications10.1145/2542205.254220610:1(1-21)Online publication date: 27-Dec-2013
      • (2013)Detecting profilable and overlapping communities with user-generated multimedia contents in LBSNsACM Transactions on Multimedia Computing, Communications, and Applications10.1145/250241510:1(1-22)Online publication date: 27-Dec-2013
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media