Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1555400.1555446acmconferencesArticle/Chapter ViewAbstractPublication PagesjcdlConference Proceedingsconference-collections
research-article

Document relevance assessment via term distribution analysis using fourier series expansion

Published: 15 June 2009 Publication History

Abstract

In addition to the frequency of terms in a document collection, the distribution of terms plays an important role in determining the relevance of documents for a given search query. In this paper, term distribution analysis using Fourier series expansion as a novel approach for calculating an abstract representation of term positions in a document corpus is introduced. Based on this approach, two methods for improving the evaluation of document relevance are proposed: (a) a function-based ranking optimization representing a user defined document region, and (b) a query expansion technique based on overlapping the term distributions in the top-ranked documents. Experimental results demonstrate the effectiveness of the proposed approach in providing new possibilities for optimizing the retrieval process.

References

[1]
G. Amati and C. J. V. Rijsbergen. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans. Inf. Syst., 20(4):357--389, 2002.
[2]
R. Attar and A. Fraenkel. Experiments in local metrical feedback in full-text retrieval systems. Information Processing and Management, 17(3):115--126, 1981.
[3]
R. Attar and A. S. Fraenkel. Local feedback in full-text retrieval systems. Journal of the ACM, 24(3):397--417, 1977.
[4]
R. A. Baeza-Yates and B. A. Ribeiro-Neto. Modern Information Retrieval. ACM Press -- Addison-Wesley, 1999.
[5]
M. Berry, S. Dumais, and G. O'Brien. Using linear algebra for intelligent information retrieval. SIAM Review, 37(4):573--595, 1994.
[6]
B. Billerbeck, F. Scholer, H. E. Williams, and J. Zobel. Query expansion using associated queries. In CIKM'03: Proceedings of the 12th Int. Conference on Information and Knowledge Management, pages 2--9, New York, NY, USA, 2003. ACM Press.
[7]
C. Buckley, G. Salton, and J. Allan. The effect of adding relevance information in a relevance feedback environment. In 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pages 292--300, London, July 1994.
[8]
C. Buckley, G. Salton, J. Allan, and A. Singhal. Automatic query expansion using smart: TREC-3. In Overview of the 3rd Text Retrieval Conference, pages 69--80. NIST Special Publication, 1995.
[9]
D. Cai, S. Yu, J.-R. Wen, and W.-Y. Ma. Block-based web search. In SIGIR'04: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 456--463, New York, NY, USA, 2004. ACM Press.
[10]
T. M. Cover and J. A. Thomas. Elements of Information Theory. Wiley-Interscience, August 1991.
[11]
E. Efthimiadis. Interactive query expansion and relevance feedback for document retrieval systems. PhD thesis, City University, London UK, 1992.
[12]
E. Efthimiadis. Query expansion. Annual Review of Information Science and Technology (ARIST), (2):121--187, 1996.
[13]
E. Efthimiadis and P. Biron. Ucla-okapi at TREC-2: Query expansion experiments. In Proceedings of the 2nd Text Retrieval Conference (TREC-2), pages 279--290. NIST Special Publication 500-215, 1994.
[14]
W. Fan, M. Luo, L. Wang, W. Xi, and E. Fox. Tuning before feedback: combining ranking discovery and blind feedback for robust retrieval. In SIGIR'04: Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 138--145, New York, NY, USA, July 2004. ACM.
[15]
L. Finkelstein, E. Gabrilovich, Y. Matias, E. Rivlin, A. Solan, G. Wolfman, and E. Ruppin. Placing search in context: the concept revisited. ACM Transactions on Information Systems, 20(1):116--131, January 2002.
[16]
A. Folkers and H. Samet. Content-based image retrieval using fourier descriptors on a logo database. In ICPR'02: Proceedings of the 16 th International Conference on Pattern Recognition Volume 3, page 30521, Washington, DC, USA, 2002. IEEE Computer Society.
[17]
M. Hearst and G. Pedersen. Reexamining the cluster hypothesis: scatter/gather on retrieval results. In A. Press, editor, Proceedings of International ACM SIGIR Conference on Research and Development in IR, pages 76--84, New York, 1996.
[18]
X. Huang and Y. Huang. Using contextual information to improve retrieval performance. In Proceedings of 2005 IEEE International Conference on Granular Computing, pages 474--481, Beijing, China, July 2005.
[19]
U. N. Jonathan Foote, Matthew D. Cooper. Audio retrieval by rhythmic similarity. In 3rd International Conference on Music Information Retrieval, 2002.
[20]
M. Kaszkiel and J. Zobel. Passage retrieval revisited. In SIGIR'97: Proceedings of the 20th annual international ACM SIGIR conference on Research and development in information retrieval, pages 178--185, New York, NY, USA, 1997. ACM.
[21]
S. M. Katz. Distribution of content words and phrases in text and language modelling. Natural Language Engineering, 2(1):15--59, 1996.
[22]
M. P. Laurence A. Park, Kotagiri Ramamohanarao. Fourier domain scoring: a novel document ranking method. Transactions on Knowledge and Data Engineering, 16(5):529--539, May 2004.
[23]
S. Lawrence. Context in web search. IEEE Data Engineering Bulletin, 23(3):25--32, 2000.
[24]
J. Li, M. Guo, and S. Tian. A new approach to query expansion. In Machine Learning and Cybernetics, pages 2302--2306, August 2005.
[25]
D. R. H. Miller, T. Leek, and R. M. Schwartz. A hidden markov model information retrieval system. In SIGIR'99: Proceedings of the 22nd annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 214--221, New York, NY, USA, 1999. ACM.
[26]
D. S. Moore. The Basic Practice Of Statistics. W H Freeman, 3rd edition, 2003.
[27]
I. Ounis, G. Amati, V. Plachouras, B. He, C. Macdonald, and C. Lioma. Terrier: A high performance and scalable information retrieval platform. In Proceedings of ACM SIGIR'06 Workshop on Open Source Information Retrieval (OSIR 2006), 2006.
[28]
M. A. Razek, C. Frasson, and M. Kaltenbach. Context-based information agent for supporting intelligent distance learning environment. In Proc. of the Twelfth International World Wide Web Conference, WWW03, page 968, Budapest, Hungary, 2003. Springer-Verlag.
[29]
S. Robertson and K. S. Jones. Relevance weighting of search terms. American Society for Information Sciences, 27(3):129--146, 1976.
[30]
J. Rocchio. Relevance feedback in information retrieval. In G. Salton, editor, The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313--323, Englewood Cliffs, NJ, 1971. Prentice-Hall.
[31]
X. Shen and C. Zhai. Exploiting query history for document ranking in interactive information retrieval. In 26th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval, pages 377--378. ACM Press, 2003.
[32]
M. R. Spiegel. Schaum's Outline of theory and problems of Fourier analysis. McGraw Hill, New York, 1 edition, 1974.
[33]
R. Sun, C.-H. Ong, and T.-S. Chua. Mining dependency relations for query expansion in passage retrieval. In SIGIR'06: Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 382--389, New York, NY, USA, 2006. ACM Press.
[34]
T. Tao and C. Zhai. An exploration of proximity measures in information retrieval. In SIGIR'07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, pages 295--302, New York, NY, USA, 2007. ACM.
[35]
J. Xu and W. Croft. Improving the effectiveness of information retrieval with local context analysis. ACM Transactions on Information Systems, 18(1):79--112, 2000.
[36]
J. Xu and R. Weischedel. Cross-lingual information retrieval using hidden markov models. In Proceedings of the 2000 Joint SIGDAT conference on Empirical methods in natural language processing and very large corpora, pages 95--103, Morristown, NJ, USA, 2000. Association for Computational Linguistics.
[37]
S. Yu, D. Cai, J. Wen, and W. Ma. Improving pseudo-relevance feedback in web information retrieval using web page segmentation. In Proceedings of the 12th International Conference on World Wide Web, pages 11--18, Budapest, 2003. ACM Press.

Cited By

View all
  • (2018)Query Expansion Based on Central Tendency and PRF for Monolingual RetrievalInformation Retrieval and Management10.4018/978-1-5225-5191-1.ch022(479-501)Online publication date: 2018
  • (2016)Query Expansion based on Central Tendency and PRF for Monolingual RetrievalInternational Journal of Information Retrieval Research10.4018/IJIRR.20161001036:4(30-50)Online publication date: 1-Oct-2016
  • (2013)Query Expansion Based on Equi-Width and Equi-Frequency PartitionMultilingual Information Access in South Asian Languages10.1007/978-3-642-40087-2_2(13-22)Online publication date: 2013
  • Show More Cited By

Index Terms

  1. Document relevance assessment via term distribution analysis using fourier series expansion

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    JCDL '09: Proceedings of the 9th ACM/IEEE-CS joint conference on Digital libraries
    June 2009
    502 pages
    ISBN:9781605583228
    DOI:10.1145/1555400
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 June 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. fourier series
    2. query expansion
    3. ranked retrieval
    4. term distribution

    Qualifiers

    • Research-article

    Conference

    JCDL '09
    JCDL '09: Joint Conference on Digital Libraries
    June 15 - 19, 2009
    TX, Austin, USA

    Acceptance Rates

    Overall Acceptance Rate 415 of 1,482 submissions, 28%

    Upcoming Conference

    JCDL '24
    The 2024 ACM/IEEE Joint Conference on Digital Libraries
    December 16 - 20, 2024
    Hong Kong , China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 24 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Query Expansion Based on Central Tendency and PRF for Monolingual RetrievalInformation Retrieval and Management10.4018/978-1-5225-5191-1.ch022(479-501)Online publication date: 2018
    • (2016)Query Expansion based on Central Tendency and PRF for Monolingual RetrievalInternational Journal of Information Retrieval Research10.4018/IJIRR.20161001036:4(30-50)Online publication date: 1-Oct-2016
    • (2013)Query Expansion Based on Equi-Width and Equi-Frequency PartitionMultilingual Information Access in South Asian Languages10.1007/978-3-642-40087-2_2(13-22)Online publication date: 2013
    • (2010)Information Retrieval via Truncated Hilbert Space ExpansionsProceedings of the 2010 10th IEEE International Conference on Computer and Information Technology10.1109/CIT.2010.135(690-697)Online publication date: 29-Jun-2010

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media