Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/253495.253524acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
Article
Free access

The cluster hypothesis revisited

Published: 05 June 1985 Publication History

Abstract

A new means of evaluating the cluster hypothesis is introduced and the results of such an evaluation are presented for four collections. The results of retrieval experiments comparing a sequential search, a cluster-based search, and a search of the clustered collection in which individual documents are scored against the query are also presented. These results indicate that while the absolute performance of a search on a particular collection is dependent on the pairwise similarity of the relevant documents, the relative effectiveness of clustered retrieval versus sequential retrieval is independent of this factor. However, retrieval of entire clusters in response to a query usually results in a poorer performance than retrieval of individual documents from clusters.

References

[1]
Salton, G., ed., (1971) The SMART Retrieval System. Prentice-Hall, Englewood Cliffs, N.J.
[2]
Kerchner, M. D., (1971) Dynamic Document Processing in Clustered Collections. Ph.D. Thesis, Cornell University. Report ISR-19 to the National Science Foundation and to the National Library of Medicine.
[3]
Murray, D. M., (1972) Document Retrieval Based on Clustered Files. Ph.D. Thesis, Cotnell University. Report ISR-20 to the National Science Foundation and to the National Library of Medicine.
[4]
Williamson, R.E., (1974) Real-time Document Retrieval. Ph.D. Thesis, Cornell University.
[5]
Jardine, N. and van Rijsbergen, C. J., (1971) The Use of Hierarchic Clustering in Information Retrieval. Inform. Stor. ~ Retr., 7, 217- 240.
[6]
van Rijsbergen, C. J., (1979) Information Retrieval, 2nd edn. Butterworths, London.
[7]
Ide, Eleanor Rose Cook, (1969) Relevance Feedback in an Automatic Document Retrieval System. Master Thesis, Cornell University. Report ISR-15 to the National Science Foundation.
[8]
Fox, Edward A., (1983) Extending the Boolean and Vector Space Models of Information Retrieval with P-norm Queries and Multiple Concept Types. Ph.D. Thesis, Cornell University, pp. 44-46.
[9]
Salton, G., Fox, E. A., and Wu, H., (1983) Extended Boolean Information Retrieval. Communications of the ACM, 26, 1022-1036.
[10]
van Rijsbergen, C. J. and Sparck Jones, K., (1973) A Test for the Separation of Relevant and Non-relevant Documents in Experimental Retrieval Collections. Journal of Documentation, 29, 251-257.
[11]
van Rijsbergen, C. J., (1974) Further Experiments with Hierarchic Clustering in Document Retrieval. Inform. Stor. ~ Retr., }tO, 1-14.
[12]
van Rijsbergen, C. J. and Croft, W. B., (1975) Document Clustering: An Evaluation of Some Experiments with the Cranfield 1400 Collection. Inform. Proc. ~ Mana#ement, 11, 171- 182.
[13]
Croft, W. B., (1980) A Model of Cluster Searching Based on Classification. Inform. Systems, 5, 189--195.
[14]
Griffiths, Alan, Robinson, Lesley A., and Willett, Peter, (1984) Hierarchic Agglomerative Clustering Methods for Automatic Document Classification. Journal of Documentation, 40, 175-205.

Cited By

View all
  • (2024)Effective Adhoc Retrieval Through Traversal of a Query-Document GraphAdvances in Information Retrieval10.1007/978-3-031-56063-7_6(89-104)Online publication date: 23-Mar-2024
  • (2023)Model-enhanced vector indexProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668518(54903-54917)Online publication date: 10-Dec-2023
  • (2022)Adaptive Re-Ranking with a Corpus GraphProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557231(1491-1500)Online publication date: 17-Oct-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '85: Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
June 1985
288 pages
ISBN:0897911598
DOI:10.1145/253495
  • Chairman:
  • Jean M. Tague
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 June 1985

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)116
  • Downloads (Last 6 weeks)31
Reflects downloads up to 16 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Effective Adhoc Retrieval Through Traversal of a Query-Document GraphAdvances in Information Retrieval10.1007/978-3-031-56063-7_6(89-104)Online publication date: 23-Mar-2024
  • (2023)Model-enhanced vector indexProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3668518(54903-54917)Online publication date: 10-Dec-2023
  • (2022)Adaptive Re-Ranking with a Corpus GraphProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557231(1491-1500)Online publication date: 17-Oct-2022
  • (2022)Competitive SearchProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3532771(2838-2849)Online publication date: 6-Jul-2022
  • (2022)From Cluster Ranking to Document RankingProceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3477495.3531819(2137-2141)Online publication date: 6-Jul-2022
  • (2021)Passage Based Answer-Set Graph Approach for Query Performance PredictionProceedings of the 25th Australasian Document Computing Symposium10.1145/3503516.3503534(1-6)Online publication date: 9-Dec-2021
  • (2021)Recommending Search Queries in Documents Using Inter N-Gram SimilaritiesProceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3471158.3472252(211-220)Online publication date: 11-Jul-2021
  • (2021)Learning Multiple Intent Representations for Search QueriesProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482445(669-679)Online publication date: 26-Oct-2021
  • (2021)Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware SamplingProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3404835.3462891(113-122)Online publication date: 11-Jul-2021
  • (2021)Fast Knowledge Discovery in Social Media Data using Clustering via Ranking2021 9th International Conference on Cyber and IT Service Management (CITSM)10.1109/CITSM52892.2021.9588866(1-8)Online publication date: 22-Sep-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media