No abstract available.
Applying summarization techniques for term selection in relevance feedback
Query-expansion is an effective Relevance Feedback technique for improving performance in Information Retrieval. In general query-expansion methods select terms from the complete contents of relevant documents. One problem with this approach is that ...
Temporal summaries of new topics
We discuss technology to help a person monitor changes in news coverage over time. We define temporal summaries of news stories as extracting a single sentence from each event within a news topic, where the stories are presented one at a time and ...
Generic text summarization using relevance measure and latent semantic analysis
In this paper, we propose two generic text summarization methods that create text summaries by ranking and extracting sentences from the original documents. The first method uses standard IR methods to rank sentence relevances, while the second method ...
A new approach to unsupervised text summarization
The paper presents a novel approach to unsupervised text summarization. The novelty lies in exploiting the diversity of concepts in text for summarization, which has not received much attention in the summarization literature. A diversity-based approach ...
Vector-space ranking with effective early termination
Considerable research effort has been invested in improving the effectiveness of information retrieval systems. Techniques such as relevance feedback, thesaural expansion, and pivoting all provide better quality responses to queries when tested in ...
Static index pruning for information retrieval systems
We introduce static index pruning methods that significantly reduce the index size in information retrieval systems.We investigate uniform and term-based methods that each remove selected entries from the index and yet have only a minor effect on ...
Using event segmentation to improve indexing of consumer photographs
Automatic albuming --- the automatic organization of photographs, either as an end in itself or for use in other applications -- is an application that promises to be of great assistance to photographers. Relatively sophisticated image content analysis ...
Ranking retrieval systems without relevance judgments
The most prevalent experimental methodology for comparing the effectiveness of information retrieval systems requires a test collection, composed of a set of documents, a set of query topics, and a set of relevance judgments indicating which documents ...
Evaluation by highly relevant documents
Given the size of the web, the search engine industry has argued that engines should be evaluated by their ability to retrieve highly relevant pages rather than all possible relevant pages. To explore the role highly relevant documents play in retrieval ...
Meta-scoring: automatically evaluating term weighting schemes in IR without precision-recall
In this paper, we present a method that can automatically evaluate performance of different term weighting schemes in information retrieval without resorting to precision-recall based on human relevance judgments. Specifically, the problem is: given two ...
Improving cross language retrieval with triangulated translation
Most approaches to cross language information retrieval assume that resources providing a direct translation between the query and document languages exist. This paper presents research examining the situation where such an assumption is false. Here, ...
Improving query translation for cross-language information retrieval using statistical models
Dictionaries have often been used for query translation in cross-language information retrieval (CLIR). However, we are faced with the problem of translation ambiguity, i.e. multiple translations are stored in a dictionary for a word. In addition, a ...
Evaluating a probabilistic model for cross-lingual information retrieval
This work proposes and evaluates a probabilistic cross-lingual retrieval system. The system uses a generative model to estimate the probability that a document in one language is relevant, given a query in another language. An important component of ...
Document language models, query models, and risk minimization for information retrieval
We present a framework for information retrieval that combines document models and query models using a probabilistic ranking function based on Bayesian decision theory. The framework suggests an operational retrieval model that extends recent ...
Relevance based language models
We explore the relation between classical probabilistic models of information retrieval and the emerging language modeling approaches. It has long been recognized that the primary obstacle to effective performance of classical models is the need to ...
A statistical learning learning model of text classification for support vector machines
This paper develops a theoretical learning model of text classification for Support Vector Machines (SVMs). It connects the statistical properties of text-classification tasks with the generalization performance of a SVM in a quantitative way. Unlike ...
A study of thresholding strategies for text categorization
Thresholding strategies in automated text categorization are an underexplored area of research. This paper presents an examination of the effect of thresholding strategies on the performance of a classifier under various conditions. Using k-Nearest ...
On feature distributional clustering for text categorization
We describe a text categorization approach that is based on a combination of feature distributional clusters with a support vector machine (SVM) classifier. Our feature selection approach employs distributional clustering of words via the recently ...
Iterative residual rescaling
We consider the problem of creating document representations in which inter-document similarity measurements correspond to semantic similarity. We first present a novelsubspace-basedframework for formalizing this task. Using this framework, we derive a ...
Expressive retrieval from XML documents
The emergence of XML as a standard interchange format for structured documents/data has given rise to many XML query language proposals. However, some of these languages do not support information retrieval-style ranked queries based on textual ...
XIRQL: a query language for information retrieval in XML documents
Based on the document-centric view of XML, we present the query language XIRQL. Current proposals for XML query languages lack most IR-related features, which are weighting and ranking, relevance-oriented search, datatypes with vague predicates, and ...
Empirical investigations on query modification using abductive explanations
In this paper we report on a series of experiments designed to investigate query modification techniques motivated by the area of abductive reasoning. In particular we use the notion of abductive explanation, explanations being a description of data ...
Generic summaries for indexing in information retrieval
This paper examines the use of generic summaries for indexing in information retrieval. Our main observations are that: (1) With or without pseudo-relevance feedback, a summary index may be as effective as the corresponding fulltext index forprecision-...
Automatic generation of concise summaries of spoken dialogues in unrestricted domains
Automatic summarization of open domain spoken dialogues is a new research area. This paper introduces the task, the challenges involved, and presents an approach to obtain automatic extract summaries for multi-party dialogues of four different genres, ...
Enhanced topic distillation using text, markup tags, and hyperlinks
Topic distillation is the analysis of hyperlink graph structure to identify mutually reinforcing authorities (popular pages) and hubs (comprehensive lists of links to authorities). Topic distillation is becoming common in Web search engines, but the ...
Transparent Queries: investigation users' mental models of search engines
Typically, commercial Web search engines provide very little feedback to the user concerning how a particular query is processed and interpreted. Specifically, they apply key query transformations without the users knowledge. Although these ...
Why batch and user evaluations do not give the same results
Much system-oriented evaluation of information retrieval systems has used the Cranfield approach based upon queries run against test collections in a batch mode. Some researchers have questioned whether this approach can be applied to the real world, ...
Evaluating a content based image retrieval system
Content Based Image Retrieval (CBIR) presents special challenges in terms of how image data is indexed, accessed, and how end systems are evaluated. This paper discusses the design of a CBIR system that uses global colour as the primary indexing key, ...
Evaluating topic-driven web crawlers
Due to limited bandwidth, storage, and computational resources, and to the dynamic nature of the Web, search engines cannot index every Web page, and even the covered portion of the Web cannot be monitored continuously for changes. Therefore it is ...
Cited By
- Ferro N (2017). What Does Affect the Correlation Among Evaluation Measures?, ACM Transactions on Information Systems, 36:2, (1-40), Online publication date: 30-Apr-2018.
-
Jensen D, Giraud-Carrier C and Davis N A Method for Computing Lexical Semantic Distance Using Linear Functionals, SSRN Electronic Journal, 10.2139/ssrn.3199389
Index Terms
- Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval