Index Selection for Databases: A Hardness Study and a Principled Heuristic Solution
We study the index selection problem: Given a workload consisting of SQL statements on a database, and a user-specified storage constraint, recommend a set of indexes that have the maximum benefit for the given workload. We present a formal statement ...
Augmenting a Conceptual Model with Geospatiotemporal Annotations
While many real-world applications need to organize data based on space (e.g., geology, geomarketing, environmental modeling) and/or time (e.g., accounting, inventory management, personnel management), existing conventional conceptual models do not ...
A Modeling Technique for the Performance Analysis of Web Searching Applications
This paper proposes a methodological approach for the performance analysis of Web-based searching applications on the Internet. It specifically investigates the behavior of the Client/Server (C/S), Remote-Evaluation (REV) and Mobile-Agent (MA) ...
The Hierarchical Degree-of-Visibility Tree
In this paper, we present a novel structure called the Hierarchical Degree-of-Visibility Tree (HDoV-tree) for visibility query processing in visualization systems. The HDoV-tree builds on and extends the R-tree such that 1) the search space is pruned ...
Cluster Analysis for Gene Expression Data: A Survey
DNA microarray technology has now made it possible to simultaneously monitor the expression levels of thousands of genes during important biological processes and across collections of related samples. Elucidating the patterns hidden in gene expression ...
HARP: A Practical Projected Clustering Algorithm
In high-dimensional data, clusters can exist in subspaces that hide themselves from traditional clustering methods. A number of algorithms have been proposed to identify such projected clusters, but most of them rely on some user parameters to guide the ...
Information Retrieval in Document Image Databases
With the rising popularity and importance of document images as an information source, information retrieval in document image databases has become a growing and challenging problem. In this paper, we propose an approach with the capability of matching ...
Evaluation of Edge Caching/Offloading for Dynamic Content Delivery
As dynamic content becomes increasingly dominant, it becomes an important research topic as how the edge resources such as client-side proxies, which are otherwise underutilized for such content, can be put into use. However, it is unclear what will be ...
Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach
- Jian Pei,
- Jiawei Han,
- Behzad Mortazavi-Asl,
- Jianyong Wang,
- Helen Pinto,
- Qiming Chen,
- Umeshwar Dayal,
- Mei-Chun Hsu
Sequential pattern mining is an important data mining problem with broad applications. However, it is also a difficult problem since the mining may have to generate or examine a combinatorially explosive number of intermediate subsequences. Most of the ...
View Adaptation in the Fragment-Based Approach
View adaptation relies on adapting a set of materialized views in response to schema changes of source relations and/or after view redefinition. Recently, several view selection methods that are based on materializing fragments of the view rather than ...