Nothing Special   »   [go: up one dir, main page]

skip to main content
Reflects downloads up to 24 Sep 2024Bibliometrics
article
Free
Fast parallel similarity search in multimedia databases

Most similarity search techniques map the data objects into some high-dimensional feature space. The similarity search then corresponds to a nearest-neighbor search in the feature space which is computationally very intensive. In this paper, we present ...

article
Free
Similarity-based queries for time series data

We study a set of linear transformations on the Fourier series representation of a sequence that can be used as the basis for similarity queries on time-series data. We show that our set of transformations is rich enough to formulate operations such as ...

article
Free
Meaningful change detection in structured data

Detecting changes by comparing data snapshots is an important requirement for difference queries, active databases, and version and configuration management. In this paper we focus on detecting meaningful changes in hierarchically structured data, such ...

article
Free
Improved query performance with variant indexes

The read-mostly environment of data warehousing makes it possible to use more complex indexes to speed up queries than in situations where concurrent updates are present. The current paper presents a short review of current indexing technology, ...

article
Free
Highly concurrent cache consistency for indices in client-server database systems

In this paper, we present four approaches to providing highly concurrent B+-tree indices in the context of a data-shipping, client-server OODBMS architecture. The first performs all index operations at the server, while the other approaches support ...

article
Free
Concurrency and recovery in generalized search trees

This paper presents general algorithms for concurrency control in tree-based access methods as well as a recovery protocol and a mechanism for ensuring repeatable read. The algorithms are developed in the context of the Generalized Search Tree (GiST) ...

article
Free
Range queries in OLAP data cubes

A range query applies an aggregation operation over all selected cells of an OLAP data cube where the selection is specified by providing ranges of values for numeric dimensions. We present fast algorithms for range queries for two types of aggregation ...

article
Free
Cubetree: organization of and bulk incremental updates on the data cube

The data cube is an aggregate operator which has been shown to be very powerful for On Line Analytical Processing (OLAP) in the context of data warehousing. It is, however, very expensive to compute, access, and maintain. In this paper we define the “...

article
Free
Maintenance of data cubes and summary tables in a warehouse

Data warehouses contain large amounts of information, often collected from a variety of independent sources. Decision-support functions in a warehouse, such as on-line analytical processing (OLAP), involve hundreds of complex aggregate queries over ...

article
Free
Database buffer size investigation for OLTP workloads

It is generally accepted that On-Line Transaction Processing (OLTP) systems benefit from large database memory buffers. As enterprise database systems become larger and more complex, hardware vendors are building increasingly large systems capable of ...

article
Free
Database performance in the real world: TPC-D and SAP R/3

Traditionally, database systems have been evaluated in isolation on the basis of standardized benchmarks (e.g., Wisconsin, TPC-C, TPC-D). We argue that very often such a performance analysis does not reflect the actual use of the DBMSs in the “real ...

article
Free
The BUCKY object-relational benchmark

According to various trade journals and corporate marketing machines, we are now on the verge of a revolution—the object-relational database revolution. Since we believe that no one should face a revolution without appropriate armaments, this paper ...

article
Free
The STRIP rule system for efficiently maintaining derived data

Derived data is maintained in a database system to correlate and summarize base data which records real world facts. As base data changes, derived data needs to be recomputed. This is often implemented by writing active rules that are triggered by ...

article
Free
An array-based algorithm for simultaneous multidimensional aggregates

Computing multiple related group-bys and aggregates is one of the core operations of On-Line Analytical Processing (OLAP) applications. Recently, Gray et al. [GBLP95] proposed the “Cube” operator, which computes group-by aggregations over all possible ...

article
Free
Online aggregation

Aggregation in traditional database systems is performed in batch mode: a query is submitted, the system processes a large volume of data over a long period of time, and, eventually, the final answer is returned. This archaic approach is frustrating to ...

article
Free
Balancing push and pull for data broadcast

The increasing ability to interconnect computers through internet-working, wireless networks, high-bandwidth satellite, and cable networks has spawned a new class of information-centered applications based on data dissemination. These applications ...

article
Free
InfoSleuth: agent-based semantic integration of information in open and dynamic environments

The goal of the InfoSleuth project at MCC is to exploit and synthesize new technologies into a unified system that retrieves and processes information in an ever-changing network of information sources. InfoSleuth has its roots in the Carnot project at ...

article
Free
STARTS: Stanford proposal for Internet meta-searching

Document sources are available everywhere, both within the internal networks of organizations and on the Internet. Even individual organizations use search engines from different vendors to index their internal document collections. These search engines ...

article
Free
On saying “Enough already!” in SQL

In this paper, we study a simple SQL extension that enables query writers to explicitly limit the cardinality of a query result. We examine its impact on the query optimization and run-time execution components of a relational DBMS, presenting two ...

article
Free
A framework for implementing hypothetical queries

Previous approaches to supporting hypothetical queries have been “eager”: some representation of the hypothetical state (or the corresponding delta) is materialized, and query evaluation is filtered through that representation. This paper develops a ...

article
Free
High-performance sorting on networks of workstations

We report the performance of NOW-Sort, a collection of sorting implementations on a Network of Workstations (NOW). We find that parallel sorting on a NOW is competitive to sorting on the large-scale SMPs that have traditionally held the performance ...

article
Free
Dynamic itemset counting and implication rules for market basket data

We consider the problem of analyzing market-basket data and present several important contributions. First, we present a new algorithm for finding large itemsets which uses fewer passes over the data than classic algorithms, and yet uses fewer candidate ...

article
Free
Beyond market baskets: generalizing association rules to correlations

One of the most well-studied problems in data mining is mining for association rules in market basket data. Association rules, whose significance is measured via support and confidence, are intended to identify rules of the type, “A customer purchasing ...

article
Free
Scalable parallel data mining for association rules

One of the important problems in data mining is discovering association rules from databases of transactions where each transaction consists of a set of items. The most time consuming operation in this discovery process is the computation of the ...

article
Free
Efficiently supporting ad hoc queries in large datasets of time sequences

Ad hoc querying is difficult on very large datasets, since it is usually not possible to have the entire dataset on disk. While compression can be used to decrease the size of the dataset, compressed data is notoriously difficult to index or access.

In ...

article
Free
DEVise: integrated querying and visual exploration of large datasets

DEVise is a data exploration system that allows users to easily develop, browse, and share visual presentation of large tabular datasets (possibly containing or referencing multimedia objects) from several sources. The DEVise framework is being ...

article
Free
Partitioned garbage collection of a large object store

We present new techniques for efficient garbage collection in a large persistent object store. The store is divided into partitions that are collected independently using information about inter-partition references. This information is maintained on ...

article
Free
Size separation spatial join

We introduce a new algorithm to compute the spatial join of two or more spatial data sets, when indexes are not available on them. Size Separation Spatial Join (S3J) imposes a hierarchical decomposition of the data space and, in contrast with previous ...

article
Free
Building a scaleable geo-spatial DBMS: technology, implementation, and evaluation

This paper presents a number of new techniques for parallelizing geo-spatial database systems and discusses their implementation in the Paradise object-relational database system. The effectiveness of these techniques is demonstrated using a variety of ...

article
Free
A toolkit for negotiation support interfaces to multi-dimensional data

CoDecide is an experimental user interface toolkit that offers an extension to spreadsheet concepts specifically geared towards support for cooperative analysis of the kinds of multi-dimensional data encountered in data warehousing. It is distinguished ...

Subjects

Comments

Please enable JavaScript to view thecomments powered by Disqus.