Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/508791.508963acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
Article

A parallel index for semistructured data

Published: 11 March 2002 Publication History

Abstract

Database systems are increasingly being used to manage semistructured data, which may not have a fixed structure or set of relationships between data items. Indexes which use tree structures to manage semistructured data become unbalanced and difficult to parallelize due to the complex nature of the data. We propose a mechanism by which an unbalanced vertical tree is managed in a balanced way by additional layers of horizontal index. Then, the vertical tree can be partitioned among parallel computing nodes in a balanced fashion. We discuss how to construct, search and update such a horizontal structure using the example of a Patricia trie index. We also present simulation results that demonstrate the speedup offered by such parallelism, for example, with three-way parallelism, our techniques can provide almost a factor of three speedup.

References

[1]
DBLP computer science bibliography. http://www.informatik.uni-trier.de/-ley/db/.
[2]
Serge Abiteboul. Querying semi-structured data. In Proc. ICDT, 1997.
[3]
Barbara T. Blaustein and Charles W. Kaufman. Updating replicated data during communications failures. In Proc. VLDB, pages 49-58, 1985.
[4]
D. Comer. The ubiquitous b-tree. Computing Surveys, 11(2):121-137, 1979.
[5]
Brian Cooper and Moshe Shadmon. The Index Fabric: A mechanism for indexing and querying the same data in many different ways, 2000. RightOrder Incorporated Technical Report.
[6]
Alin Deutsch, Mary Fernandez, and Dan Suciu. Storing semistructured data with STORED. In Proc. SIGMOD, 1999.
[7]
David J. DeWitt and Jim Gray. Parallel database systems: The future of high performance database systems. CACM, 35(6):85-98, 1992.
[8]
A. A. Diwan, Sanjeeva Rane, S. Seshadri, and S. Sudarshan. Clustering techniques for minimizing external path length. In Proc. VLDB, pages 342-353, 1996.
[9]
Brian Cooper et al. A fast index for semistructured data. In Proc. VLDB, September 2001.
[10]
Jason McHugh et al. Lore: A database management system for semistructured data. SIGMOD Record, 26(3):54-66, September 1997.
[11]
D. Gifford. Weighted voting for replicated data. In Proc. SOSP, pages 49-58, 1979.
[12]
Joseph M. Hellerstein, Jeffrey F. Naughton, and Avi Pfeffer. Generalized search trees for database systems. In Proc. VLDB, pages 562-573, September 1995.
[13]
Donald Knuth. The Art of Computer Programming, Vol. III, Sorting and Searching, Third Edition. Addison Wesley, Reading, MA, 1998.
[14]
M. Tamer Ozsu and Patrick Valduriez. Principles of Distributed Database Systems (Second Edition). Prentice Hall, Upper Saddle River, New Jersey, 1999.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '02: Proceedings of the 2002 ACM symposium on Applied computing
March 2002
1200 pages
ISBN:1581134452
DOI:10.1145/508791
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 March 2002

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

SAC02
Sponsor:
SAC02: 2002 ACM Symposium on Applied Computing
March 11 - 14, 2002
Madrid, Spain

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Nov 2024

Other Metrics

Citations

Cited By

View all

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media