Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/956863.956905acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

A reliable storage management layer for distributed information retrieval systems

Published: 03 November 2003 Publication History

Abstract

We present a storage management layer that facilitates the implementation of parallel information retrieval systems, and related applications, on networks of workstations. The storage management layer automates the process of adding and removing nodes, and implements a dispersed mirroring strategy to improve reliability. When nodes are added and removed, the document collection managed by the system is redistributed for load balancing purposes. The use of dispersed mirroring minimizes the impact of node failures and system modifications on query performance.

References

[1]
Philip A. Bernstein, Vassos Hadzilacos, and Nathan Goodman. Concurrency Control And Recovery In Database Systems. Addison-Wesley, 1987.
[2]
Elisa Bertino, Beng Chin OOI, Ron Sacks-Davis, Kian-Lee Tan, Justin Zobel, Boris Shidlovsky, and Barbara Catania. Indexing Techniques for Advanced Database Systems. Kluwer, 1997.
[3]
W. J. Bolosky, Joseph S. Barrera, Richard P. Draves, Robert P. Fitzgerald, Garth A. Gibson, Michael B. Jones, Steven P. Levi, Nathan P. Myhrvold, and Richard F. Rashid. The tiger video fileserver. In 6th International Workshop on Network and Operating System Support for Digital Audio and Video, April 1996.
[4]
Aaron B. Brown and David A. Patterson. Embracing failure: A case for recovery-oriented computing (ROC). In 2001 High Performance Transaction Processing Symposium, Asilomar, California, October 2001.
[5]
Eric W. Brown, James P. Callan, and W. Bruce Croft. Fast incremental indexing for full-text information retrieval. In 20th VLDB Conference, pages 192--202, Santiago, Chile, September 1994.
[6]
Peter M. Chen, Edward K. Lee, Garth A. Gibson, Randy H. Katz, and David A. Patterson. RAID: high-performance, reliable secondary storage. ACM Computing Surveys, 26(2):145--185, June 1994.
[7]
Gordon V. Cormack, Charles L. A. Clarke, Christopher R. Palmer, and Samuel S. L. To. Passage-based query refinement. Information Processing and Management, 36(1):133--153, 2000.
[8]
Arturo Crespo and Hector Garcia-Molina. Archival storage for digital libraries. In Digital Libraries '98, pages 69--78, Pittsburgh, 1998.
[9]
Flaviu Cristian and Christof Fetzer. The timed asynchronous system model. Technical Report CS97-519, University of California, San Diego, January 1997.
[10]
Doug Cutting and Jan Pedersen. Optimizations for dynamic inverted list maintenance. In 13th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 1990.
[11]
Armando Fox, Steven D. Gribble, Yatin Chawathe, Eric A. Brewer, and Paul Gauthier. Cluster-based scalable network services. In 16th ACM Symposium on Operating System Principles, pages 78--91, Saint-Malo, France, October 1997.
[12]
David Hawking, Nick Craswell, and Paul Thistlewaite. Overview of the TREC-7 very large collection track. In Seventh Text REtrieval Conference, Gaithersburg, Maryland, November 1998.
[13]
A. A. Helal, A. A. Heddaya, and B. B. Bhargava. Replication Techniques in Distributed Systems. Kluver, 1996.
[14]
Farnam Jahaniana, Sameh Fakhouri, and Ragunathan Rajkumar. Processor group membership protocols: Specification, design, and implementation. In 12th Symposium on Reliable Distributed Systems, 1993.
[15]
Mohan Kamath and Krithi Ramamritham. Efficient transaction support for dynamic information retrieval systems. In 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Zurich, August 1996.
[16]
Zhihong Lu and Kathryn S. McKinley. Partial collection replication for information retrieval. volume 6, pages 159--198, April 2003.
[17]
Nancy A. Lynch. Distributed Algorithms. Morgan Kaufmann, 1996.
[18]
Allison L. Powell, James C. French, Jamie Callan, Margaret Connell, and Charles L. Viles. The impact of database selection on distributed searching. In 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 232--239, Athens, August 2000.
[19]
Berthier Ribeiro-Neto, Edleno S. Moura, and Marden S. Neubert. Efficient distributed algorithms to build inverted files. In 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pages 105--112, Berkeley, California, August 1999.
[20]
Berthier A. Ribeiro-Neto and Ramurti A. Barbosa. Query performance for tightly coupled distributed digital libraries. In Digital Libraries '98, pages 182--190, Pittsburgh, 1998.
[21]
Anthony Tomasic and Hector Garcia-Molina. Performance issues in distributed shared-nothing information retrieval systems. Information Processing and Management, 32(6):647--665, 1996.
[22]
Anthony Tomasic, Hector Garcia-Molina, and Kurt Shoens. Incremental updates of inverted lists for text document retrieval. In ACM SIGMOD International Conference on Management of Data, pages 289--300, Minneapolis, Minnessota, May 1994.
[23]
Allen Quoc-Luan Tran. A network management facility for a fault-tolerant distributed information retrieval system. Master's thesis, University of Toronto, 2000.
[24]
D. C. Verma, S. Sahu, S. Calo, A. Shaikh, I. Chang, and A. Acharya. SRIRAM: A scalable resilient autonomic mesh. IBM Systems Journal, 42(1):19--28, 2003.

Cited By

View all
  • (2006)A pipelined architecture for distributed text query evaluationInformation Retrieval10.1007/s10791-006-9014-410:3(205-231)Online publication date: 5-Oct-2006
  • (2004)Approximating the top-m passages in a parallel question answering systemProceedings of the thirteenth ACM international conference on Information and knowledge management10.1145/1031171.1031259(454-462)Online publication date: 13-Nov-2004

Index Terms

  1. A reliable storage management layer for distributed information retrieval systems

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CIKM '03: Proceedings of the twelfth international conference on Information and knowledge management
      November 2003
      592 pages
      ISBN:1581137230
      DOI:10.1145/956863
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 03 November 2003

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. cluster computing
      2. distributed information retrieval
      3. self-managing systems

      Qualifiers

      • Article

      Conference

      CIKM03

      Acceptance Rates

      Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

      Upcoming Conference

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 19 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2006)A pipelined architecture for distributed text query evaluationInformation Retrieval10.1007/s10791-006-9014-410:3(205-231)Online publication date: 5-Oct-2006
      • (2004)Approximating the top-m passages in a parallel question answering systemProceedings of the thirteenth ACM international conference on Information and knowledge management10.1145/1031171.1031259(454-462)Online publication date: 13-Nov-2004

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media