Abstract
An Archival Repository reliably stores digital objects for long periods of time (decades or centuries). The archival nature of the system requires new techniques for storing, indexing, and replicating digital objects. In this paper we discuss the specialized indexing needs of a write-once archive. We also present a reliability algorithm for effectively replicating sets of related objects. We describe a data import utility for archival repositories. Finally, we discuss and evaluate a prototype repository we have built, the Stanford Archival Vault (SAV).
This material is based upon work supported by the National Science Foundation under Award 9811992.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Yuri Breitbart, Raghavan Komondoor, Rajeev Rastogi, S. Seshadri, and Avi Silberschatz. Update propagation protocols for replicated databases. In Proceedings of the ACM SIGMOD Conference, 1999.
Yuan Chen, Jan Edler, Andrew Goldberg, Allan Gottlieb, Sumeet Sobti, and Peter Yianilos. A prototype implementation of archival intermemory. In Proceedings of the Fourth ACM DL Conference, 1999.
Ann Chervenak, Vivekenand Vellanki, and Zachary Kurmas. Protecting file systems: A survey of backup techniques. In Proceedings Joint NASA and IEEE Mass Storage Conference, March 1998.
Brian Cooper, Arturo Crespo, and Hector Garcia-Molina. Implementing a reliable digital object archive. http://dbpubs.stanford.edu/pub/2000-27, 2000. Extended version of paper.
Brian Cooper and Hector Garcia-Molina. InfoMonitor: Unobtrusively archiving a World Wide Web server. http://www-db.stanford.edu/pub/papers/fmpaper.ps, 2000. Technical Report.
Inktomi Corporation. Web surpasses one billion documents. http://-www.inktomi.com/new/press/billion.html, 2000.
Arturo Crespo and Hector Garcia-Molina. Awareness services for digital libraries.In Lecture Notes in Computer Science, volume 1324, 1997.
Arturo Crespo and Hector Garcia-Molina. Archival storage for digital libraries. In Proceedings of the Third ACM DL Conference, 1998.
Arturo Crespo and Hector Garcia-Molina. Modeling archival repositories for digital libraries. In Proceedings of the Fourth European Conference on Research and Advanced Technology for Digital Libraries (ECDL), 2000.
Jean Deken. Writ in water? an exploration of the gap between archival construct and practice in the machine-readable environment. In Working With Knowldge Conference, May 1998. Accessible at http://www.slac.stanford.edu/pubs/slacpubs/7000/slac-pub-7811.html.
Ross Finlayson and David Cheriton. Log files: An extended file service exploiting write-once storage. In Proceedings of the 11th Symposium on Operating Systems Principles, November 1987.
National Science Foundation. Workshop on Data Archival and Information Preservation: Executive summary. http://cecssrv1.cecs.missouri.edu/NSFWorkshop/execsum.html, 1999.
Hector Garcia-Molina, Jeff Ullman, and Jennifer Widom. Database System Implementation. Prentice Hall, Upper Saddle River, New Jersey, 2000.
John Garrett and Donald Waters. Preserving digital information: Report of the Task Force on Archiving of Digital Information, May 1996. Accessible at http://www.rlg.org/ArchTF/.
Kaj Gronbaek and Randall Trigg. Design issues for a Dexter-based hypermedia system. Communications of the ACM, 37(2):40–49, February 1994.
Anja Haake and David Hicks. Verse: Towards hypertext versioning styles. In Hypertext’ 96, 1996.
Frank Halasz and Mayer Schwartz. The Dexter Hypertext Reference Model. Communications of the ACM, 37(2):30–39, February 1994.
Joseph Halpern and Carl Lagoze. The Computing Research Repository: Promoting the rapid dissemination and archiving of computer science research. In Proceedings of the Fourth ACM DL Conference, 1999.
John Hartman and John Ousterhout. The Zebra striped network file system. In Proceedings 14th Symposium on Operating Systems Principles, December 1993.
Norman C. Hutchinson, Stephen Manley, Mike Federwisch, Guy Harris, Dave Hitz, Steven Kleiman, and Sean O’Malley. Logical vs. physical file system backup. In Proceedings of the Third USENIX Symposium on Operating Systems Design and Implementation (OSDI), 1999.
Tivoli Systems Inc. Tivoli storage manager. http://www.tivoli.com/products/index/storage mgr/, 1999.
Richard P. King, Nagui Halim, Hector Garcia-Molina, and Christos A. Polyzois. Management of a remote backup copy for disaster recovery. TODS, 16(2):338–68, 1991.
Barbara Liskov, Sanjay Ghemawat, Robert Gruber, Paul Johnson, Liuba Shrira, and Michael Williams. Replication in the Harp file system. In Proceedings 13th Symposium on Operating Systems Principles, October 1991.
Stanford Conservation Online. Electronic storage media.http://palimpsest.stanford.edu/bytopic/electronic-records/electronic-storage-media/, 2000.
David Patterson, Garth Gibson, and Randy H. Katz. A case for redundant arrays of inexpensive disks (RAID). SIGMOD Record, 17(3):109–116, September 1988.
Michael Rabinovich, Narain Gehani, and Alex Kononov. Efficient update propagation in epidemic replicated databases. In Proceedings of the 5th International Conference on Extending Database Technology, 1996.
Arcot Rajasekar, Richard Marciano, and Reagan Moore. Collection-based persistent archives. http://www.sdsc.edu/NARA/Publications/OTHER/Persistent/Persistent.html, 2000.
Mendel Rosenblum and John K. Ousterhout. The design and implementation of a log-structured file system. In Proceedings 13th Symposium on Operating Systems Principles, October 1991.
David Rosenthal and Vicky Reich. Permanent web publishing.http://lockss.stanford.edu/, 2000. To appear at Freenix, San Diego, CA, June 2000.
Victorian Electronic Records Strategy. Victorian electronic records strategy final report. http://home.vicnet.net.au/~ provic/vers/final.htm, 1999.
Walter Tichy. RCS — a system for version control. Software — Practice and Experience, 15(7):637–654, 1985.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cooper, B., Crespo, A., Garcia-Molina, H. (2000). Implementing a Reliable Digital Object Archive. In: Borbinha, J., Baker, T. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2000. Lecture Notes in Computer Science, vol 1923. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45268-0_13
Download citation
DOI: https://doi.org/10.1007/3-540-45268-0_13
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-41023-2
Online ISBN: 978-3-540-45268-3
eBook Packages: Springer Book Archive