Abstract
The iRODS system, created by the San Diego Supercomputing Centre, is a rule oriented data management system that allows the user to create sets of rules to define how the data is to be managed. Each rule corresponds to a particular action or operation (such as checksumming a file) and the system is flexible enough to allow the user to create new rules for new types of operations. The iRODS system can interface to any storage system (provided an iRODS driver is built for that system) and relies on its’ metadata catalogue to provide a virtual file-system that can handle files of any size and type.
However, some storage systems (such as tape systems) do not handle small files efficiently and prefer small files to be packaged up (or “bundled”) into larger units. We have developed a system that can bundle small data files of any type into larger units - mounted collections. The system can create collection families and contains its’ own extensible metadata, including metadata on which family the collection belongs to. The mounted collection system can work standalone and is being incorporated into the iRODS system to enhance the systems flexibility to handle small files.
In this paper we describe the motivation for creating a mounted collection system, its’ architecture and how it has been incorporated into the iRODS system. We describe different technologies used to create the mounted collection system and provide some performance numbers.
Chapter PDF
Similar content being viewed by others
References
Rajasekar, A., Wan, M., Moore, R., Schroeder, W., Kremenek, G., Jagatheesan, A., Cowart, C., Zhu, B., Chen, S.-Y., Olschanowsky, R.: Storage resource broker - managing distributed data in a grid. Technical report, San Diego Supercomputer Center (SDSC), University of California
About irods (September 19, 2007), http://irods.sdsc.edu/index.php/Main_Page
Rajasekar, A., Wan, M., Moore, R., Schroeder, W.: A prototype rule-base distributed data management system. Technical report, Paris, France (May 2006)
Strong, B., Corney, D., Berrisford, P., Folkes, T., Moreton-Smith, C., Kleese-Van-Dam, K.: Key lessons in the efficient archive of small files to the cclrc mss using srb. Technical report, IEEE (2005)
About sqlite (September 13, 2007), http://www.sqlite.org
Why oracle berkeley db? (September 13, 2007), http://www.oracle.com/database/berkeley-db/index.html
What is apache derby? (September 13, 2007), http://db.apache.org/derby/
Lewis, J.P., Neumann, U.: Performance of java versus c++ (accessed on December 3, 2007), http://www.idiom.com/~zilla/Computer/javaCbenchmark.html
7-zip (September 17, 2007), http://www.7-zip.org/
Clifford Neumann, B., Tso, T.: Kerberos: An authentication service for computer networks. Technical report, Institute of Electricaland Electronics Engineers (September 1994)
Shibboleth, http://shibboleth.internet2.edu/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Weise, A., Wan, M., Schroeder, W., Hasan, A. (2008). Managing Groups of Files in a Rule Oriented Data Management System (iRODS). In: Bubak, M., van Albada, G.D., Dongarra, J., Sloot, P.M.A. (eds) Computational Science – ICCS 2008. ICCS 2008. Lecture Notes in Computer Science, vol 5103. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-69389-5_37
Download citation
DOI: https://doi.org/10.1007/978-3-540-69389-5_37
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-69388-8
Online ISBN: 978-3-540-69389-5
eBook Packages: Computer ScienceComputer Science (R0)