Abstract
Digital material has to be preserved not only against loss or corruption, but also against changes in its ecosystem. A quite general view of the digital preservation problem is to approach it from a dependency management point of view. In this paper, we present a rule-based approach for dependency management which can model also converters and emulators. We show that this modeling approach enables the automatic reasoning needed for reducing the human effort required for checking (and monitoring) whether a task on a digital object is performable. We provide examples demonstrating how real-world converters and emulators can be modeled, and show how the preservation services can be implemented. Subsequently, we detail an implementation based on semantic web technologies, describe the prototype system Epimenides which demonstrates the feasibility of the approach, and finally report various promising evaluation results.
Similar content being viewed by others
Notes
For reasons of space, we do not include examples for this case. We provide examples only for the case of software, since this case is in general more challenging.
This paper elaborates on the ideas first presented in [32]. In comparison to that paper, the current paper presents a more expressive modeling approach (accounting for parameters and exceptions), shows how real emulators and converters can be modeled, provides implementation details, reports our experiences for implementing this approach using semantic web tools, presents the system Epimenides and reports results from its evaluation so far in the context of the ongoing APARSEN NoE.
Open Archival Information System (ISO 14721:2003).
KEEP has created an emulation framework [5] (EF) which provides additional services which will help to build a more solid ground for the emulation preservation strategy. KEEP is depending on existing and future emulators, and has not created an emulator itself.
In an implementation over Prolog, we could use the retract feature to delete a fact from the database.
Multipurpose internet mail extensions (MIME) is an internet standard that extends the format of email.
The experiments were carried out using the Virtuoso 06.01.3127 version, running in a DualCore linux machine with 3GB RAM.
References
Anderson, D., Delve, J., Konstantelos, L., Ciuffreda, A., Dobreva, M.: TOTEM: Trusted Online Technical Environment Metadata: a long-term solution for a relational database/RDF ontologies. In: Proceedings of the 8th International Conference on Preservation of Digital Objects (iPRES 2011), 1–4 Nov 2011, Singapore (2011)
Becker, C., Kulovits, H., Rauber, A., Hofman, H.: Plato: a service oriented decision support system for preservation planning. In: Proceedings of the 8th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 367–370. ACM (2008)
Becker, C., Rauber, A.: Decision criteria in digital preservation: what to measure and how. JASIST 62(6), 1009–1028 (2011)
Bellard, F.: QEMU, a fast and portable dynamic translator. In: Proceedings of the USENIX Annual Technical Conference, FREENIX Track, pp. 41–46 (2005)
Bergmeyer, W.: The KEEP emulation framework. In: Proceedings of the 1st International Workshop on Semantic Digital Archives (SDA 2011) (2011)
Ceri, S., Gottlob, G., Tanca, L.: What you always wanted to know about datalog (and never dared to ask). IEEE Trans Knowl Data Eng 1(1), 146–166 (1989)
Conway, E., Dunckley, M., McIlwrath, B., Giaretta, D.: Preservation network models: creating stable networks of information to ensure the long term use of scientific data. In: Proceedings of PV2009, Madrid, Spain, pp. 1–3 (2009)
Conway, E., Matthews, B., Giaretta, D., Lambert, S., Wilson, M., Draper, N.: Managing risks in the preservation of research data with preservation networks. Int. J. Digit. Curation 7(1), 3–15 (2012)
Doerr, M., Tzitzikas, Y.: Information carriers and identification of information objects: an ontological approach). 2012. CoRR, Digital Libraries. arXiv: 1201.0385v1 [cs.DL]
Giaretta, D. (ed.): Advanced Digital Preservation. Springer, Berlin (2011)
Elenius, D., Martin, D., Ford, R., Denker, G.: Reasoning about resources and hierarchical tasks using OWL and SWRL. In: Proceedings of the 8th International Semantic Web Conference (ISWC’2009) (2009)
Erling, O., Mikhailov, I.: RDF support in the virtuoso DBMS. In: Proceedings of the 1st Conference on Social Semantic Web (2007)
Granger, S.: Emulation as a digital preservation strategy. D-Lib Magazine 6(10), (2000)
Granger, S.: Digital preservation and emulation: from theory to practice. ICHIM 2, 289–296 (2001)
Haslhofer, B., Roochi, E.M., Schandl, B., Zander, S.: Europeana RDF store report. Technical report, University of Vienna, Vienna, March 2011. http://eprints.cs.univie.ac.at/2833/ (2011)
Horrocks, I., Patel-Schneider, P.F., Boley, H., Tabet, S., Grosof, B., Dean, M.: Swrl: a semantic web rule language combining owl and ruleml. W3C Member Submission, vol. 21, p. 79 (2004)
Howe, J.: Crowdsourcing: Why the power of people of the crowd is driving the future of business. Crown Publishing Group, New York (2008)
Lohman, B., Kiers, B., Michel, D., van der J. Hoeven.: Emulation as a business solution: the emulation framework. In: Proceedings of the 8th International Conference on Preservation of Digital Objects (iPres’2011) (2011)
Lorie, R.A.: Long term preservation of digital information. In: Proceedings of the 1st ACM/IEEE-CS Joint Conference on Digital Libraries. JCDL ’01, pp. 346–352. ACM, New York (2001)
Marketakis, Y., Tzanakis, M., Tzitzikas, Y.: PreScan: Towards automating the preservation of digital objects. In: Proceedings of the International Conference on Management of Emergent Digital Ecosystems MEDES’2009, Lyon, France, Oct 2009 (2009)
Marketakis, Y., Tzitzikas, Y.: Dependency management for digital preservation using semantic web technologies. Int. J. Digit. Libr. 10(4), 159–177 (2009)
McGuinness, D.L., Van Harmelen, F.: Owl web ontology language overview. W3C Recomm. 10(2004-03), 10 (2004)
Rechert, K., von Suchodoletz, D., Welte, R.: Emulation based services in digital preservation. In: Proceedings of the 10th Annual Joint Conference on Digital Libraries, pp. 365–368. ACM (2010)
Sabou, M., Bontcheva, K., Scharl, A.: Crowdsourcing research opportunities: lessons from natural language processing. In: I-KNOW, p. 17 (2012)
Shaon, A., Giaretta, D., Crompton, S., Conway, E., Matthews, B., Marelli, F., Di Giammatteo, U., Marketakis, Y., Tzitzikas, Y., Guarino, R., Brocks, H., Engel, F.: Towards a Long-term preservation infrastructure for earth science data. In Proceedings of the 9th International Conference on Digital Preservation (iPres’2012) (2012)
Strubulis, C., Tzitzikas, Y., Doerr, M., Flouris, G.: Evolution of workflow provenance information in the presence of custom inference rules. In: 3rd International Workshop on the role of Semantic Web in Provenance Management (SWPM12), co-located with ESWC12, Heraklion, Crete (2012)
Theodoridou, M., Tzitzikas, Y., Doerr, M., Marketakis, Y., Melessanakis, V.: Modeling and querying provenance by extending CIDOC CRM. J. Distrib. Parallel Databases 27(2), 169–210 (2010)
Tzitzikas, Y.: Dependency management for the preservation of digital information. In: Proceedings of the 18th International Conference on Database and Expert Systems Applications, DEXA’2007, Regensburg, Germany, Sept 2007 (2007)
Tzitzikas, Y., Flouris, G.: Mind the (intelligibily) gap. In: Proceedings of the 11th European Conference on Research and Advanced Technology for Digital Libraries, ECDL’07, Budapest, Hungary, September 2007. Springer (2007)
Tzitzikas, Y., Kampouraki, M., Analyti, A.: Curating the specificity of ontological descriptions under ontology evolution. J. Data Semant. 3(2), 75–106 (2014)
Tzitzikas, Y., Marketakis, Y., Antoniou, G.: Task-based dependency management for the preservation of digital objects using rules. In: Proceedings of the 6th Hellenic Conference on Artificial Intelligence, SETN-2010, Athens, Greece (2010)
Tzitzikas, Y., Marketakis, Y., Kargakis, Y.: Conversion and emulation-aware dependency reasoning for curation services. In: Proceedings of the 9th Annual International Conference on Digital Preservation (iPres2012) (2012)
Van der Hoeven, J., Lohman, B., Verdegem, R.: Emulation for digital preservation in practice: the results. Int. J. Digit. Curation 2(2), 123–132 (2008)
Van Der Hoeven, J.R., Van Diessen, R.J., Van Der Meer, K.: Development of a universal virtual computer (UVC) for long-term preservation of digital objects. J. Inf. Sci. 31(3), 196–208 (2005)
von Suchodoletz, D., Rechert, K., van der Hoeven, J., Schroder, J.: Seven steps for reliable emulation strategies-solved problems and open issues. In: Proceedings of the 7th Internationa Conference on Preservation of Digital Objects (iPRES’2010), pp. 19–24 (2010)
Waters, D., Garrett, J.: Preserving digital information report of the task force on archiving of digital information. In: Commissioned by the Commission on Preservation and Access and the Research Libraries Group Inc. Commission on Preservation and Access, Washington DC (1996)
Acknowledgments
Work done in the context of NoE APARSEN (Alliance Permanent Access to the Records of Science in Europe, FP7, Proj. No 269977), and SCIDIP-ES (SCIence Data Infrastructure for Preservation—Earth Science, FP7, for an overview see [25]). Many thanks to Rene van Horik from DANS for his active participation, and to Anastasia Analyti for proofreading the paper.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Appendix A: A screen dumps from Epimenides
Figure 14 shows the initial screen of the system, where the user can login for loading his personal or other profile. Figure 15 shows the first screen that allows the user to upload an atomic file or a zipped collection of files.
The system analyzes the contents of the zip file and for each of the included files it suggests a task. This is shown in Fig. 16.
Figure 17 shows the results of this analysis. We can see that the first file is in red because the selected task, i.e. Rendenring, cannot be performed over that file digital object). In contrast, the selected tasks for the other two files can be performed, and for this reason they are marked with green.
The user can explore the dependencies for each one of the digital objects. For example Fig. 18 shows what happens if the user clicks to explore the dependencies of the “rendering” task. We can see all the rules of the selected task that are available in the system. The atoms of each rule are green or red. Green atoms are available in the profile of the user, while the red are not. Moreover, the user can click on an atom to explore the dependencies of this atom, so he can see the rules or the facts of this atom.
Rights and permissions
About this article
Cite this article
Tzitzikas, Y., Kargakis, Y. & Marketakis, Y. Assisting digital interoperability and preservation through advanced dependency reasoning. Int J Digit Libr 15, 103–127 (2015). https://doi.org/10.1007/s00799-014-0131-1
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00799-014-0131-1