Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Virtual machine time travel using continuous data protection and checkpointing

Published: 01 January 2008 Publication History

Abstract

Virtual machine (VM) time travel enables reverting a virtual machine's state, both transient and persistent, to past points in time. This capability can be used to improve virtual machine availability, to enable forensics on past VM states, and to recover from operator errors. We present an approach to virtual machine time travel which combines Continuous Data Protection (CDP) storage support with live-migration-based virtual machine checkpointing. In particular, we present a novel approach for CDP which enables efficient reverts of the storage state to past points in time and makes it possible to undo a revert, and this is achieved using a simple branched-temporal data structure. We also present a design and implementation of a simple live-migration-based checkpointing mechanism in Xen.

References

[1]
ZFS: The last word in file systems. http://www.sun.com/2004-0914/feature/.
[2]
A. Azagury, M. E. Factor, J. Satran, and W. Micka. Point-in-Time Copy: Yesterday, Today and Tomorrow. In Proceedings of the 10th NASA Goddard and 19th IEEE Symposium Conference on Mass Storage Systems and Technologies (MSST'02), pages 259--270, April 2002.
[3]
P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield. Xen and the art of virtualization. In Proc. 19th ACM Symposium on Operating Systems Principles (SOSP 03), pages 164--177, 2003.
[4]
R. Bayer and E. M. McCreight. Organization and maintenance of large ordered indices. Acta Informatica, 1:173--189, 1972.
[5]
A. Brown and D. A. Patterson. Undo for Operators: Building an undoable E-mail store. In Proceedings of USENIX Annual Technical Conference, San Antonio, TX, June 2003.
[6]
C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield. Live migration of virtual machines. In In Proceedings of the 2nd ACM/USENIX Symposium on Networked Systems Design and Implementation (NSDI), Boston, MA, May 2005.
[7]
B. Cully. Virtual machine checkpointing. Xen Summit 2007. http://www.xensource.com/files/xensummit 4/talk_Cully.pdf.
[8]
B. Cully and A. Warfield. Secondsite: disaster protection for the common server. In Proceedings of the 2nd conference on Hot Topics in System Dependability (HOTDEP'06), pages 12--12, Berkeley, CA, USA, 2006. USENIX Association.
[9]
J. Damoulakis. Continuous protection. Storage, June 2004, 3(4):33--39, 2004.
[10]
J. Dike. A user-mode port of the linux kernel. In Proceedings of the 2000 Linux Showcase and Conference, 2000.
[11]
J. G. Hansen and E. Jul. Self-migration of operating systems. In Proceedings of the 11th ACM SIGOPS European Workshop (EW 2004), pages 126--130, 2004.
[12]
D. Hitz, J. Lau, and M. A. Malcolm. File system design for an NFS file server appliance. In Proceedings of the Winter'94 USENIX Technical Conference, pages 235--246, 1994.
[13]
L. Jiang, B. Salzberg, D. B. Lomet, and M. Barrena. The BT-tree: A Branched and Temporal Access Method. In Proceedings of 26th International Conference on Very Large Data Bases (VLDB 2000), September 10--14, 2000, Cairo, Egypt, pages 451--460. Morgan Kaufmann, 2000.
[14]
S. T. King, G. W. Dunlap, and P. M. Chen. Debugging operating systems with time-traveling virtual machines. In Proceedings of USENIX Annual Technical Conference, April 2005.
[15]
I. Krsul, A. Ganguly, J. Zhang, J. A. B. Fortes, and R. J. Figueiredo. VMPlants: Providing and managing virtual machine execution environments for grid computing. In SC '04: Proceedings of the 2004 ACM/IEEE conference on Supercomputing, page 7, Washington, DC, USA, 2004. IEEE Computer Society.
[16]
G. Laden, P. Ta-Shma, E. Yaffe, M. Factor, and S. Fienblit. Architectures for controller based CDP. In Proceedings 5th USENIX Conference on File and Storage Technologies (FAST '07), Feb 2007.
[17]
Z. Peterson and R. Burns. Ext3cow: a time-shifting file system for regulatory compliance. ACM Transactions on Storage, 1(2):190--212, May 2005.
[18]
M. Rosenblum and J. K. Ousterhout. The design and implementation of a log-structured filesystem. ACM Transactions on Computer Systems, pages 26--52, 1992.
[19]
B. Salzberg and V. J. Tsotras. Comparison of access methods for time-evolving data. ACM Comput. Surv., 31(2):158--221, 1999.
[20]
J. C. Sancho, F. Petrini, K. Davis, R. Gioiosa, and S. Jiang. Current practice and a direction forward in checkpoint/restart implementations for fault tolerance. In IPDPS '05: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium -- Workshop 18, page 300.2, Washington, DC, USA, 2005. IEEE Computer Society.
[21]
D. J. Santry, M. J. Feeley, N. C. Hutchinson, A. C. Veitch, R. W. Carton, and J. Otir. Deciding when to forget in the Elephant file system. In Proc. 17th ACM Symposium on Operating Systems Principles (SOSP 99), 1999.
[22]
C. P. Sapuntzakis, R. Chandra, B. Pfaff, J. Chow, M. S. Lam, and M. Rosenblum. Optimizing the migration of virtual computers. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation, December 2002.
[23]
M. D. Schroeder, D. K. Gifford, and R. M. Needham. A caching file system for a programmer's workstation. In Proceedings of the 10th ACM Symposium on Operating Systems Principles, pages 25--34, 1985.
[24]
M. I. Seltzer. Berkeley DB: A Retrospective. IEEE Data Eng. Bull., 30(3):21--28, 2007.
[25]
E. Skoglund, C. Ceelen, and J. Liedtke. Transparent orthogonal checkpointing through user-level pagers. In Proceedings of the 9th International Workshop on Persistent Object Systems, pages 201--215, Lillehammer, Norway, Sept. 6--8 2000.
[26]
G. Vallée, T. Naughton, H. Ong, and S. L. Scott. Checkpoint/restart of virtual machines based on xen. In High Availability and Performance Computing Workshop (HAPCW'06), Santa Fe, New Mexico, USA, Oct. 2006. Held in conjunction with LACSI 2006.
[27]
M. Vrable, J. Ma, J. Chen, D. Moore, E. Vandekieft, A. C. Snoeren, G. M. Voelker, and S. Savage. Scalability, fidelity, and containment in the potemkin virtual honeyfarm. SIGOPS Oper. Syst. Rev., 39(5):148--162, 2005.
[28]
A. Warfield, K. Fraser, S. Hand, and T. Deegan. Facilitating the development of soft devices. In Proc. USENIX Annual Technical Conference, pages 379--382, 2005.
[29]
A. Warfield, R. Ross, K. Fraser, C. Limpach, and S. Hand. Parallax: Managing storage for a million machines. In USENIX Hot Topics in Operating Systems (HOTOS), 2005.
[30]
A. Whitaker, R. S. Cox, and S. D. Gribble. Configuration debugging as search: Finding the needle in the haystack. In Proceedings of the Sixth Symposium on Operating Systems Design and Implementation (OSDI 2004), San Francisco, CA, December 2004.
[31]
C. P. Wright, J. Dave, P. Gupta, H. Krishnan, E. Zadok, and M. N. Zubair. Versatility and unix semantics in a fan-out unification file system. Technical Report FSL-04-01b, Computer Science Department, StonyBrook University, October 2004.
[32]
Q. Yang, W. Xiao, and J. Ren. TRAP-Array: A disk array architecture providing timely recovery to any point-in-time. In Proceedings of the 33rd annual international symposium on Computer Architecture (ISCA'06), pages 289--301, Washington, DC, USA, 2006. IEEE Computer Society.

Cited By

View all
  • (2022)iConSnap: An Incremental Continuous Snapshots System for Virtual MachinesIEEE Transactions on Services Computing10.1109/TSC.2019.295570015:1(539-550)Online publication date: 1-Jan-2022
  • (2022)Be United in Actions: Taking Live Snapshots of Heterogeneous Edge–Cloud Collaborative Cluster With Low OverheadIEEE Internet of Things Journal10.1109/JIOT.2021.31110239:10(7311-7324)Online publication date: 15-May-2022
  • (2022)A Study on Recent Advances in Artificial Intelligence and Future Prospects of Attaining SuperintelligenceProceedings of Third International Conference on Communication, Computing and Electronics Systems10.1007/978-981-16-8862-1_57(879-892)Online publication date: 20-Mar-2022
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGOPS Operating Systems Review
ACM SIGOPS Operating Systems Review  Volume 42, Issue 1
January 2008
133 pages
ISSN:0163-5980
DOI:10.1145/1341312
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2008
Published in SIGOPS Volume 42, Issue 1

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 14 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)iConSnap: An Incremental Continuous Snapshots System for Virtual MachinesIEEE Transactions on Services Computing10.1109/TSC.2019.295570015:1(539-550)Online publication date: 1-Jan-2022
  • (2022)Be United in Actions: Taking Live Snapshots of Heterogeneous Edge–Cloud Collaborative Cluster With Low OverheadIEEE Internet of Things Journal10.1109/JIOT.2021.31110239:10(7311-7324)Online publication date: 15-May-2022
  • (2022)A Study on Recent Advances in Artificial Intelligence and Future Prospects of Attaining SuperintelligenceProceedings of Third International Conference on Communication, Computing and Electronics Systems10.1007/978-981-16-8862-1_57(879-892)Online publication date: 20-Mar-2022
  • (2021)Optimizing Job Reliability Through Contention-Free, Distributed Checkpoint SchedulingIEEE Transactions on Network and Service Management10.1109/TNSM.2020.303093718:2(2077-2088)Online publication date: Jun-2021
  • (2017)Elastic Reliability Optimization Through Peer-to-Peer Checkpointing in Cloud ComputingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2016.257128128:2(491-502)Online publication date: 1-Feb-2017
  • (2016)Towards a Scalable and Write-Free Multi-version Checkpointing Scheme in Solid State Drives2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN.2016.13(37-48)Online publication date: Jun-2016
  • (2016)A Remote Backup Approach for Virtual Machine Images2016 IEEE 3rd International Conference on Cyber Security and Cloud Computing (CSCloud)10.1109/CSCloud.2016.41(252-255)Online publication date: Jun-2016
  • (2014)TardisACM SIGPLAN Notices10.1145/2714064.266020949:10(67-82)Online publication date: 15-Oct-2014
  • (2014)TardisProceedings of the 2014 ACM International Conference on Object Oriented Programming Systems Languages & Applications10.1145/2660193.2660209(67-82)Online publication date: 15-Oct-2014
  • (2014)ConSnap: Taking continuous snapshots for running state protection of virtual machines2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)10.1109/PADSW.2014.7097869(677-684)Online publication date: Dec-2014
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media