Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

Safe and automatic live update for operating systems

Published: 16 March 2013 Publication History

Abstract

Increasingly many systems have to run all the time with no downtime allowed. Consider, for example, systems controlling electric power plants and e-banking servers. Nevertheless, security patches and a constant stream of new operating system versions need to be deployed without stopping running programs. These factors naturally lead to a pressing demand for live update---upgrading all or parts of the operating system without rebooting. Unfortunately, existing solutions require significant manual intervention and thus work reliably only for small operating system patches.
In this paper, we describe an automated system for live update that can safely and automatically handle major upgrades without rebooting. We have implemented our ideas in Proteos, a new research OS designed with live update in mind. Proteos relies on system support and nonintrusive instrumentation to handle even very complex updates with minimal manual effort. The key novelty is the idea of state quiescence, which allows updates to happen only in safe and predictable system states. A second novelty is the ability to automatically perform transactional live updates at the process level, ensuring a safe and stable update process. Unlike prior solutions, Proteos supports automated state transfer, state checking, and hot rollback. We have evaluated Proteos on 50 real updates and on novel live update scenarios. The results show that our techniques can effectively support both simple and complex updates, while outperforming prior solutions in terms of flexibility, security, reliability, and stability of the update process.

References

[1]
Ksplice performance record. http://www.ksplice.com/cve-evaluation, 2009.
[2]
FUSE: Filesystem in userspace. http://fuse.sourceforge.net/, 2012.
[3]
Green hills integrity. http://www.ghs.com/products/rtos/integrity.html, 2012.
[4]
S. V. Adve, V. S. Adve, and Y. Zhou. Using likely program invariants to detect hardware errors. In Proc. of the IEEE Int'l Conf. on Dependable Systems and Networks, 2008.
[5]
S. Ajmani. A review of software upgrade techniques for distributed systems, 2004.
[6]
S. Ajmani, B. Liskov, and L. Shrira. Scheduling and simulation: How to upgrade distributed systems. In Proc. of the Ninth Workshop on Hot Topics in Operating Systems, volume 9, pages 43--48, 2003.
[7]
S. Ajmani, B. Liskov, L. Shrira, and D. Thomas. Modular software upgrades for distributed systems. In Proc. of the 20th European Conf. on Object-oriented Programming, pages 452--476, 2006.
[8]
P. Akritidis. Cling: A memory allocator to mitigate dangling pointers. In Proc. of the 19th USENIX Security Symp., page 12, 2010.
[9]
J. P. A. Almeida, M. v. Sinderen, and L. Nieuwenhuis. Transparent dynamic reconfiguration for CORBA. In Proc. of the Third Int'l Symp. on Distributed Objects and Applications, pages 197--207, 2001.
[10]
G. Altekar, I. Bagrak, P. Burstein, and A. Schultz. OPUS: Online patches and updates for security. In Proc. of the 14th USENIX Security Symp., volume 14, pages 19--19, 2005.
[11]
J. R. Andersen, L. Bak, S. Grarup, K. V. Lund, T. Eskildsen, K. M. Hansen, and M. Torgersen. Design, implementation, and evaluation of the resilient Smalltalk embedded platform. Comput. Lang. Syst. Struct., 31 (3--4): 127--141, 2005.
[12]
J. Arnold and M. F. Kaashoek. Ksplice: Automatic rebootless kernel updates. In Proc. of the Fourth ACM European Conf. on Computer Systems, pages 187--198, 2009.
[13]
A. Baumann, G. Heiser, J. Appavoo, D. Da Silva, O. Krieger, R. W. Wisniewski, and J. Kerr. Providing dynamic update in an operating system. In Proc. of the USENIX Annual Tech. Conf., page 32, 2005.
[14]
A. Baumann, J. Appavoo, R. W. Wisniewski, D. D. Silva, O. Krieger, and G. Heiser. Reboots are for hardware: Challenges and solutions to updating an operating system on the fly. In Proc. of the USENIX Annual Tech. Conf., pages 1--14, 2007.
[15]
B. N. Bershad, S. Savage, E. G. Sirer, M. E. Fiuczynski, D. Becker, C. Chambers, and S. Eggers. Extensibility, safety and performance in the SPIN operating system. In Proc. of the 15th ACM Symp. on Oper. Systems Prin., volume 29, pages 267--284, 1995.
[16]
T. Bloom. Dynamic module replacement in a distributed programming system. PhD thesis, MIT, Cambridge, MA, USA, 1983.
[17]
T. Bloom and M. Day. Reconfiguration and module replacement in Argus: Theory and practice. Software Engineering J., 8 (2): 102--108, 1993.
[18]
C. Boyapati, B. Liskov, L. Shrira, C. Moh, and S. Richman. Lazy modular upgrades in persistent object stores. In Proc. of the 18th ACM Conf. on Object-Oriented Programming, Systems, Languages, and Applications, pages 403--417, 2003.
[19]
S. Boyd-Wickizer and N. Zeldovich. Tolerating malicious device drivers in Linux. In Proc. of the USENIX Annual Tech. Conf., page 9, 2010.
[20]
H. Chen, R. Chen, F. Zhang, B. Zang, and P. Yew. Live updating operating systems using virtualization. In Proc. of the Second Int'l Conf. on Virtual Execution Environments, pages 35--44, 2006.
[21]
H. Chen, J. Yu, R. Chen, B. Zang, and P. Yew. POLUS: A powerful live updating system. In Proc. of the 29th Int'l Conf. on Software Eng., pages 271--281, 2007.
[22]
C. Cowan, T. Autrey, C. Krasic, C. Pu, and J. Walpole. Fast concurrent dynamic linking for an adaptive operating system. In Proc. of the Third Int'l Conf. on Configurable Distributed Systems, pages 108--115, 1996.
[23]
F. M. David, E. M. Chan, J. C. Carlyle, and R. H. Campbell. CuriOS: Improving reliability through operating system structure. In Proc. of the 8th USENIX Symp. on Operating Systems Design and Implementation, pages 59--72, 2008.
[24]
A. Depoutovitch and M. Stumm. Otherworld: giving applications a chance to survive OS kernel crashes. In Proc. of the 5th ACM European Conf. on Computer systems, pages 181--194, 2010.
[25]
M. Dimitrov and H. Zhou. Unified architectural support for soft-error protection or software bug detection. In Proc. of the 16th Int'l Conf. on Parallel Architecture and Compilation Techniques, pages 73--82, 2007.
[26]
D. Duggan. Type-based hot swapping of running modules. In Proc. of the Sixth ACM SIGPLAN Int'l Conf. on Functional programming, pages 62--73, 2001.
[27]
T. Dumitras and P. Narasimhan. Why do upgrades fail and what can we do about it?: Toward dependable, online upgrades in enterprise system. In Proc. of the 10th Int'l Conf. on Middleware, pages 1--20, 2009.
[28]
T. Dumitras, J. Tan, Z. Gho, and P. Narasimhan. No more HotDependencies: Toward dependency-agnostic online upgrades in distributed systems. In Proc. of the Third Workshop on Hot Topics in System Dependability, page 14, 2007.
[29]
T. Dumitras, P. Narasimhan, and E. Tilevich. To upgrade or not to upgrade: impact of online upgrades across multiple administrative domains. In Proc. of the ACM Conf. on Object-Oriented Programming, Systems, Languages, and Appilcations, pages 865--876, 2010.
[30]
M. D. Ernst, J. Cockrell, W. G. Griswold, and D. Notkin. Dynamically discovering likely program invariants to support program evolution. In Proc. of the 21st Int'l Conf. on Software Eng., pages 213--224, 1999.
[31]
O. Frieder and M. E. Segal. On dynamically updating a computer program: From concept to prototype. J. Syst. Softw., 14 (2): 111--128, 1991.
[32]
C. Giuffrida and A. Tanenbaum. Safe and automated state transfer for secure and reliable live update. In Proc. of the Fourth Int'l Workshop on Hot Topics in Software Upgrades, pages 16--20, 2012.
[33]
C. Giuffrida and A. S. Tanenbaum. Cooperative update: A new model for dependable live update. In Proc. of the Second Int'l Workshop on Hot Topics in Software Upgrades, pages 1--6, 2009.
[34]
C. Giuffrida, A. Kuijsten, and A. S. Tanenbaum. Enhanced operating system security through efficient and fine-grained address space randomization. In Proc. of the 21st USENIX Security Symp., page 40, 2012.
[35]
D. Gupta. On-line software version change. PhD thesis, Indian Institute of Technology Kanpur, 1994.
[36]
D. Gupta and P. Jalote. On-line software version change using state transfer between processes. Softw. Pract. and Exper., 23 (9): 949--964, 1993.
[37]
D. Gupta, P. Jalote, and G. Barua. A formal framework for on-line software version change. IEEE Trans. Softw. Eng., 22 (2): 120--131, 1996.
[38]
S. Hangal and M. S. Lam. Tracking down software bugs using automatic anomaly detection. In Proc. of the 24th Int'l Conf. on Software Eng., pages 291--301, 2002.
[39]
H. Hartig, M. Hohmuth, J. Liedtke, J. Wolter, and S. Schönberg. The performance of microkernel-based systems. In Proc. of the 16th ACM Symp. on Oper. Systems Prin., pages 66--77, 1997.
[40]
D. Hartmeier. Design and performance of the OpenBSD stateful packet filter (pf). In Proc. of the USENIX Annual Tech. Conf., pages 171--180, 2002.
[41]
C. M. Hayden, E. K. Smith, M. Denchev, M. Hicks, and J. S. Foster. Kitsune: Efficient, general-purpose dynamic software updating for C. In Proc. of the ACM Conf. on Object-Oriented Programming, Systems, Languages, and Appilcations, 2012.
[42]
J. N. Herder, H. Bos, B. Gras, P. Homburg, and A. S. Tanenbaum. Reorganizing UNIX for reliability. In Proc. of the 11th Asia-Pacific Conf. on Advances in Computer Systems Architecture, pages 81--94, 2006.
[43]
J. N. Herder, H. Bos, B. Gras, P. Homburg, and A. S. Tanenbaum. Failure resilience for device drivers. In Proc. of the Int'l Conf. on Dependable Systems and Networks, pages 41--50, 2007.
[44]
M. Hicks. Dynamic software updating. PhD thesis, Univ. of Pennsylvania, 2001.
[45]
D. Hildebrand. An architectural overview of QNX. In Proc. of the Workshop on Micro-kernels and Other Kernel Architectures, pages 113--126, 1992.
[46]
G. Hjalmtysson and R. Gray. Dynamic C++ classes: A lightweight mechanism to update code in a running program. In Proc. of the USENIX Annual Tech. Conf., page 6, 1998.
[47]
G. C. Hunt and J. R. Larus. Singularity: Rethinking the software stack. SIGOPS Oper. Syst. Rev., 41 (2): 37--49, 2007.
[48]
J. Kramer and J. Magee. The evolving philosophers problem: Dynamic change management. IEEE Trans. Softw. Eng., 16 (11): 1293--1306, 1990.
[49]
O. K. Labs. OKL4 community site. http://wiki.ok-labs.com/, 2012.
[50]
C. Lattner and V. Adve. LLVM: A compilation framework for lifelong program analysis & transformation. In Proc. of the Int'l Symp. on Code Generation and Optimization, page 75, 2004.
[51]
I. Lee. Dymos: A dynamic modification system. PhD thesis, Univ. of Wisconsin-Madison, 1983.
[52]
J. Liedtke. Improving IPC by kernel design. In Proc. of the 14th ACM Symp. on Oper. Systems Prin., pages 175--188, 1993.
[53]
J. Liedtke. On micro-kernel construction. In Proc. of the 15th ACM Symp. on Oper. Systems Prin., pages 237--250, 1995.
[54]
D. E. Lowell, Y. Saito, and E. J. Samberg. Devirtualizable virtual machines enabling general, single-node, online maintenance. In Proc. of the 11th Int'l Conf. on Architectural support for programming languages and operating systems, volume 39, pages 211--223, 2004.
[55]
K. Makris and R. Bazzi. Immediate multi-threaded dynamic software updates using stack reconstruction. In Proc. of the USENIX Annual Tech. Conf., pages 397--410, 2009.
[56]
K. Makris and K. D. Ryu. Dynamic and adaptive updates of non-quiescent subsystems in commodity operating system kernels. In Proc. of the Second ACM European Conf. on Computer Systems, pages 327--340, 2007.
[57]
Microsoft. Windows User-Mode driver framework. http://msdn.microsoft.com/en-us/windows/hardware/gg463294, 2010.
[58]
R. G. Minnich. A dynamic kernel modifier for Linux. In Proc. of the LACSI Symposium, 2002.
[59]
I. Neamtiu and M. Hicks. Safe and timely updates to multi-threaded programs. In Proc. of the ACM SIGPLAN Conf. on Programming Language Design and Implementation, pages 13--24, 2009.
[60]
I. Neamtiu, M. Hicks, G. Stoyle, and M. Oriol. Practical dynamic software updating for C. In Proc. of the ACM SIGPLAN Conf. on Programming Language Design and Implementation, pages 72--83, 2006.
[61]
I. Neamtiu, M. Hicks, J. S. Foster, and P. Pratikakis. Contextual effects for version-consistent dynamic software updating and safe concurrent programming. In Proc. of the ACM SIGPLAN Conf. on Programming Language Design and Implementation, pages 37--49, 2008.
[62]
Y. Padioleau, J. L. Lawall, and G. Muller. Understanding collateral evolution in Linux device drivers. In Proc. of the First ACM European Conf. on Computer Systems, pages 59--71, 2006.
[63]
Y. Padioleau, J. Lawall, R. R. Hansen, and G. Muller. Documenting and automating collateral evolutions in Linux device drivers. In Proc. of the Third ACM European Conf. on Computer Systems, pages 247--260, 2008.
[64]
N. Palix, G. Thomas, S. Saha, C. Calvas, J. Lawall, and G. Muller. Faults in Linux: Ten years later. In Proc. of the 16th Int'l Conf. on Architectural support for programming languages and operating systems, pages 305--318, 2011.
[65]
K. Pattabiraman, G. P. Saggese, D. Chen, Z. T. Kalbarczyk, and R. K. Iyer. Automated derivation of application-specific error detectors using dynamic analysis. IEEE Trans. Dep. Secure Comput., 8 (5): 640--655, 2011.
[66]
S. Potter and J. Nieh. Reducing downtime due to system maintenance and upgrades. In Proc. of the 19th USENIX Systems Administration Conf., pages 6--6, 2005.
[67]
J. Rafkind, A. Wick, J. Regehr, and M. Flatt. Precise garbage collection for C. In Proc. of the 2009 Int'l Symp. on Memory management, pages 39--48, 2009.
[68]
M. E. Segal and O. Frieder. On-the-fly program modification: Systems for dynamic updating. IEEE Softw., 10 (2): 53--65, 1993.
[69]
M. Seltzer and C. Small. Self-monitoring and self-adapting operating systems. In Proc. of the Sixth Workshop on Hot Topics in Operating Systems, pages 124--129, 1997.
[70]
M. I. Seltzer, Y. Endo, C. Small, and K. A. Smith. Dealing with disaster: Surviving misbehaved kernel extensions. In Proc. of the Second USENIX Symp. on Operating Systems Design and Implementation, pages 213--227, 1996.
[71]
J. S. Shapiro, J. M. Smith, and D. J. Farber. EROS: A fast capability system. In Proc. of the 17th ACM Symp. on Oper. Systems Prin., pages 170--185, 1999.
[72]
C. A. N. Soules, D. D. Silva, M. Auslander, G. R. Ganger, and M. Ostrowski. System support for online reconfiguration. In Proc. of the USENIX Annual Tech. Conf., pages 141--154, 2003.
[73]
G. Stoyle, M. Hicks, G. Bierman, P. Sewell, and I. Neamtiu. Mutatis mutandis: Safe and predictable dynamic software updating. ACM Trans. Program. Lang. Syst., 29 (4), 2007.
[74]
subramanian_dynamic_2009S. Subramanian, M. Hicks, and K. S. McKinley. Dynamic software updates: A VM-centric approach. In Proc. of the ACM SIGPLAN Conf. on Programming Language Design and Implementation, volume 44, pages 1--12, 2009.
[75]
A. Tamches and B. P. Miller. Fine-grained dynamic instrumentation of commodity operating system kernels. In Proc. of the Third ACM Symp. on Oper. Systems Prin., pages 117--130, 1999.
[76]
Y. Vandewoude, P. Ebraert, Y. Berbers, and T. D'Hondt. Tranquility: A low disruptive alternative to quiescence for ensuring safe dynamic updates. IEEE Trans. Softw. Eng., 33 (12): 856--868, 2007.
[77]
P. Zhou, W. Liu, L. Fei, S. Lu, F. Qin, Y. Zhou, S. Midkiff, and J. Torrellas. AccMon: Automatically detecting memory-related bugs via program counter-based invariants. In Proc. of the 37th Annual IEEE/ACM Int'l Symp. on Microarchitecture, pages 269--280, 2004.

Cited By

View all
  • (2024)Retcon: Live Updates for Embedded Event-Driven Applications2024 23rd ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)10.1109/IPSN61024.2024.00015(126-137)Online publication date: 13-May-2024
  • (2023)Efficient Scheduler Live Update for Linux Kernel with ModularizationProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582054(194-207)Online publication date: 25-Mar-2023
  • (2023)Runtime software patching: Taxonomy, survey and future directionsJournal of Systems and Software10.1016/j.jss.2023.111652200(111652)Online publication date: Jun-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM SIGPLAN Notices
ACM SIGPLAN Notices  Volume 48, Issue 4
ASPLOS '13
April 2013
540 pages
ISSN:0362-1340
EISSN:1558-1160
DOI:10.1145/2499368
Issue’s Table of Contents
  • cover image ACM Conferences
    ASPLOS '13: Proceedings of the eighteenth international conference on Architectural support for programming languages and operating systems
    March 2013
    574 pages
    ISBN:9781450318709
    DOI:10.1145/2451116
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 16 March 2013
Published in SIGPLAN Volume 48, Issue 4

Check for updates

Author Tags

  1. automatic updates
  2. live update
  3. operating systems
  4. state checking
  5. state transfer
  6. update safety

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)73
  • Downloads (Last 6 weeks)11
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Retcon: Live Updates for Embedded Event-Driven Applications2024 23rd ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)10.1109/IPSN61024.2024.00015(126-137)Online publication date: 13-May-2024
  • (2023)Efficient Scheduler Live Update for Linux Kernel with ModularizationProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582054(194-207)Online publication date: 25-Mar-2023
  • (2023)Runtime software patching: Taxonomy, survey and future directionsJournal of Systems and Software10.1016/j.jss.2023.111652200(111652)Online publication date: Jun-2023
  • (2022)Live synthesisInnovations in Systems and Software Engineering10.1007/s11334-022-00447-518:3(443-454)Online publication date: 31-Mar-2022
  • (2021)Live SynthesisAutomated Technology for Verification and Analysis10.1007/978-3-030-88885-5_11(153-169)Online publication date: 18-Oct-2021
  • (2020)Reboot-Oriented IoT: Life Cycle Management in Trusted Execution Environment for Disposable IoT devicesProceedings of the 36th Annual Computer Security Applications Conference10.1145/3427228.3427293(428-441)Online publication date: 7-Dec-2020
  • (2017)PistonProceedings of the 33rd Annual Computer Security Applications Conference10.1145/3134600.3134611(141-153)Online publication date: 4-Dec-2017
  • (2016)Dynamic and coordinated software reconfiguration in distributed data stream systemsJournal of Internet Services and Applications10.1186/s13174-016-0050-z7:1Online publication date: 9-Aug-2016
  • (2014)Mutable checkpoint-restartProceedings of the 15th International Middleware Conference10.1145/2663165.2663328(133-144)Online publication date: 8-Dec-2014
  • (2014)Towards Patching Memory Leak Bugs in Off-The-Shelf SoftwareProceedings of the 2014 IEEE International Symposium on Software Reliability Engineering Workshops10.1109/ISSREW.2014.44(433-436)Online publication date: 3-Nov-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media