Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1377943.1377957acmotherconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article

Got predictability?: experiences with fault-tolerant middleware

Published: 01 November 2007 Publication History

Abstract

Unpredictability in COTS-based systems often manifests as occasional instances of uncontrollably-high response times. A particular category of COTS systems, fault-tolerant (FT) middleware, is used in critical enterprise and embedded applications where predictability is of paramount importance. Our prior empirical study, which used a client-server microbenchmark, suggested that hard bounds for the maximum latency are hard to establish a priori, but that the unpredictability may be confined to less than 1% of the requests. In this paper, we present empirical data, from 7 different three-tier, FT-middleware applications, that shows strong evidence supporting this "magical 1%" hypothesis. We conducted a controlled experiment with 7 teams of students from a graduate-level course at Carnegie Mellon University. Each team, starting from a common three-tier architecture, independently implemented and evaluated an original application using middleware (either CORBA or EJB) and a custom-implemented fault-tolerance mechanism (relying on either state-machine or primary-backup replication) for the middle-tier server. This experiment shows that unpredictability may not be avoidable, even in the absence of faults, and that, in some cases, the random latency outliers are larger than the time needed to recover from a fault. The data also reveals a statistically-significant result that, across all 7 applications, unpredictability is confined to the highest 1% of the recorded end-to-end latencies and is not correlated with the request rate, the size of messages exchanged or the number of clients. This suggests that strict predictability is hard to achieve in FT-middleware systems and that developers of critical FT applications should focus on guaranteeing bounds for statistical measures, such as the 99th percentile of the latency.

References

[1]
Felber, P., Narasimhan, P.: Experiences, approaches and challenges in building fault-tolerant CORBA systems. IEEE Transactions on Computers 54 (2004) 497--511
[2]
Krishna, A. S., Wang, N., Natarajan, B., Gokhale, A., Schmidt, D. C., Thaker, G.: CCMPerf: A benchmarking tool for CORBA Component Model implementations. The International Journal of Time-Critical Computing Systems 29 (2005)
[3]
Zhao, W., Moser, L. E., Melliar-Smith, P. M.: End-to-end latency of a fault-tolerant CORBA infrastructure. Performance Evaluation 63 (2006) 341--363
[4]
http://www.atl.external.lmco.com/projects/QoS/.
[5]
Dumitraş, T., Narasimhan, P.: Fault-tolerant middleware and the magical 1%. In: ACM/IEEE/IFIP Middleware Conference, Grenoble, France (2005) 431--441
[6]
Narasimhan, P., Dumitraş, T., Paulos, A., Pertet, S., Reverte, C., Slember, J., Srivastava, D.: MEAD: Support for real-time, fault-tolerant CORBA. Concurrency and Computation: Practice and Experience 17 (2005) 1527--1545
[7]
Alistair Croll: Meaningful Service Level Agreements for Web transaction systems. LOOP: The Online Voice of the IT Community (2005)
[8]
Object Management Group: Fault Tolerant CORBA. OMG Technical Committee Document formal/2001-09-29 (2001)
[9]
Budhiraja, N., Schneider, F., Toueg, S., Marzullo, K.: The primary-backup approach. In Mullender, S., ed.: Distributed Systems. ACM Press - Addison Wesley (1993) 199--216
[10]
Schneider, F. B.: Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Computing Surveys 22 (1990) 299--319
[11]
Dumitraş, T., Srivastava, D., Narasimhan, P.: Architecting and implementing versatile dependability. In de Lemos, R., Gacek, C., Romanovsky, A., eds.: Architecting Dependable Systems III. Springer-Verlag, LNCS 3549 (2005) 212--231
[12]
Hentges, R.: Puzzling: Sodoku has grabbed the short-attention span of a nation. Pittsburgh Tribune Review (2006) http://www.pittsburghlive.com/x/pittsburghtrib/search/s_447266.html.
[13]
National Institute of Standards and Technology: (Engineering statistics handbook) http://www.itl.nist.gov/div898/handbook/index.htm.
[14]
Lazowska, E. D., Zahorjan, J., Graham, G. S., Sevcik, K. C.: Computer System Analysis Using Queueing Network Models. Prentice Hall (1984)
[15]
Wu, H., Kemme, B.: Fault-tolerance for stateful application servers in the presence of advanced transactions patterns. In: Symposium on Reliable Distributed Systems, Orlando, FL (2005) 95--108
[16]
Global Grid Forum: Web services agreement specification (WS-Agreement). Draft, version 11 (2004)

Cited By

View all
  • (2019)A study of unpredictability in fault-tolerant middlewareComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2012.10.01557:3(682-698)Online publication date: 6-Jan-2019

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
MC '07: Proceedings of the 2007 ACM/IFIP/USENIX international conference on Middleware companion
November 2007
118 pages
ISBN:9781595939357
DOI:10.1145/1377943
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 November 2007

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

Middleware07
Middleware07: 8th International Middleware Conference
November 26 - 30, 2007
California, Newport Beach

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2019)A study of unpredictability in fault-tolerant middlewareComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2012.10.01557:3(682-698)Online publication date: 6-Jan-2019

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media