Article

Tashkent: uniting durability with transaction ordering for high-performance scalable database replication

Authors:

Sameh Elnikety,

Steven Dropsho,

Fernando PedoneAuthors Info & Claims

EuroSys '06: Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006

Pages 117 - 130

https://doi.org/10.1145/1217935.1217947

Published: 18 April 2006 Publication History

Abstract

In stand-alone databases, the functions of ordering the transaction commits and making the effects of transactions durable are performed in one single action, namely the writing of the commit record to disk. For efficiency many of these writes are grouped into a single disk operation. In replicated databases in which all replicas agree on the commit order of update transactions, these two functions are typically separated. Specifically, the replication middleware determines the global commit order, while the database replicas make the transactions durable.The contribution of this paper is to demonstrate that this separation causes a significant scalability bottleneck. It forces some of the commit records to be written to disk serially, where in a standalone system they could have been grouped together in a single disk write. Two solutions are possible: (1) move durability from the database to the replication middleware, or (2) keep durability in the database and pass the global commit order from the replication middleware to the database.We implement these two solutions. Tashkent-MW is a pure middleware solution that combines durability and ordering in the middleware, and treats an unmodified database as a black box. In Tashkent-API, we modify the database API so that the middleware can specify the commit order to the database, thus, combining ordering and durability inside the database. We compare both Tashkent systems to an otherwise identical replicated system, called Base, in which ordering and durability remain separated. Under high update transaction loads both Tashkent systems greatly outperform Base in throughput and response time.

References

[1]

Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O'Neil, and Patrick O'Neil. A critique of ANSI SQL isolation levels. In proceedings of the SIGMOD International Conference on Management of Data, May 1995.

Digital Library

[2]

Philip Bernstein, Vassos Hadzilacos, and Nathan Goodman. Concurrency Control and Recovery in Database Systems. Addison-Wesley, 1987.

Digital Library

[3]

Sameh Elnikety, Fernando Pedone, and Willy Zwaenepoel. Database Replication Using Generalized Snapshot Isolation. IEEE Symposium on Reliable Distributed Systems (SRDS 2005), Orlando, Florida, Oct. 2005.

Digital Library

[4]

Alan Fekete. Allocating Isolation Levels to Transactions. ACM Sigmod, Baltimore, Maryland, June 2005.

Digital Library

[5]

Alan Fekete. Serialisability and snapshot isolation. In proceedings of the Australian Database Conference, pages 201--210, Auckland, New Zealand, January 1999.

[6]

Lars Frank. Evaluation of the basic remote backup and replication methods for high availability databases. Software Practice and Experience, 29:1339--1353, 1999.

Digital Library

[7]

Alan Fekete, Dimitrios Liarokapis, Elizabeth O'Neil, Patrick O'Neil, and Dennis Shasha. Making snapshot isolation serializable. In proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, pages 173--182, June 1996.

[8]

Lei Gao, Mike Dahlin, Amol Nayate, Jiandan Zheng, and Arun Iyengar. Application specific data replication for edge services. In Proceedings of the twelfth international conference on World Wide. Web, pages 449--460. ACM Press, 2003.

Digital Library

[9]

Jim Gray, Pat Helland, Patrick O'Neil, and Dennis Shasha. The dangers of replication and a solution. In proceedings of the 1996 ACM SIGMOD International Conference on Management of Data, Montreal, Canada, June 1996.

Digital Library

[10]

K. Jacobs. Concurrency control, transaction isolation and serializability in SQL92 and Oracle7. Technical report number A33745, Oracle Corporation, Redwood City, CA, July 1995.

[11]

Bettina Kemme and Gustavo Alonso. Don't be lazy, be consistent: Postgres-R, a new way to implement database replication. In proceedings of 26th International Conference on Very Large Data Bases (VLDB 2000), Cairo, Egypt, September 2000.

Digital Library

[12]

Bettina Kemme and Gustavo Alonso. A suite of database replication protocols based on group communication primitives. In proceedings 18th International Conference on Distributed Computing Systems (ICDCS), Amsterdam, The Netherlands, May 1998.

Digital Library

[13]

Leslie Lamport. The Part-time Parliament. ACM Transactions on Computer Systems, 16(2):133--169, May 1998.

Digital Library

[14]

Yi Lin, Bettina Kemme, Marta Patifio-Martínez, and Ricardo Jiménez-Peris. Middleware based Data Replication providing Snapshot Isolation. ACM Int. Conf. on Management of Data (SIGMOD), Baltimore, Maryland, June 2005.

Digital Library

[15]

Oracle parallel server for windows NT clusters. Online White Paper.

[16]

Data Concurrency and Consistency, Oracle8 Concepts, Release 8.0: Chapter 23. Technical report, Oracle Corporation, 1997.

[17]

Christos Papadimitriou. The theory of database concurrency control. Computer Science Press. July 1986.

Digital Library

[18]

Christian Plattner and Gustavo Alonso. Ganymed: Scalable Replication for Transactional Web Applications. In proceedings of the 5th ACM/IFIP/USENIX International Middleware Conference, Toronto, Canada, October 2004.

Digital Library

[19]

PostgreSQL, SQL compliant, open source object-relational database management system. http://www.postgresql.org/.

[20]

Calton Pu and Avraham Leff. Replica control in distributed systems: an asynchronous approach. SIGMOD Record (ACM Special Interest Group on Management of Data), 20(2): 377--386, June 1991.

Digital Library

[21]

Robbert van Renesse and Fred B. Schneider. Chain Replication for Supporting High Throughput and Availability. Sixth Symposium on Operating Systems Design and Implementation (OSDI '04), San Francisco, California, December 2004.

Digital Library

[22]

Fred B. Schneider. Implementing fault-tolerant services using the state machine approach: a tutorial. In ACM Computing Surveys. 22 (4):299--319, December 1990.

Digital Library

[23]

Transaction Processing Performance Council - http://www.tpc.org/.

[24]

Shuqing Wu and Bettina Kemme. Postgres-R(SI): Combining Replica Control with Concurrency Control based on Snapshot Isolation. In proceedings of International Conference on Data Engineering (ICDE), April 2005.

Digital Library

[25]

Matthias Wiesmann, Fernando Pedone, André Schiper, Bettina Kemme, and Gustavo Alonso. Understanding replication in databases and distributed systems. In proceedings of 20th International Conference on Distributed Computing Systems (ICDCS'2000), Taipei, Taiwan, April 2000.

Digital Library

Cited By

Zhou WPeng QZhang ZZhang YRen YLi SFu GCui YLi QWu CHan SWang SLi GYu G(2023)GeoGauss: Strongly Consistent and Light-Coordinated OLTP for Geo-Replicated SQL DatabaseProceedings of the ACM on Management of Data10.1145/35889161:1(1-27)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.1145/3588916
Georgiou MPanayiotou MOdysseos LPaphitis ASirivianos MHerodotou HLi GLi ZIdreos SSrivastava D(2021)Attaining Workload Scalability and Strong Consistency for Replicated Databases with HihooiProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452746(2721-2725)Online publication date: 9-Jun-2021
https://dl.acm.org/doi/10.1145/3448016.3452746
Lu YYu XMadden S(2019)STARProceedings of the VLDB Endowment10.14778/3342263.334227012:11(1316-1329)Online publication date: 1-Jul-2019
https://dl.acm.org/doi/10.14778/3342263.3342270
Show More Cited By

Index Terms

Tashkent: uniting durability with transaction ordering for high-performance scalable database replication
1. Information systems
  1. Data management systems
    1. Database management system engines

Recommendations

Tashkent: uniting durability with transaction ordering for high-performance scalable database replication
Proceedings of the 2006 EuroSys conference

In stand-alone databases, the functions of ordering the transaction commits and making the effects of transactions durable are performed in one single action, namely the writing of the commit record to disk. For efficiency many of these writes are ...
SIPRe: a partial database replication protocol with SI replicas
SAC '08: Proceedings of the 2008 ACM symposium on Applied computing

Database replication has been researched as a solution to overcome the problems of performance and availability of distributed systems. Full database replication, based on group communication systems, is an attempt to enhance performance that works well ...
Tashkent+: memory-aware load balancing and update filtering in replicated databases
EuroSys'07 Conference Proceedings

We present a memory-aware load balancing (MALB) technique to dispatch transactions to replicas in a replicated database. Our MALB algorithm exploits knowledge of the working sets of transactions to assign them to replicas in such a way that they execute ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

EuroSys '06: Proceedings of the 1st ACM SIGOPS/EuroSys European Conference on Computer Systems 2006

April 2006

420 pages

ISBN:1595933220

DOI:10.1145/1217935

Conference Chair:
Yolande Berbers
K. U. Leuven, Belgium
,
Program Chair:
Willy Zwaenepoel
EPFL

ACM SIGOPS Operating Systems Review Volume 40, Issue 4
Proceedings of the 2006 EuroSys conference
October 2006
383 pages
ISSN:0163-5980
DOI:10.1145/1218063
Issue’s Table of Contents

Copyright © 2006 Authors.

Sponsors

SIGOPS: ACM Special Interest Group on Operating Systems

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 April 2006

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Conference

EUROSYS06

Sponsor:

SIGOPS

EUROSYS06: Eurosys 2006 Conference

April 18 - 21, 2006

Leuven, Belgium

Acceptance Rates

Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25

Sponsor:
sigops

Twentieth European Conference on Computer Systems

March 30 - April 3, 2025

Rotterdam , Netherlands

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

78
Total Citations
View Citations
467
Total Downloads

Downloads (Last 12 months)22
Downloads (Last 6 weeks)3

Reflects downloads up to 23 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhou WPeng QZhang ZZhang YRen YLi SFu GCui YLi QWu CHan SWang SLi GYu G(2023)GeoGauss: Strongly Consistent and Light-Coordinated OLTP for Geo-Replicated SQL DatabaseProceedings of the ACM on Management of Data10.1145/35889161:1(1-27)Online publication date: 30-May-2023
https://dl.acm.org/doi/10.1145/3588916
Georgiou MPanayiotou MOdysseos LPaphitis ASirivianos MHerodotou HLi GLi ZIdreos SSrivastava D(2021)Attaining Workload Scalability and Strong Consistency for Replicated Databases with HihooiProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3452746(2721-2725)Online publication date: 9-Jun-2021
https://dl.acm.org/doi/10.1145/3448016.3452746
Lu YYu XMadden S(2019)STARProceedings of the VLDB Endowment10.14778/3342263.334227012:11(1316-1329)Online publication date: 1-Jul-2019
https://dl.acm.org/doi/10.14778/3342263.3342270
Lee JHan WNa HPark CKim KKim DLee JCha SMoon S(2018)Parallel replication across formats for scaling out mixed OLTP/OLAP workloads in main-memory databasesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-018-0503-z27:3(421-444)Online publication date: 1-Jun-2018
https://dl.acm.org/doi/10.1007/s00778-018-0503-z
Lee JMoon SKim KKim DCha SHan W(2017)Parallel replication across formats in SAP HANA for scaling out mixed OLTP/OLAP workloadsProceedings of the VLDB Endowment10.14778/3137765.313776710:12(1598-1609)Online publication date: 1-Aug-2017
https://dl.acm.org/doi/10.14778/3137765.3137767
Rabl TJacobsen HChirkova RYang JSuciu D(2017)Query Centric Partitioning and Allocation for Partially Replicated Database SystemsProceedings of the 2017 ACM International Conference on Management of Data10.1145/3035918.3064052(315-330)Online publication date: 9-May-2017
https://dl.acm.org/doi/10.1145/3035918.3064052
Gao XChiueh T(2016)Towards Seamless Resynchronization for Active-Active Database Clustering2016 IEEE 22nd International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS.2016.0148(1127-1134)Online publication date: Dec-2016
https://doi.org/10.1109/ICPADS.2016.0148
Bernstein PDas SDing BPilman MSellis TDavidson SIves Z(2015)Optimizing Optimistic Concurrency Control for Tree-Structured, Log-Structured DatabasesProceedings of the 2015 ACM SIGMOD International Conference on Management of Data10.1145/2723372.2737788(1295-1309)Online publication date: 27-May-2015
https://dl.acm.org/doi/10.1145/2723372.2737788
Silva JLourenço JPaulino HWainwright RCorchado JBechini AHong J(2015)Boosting locality in multi-version partial data replicationProceedings of the 30th Annual ACM Symposium on Applied Computing10.1145/2695664.2695851(1309-1314)Online publication date: 13-Apr-2015
https://dl.acm.org/doi/10.1145/2695664.2695851
Dhamane RMartínez MVianello VPeris RDesai BBernardino JAlmeida ADesai B(2014)Performance evaluation of database replication systemsProceedings of the 18th International Database Engineering & Applications Symposium10.1145/2628194.2628214(288-293)Online publication date: 7-Jul-2014
https://dl.acm.org/doi/10.1145/2628194.2628214
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents