Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/2452376.2452429acmotherconferencesArticle/Chapter ViewAbstractPublication PagesedbtConference Proceedingsconference-collections
research-article

Elastic online analytical processing on RAMCloud

Published: 18 March 2013 Publication History

Abstract

A shared-nothing architecture is state-of-the-art for deploying a distributed analytical in-memory database management system: it preserves the in-memory performance advantage by processing data locally on each node but is difficult to scale out. Modern switched fabric communication links such as InfiniBand narrow the performance gap between local and remote DRAM data access to a single order of magnitude. Based on these premises, we introduce a distributed in-memory database architecture that separates the query execution engine and data access: this enables a) the usage of a large-scale DRAM-based storage system such as Stanford's RAMCloud and b) the push-down of bandwidth-intensive database operators into the storage system. We address the resulting challenges such as finding the optimal operator execution strategy and partitioning scheme. We demonstrate that such an architecture delivers both: the elasticity of a shared-storage approach and the performance characteristics of operating on local DRAM.

References

[1]
D. J. Abadi, D. S. Myers, D. J. DeWitt, and S. Madden. Materialization Strategies in a Column-Oriented DBMS. In Proceedings of the 23rd International Conference on Data Engineering, ICDE 2007, The Marmara Hotel, Istanbul, Turkey, April 15--20, 2007, pages 466--475. IEEE, 2007.
[2]
P. A. Boncz, M. L. Kersten, and S. Manegold. Breaking the memory wall in MonetDB. Commun. ACM, 51(12):77--85, 2008.
[3]
D. Borthakur. The Hadoop Distributed File System: Architecture and Design. The Apache Software Foundation, 2007.
[4]
M. Brantner, D. Florescu, D. A. Graf, D. Kossmann, and T. Kraska. Building a database on S3. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10--12, 2008, pages 251--264. ACM, 2008.
[5]
C. Curino, E. P. Jones, S. Madden, and H. Balakrishnan. Workload-aware database monitoring and consolidation. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, SIGMOD '11, pages 313--324, New York, NY, USA, 2011. ACM.
[6]
J. Dean and S. Ghemawat. Mapreduce: a flexible data processing tool. Commun. ACM, 53(1):72--77, Jan. 2010.
[7]
F. Färber, N. May, W. Lehner, P. Große, I. Müller, H. Rauhe, and J. Dees. The SAP HANA Database -- An Architecture Overview. IEEE Data Eng. Bull., 35(1):28--33, 2012.
[8]
InfiniBand Trade Association. The InfiniBand Architecture.
[9]
Intel Coporation. Intel Xeon Processor E5-4650 Specification.
[10]
D. Kossmann. The state of the art in distributed query processing. ACM Comput. Surv., 32(4):422--469, Dec. 2000.
[11]
S. Melnik, A. Gubarev, J. J. Long, G. Romer, S. Shivakumar, M. Tolton, and T. Vassilakis. Dremel: interactive analysis of web-scale datasets. Proc. VLDB Endow., 3(1-2):330--339, Sept. 2010.
[12]
H. Montaner, F. Silla, H. Fröning, and J. Duato. Memscale: in-cluster-memory databases. In Proceedings of the 20th ACM international conference on Information and knowledge management, CIKM '11, pages 2569--2572, New York, NY, USA, 2011. ACM.
[13]
B. Nitzberg and V. Lo. Distributed shared memory: A survey of issues and algorithms. Computer, 24(8):52--60, Aug. 1991.
[14]
P. E. O'Neil, E. J. O'Neil, X. Chen, and S. Revilak. The star schema benchmark and augmented fact table indexing. In R. O. Nambiar and M. Poess, editors, TPCTC, volume 5895 of Lecture Notes in Computer Science, pages 237--252. Springer, 2009.
[15]
O'Neil, P. E. and O'Neil, E. J. and Chen, X. The Star Schema Benchmark (SSB).
[16]
D. Ongaro, S. M. Rumble, R. Stutsman, J. K. Ousterhout, and M. Rosenblum. Fast crash recovery in RAMCloud. In Proceedings of the 23rd ACM Symposium on Operating Systems Principles 2011, SOSP 2011, Cascais, Portugal, October 23--26, 2011, pages 29--41. ACM, 2011.
[17]
J. K. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, D. Ongaro, G. M. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The case for RAMCloud. Commun. ACM, 54(7):121--130, 2011.
[18]
M. T. Ozsu. Principles of Distributed Database Systems. Prentice Hall Press, Upper Saddle River, NJ, USA, 3rd edition, 2007.
[19]
A. Raghuveer, S. W. Schlosser, and S. Iren. Enabling database-aware storage with osd. In MSST, pages 129--142. IEEE Computer Society, 2007.
[20]
E. Rahm. Parallel query processing in shared disk database systems. SIGMOD Rec., 22(4):32--37, Dec. 1993.
[21]
S. M. Rumble, D. Ongaro, R. Stutsman, M. Rosenblum, and J. K. Ousterhout. It's time for low latency. In Proceedings of the 13th USENIX conference on Hot topics in operating systems, HotOS'13, pages 11--11, Berkeley, CA, USA, 2011. USENIX Association.
[22]
M. Sivathanu, L. N. Bairavasundaram, A. C. Arpaci-dusseau, and R. H. Arpaci-dusseau. Database-aware semantically-smart storage. In In Proceedings of the 4th USENIX Conference on File and Storage Technologies. USENIX Association, pages 239--252, 2005.
[23]
M. Stonebraker. The case for shared nothing. IEEE Database Eng. Bull., 9(1):4--9, 1986.
[24]
M. Stonebraker, D. Abadi, D. J. DeWitt, S. Madden, E. Paulson, A. Pavlo, and A. Rasin. Mapreduce and parallel dbmss: friends or foes? Commun. ACM, 53(1):64--71, Jan. 2010.
[25]
M. Stonebraker, C. Bear, U. Çetintemel, M. Cherniack, T. Ge, N. Hachem, S. Harizopoulos, J. Lifter, J. Rogers, and S. B. Zdonik. One Size Fits All? Part 2: Benchmarking Studies. In CIDR 2007, Third Biennial Conference on Innovative Data Systems Research, Asilomar, CA, USA, January 7--10, 2007, Online Proceedings, pages 173--184. www.cidrdb.org, 2007.
[26]
Texas Memory Systems. TMS RamSan-440 Details.
[27]
C. Tinnefeld, A. Zeier, and H. Plattner. Cache-conscious data placement in an in-memory key-value store. In 15th International Database Engineering and Applications Symposium (IDEAS 2011), September 21--27, 2011, Lisbon, Portugal, pages 134--142. ACM, 2011.
[28]
E. Wong and R. H. Katz. Distributing a database for parallelism. SIGMOD Rec., 13(4):23--29, May 1983.

Cited By

View all
  • (2022)Recent implications towards sustainable and energy efficient AI and big data implementations in cloud-fog systems: A newsworthy inquiryJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2021.11.00234:10(8867-8887)Online publication date: Nov-2022
  • (2019)Memory-Side Protection With a Capability Enforcement Co-ProcessorACM Transactions on Architecture and Code Optimization10.1145/330225716:1(1-26)Online publication date: 8-Mar-2019
  • (2018)The New Hardware Development Trend and the Challenges in Data Management and AnalysisData Science and Engineering10.1007/s41019-018-0072-63:3(263-276)Online publication date: 24-Sep-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
EDBT '13: Proceedings of the 16th International Conference on Extending Database Technology
March 2013
793 pages
ISBN:9781450315975
DOI:10.1145/2452376
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 March 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. RAMCloud
  2. analytics
  3. elasticity
  4. in-memory

Qualifiers

  • Research-article

Conference

EDBT/ICDT '13

Acceptance Rates

Overall Acceptance Rate 7 of 10 submissions, 70%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Recent implications towards sustainable and energy efficient AI and big data implementations in cloud-fog systems: A newsworthy inquiryJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2021.11.00234:10(8867-8887)Online publication date: Nov-2022
  • (2019)Memory-Side Protection With a Capability Enforcement Co-ProcessorACM Transactions on Architecture and Code Optimization10.1145/330225716:1(1-26)Online publication date: 8-Mar-2019
  • (2018)The New Hardware Development Trend and the Challenges in Data Management and AnalysisData Science and Engineering10.1007/s41019-018-0072-63:3(263-276)Online publication date: 24-Sep-2018
  • (2017)Characterizing Performance and Energy-Efficiency of the RAMCloud Storage System2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS.2017.51(1488-1498)Online publication date: Jun-2017
  • (2017)An Empirical Evaluation of How The Network Impacts The Performance and Energy Efficiency in RAMCloudProceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2017.127(1027-1034)Online publication date: 14-May-2017
  • (2016)The end of slow networksProceedings of the VLDB Endowment10.14778/2904483.29044859:7(528-539)Online publication date: 1-Mar-2016
  • (2015)Benchmarking Elastic Query Processing on Big DataBig Data Benchmarking10.1007/978-3-319-20233-4_5(37-44)Online publication date: 14-Jun-2015
  • (2014)In Memory Data Processing SystemsEncyclopedia of Business Analytics and Optimization10.4018/978-1-4666-5202-6.ch109(1182-1191)Online publication date: 2014
  • (2014)Parallel join executions in RAMCloud2014 IEEE 30th International Conference on Data Engineering Workshops10.1109/ICDEW.2014.6818325(182-190)Online publication date: Mar-2014

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media