Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/3035918.3056102acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

LittleTable: A Time-Series Database and Its Uses

Published: 09 May 2017 Publication History

Abstract

We present LittleTable, a relational database that Cisco Meraki has used since 2008 to store usage statistics, event logs, and other time-series data from our customers' devices.
LittleTable optimizes for time-series data by clustering tables in two dimensions. By partitioning rows by timestamp, it allows quick retrieval of recent measurements without imposing any penalty for retaining older history. By further sorting within each partition by a hierarchically-delineated key, LittleTable allows developers to optimize each table for the specific patterns with which they intend to access it.
LittleTable further optimizes for time-series data by capitalizing on the reduced consistency and durability needs of our applications, three of which we present here. In particular, our applications are single-writer and append-only. At most one process inserts a given type of data collected from a given device, and applications never update rows written in the past, simplifying both lock management and crash recovery. Our most recently written data is also recoverable, as it can generally be re-read from the devices themselves, allowing LittleTable to safely lose some amount of recently-written data in the event of a crash.
As a result of these optimizations, LittleTable is fast and efficient, even on a single processor and spinning disk. Querying an uncached table of 128-byte rows, it returns the first matching row in 31 ms, and it returns 500,000 rows/second thereafter, approximately 50% of the throughput of the disk itself. Today Meraki stores 320 TB of data across several hundred LittleTable servers system-wide.

References

[1]
LevelDB. http://leveldb.org/.
[2]
LZO real-time data compression library. http://www.oberhumer.com/opensource/lzo/.
[3]
Round robin database tool. http://oss.oetiker.ch/rrdtool/.
[4]
The virtual table mechanism of SQLite. https://sqlite.org/vtab.html.
[5]
Welcome to Apache HBase. https://hbase.apache.org/.
[6]
J. Baker, C. Bond, J. C. Corbett, J. Furman, A. Khorlin, J. Larson, J.-M. Leon, Y. Li, A. Lloyd, and V. Yushprakh. Megastore: Providing scalable, highly available storage for interactive services. In Proceedings of CIDR, 2011.
[7]
N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. Li, M. Marchukov, D. Petrov, L. Puzar, Y. J. Song, and V. Venkataramani. TAO: Facebook rights distributed data store for the social graph. In Proceedings of USENIX ATC, 2013.
[8]
F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. In Proceedings of OSDI, 2006.
[9]
J. C. Corbett, J. Dean, M. Epstein, A. Fikes, C. Frost, J. Furman, S. Ghemawat, A. Gubarev, C. Heiser, P. Hochschild, W. Hsieh, S. Kanthak, E. Kogan, H. Li, A. Lloyd, S. Melnik, D. Mwaura, D. Nagle, S. Quinlan, R. Rao, L. Rolig, Y. Saito, M. Szymaniak, C. Taylor, R. Wang, and D. Woodford. Spanner: Google's globally-distributed database. In Proceedings of OSDI, 2012.
[10]
P. Flajolet, É. Fusy, O. Gandouet, and F. Meunier. HyperLogLog: The analysis of a near-optimal cardinality estimation algorithm. In Proceedings of the International Conference on Analysis of Algorithms, 2007.
[11]
B. Hegerfors. Date-tiered compaction in Apache Cassandra. https://labs.spotify.com/2014/12/18/date-tiered-compaction/, Dec. 2014.
[12]
C. Jermaine, E. Omiecinski, and W. G. Yee. The partitioned exponential file for database storage management. The VLDB Journal, 16(4):417--437, Oct. 2007.
[13]
C. Kolovson and M. Stonebraker. Indexing techniques for historical databases. In Proceedings of ICDE, 1989.
[14]
A. Lakshman and P. Malik. Cassandra: A decentralized structured storage system. SIGOPS Oper. Syst. Rev., 44(2):35--40, 2010.
[15]
D. Lomet and B. Salzberg. Access methods for multiversion data. In Proceedings of SIGMOD, 1989.
[16]
D. Lomet and B. Salzberg. The performance of a multiversion access method. In Proceedings of SIGMOD, 1990.
[17]
Y. Matsunobu. MyRocks: A space- and write-optimized MySQL database. https://code.facebook.com/posts/190251048047090/myrocks-a-space-and-write-optimized-mysql-database/, Aug. 2016.
[18]
P. Muth, P. O'Neil, A. Pick, and G. Weikum. The LHAM log-structured history data access method. The VLDB Journal, 8(3--4):199--221, 2000.
[19]
P. O'Neil, E. Cheng, D. Gawlick, and E. O'Neil. The log-structured merge-tree (LSM-tree). Acta Inf., 33(4):351--385, 1996.
[20]
S. Papadopoulos, K. Datta, S. Madden, and T. Mattson. The TileDB array data storage manager. In Proceedings of VLDB, 2017.
[21]
M. Rosenblum and J. K. Ousterhout. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst., 10(1):26--52, 1992.
[22]
R. Sears and R. Ramakrishnan. bLSM: A general purpose log structured merge tree. In Proceedings of SIGMOD, 2012.
[23]
M. Seltzer, K. A. Smith, H. Balakrishnan, J. Chang, S. McMains, and V. Padmanabhan. File system logging versus clustering: A performance comparison. In Proceedings of the USENIX Technical Conference, 1995.
[24]
W. Tan, S. Tata, Y. Tang, and L. Fong. Diff-index: Differentiated index in distributed log-structured data stores. In Proceedings of EDBT, 2014.
[25]
H. T. Vo, S. Wang, D. Agrawal, G. Chen, and B. C. Ooi. LogBase: A scalable log-structured database system in the cloud. Proceedings of the VLDB Endowment, 5(10):1004--1015, 2012.
[26]
T. Wolpe. MongoDB CTO: How our new WiredTiger storage engine will earn its stripes. http://www.zdnet.com/article/ mongodb-cto-how-our-new-wiredtiger-storage-engine-will-earn-its-stripes/, Nov. 2014.

Cited By

View all
  • (2024)An Efficient NoSQL-Based Storage Schema for Large-Scale Time Series DataJournal of Database Management10.4018/JDM.33991535:1(1-21)Online publication date: 8-Mar-2024
  • (2024)Structural Designs Meet Optimality: Exploring Optimized LSM-tree Structures in a Colossal Configuration SpaceProceedings of the ACM on Management of Data10.1145/36549782:3(1-26)Online publication date: 30-May-2024
  • (2024)A survey on hybrid transactional and analytical processingThe VLDB Journal10.1007/s00778-024-00858-933:5(1485-1515)Online publication date: 4-Jun-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '17: Proceedings of the 2017 ACM International Conference on Management of Data
May 2017
1810 pages
ISBN:9781450341974
DOI:10.1145/3035918
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 May 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. cloud computing
  2. clustering
  3. databases
  4. internet of things
  5. partitioning
  6. time-series data

Qualifiers

  • Research-article

Conference

SIGMOD/PODS'17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)95
  • Downloads (Last 6 weeks)10
Reflects downloads up to 23 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)An Efficient NoSQL-Based Storage Schema for Large-Scale Time Series DataJournal of Database Management10.4018/JDM.33991535:1(1-21)Online publication date: 8-Mar-2024
  • (2024)Structural Designs Meet Optimality: Exploring Optimized LSM-tree Structures in a Colossal Configuration SpaceProceedings of the ACM on Management of Data10.1145/36549782:3(1-26)Online publication date: 30-May-2024
  • (2024)A survey on hybrid transactional and analytical processingThe VLDB Journal10.1007/s00778-024-00858-933:5(1485-1515)Online publication date: 4-Jun-2024
  • (2024)LayerBF: A Space Allocation Policy for Bloom Filter in LSM-TreeWeb and Big Data10.1007/978-981-97-2387-4_33(492-506)Online publication date: 28-Apr-2024
  • (2024)Housing Demand, Affordability and Mortgage Financing: A Case Study of KarachiSustainability and Financial Services in the Digital Age10.1007/978-3-031-67511-9_16(259-301)Online publication date: 22-Oct-2024
  • (2023)Krypton: Real-Time Serving and Analytical SQL Engine at ByteDanceProceedings of the VLDB Endowment10.14778/3611540.361154516:12(3528-3542)Online publication date: 1-Aug-2023
  • (2023)A Novel Cache and Consistency Mechanism for IoT Time Series Data2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys60770.2023.00087(599-606)Online publication date: 17-Dec-2023
  • (2023)Holistic Analytics of Sensor Data from Renewable Energy Sources: A Vision PaperNew Trends in Database and Information Systems10.1007/978-3-031-42941-5_31(360-366)Online publication date: 31-Aug-2023
  • (2023)A Comparative Study of Row and Column Storage for Time Series DataSpatial Data and Intelligence10.1007/978-3-031-32910-4_16(223-238)Online publication date: 13-Apr-2023
  • (2022)A New NVM Device Driver for IoT Time Series DatabaseMicromachines10.3390/mi1303038513:3(385)Online publication date: 27-Feb-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media