Nothing Special   »   [go: up one dir, main page]

skip to main content
research-article

RINSE: interactive data series exploration with ADS+

Published: 01 August 2015 Publication History

Abstract

Numerous applications continuously produce big amounts of data series, and in several time critical scenarios analysts need to be able to query these data as soon as they become available. An adaptive index data structure, ADS+, which is specifically tailored to solve the problem of indexing and querying very large data series collections has been recently proposed as a solution to this problem. The main idea is that instead of building the complete index over the complete data set up-front and querying only later, we interactively and adaptively build parts of the index, only for the parts of the data on which the users pose queries. The net effect is that instead of waiting for extended periods of time for the index creation, users can immediately start exploring the data series. In this work, we present a demonstration of ADS+; we introduce RINSE, a system that allows users to experience the benefits of the ADS+ adaptive index through an intuitive web interface. Users can explore large datasets and find patterns of interest, using nearest neighbor search. They can draw queries (data series) using a mouse, or touch screen, or they can select from a predefined list of data series. RINSE can scale to large data sizes, while drastically reducing the data to query delay: by the time state-of-the-art indexing techniques finish indexing 1 billion data series (and before answering even a single query), adaptive data series indexing can already answer 3 * 105 queries.

References

[1]
R. Agrawal, C. Faloutsos, and A. N. Swami. Efficient similarity search in sequence databases. In FODO, 1993.
[2]
S. Berchtold, D. A. Keim, and H.-P. Kriegel. The X-tree: An index structure for high-dimensional data. In VLDB, 1996.
[3]
A. Camerra, T. Palpanas, J. Shieh, and E. Keogh. iSAX 2.0: Indexing and mining one billion time series. In ICDM, 2010.
[4]
A. Camerra, J. Shieh, T. Palpanas, T. Rakthanmanon, and E. Keogh. Beyond One Billion Time Series: Indexing and Mining Very Large Time Series Collections with iSAX2+. KAIS, 39(1):123--151, 2014.
[5]
K.-P. Chan and A.-C. Fu. Efficient time series matching by wavelets. In ICDE, 1999.
[6]
A. Guttman. R-Trees A Dynamic Structure for Spatial Searching. In SIGMOD, 1984.
[7]
F. Halim, S. Idreos, P. Karras, and R. H. C. Yap. Stochastic database cracking: Towards robust adaptive indexing in main-memory column-stores. PVLDB, 5(6):502--513, 2012.
[8]
S. Idreos. Database Cracking: Towards Auto-tuning Database Kernels. PhD Thesis, 2010.
[9]
S. Idreos, I. Alagiannis, R. Johnson, and A. Ailamaki. Here are my Data Files. Here are my Queries. Where are my Results? In CIDR, 2011.
[10]
S. Idreos and E. Liarou. dbtouch: Analytics at your fingertips. In CIDR, 2013.
[11]
S. Idreos, S. Manegold, H. Kuno, and G. Graefe. Merging what's cracked, cracking what's merged: Adaptive indexing in main-memory column-stores. PVLDB, 4(9):586--597, 2011.
[12]
S. Idreos, O. Papaemmanouil, and S. Chaudhuri. Overview of Data Exploration Techniques. In SIGMOD, Tutorial, 2015.
[13]
E. Keogh, K. Chakrabarti, and M. Pazzani. Dimensionality reduction for fast similarity search in large time series databases. KAIS, 3(3):263--286, 2000.
[14]
J. Lin, E. Keogh, and S. Lonardi. A symbolic representation of time series, with implications for streaming algorithms. In DMKD Workshop, 2003.
[15]
J. Shieh and E. Keogh. iSAX: Indexing and Mining Terabyte Sized Time Series. In SIGKDD, 2008.
[16]
Y. Wang, P. Wang, J. Pei, W. Wang, and S. Huang. A data-adaptive and dynamic segmentation index for whole matching on time series. PVLDB, 6(10):793--804, 2013.
[17]
K. Zoumpatianos, S. Idreos, and T. Palpanas. Indexing for interactive exploration of big data series. In SIGMOD, 2014.
[18]
K. Zoumpatianos, Y. Lou, T. Palpanas, and J. Gehrke. Query workloads for data series indexes. In SIGKDD, 2015.

Cited By

View all
  • (2023)TSM-Bench: Benchmarking Time Series Database Systems for Monitoring ApplicationsProceedings of the VLDB Endowment10.14778/3611479.361153216:11(3363-3376)Online publication date: 24-Aug-2023
  • (2022)TaleBrush: Sketching Stories with Generative Pretrained Language ModelsProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501819(1-19)Online publication date: 29-Apr-2022
  • (2020)Data Series Progressive Similarity Search with Probabilistic Quality GuaranteesProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3389751(1857-1873)Online publication date: 11-Jun-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 8, Issue 12
Proceedings of the 41st International Conference on Very Large Data Bases, Kohala Coast, Hawaii
August 2015
728 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 August 2015
Published in PVLDB Volume 8, Issue 12

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)1
Reflects downloads up to 13 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)TSM-Bench: Benchmarking Time Series Database Systems for Monitoring ApplicationsProceedings of the VLDB Endowment10.14778/3611479.361153216:11(3363-3376)Online publication date: 24-Aug-2023
  • (2022)TaleBrush: Sketching Stories with Generative Pretrained Language ModelsProceedings of the 2022 CHI Conference on Human Factors in Computing Systems10.1145/3491102.3501819(1-19)Online publication date: 29-Apr-2022
  • (2020)Data Series Progressive Similarity Search with Probabilistic Quality GuaranteesProceedings of the 2020 ACM SIGMOD International Conference on Management of Data10.1145/3318464.3389751(1857-1873)Online publication date: 11-Jun-2020
  • (2020)Scalable data series subsequence matching with ULISSEThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-020-00619-429:6(1449-1474)Online publication date: 4-Jul-2020
  • (2020)Processing Temporal and Time Series Data: Present State and Future ChallengesAdvances in Databases and Information Systems10.1007/978-3-030-54832-2_2(8-14)Online publication date: 25-Aug-2020
  • (2019)Scalable, variable-length similarity search in data seriesProceedings of the VLDB Endowment10.14778/3275366.328496811:13(2236-2248)Online publication date: 17-Jan-2019
  • (2019)CoconutProceedings of the VLDB Endowment10.14778/3199517.319951911:6(677-690)Online publication date: 17-Jan-2019
  • (2019)Visual Exploration of Time Series Anomalies with Metro-VizProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3320247(1901-1904)Online publication date: 25-Jun-2019
  • (2019)Coconut PalmProceedings of the 2019 International Conference on Management of Data10.1145/3299869.3320233(1941-1944)Online publication date: 25-Jun-2019
  • (2018)CoconutProceedings of the VLDB Endowment10.5555/3199517.319951911:6(677-690)Online publication date: 1-Feb-2018
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media