Nothing Special   »   [go: up one dir, main page]

skip to main content
10.1145/1031171.1031203acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
Article

A dimensionality reduction technique for efficient similarity analysis of time series databases

Published: 13 November 2004 Publication History

Abstract

Efficiently searching for similarities among time series and discovering interesting patterns is an important and non-trivial problem with applications in many domains. The high dimensionality of the data makes the analysis very challenging. To solve this problem, many dimensionality reduction methods have been proposed. PCA (Piecewise Constant Approximation) and its variant have been shown efficient in time series indexing and similarity retrieval. However, in certain applications, too many false alarms introduced by the approximation may reduce the overall performance dramatically. In this paper, we introduce a new piecewise dimensionality reduction technique that is based on Vector Quantization. The new technique, PVQA (Piecewise Vector Quantized Approximation), partitions each sequence into equi-length segments and uses vector quantization to represent each segment by the closest (based on a distance metric) codeword from a codebook of key-sequences. The efficiency of calculations is improved due to the significantly lower dimensionality of the new representation. We demonstrate the utility and efficiency of the proposed technique on real and simulated datasets. By exploiting prior knowledge about the data, the proposed technique generally outperforms PCA and its variants in similarity searches.

References

[1]
Gersho, A. & Gray R. M. (1992). Vector Quantization and Signal Compression. Kluwer Academic, Boston.
[2]
Keogh, E., Chakrabarti, K., Pazzani, M. & Mehrotra, S. (2000). "Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases", Knowledge and Information Systems 3(3): 263--286.
[3]
Lin, J., Keogh, E., Patel, P. & Lonardi, S. (2002). "Finding motifs in time series", 2nd Workshop on Temporal Data Mining at the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. July 23-26. Edmonton, Alberta, Canada.
[4]
Lloyd, S. P. (1982). "Least squares quantization in PCM", IEEE Transactions on Information Theory, IT(28), pp. 127--135.
[5]
Stanford Genomic Resources. http://genome-www.stanford.edu/nci60
[6]
UCI KDD Archive. http://kdd.ics.uci.edu
[7]
Yi, B-K & Faloutsos, C. (2000). "Fast Time Sequence Indexing for Arbitrary Lp Norms", in Proceedings of the VLDB, Cairo, Egypt, pp. 385--394.

Cited By

View all
  • (2023)Correlation Joins over Time Series Data Streams Utilizing Complementary Dimension Reduction and TransformationProceedings of the ACM on Management of Data10.1145/36267221:4(1-26)Online publication date: 12-Dec-2023
  • (2020)Mining Massive Time Series Data: With Dimensionality Reduction TechniquesAdvances in Computing and Data Sciences10.1007/978-981-15-6634-9_45(496-506)Online publication date: 18-Jul-2020
  • (2019)GRAILProceedings of the VLDB Endowment10.14778/3342263.334264812:11(1762-1777)Online publication date: 1-Jul-2019
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '04: Proceedings of the thirteenth ACM international conference on Information and knowledge management
November 2004
678 pages
ISBN:1581138741
DOI:10.1145/1031171
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 November 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data mining
  2. dimensionality reduction
  3. time series

Qualifiers

  • Article

Conference

CIKM04
Sponsor:
CIKM04: Conference on Information and Knowledge Management
November 8 - 13, 2004
D.C., Washington, USA

Acceptance Rates

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7
  • Downloads (Last 6 weeks)1
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Correlation Joins over Time Series Data Streams Utilizing Complementary Dimension Reduction and TransformationProceedings of the ACM on Management of Data10.1145/36267221:4(1-26)Online publication date: 12-Dec-2023
  • (2020)Mining Massive Time Series Data: With Dimensionality Reduction TechniquesAdvances in Computing and Data Sciences10.1007/978-981-15-6634-9_45(496-506)Online publication date: 18-Jul-2020
  • (2019)GRAILProceedings of the VLDB Endowment10.14778/3342263.334264812:11(1762-1777)Online publication date: 1-Jul-2019
  • (2017)Enhanced minimum description length preprocessing of time series trajectories2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS)10.1109/ICIIECS.2017.8275911(1-5)Online publication date: Mar-2017
  • (2017)Shape based time series reduction using PCA2017 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS)10.1109/ICIIECS.2017.8275897(1-4)Online publication date: Mar-2017
  • (2017)IFRAT: An IoT Field Recognition Algorithm Based on Time-Series Data2017 3rd International Conference on Big Data Computing and Communications (BIGCOM)10.1109/BIGCOM.2017.62(251-255)Online publication date: Aug-2017
  • (2016)Predefined pattern detection in large time seriesInformation Sciences: an International Journal10.1016/j.ins.2015.04.018329:C(950-964)Online publication date: 1-Feb-2016
  • (2015)Data mining for the Internet of ThingsInternational Journal of Distributed Sensor Networks10.1155/2015/4310472015(12-12)Online publication date: 1-Jan-2015
  • (2015)Vector quantization: A discretization technique for fast time series discord discovery2015 2nd National Foundation for Science and Technology Development Conference on Information and Computer Science (NICS)10.1109/NICS.2015.7302190(197-201)Online publication date: Sep-2015
  • (2015)Time Series Classification Based on Multi-codebook Piecewise Vector Quantized ApproximationProceedings of the 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI)10.1109/ICTAI.2015.65(385-390)Online publication date: 9-Nov-2015
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media