Mining Massive Time Series Data: With Dimensionality Reduction Techniques

Justin Borg¹² &
Joseph G. Vella¹²

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1244))

Included in the following conference series:

International Conference on Advances in Computing and Data Sciences

1421 Accesses
1 Citations

Abstract

A pre-processing step to reduce the volume of data but suffer an acceptable loss of data quality before applying data mining algorithms on time series data is needed to decrease the input data size. Input size reduction is an important step in optimizing time series processing, e.g. in data mining computations. During the last two decades various time series dimensionality reduction techniques have been proposed. However no study has been dedicated to gauge these time series dimensionality reduction techniques in terms of their effectiveness of producing a reduced representation of the input time series that when applied to various data mining algorithms produces good quality results. In this paper empirical evidence is given by comparing three reduction techniques on various data sets and applying their output to four different data mining algorithms. The results show that it is sometimes feasible to use these techniques instead of using the original time series data. The comparison is evaluated by running data mining methods over the original and reduced sets of data. It is shown that one dimensionality reduction technique managed to generate results of over 83% average accuracy when compared to its benchmark results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Time Series Data Representation and Dimensionality Reduction Techniques

Dimensionality reduction for multivariate time-series data mining

Article 19 January 2022

Dimensionality Reduction

References

Fu, T.: A review on time series data mining. Eng. Appl. Artif. Intell. 24(1), 164–181 (2011)
Article Google Scholar
Keogh, E., Kasetty, S.: On the need for time series data mining benchmarks: a survey and empirical demonstration. Data Min. Knowl. Discov. 7(4), 349–371 (2003)
Article MathSciNet Google Scholar
Esling, P., Agon, C.: Time-series data mining. ACM Comput. Surv. 45(1), 1–34 (2012)
Article Google Scholar
Ding, H., Trajcevski, G., Scheuermann, P., Wang, X., Keogh, E.: Querying and mining of time series data: experimental comparison of representations and distance measures. Proc. VLDB Endowment 1(2), 1542–1552 (2008)
Article Google Scholar
Agrawal, Rakesh, Faloutsos, Christos, Swami, Arun: Efficient similarity search in sequence databases. In: Lomet, David B. (ed.) FODO 1993. LNCS, vol. 730, pp. 69–84. Springer, Heidelberg (1993). https://doi.org/10.1007/3-540-57301-1_5
Chapter Google Scholar
Chan, K., Fu, A.W.-c.: Efficient time series matching by wavelets. In: Proceedings of the 15th International Conference on Data Engineering, ICDE 1999, Washington DC (1999)
Google Scholar
Keogh, E., Chakrabarti, K., Pazzani, M., Mehrotra, S.: Dimensionality reduction for fast similarity search in large time series databases. Knowl. Inf. Syst. 3(3), 263–286 (2001)
Article Google Scholar
Vlachos, M., Gunopulos, D.: Indexing time-series under conditions of noise. In: Data Mining in Time Series Databases, pp. 67–100. World Scientific Press (2004)
Google Scholar
Struzik, Z., Siebes, A.: Measuring time series similarity through large singular features revealed with wavelet transformation. In: Proceedings of the 10th International Workshop on Database and Expert System Applications (1999)
Google Scholar
Megalooikonomou, V., Li, G., Wang, Q.: A dimensionality reduction technique for efficient similarity analysis of time series databases. In: Proceedings of the Thirteenth ACM International Conference on Information and Knowledge Management, CIKM 2004, Washington DC (2004)
Google Scholar
Chakrabarti, K., Keogh, E., Mehrotra, S., Pazzani, M.: Locally Adaptive Dimensionality reduction for indexing large time series databases. ACM Trans. Database Syst. (TODS), pp. 188–228 (2002)
Google Scholar
Lin, J., Keogh, E., Wei, L., Lonardi, S.: Experiencing SAX: a novel symbolic representation of time series. Data Min. Knowl. Discov. 15, 107–144 (2007). https://doi.org/10.1007/s10618-007-0064-z
Article MathSciNet Google Scholar
Bode, G., Schreiber, T., Baranski, M., Müller, D.: A time series clustering approach for Building Automation and Control Systems. Appl. Energy 238, 1337–1345 (2019)
Article Google Scholar
Caiado, J., Crato, N., Poncela, P.: A fragmented-periodogram approach for clustering big data time series. Adv. Data Anal. Classif. 14, 117–146 (2020)
Article MathSciNet Google Scholar
Wismuller, A., et al.: Cluster analysis of biomedical image time-series. Int. J. Comput. Vis. 46(2), 103–128 (2002)
Article Google Scholar
Luo, W., Gallagher, M., Wiles, J.: Parameter-free search of time-series discord. J. Comput. Sci. Technol. 28(2), 300–310 (2013)
Article Google Scholar
Chuah, M.C., Fu, F.: ECG anomaly detection via time series analysis. In: Thulasiraman, Parimala, He, Xubin, Xu, Tony Li, Denko, Mieso K., Thulasiram, Ruppa K., Yang, Laurence T. (eds.) ISPA 2007. LNCS, vol. 4743, pp. 123–135. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74767-3_14
Chapter Google Scholar
Keogh, E., Lin, J., Fu, A.W., Van Herle, H.: Finding unusual medical time-series subsequences: algorithms and applications. IEEE Trans. Inf Technol. Biomed. 10, 429–439 (2006)
Article Google Scholar
Wei, L., Keogh, E., Xi, X.: SAXually explicit images: finding unusual shapes. In: Sixth International Conference on Data Mining, 2006, ICDM 2006, Hong Kong (2007)
Google Scholar
Yi, B., Faloutsos, C.: Fast time sequence indexing for arbitrary Lp norms. In: Proceedings of the 26th International Conference on Very Large Databases, San Francisco, VLDB 2000 (2000)
Google Scholar
Chaovalit, P., Gangopadhyay, A., Karabatis, G., Chen, Z.: Discrete wavelet transform-based time series analysis and mining. ACM Comput. Surv. (CSUR) 43(12), 1–37 (2011)
Article Google Scholar
Gunopulos, D.: Tutorial Slides: Dimensionality Reduction Techniques (2001)
Google Scholar
Rand, W.: Objective criteria for the evaluation of clustering methods. J. Am. Stat. Assoc. 66, 846–850 (1971)
Article Google Scholar
Fowlkes, E., Mallows, C.: A method for comparing two hierarchical clusterings. J. Am. Stat. Assoc. 78, 553–569 (1983)
Article Google Scholar
Alcala-Fdez, J., et al.: KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework. J. Multiple-Valued Logic Soft Comput. 17, 255–287 (2011)
Google Scholar
Lichman, M.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2010). http://archive.ics.uci.edu/ml
Fonollosa, J., Sheik, S., Huerta, R., Marco, S.: Reservoir Computing compensates slow response of chemosensor arrays exposed to fast varying gas concentrations in continuous monitoring. Sens. Actuators B: Chem. 215, 618–629 (2015)
Article Google Scholar
Chen, Y., et al.: The UCR Time Series Classification Archive, July 2015. http://www.cs.ucr.edu/~eamonn/time_series_data/
Keogh, E., Lin, J., Fu, A.: HOT SAX: efficiently finding the most unusual time series subsequence. In: Proceedings of the Fifth IEEE International Conference on Data Mining, ICDM 2005, Washington (2005)
Google Scholar
Bahadori, S., Charkari, N.M.: Increasing efficiency of time series clustering by dimension reduction techniques (2018)
Google Scholar
Sirisambhand, K., Ratanamahatana, C.H.: A dimensionality reduction technique for time series classification using additive representation. In: Third International Congress on Information and Communication Technology. Advances in Intelligent Systems and Computing, Singapore (2019)
Google Scholar
Wang, Lin, Lu, Faming, Cui, Minghao, Bao, Yunxia: Survey of methods for time series symbolic aggregate approximation. In: Cheng, Xiaohui, Jing, Weipeng, Song, Xianhua, Lu, Zeguang (eds.) ICPCSEE 2019. CCIS, vol. 1058, pp. 645–657. Springer, Singapore (2019). https://doi.org/10.1007/978-981-15-0118-0_50
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of ICT, Department of Computer Information Systems, University of Malta, Msida, Malta
Justin Borg & Joseph G. Vella

Authors

Justin Borg
View author publications
You can also search for this author in PubMed Google Scholar
Joseph G. Vella
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joseph G. Vella .

Editor information

Editors and Affiliations

University of KwaZulu-Natal, Durban, South Africa
Mayank Singh
Jaypee University of Information Technology, Waknaghat, Himachal Pradesh, India
P. K. Gupta
Jaypee University of Engineering and Technology, Guna, Madhya Pradesh, India
Vipin Tyagi
Institute of Information Theory and Automation, Prague, Czech Republic
Jan Flusser
University of Ottawa, Ottawa, ON, Canada
Tuncer Ören
University of Malta, Valletta, Malta
Gianluca Valentino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Borg, J., Vella, J.G. (2020). Mining Massive Time Series Data: With Dimensionality Reduction Techniques. In: Singh, M., Gupta, P., Tyagi, V., Flusser, J., Ören, T., Valentino, G. (eds) Advances in Computing and Data Sciences. ICACDS 2020. Communications in Computer and Information Science, vol 1244. Springer, Singapore. https://doi.org/10.1007/978-981-15-6634-9_45

Download citation

DOI: https://doi.org/10.1007/978-981-15-6634-9_45
Published: 18 July 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-6633-2
Online ISBN: 978-981-15-6634-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Mining Massive Time Series Data: With Dimensionality Reduction Techniques

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Time Series Data Representation and Dimensionality Reduction Techniques

Dimensionality reduction for multivariate time-series data mining

Dimensionality Reduction

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Mining Massive Time Series Data: With Dimensionality Reduction Techniques

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Time Series Data Representation and Dimensionality Reduction Techniques

Dimensionality reduction for multivariate time-series data mining

Dimensionality Reduction

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation