Interval Feature Transformation for Time Series Classification Using Perceptually Important Points
<p>Examples of time series segments for four data sets, which are ECGFiveDays (ECG = electrocardiogram) (<b>a</b>), FreezerSmallTrain (<b>b</b>), Ham (<b>c</b>) and Wine (<b>d</b>), respectively.</p> "> Figure 2
<p>Influence of parameter <math display="inline"><semantics> <mi>τ</mi> </semantics></math> selection for data set OliveOil, which is <math display="inline"><semantics> <mrow> <mi>τ</mi> <mo>=</mo> <mn>1</mn> </mrow> </semantics></math> (<b>a</b>), <math display="inline"><semantics> <mrow> <mi>τ</mi> <mo>=</mo> <mn>25</mn> </mrow> </semantics></math> (<b>b</b>) and <math display="inline"><semantics> <mrow> <mi>τ</mi> <mo>=</mo> <mn>50</mn> </mrow> </semantics></math> (<b>c</b>), respectively.</p> "> Figure 3
<p>Split point.</p> "> Figure 4
<p>The results of a noise sensitivity analysis on two datasets, which are OliveOil dataset (<b>a</b>) and ShapeletSim dataset (<b>b</b>), respectively. IFT: Interval Feature Transformation; RF: Random Forest; SVM: Support Vector Machine; XGBOOST: eXtreme Gradient Boosting; GBDT: Gradient Tree Boosting; SNR: Signal Noise Ratio.</p> "> Figure 5
<p>Critical difference diagram for six classifiers derived from results in <a href="#applsci-10-05428-t003" class="html-table">Table 3</a>. LS: Learning Time-Series Shapelets; BOSS: Bag-of-SFA Symbols; CD: Critical Difference; 1NN-DTW: One-Nearest-Neighbor Classifier with Dynamic Time Warping; TSF: Time Series Forest; SAXVSM: Symbolic Aggregate approXimation and Vector Space Model.</p> "> Figure 6
<p>Two classes of time series from the ECGFiveDays.</p> "> Figure 7
<p>(<b>a</b>) Sixty-eight feature vectors extracted by the proposed method. (<b>b</b>) Interval features found by the proposed method.</p> ">
Abstract
:1. Introduction
2. Related Work
3. Interval Feature Transformation
3.1. Perceptually Important Points (PIPs)
3.2. Discriminative Feature Vector Selection
3.2.1. Internal Feature Vector
3.2.2. Selection Process
3.2.3. Selection Measure
3.3. Data Transform
4. Experimental Studies
4.1. Experimental Design
4.2. Evaluating Indicator
4.3. Results
5. Discussion
Author Contributions
Funding
Conflicts of Interest
References
- Ghaderpour, E.; Pagiatakis, S.D. Least-Squares Wavelet Analysis of Unequally Spaced and Non-stationary Time Series and Its Applications. Math. Geosci. 2017, 49, 819–844. [Google Scholar] [CrossRef]
- Abdel-Hamid, O.; Deng, L.; Yu, D. Exploring convolutional neural network structures and optimization techniques for speech recognition. Interspeech 2013, 11, 73–75. [Google Scholar]
- Abdel-Hamid, O.; Mohamed, A.R.; Jiang, H.; Penn, G. Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition. In Proceedings of the 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, Kyoto, Japan, 25–30 March 2012. [Google Scholar]
- Wang, J.; Liu, P.; She, M.F.; Nahavandi, S.; Kouzani, A. Bag-of-words representation for biomedical time series classification. Biomed. Signal Process. Control 2013, 8, 634–644. [Google Scholar] [CrossRef] [Green Version]
- Lines, J. Time Series Classification through Transformation and Ensembles. Ph.D. Thesis, University of East Anglia, Norwich, UK, 2015. [Google Scholar]
- Fulcher, B.D. Feature-based time-series analysis. In Feature Engineering for Machine Learning and Data; CRC Press: Boca Raton, FL, USA, 2018. [Google Scholar]
- Masip, D.; Vitrià, J. Boosted discriminant projections for nearest neighbor classification. Pattern Recognit 2006, 39, 164–170. [Google Scholar] [CrossRef]
- Goldstein, M. kn-nearest neighbor classification. IEEE Trans. Inf. Theory 2003, 18, 627–630. [Google Scholar] [CrossRef]
- Ratanamahatana, C.A.; Keogh, E. Making time-series classification more accurate using learned constraints. In Proceedings of the Fourth SIAM International Conference on Data Mining, Lake Buena Vista, FA, USA, 22–24 April 2004. [Google Scholar] [CrossRef] [Green Version]
- Jeong, Y.S.; Jeong, M.K.; Omitaomu, O.A. Weighted dynamic time warping for time series classification. Pattern Recognit. 2011, 44, 2231–2240. [Google Scholar] [CrossRef]
- Yu, D.; Yu, X.; Hu, Q.; Liu, J.; Wu, A. Dynamic time warping constraint learning for large margin nearest neighbor classification. Inf. Sci. 2011, 181, 2787–2796. [Google Scholar] [CrossRef]
- Deng, H.; Runger, G.; Tuv, E.; Vladimir, M. A Time Series Forest for Classification and Feature Extraction. Inf. Sci. 2013, 239, 142–153. [Google Scholar] [CrossRef] [Green Version]
- Ye, L.; Keogh, E. Time series shapelets: A new primitive for data mining. Knowl. Discov. Data Min. 2009, 947–956. [Google Scholar] [CrossRef]
- Hills, J.; Lines, J.; Baranauskas, E.; Mapp, J.; Bagnall, A. Classification of time series by shapelet transformation. Data Min. Knowl. Discov. 2013, 28, 851–881. [Google Scholar] [CrossRef] [Green Version]
- Faouzi, J.; Janati, H. pyts: A python package for time series classification. J. Mach. Learn. Res. 2020, 21, 1–6. [Google Scholar]
- Schäfer, P. The BOSS is concerned with time series classification in the presence of noise. Data Min. Knowl. Discov. 2015, 29, 1505–1530. [Google Scholar] [CrossRef]
- Patel, P.; Keogh, E.; Lin, J.; Lonardi, S. Mining Motifs in Massive Time Series Databases. In Proceedings of the 2002 IEEE International Conference on Data Mining, Maebashi City, Japan, 9–12 December 2002. [Google Scholar]
- Senin, P.; Malinchik, S. SAX-VSM: Interpretable Time Series Classification Using SAX and Vector. In Proceedings of the 2013 IEEE 13th International Conference on Data Mining, Dallas, TX, USA, 7–10 December 2013. [Google Scholar]
- Rodríguez, J.J.; Alonso, C.J.; Boström, H. Boosting interval based literals. Intell. Data Anal. 2001, 5, 245–262. [Google Scholar] [CrossRef] [Green Version]
- Lu, W.; Chen, X.; Pedrycz, W.; Liu, X.; Yang, J. Using interval information granules to improve forecasting in fuzzy time series. Int. J. Approx. Reason. 2015, 57, 1–18. [Google Scholar] [CrossRef]
- Fulcher, B.D.; Jones, N.S. Highly Comparative Feature-Based Time-Series Classification. IEEE Trans. Knowl. Data Eng. 2014, 26, 3026–3037. [Google Scholar] [CrossRef] [Green Version]
- Nanopoulos, A.; Alcock, R.; Manolopoulos, Y. Feature-based Classification of Time-series Data. Int. J. Comput. Res. 2001, 10, 49–61. [Google Scholar]
- Lin, J.; Li, Y. Finding Structural Similarity in Time Series Data Using Bag-of-Patterns Representation. In International Conference on Scientific and Statistical Database Management; Springer: Berlin/Heidelberg, Germany, 2009; pp. 461–477. [Google Scholar] [CrossRef]
- Sutcliffe, P.R. Fourier transformation as a method of reducing the sampling interval of a digital time series. Comput. Geosci. 1988, 14, 125–129. [Google Scholar] [CrossRef]
- Hariharan, G. Wavelet Analysis—An Overview. In Wavelet Solutions for Reaction–Diffusion Problems in Science and Engineering; Forum for Interdisciplinary Mathematics; Springer: Singapore, 2019. [Google Scholar]
- Huang, N.E.; Shen, Z.; Long, S.R.; Wu, M.C.; Shih, H.H.; Zheng, Q.; Yen, N.C.; Tung, C.C.; Liu, H.H. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc. R. Soc. Lond. Ser. A Math. Phys. Eng. Sci. 1998, 454, 903–995. [Google Scholar] [CrossRef]
- Ren, H.; Wang, Y.L.; Huang, M.Y.; Chang, Y.L.; Kao, H.M. Ensemble empirical mode decomposition parameters optimization for spectral distance measurement in hyperspectral remote sensing data. Remote Sens. 2014, 6, 2069–2083. [Google Scholar] [CrossRef] [Green Version]
- Yu, J.; Yin, J.; Zhou, D.; Zhang, J. A Pattern Distance-Based Evolutionary Approach to Time Series Segmentation. In Intelligent Control and Automation; Springer: Berlin/Heidelberg, Germany, 2006. [Google Scholar]
- Tsinaslanidis, P.E.; Kugiumtzis, D. A prediction scheme using perceptually important points and dynamic time warping. Expert Syst. Appl. 2014, 41, 6848–6860. [Google Scholar] [CrossRef]
- Jiménez, P.; Nogal, M.; Caulfield, B.; Pilla, F. Perceptually important points of mobility patterns to characterise bike sharing systems: The Dublin case. J. Transp. Geogr. 2016, 54, 228–239. [Google Scholar] [CrossRef]
- Yu, H.H.; Chen, C.H.; Tseng, S. Mining Emerging Patterns from Time Series Data with Time Gap Constraint. Int. J. Innov. Comput. Inf. Control 2011, 7, 5515–5528. [Google Scholar]
- Ye, L.; Keogh, E. Time series shapelets: A novel technique that allows accurate, interpretable and fast classification. Data Min. Knowl. Discov. 2011, 22, 149–182. [Google Scholar] [CrossRef] [Green Version]
- Mueen, A.; Keogh, E.; Young, N.E. Logical-Shapelets: An Expressive Primitive for Time Series Classification. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Diego, CA, USA, 21 August 2011; pp. 1154–1162. [Google Scholar]
- Hoang, A.D.; Eamonn, K.; Kaveh, K.; Chin-Chia, M.Y.; Yan, Z.; Shaghayegh, G.; Chotirat, A.R.; Chen, Y.P.; Hu, B.; Nurjahan, B.; et al. The UCR Time Series Classification Archive. Available online: https://www.cs.ucr.edu/~eamonn/time_series_data_2018/ (accessed on 1 October 2018).
- Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine Learning in Python. J. Mach. Learn. Res. 2012, 12, 2825–2830. [Google Scholar]
- Demšar, J. Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 2006, 7, 1–30. [Google Scholar]
- Friedman, M. A comparison of alternative tests of significance for the problem of m rankings. Ann. Math. Stat. 1940, 11, 86–92. [Google Scholar] [CrossRef]
- Nemenyi, P.B. Distribution-Free Multiple Comparisons. Ph.D. Thesis, Princeton University, Princeton, NJ, USA, 1963. [Google Scholar]
- Physical Activity Monitoring for Aging People. Available online: http://www.pamap.org (accessed on 1 February 2011).
# | Dataset | Type | Train | Test | Length | Class | k | ||
---|---|---|---|---|---|---|---|---|---|
1 | BirdChicken | Image | 20 | 20 | 512 | 2 | 25 | 5 | 256 |
2 | FreezerRegularTrain | Sensor | 150 | 2850 | 301 | 2 | 10 | 2 | 150 |
3 | ShapeletSim | Simulated | 20 | 180 | 500 | 2 | 25 | 5 | 250 |
4 | ToeSegmentation1 | Motion | 40 | 228 | 277 | 2 | 16 | 5 | 138 |
5 | Worms | Motion | 181 | 77 | 900 | 5 | 10 | 1 | 450 |
6 | Rock | Spectrum | 20 | 50 | 2844 | 4 | 5 | 1 | 100 |
7 | Meat | Spectro | 60 | 60 | 448 | 3 | 20 | 2 | 224 |
8 | Beef | Spectro | 30 | 30 | 470 | 5 | 40 | 5 | 235 |
9 | InlineSkate | Motion | 100 | 550 | 1882 | 7 | 8 | 1 | 400 |
10 | Coffee | Spectro | 28 | 28 | 286 | 2 | 15 | 5 | 143 |
11 | DodgerLoopGame | Sensor | 20 | 138 | 288 | 2 | 20 | 2 | 144 |
12 | DodgerLoopWeekend | Sensor | 20 | 138 | 288 | 2 | 20 | 2 | 144 |
13 | ECGFiveDays | ECG | 23 | 861 | 136 | 2 | 14 | 5 | 68 |
14 | Ham | Spectro | 109 | 105 | 431 | 2 | 20 | 10 | 215 |
15 | Herring | Image | 64 | 64 | 512 | 2 | 20 | 10 | 256 |
16 | PowerCons | Power | 180 | 180 | 144 | 2 | 24 | 2 | 72 |
17 | Wine | Spectro | 57 | 54 | 234 | 2 | 12 | 5 | 117 |
18 | Yoga | Image | 300 | 3000 | 426 | 2 | 5 | 2 | 213 |
19 | FaceFour | Image | 24 | 88 | 350 | 4 | 20 | 10 | 175 |
20 | OliveOil | Spectro | 30 | 30 | 570 | 4 | 7 | 50 | 200 |
21 | Fish | Image | 175 | 175 | 463 | 7 | 6 | 1 | 231 |
22 | Plane | Sensor | 105 | 105 | 144 | 7 | 10 | 1 | 72 |
# | 1NN-DTW | IFT + RF | IFT + SVM | IFT + XGBOOST | IFT + GDBT |
---|---|---|---|---|---|
1 | 0.7500 | 0.8000 | 0.9000 | 0.7500 | 0.7500 |
2 | 0.9042 | 0.8968 | 0.9035 | 0.8940 | 0.8958 |
3 | 0.6944 | 0.9833 | 0.9944 | 0.9944 | 0.9778 |
4 | 0.7368 | 0.8728 | 0.8816 | 0.8246 | 0.8114 |
5 | 0.4935 | 0.6623 | 0.5974 | 0.6104 | 0.6234 |
6 | 0.6400 | 0.6000 | 0.6200 | 0.5600 | 0.6000 |
7 | 0.9333 | 0.9500 | 0.8500 | 0.9500 | 0.9167 |
8 | 0.6667 | 0.6667 | 0.6667 | 0.4667 | 0.4667 |
9 | 0.3836 | 0.3491 | 0.3400 | 0.3582 | 0.3109 |
10 | 1.0000 | 0.9643 | 0.9643 | 0.8929 | 0.7857 |
11 | 0.8768 | 0.6693 | 0.6693 | 0.5433 | 0.6142 |
12 | 0.9493 | 0.8016 | 0.7778 | 0.7460 | 0.7778 |
13 | 0.7677 | 0.7027 | 0.8281 | 0.7247 | 0.7607 |
14 | 0.5905 | 0.6381 | 0.5619 | 0.5741 | 0.5238 |
15 | 0.5312 | 0.6719 | 0.5625 | 0.6250 | 0.5938 |
16 | 0.9667 | 0.9056 | 0.9222 | 0.9167 | 0.9333 |
17 | 0.5741 | 0.7222 | 0.7037 | 0.7407 | 0.5926 |
18 | 0.8363 | 0.7767 | 0.6720 | 0.7693 | 0.7553 |
19 | 0.8977 | 0.6250 | 0.6477 | 0.5568 | 0.4659 |
20 | 0.8333 | 0.7333 | 0.7667 | 0.7667 | 0.7333 |
21 | 0.8229 | 0.7600 | 0.8114 | 0.7886 | 0.7086 |
22 | 1.0000 | 1.0000 | 0.9810 | 0.9905 | 1.0000 |
Average | 0.7659 | 0.7614 | 0.7556 | 0.7293 | 0.7090 |
Win/tie/lose | 9/1/12 | 7/1/14 | 6/1/15 | 5/2/15 |
# | 1NN-DTW | LS | BOSS | TSF | SAXVSM | IFT |
---|---|---|---|---|---|---|
1 | 0.7500 | 0.9000 | 0.9500 | 0.9000 | 0.5000 | 0.9000 |
2 | 0.9042 | 0.8079 | 0.8242 | 0.9958 | 0.6800 | 0.9035 |
3 | 0.6944 | 0.9889 | 0.6389 | 0.5000 | 0.6500 | 0.9944 |
4 | 0.7368 | 0.9167 | 0.7982 | 0.7632 | 0.8991 | 0.8816 |
5 | 0.4935 | 0.5325 | 0.6234 | 0.5714 | 0.2208 | 0.6623 |
6 | 0.6400 | 0.3200 | 0.4200 | 0.8800 | 0.1200 | 0.6200 |
7 | 0.9333 | 0.3333 | 0.8333 | 0.9333 | 0.7500 | 0.9500 |
8 | 0.6667 | 0.4000 | 0.5667 | 0.7667 | 0.5000 | 0.6667 |
9 | 0.3836 | 0.3000 | 0.2800 | 0.3673 | 0.1564 | 0.3582 |
10 | 1.0000 | 0.5000 | 0.8214 | 1.0000 | 0.9643 | 0.9643 |
11 | 0.8768 | 0.8583 | 0.5906 | 0.8031 | 0.6378 | 0.6693 |
12 | 0.9493 | 0.9762 | 0.7937 | 0.9841 | 0.7937 | 0.8016 |
13 | 0.7677 | 0.4971 | 0.6655 | 0.9872 | 0.7793 | 0.8281 |
14 | 0.5905 | 0.6190 | 0.5905 | 0.7524 | 0.5810 | 0.6381 |
15 | 0.5312 | 0.5938 | 0.6250 | 0.5625 | 0.4062 | 0.6719 |
16 | 0.9667 | 0.7833 | 0.9111 | 1.0000 | 0.7278 | 0.9333 |
17 | 0.5741 | 0.5000 | 0.4074 | 0.6667 | 0.5370 | 0.7407 |
18 | 0.8363 | 0.5473 | 0.6900 | 0.7463 | 0.6083 | 0.7767 |
19 | 0.8977 | 0.3977 | 0.5795 | 0.7614 | 0.9091 | 0.6477 |
20 | 0.8333 | 0.4000 | 0.8000 | 0.8667 | 0.5000 | 0.7667 |
21 | 0.8229 | 0.2229 | 0.5257 | 0.5429 | 0.3943 | 0.8114 |
22 | 1.0000 | 0.9905 | 0.9619 | 0.9429 | 0.9810 | 1.0000 |
Average rank | 2.7045 | 4.3182 | 4.1818 | 2.5227 | 4.7727 | 2.5000 |
Win | 6 | 1 | 1 | 9 | 1 | 6 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yan, L.; Liu, Y.; Liu, Y. Interval Feature Transformation for Time Series Classification Using Perceptually Important Points. Appl. Sci. 2020, 10, 5428. https://doi.org/10.3390/app10165428
Yan L, Liu Y, Liu Y. Interval Feature Transformation for Time Series Classification Using Perceptually Important Points. Applied Sciences. 2020; 10(16):5428. https://doi.org/10.3390/app10165428
Chicago/Turabian StyleYan, Lijuan, Yanshen Liu, and Yi Liu. 2020. "Interval Feature Transformation for Time Series Classification Using Perceptually Important Points" Applied Sciences 10, no. 16: 5428. https://doi.org/10.3390/app10165428
APA StyleYan, L., Liu, Y., & Liu, Y. (2020). Interval Feature Transformation for Time Series Classification Using Perceptually Important Points. Applied Sciences, 10(16), 5428. https://doi.org/10.3390/app10165428