Nothing Special   »   [go: up one dir, main page]

Skip to main content

Learning Entropy: On Shannon vs. Machine-Learning-Based Information in Time Series

  • Conference paper
  • First Online:
Database and Expert Systems Applications - DEXA 2022 Workshops (DEXA 2022)

Abstract

The paper discusses the Learning-based information (\({\varvec{L}}\)) and Learning Entropy (\({\varvec{L}}{\varvec{E}}\)) in contrast to classical Shannon probabilistic Information (\({\varvec{I}}\)) and probabilistic entropy (\({\varvec{H}}\)). It is shown that \({\varvec{L}}\) corresponds to the recently introduced Approximate Individual Sample-point Learning Entropy (\({\varvec{A}}{\varvec{I}}{\varvec{S}}{\varvec{L}}{\varvec{E}}\)). For data series, then, the LE should be defined as the mean value of L that is finally in proper accordance with Shannon's concept of entropy \({\varvec{H}}\). The distinction of \({\varvec{L}}\) against \({\varvec{I}}\) is explained by the real-time anomaly detection of individual time series data points (states). First, the principal distinction of the information concept of \({\varvec{I}}\boldsymbol{ }{\varvec{v}}{\varvec{s}}.\boldsymbol{ }{\varvec{L}}\) is demonstrated in respect to data governing law that \({\varvec{L}}\) considers explicitly (while \({\varvec{I}}\) does not). Second, it is shown that \({\varvec{L}}\) has the potential to be applied on much shorter datasets than \({\varvec{I}}\) because of the learning system being pre-trained and being able to generalize from a smaller dataset. Then, floating window trajectories of the covariance matrix norm, the trajectory of approximate variance fractal dimension, and especially the windowed Shannon Entropy trajectory are compared to \({\varvec{L}}{\varvec{E}}\) on multichannel EEG featuring epileptic seizure. The results on real time series show that \({\varvec{L}}\), i.e., \({\varvec{A}}{\varvec{I}}{\varvec{S}}{\varvec{L}}{\varvec{E}}\), can be a useful counterpart to Shannon entropy allowing us also for more detailed search of anomaly onsets (change points).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Abbreviations

\(AISLE\):

Approximate Individual Sample Learning Entropy

\(\alpha \):

Vector of re-sampling setups for estimation of \(VFD\)

\(\beta \):

Vector of detection sensitivities for \(LE\)

\(H\):

Shannon Entropy

\(k\):

Discrete index of time

\(LE\):

Learning Entropy (\(AISLE\))

\(n\):

Number of channels

\({n}_{{w}_{i}}\):

Length of vector \({{\varvec{w}}}_{i}\)

\(r\):

Order of \(LE\)

\(\sigma \left(.\right)\):

Standard deviation

\(VFD\):

Variance Fractal Dimension

\({\mathbf{w}}_{i}\):

Vector of all adaptive parameters (neural weights) of \({i}^{th}\) channel

\(x\), x, X:

Scalar, vector, matrix (multidim. array)

\({y}_{i}\left(k\right)\):

Measured data sample of \({i}^{th}\) channel at time \(k\)

\({\tilde{y }}_{i}\left(k\right)\):

Filter output (neural predictor output) of \({i}^{th}\) channel at time \(k\)

References

  1. Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948). https://doi.org/10.1002/j.1538-7305.1948.tb01338.x

    Article  MathSciNet  MATH  Google Scholar 

  2. Markou, M., Singh, S.: Novelty detection: a review—part 1: statistical approaches. Sig. Process. 83, 2481–2497 (2003). https://doi.org/10.1016/j.sigpro.2003.07.018

    Article  MATH  Google Scholar 

  3. Markou, M., Singh, S.: Novelty detection: a review—part 2: neural network based approaches. Sig. Process. 83, 2499–2521 (2003). https://doi.org/10.1016/j.sigpro.2003.07.019

    Article  MATH  Google Scholar 

  4. Pincus, S.M.: Approximate entropy as a measure of system complexity. Proc. Natl. Acad. Sci. U. S. A. 88, 2297–2301 (1991)

    Article  MathSciNet  Google Scholar 

  5. Richman, J.S., Moorman, J.R.: Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart Circ. Physiol. 278, H2039–2049 (2000)

    Article  Google Scholar 

  6. Bukovsky, I.: Learning entropy: multiscale measure for incremental learning. Entropy 15, 4159–4187 (2013). https://doi.org/10.3390/e15104159

    Article  MathSciNet  MATH  Google Scholar 

  7. Bukovsky, I., Kinsner, W., Homma, N.: Learning entropy as a learning-based information concept. Entropy 21, 166 (2019). https://doi.org/10.3390/e21020166

    Article  MathSciNet  Google Scholar 

  8. Bukovsky, I., Homma, N.: An approach to stable gradient-descent adaptation of higher order neural units. IEEE Trans. Neural Netw. Learn. Syst. 28, 2022–2034 (2017). https://doi.org/10.1109/TNNLS.2016.2572310

    Article  MathSciNet  Google Scholar 

  9. Bukovsky, I., Dohnal, G., Benes, P.M., Ichiji, K., Homma, N.: Letter on convergence of in-parameter-linear nonlinear neural architectures with gradient learnings. IEEE Trans. Neural Netw. Learn. Syst. 1–4, 2016 (2021). https://doi.org/10.1109/TNNLS.2021.3123533

    Article  Google Scholar 

  10. Bukovsky, I., Vrba, J., Cejnek, M.: Learning entropy: a direct approach. In: IEEE International Joint Conference on Neural Networks. IEEE, Vancouver (2016)

    Google Scholar 

  11. Mandic, D.P., Goh, V.S.L.: Complex Valued Nonlinear Adaptive Filters: Noncircularity, Widely Linear and Neural Models. Wiley (2009)

    Google Scholar 

  12. Sanei, S., Chambers, J.: EEG Signal Processing. Wiley, Chichester, England; Hoboken, NJ (2007)

    Google Scholar 

  13. Kinsner, W., Grieder, W.: Amplification of signal features using variance fractal dimension trajectory. In: 2009 8th IEEE International Conference on Cognitive Informatics, ICCI 2009, pp. 201–209 (2009). https://doi.org/10.1109/COGINF.2009.5250750

  14. Bukovsky, I., Kinsner, W., Maly, V., Krehlik, K.: Multiscale Analysis of False Neighbors for state space reconstruction of complicated systems. In: 2011 IEEE Workshop on Merging Fields of Computational Intelligence and Sensor Technology (CompSens), pp. 65–72 (2011). https://doi.org/10.1109/MFCIST.2011.5949517

  15. Bukovsky, I., Kinsner, W., Bila, J.: Multiscale analysis approach for novelty detection in adaptation plot. In: Sensor Signal Processing for Defence, SSPD 2012, pp. 1–6 (2012). https://doi.org/10.1049/ic.2012.0114

  16. Vorburger, P., Bernstein, A.: Entropy-based concept shift detection. In: 2006 6th International Conference on Data Mining, ICDM 2006, pp. 1113–1118 (2006). https://doi.org/10.1109/ICDM.2006.66

  17. Amigó, J., Balogh, S., Hernández, S.: A brief review of generalized entropies. Entropy 20, 813 (2018). https://doi.org/10.3390/e20110813

    Article  MathSciNet  Google Scholar 

  18. Bereziński, P., Jasiul, B., Szpyrka, M.: An entropy-based network anomaly detection method. Entropy 17, 2367–2408 (2015). https://doi.org/10.3390/e17042367

    Article  Google Scholar 

  19. Mahmoud, S., Martinez-Gil, J., Praher, P., Freudenthaler, B., Girkinger, A.: Deep learning rule for efficient changepoint detection in the presence of non-linear trends. In: Kotsis, G., et al. (eds.) DEXA 2021. CCIS, vol. 1479, pp. 184–191. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87101-7_18

    Chapter  Google Scholar 

Download references

Acknowledgment

The research reported in this paper has been funded by the European Interreg Austria-Czech Republic project “PredMAIn (ATCZ279)”.

Anonymous real EEG dataset courtesy of Department of Neurology, Faculty of Medicine in Hradec Kralove, Charles University, Czech Republic.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ivo Bukovsky .

Editor information

Editors and Affiliations

Appendix

Appendix

(See Figs. 5, 6 and 7)

Fig. 4.
figure 4

The comparison of AISLE with the other proposed methods on artificial multichannel data (15) with noisy intervals at k = 400…500 and k = 700…800; covDEMT \(\left\| {{\mathbf{covX}}^{{\Delta }} } \right\|\) does not detect anything (due to various frequencies and phases of sinusoidal channels), VFDT and H gradually detects the changes in data, and AISLE instantly detects the increased learning effort when dynamical behavior is presented at k > 400 and k > 700.

Fig. 5.
figure 5

The data Y are EEG channels with known epileptic seizure starting at about t = 44 s, its detection with the discussed methods is in further figures.

Fig. 6.
figure 6

The evaluation of multichannel EEG data (Fig. 2), where covDEMT = does not indicate the seizure onset at t = 44s, VFDT gradually increases, and both H and AISLE indicate the seizure onset rather instantly. The detail of how early H vs. AISLE detect the seizure is shown in Fig. 4.

Fig. 7.
figure 7

(Detail of Fig. 3) Both methods, i.e., the DEM-based Shannon Entropy trajectory H as well as the Approximate Individual Sample Learning Entropy (AISLE) were found practically feasible for early seizure indication while the AISLE has potential of earlier detection (bottom axis) because it is in principle more sensitive to individual samples of data.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bukovsky, I., Budik, O. (2022). Learning Entropy: On Shannon vs. Machine-Learning-Based Information in Time Series. In: Kotsis, G., et al. Database and Expert Systems Applications - DEXA 2022 Workshops. DEXA 2022. Communications in Computer and Information Science, vol 1633. Springer, Cham. https://doi.org/10.1007/978-3-031-14343-4_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-14343-4_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-14342-7

  • Online ISBN: 978-3-031-14343-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics