Abstract
The paper discusses the Learning-based information (\({\varvec{L}}\)) and Learning Entropy (\({\varvec{L}}{\varvec{E}}\)) in contrast to classical Shannon probabilistic Information (\({\varvec{I}}\)) and probabilistic entropy (\({\varvec{H}}\)). It is shown that \({\varvec{L}}\) corresponds to the recently introduced Approximate Individual Sample-point Learning Entropy (\({\varvec{A}}{\varvec{I}}{\varvec{S}}{\varvec{L}}{\varvec{E}}\)). For data series, then, the LE should be defined as the mean value of L that is finally in proper accordance with Shannon's concept of entropy \({\varvec{H}}\). The distinction of \({\varvec{L}}\) against \({\varvec{I}}\) is explained by the real-time anomaly detection of individual time series data points (states). First, the principal distinction of the information concept of \({\varvec{I}}\boldsymbol{ }{\varvec{v}}{\varvec{s}}.\boldsymbol{ }{\varvec{L}}\) is demonstrated in respect to data governing law that \({\varvec{L}}\) considers explicitly (while \({\varvec{I}}\) does not). Second, it is shown that \({\varvec{L}}\) has the potential to be applied on much shorter datasets than \({\varvec{I}}\) because of the learning system being pre-trained and being able to generalize from a smaller dataset. Then, floating window trajectories of the covariance matrix norm, the trajectory of approximate variance fractal dimension, and especially the windowed Shannon Entropy trajectory are compared to \({\varvec{L}}{\varvec{E}}\) on multichannel EEG featuring epileptic seizure. The results on real time series show that \({\varvec{L}}\), i.e., \({\varvec{A}}{\varvec{I}}{\varvec{S}}{\varvec{L}}{\varvec{E}}\), can be a useful counterpart to Shannon entropy allowing us also for more detailed search of anomaly onsets (change points).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Abbreviations
- \(AISLE\):
-
Approximate Individual Sample Learning Entropy
- \(\alpha \):
-
Vector of re-sampling setups for estimation of \(VFD\)
- \(\beta \):
-
Vector of detection sensitivities for \(LE\)
- \(H\):
-
Shannon Entropy
- \(k\):
-
Discrete index of time
- \(LE\):
-
Learning Entropy (\(AISLE\))
- \(n\):
-
Number of channels
- \({n}_{{w}_{i}}\):
-
Length of vector \({{\varvec{w}}}_{i}\)
- \(r\):
-
Order of \(LE\)
- \(\sigma \left(.\right)\):
-
Standard deviation
- \(VFD\):
-
Variance Fractal Dimension
- \({\mathbf{w}}_{i}\):
-
Vector of all adaptive parameters (neural weights) of \({i}^{th}\) channel
- \(x\), x, X:
-
Scalar, vector, matrix (multidim. array)
- \({y}_{i}\left(k\right)\):
-
Measured data sample of \({i}^{th}\) channel at time \(k\)
- \({\tilde{y }}_{i}\left(k\right)\):
-
Filter output (neural predictor output) of \({i}^{th}\) channel at time \(k\)
References
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27, 379–423 (1948). https://doi.org/10.1002/j.1538-7305.1948.tb01338.x
Markou, M., Singh, S.: Novelty detection: a review—part 1: statistical approaches. Sig. Process. 83, 2481–2497 (2003). https://doi.org/10.1016/j.sigpro.2003.07.018
Markou, M., Singh, S.: Novelty detection: a review—part 2: neural network based approaches. Sig. Process. 83, 2499–2521 (2003). https://doi.org/10.1016/j.sigpro.2003.07.019
Pincus, S.M.: Approximate entropy as a measure of system complexity. Proc. Natl. Acad. Sci. U. S. A. 88, 2297–2301 (1991)
Richman, J.S., Moorman, J.R.: Physiological time-series analysis using approximate entropy and sample entropy. Am. J. Physiol. Heart Circ. Physiol. 278, H2039–2049 (2000)
Bukovsky, I.: Learning entropy: multiscale measure for incremental learning. Entropy 15, 4159–4187 (2013). https://doi.org/10.3390/e15104159
Bukovsky, I., Kinsner, W., Homma, N.: Learning entropy as a learning-based information concept. Entropy 21, 166 (2019). https://doi.org/10.3390/e21020166
Bukovsky, I., Homma, N.: An approach to stable gradient-descent adaptation of higher order neural units. IEEE Trans. Neural Netw. Learn. Syst. 28, 2022–2034 (2017). https://doi.org/10.1109/TNNLS.2016.2572310
Bukovsky, I., Dohnal, G., Benes, P.M., Ichiji, K., Homma, N.: Letter on convergence of in-parameter-linear nonlinear neural architectures with gradient learnings. IEEE Trans. Neural Netw. Learn. Syst. 1–4, 2016 (2021). https://doi.org/10.1109/TNNLS.2021.3123533
Bukovsky, I., Vrba, J., Cejnek, M.: Learning entropy: a direct approach. In: IEEE International Joint Conference on Neural Networks. IEEE, Vancouver (2016)
Mandic, D.P., Goh, V.S.L.: Complex Valued Nonlinear Adaptive Filters: Noncircularity, Widely Linear and Neural Models. Wiley (2009)
Sanei, S., Chambers, J.: EEG Signal Processing. Wiley, Chichester, England; Hoboken, NJ (2007)
Kinsner, W., Grieder, W.: Amplification of signal features using variance fractal dimension trajectory. In: 2009 8th IEEE International Conference on Cognitive Informatics, ICCI 2009, pp. 201–209 (2009). https://doi.org/10.1109/COGINF.2009.5250750
Bukovsky, I., Kinsner, W., Maly, V., Krehlik, K.: Multiscale Analysis of False Neighbors for state space reconstruction of complicated systems. In: 2011 IEEE Workshop on Merging Fields of Computational Intelligence and Sensor Technology (CompSens), pp. 65–72 (2011). https://doi.org/10.1109/MFCIST.2011.5949517
Bukovsky, I., Kinsner, W., Bila, J.: Multiscale analysis approach for novelty detection in adaptation plot. In: Sensor Signal Processing for Defence, SSPD 2012, pp. 1–6 (2012). https://doi.org/10.1049/ic.2012.0114
Vorburger, P., Bernstein, A.: Entropy-based concept shift detection. In: 2006 6th International Conference on Data Mining, ICDM 2006, pp. 1113–1118 (2006). https://doi.org/10.1109/ICDM.2006.66
Amigó, J., Balogh, S., Hernández, S.: A brief review of generalized entropies. Entropy 20, 813 (2018). https://doi.org/10.3390/e20110813
Bereziński, P., Jasiul, B., Szpyrka, M.: An entropy-based network anomaly detection method. Entropy 17, 2367–2408 (2015). https://doi.org/10.3390/e17042367
Mahmoud, S., Martinez-Gil, J., Praher, P., Freudenthaler, B., Girkinger, A.: Deep learning rule for efficient changepoint detection in the presence of non-linear trends. In: Kotsis, G., et al. (eds.) DEXA 2021. CCIS, vol. 1479, pp. 184–191. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87101-7_18
Acknowledgment
The research reported in this paper has been funded by the European Interreg Austria-Czech Republic project “PredMAIn (ATCZ279)”.
Anonymous real EEG dataset courtesy of Department of Neurology, Faculty of Medicine in Hradec Kralove, Charles University, Czech Republic.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Bukovsky, I., Budik, O. (2022). Learning Entropy: On Shannon vs. Machine-Learning-Based Information in Time Series. In: Kotsis, G., et al. Database and Expert Systems Applications - DEXA 2022 Workshops. DEXA 2022. Communications in Computer and Information Science, vol 1633. Springer, Cham. https://doi.org/10.1007/978-3-031-14343-4_38
Download citation
DOI: https://doi.org/10.1007/978-3-031-14343-4_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-14342-7
Online ISBN: 978-3-031-14343-4
eBook Packages: Computer ScienceComputer Science (R0)