Abstract
We propose a new method to combine adaptive processes with a class of entropy estimators for the case of streams of data. Starting from a first estimation obtained from a batch of initial data, model parameters are estimated at each step by combining the prior knowledge with the new observation (or a block of observations). This allows to extend the maximum entropy technique to a dynamical setting, also distinguishing between entropic contributions of the signal and the error. Furthermore, it provides a suitable approximation of standard GME problems when the exacted solutions are hard to evaluate. We test this method by performing numerical simulations at various sample sizes and batch dimensions. Moreover, we extend this analysis exploring intermediate cases between streaming GCE and standard GCE, i.e., considering blocks of observations of different sizes to update the estimates, and incorporating collinearity effects as well. The role of time in the balance between entropic contributions of signal and errors is further explored considering a variation of the Streaming GCE algorithm, namely Weighted Streaming GCE. Finally, we discuss the results: In particular, we highlight the main characteristics of this method, the range of application, and future perspectives.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Amusa L, Zewotir T, North D (2019) Examination of entropy balancing technique for estimating some standard measures of treatment effects: a simulation study. Electron J Appl Stat Anal 12(2):491–507. https://doi.org/10.1285/i20705948v12n2p491
Angelelli M (2017) Tropical limit and micro-macro correspondence in statistical physics. J Phys A: Math Theor 50:415202. https://doi.org/10.1088/1751-8121/aa863b
Angelelli M, Konopelchenko B (2018) Zeros and amoebas of partition functions. Rev Math Phys 30(09):1850015. https://doi.org/10.1142/s0129055x18500150
Bagya Lakshmi H, Gallo M, Srinivasan RM (2018) Comparison of regression models under multi-collinearity. Electron J Appl Stat Anal 11(1):340–368. https://doi.org/10.1285/i20705948v11n1p340
Berger AL, Della Pietra VJ, Della Pietra S (1996) A maximum entropy approach to natural language processing. Comput Linguist 22(1):39–71
Bertsekas DP (1975) Combined primal–dual and penalty methods for constrained minimization. SIAM J Control 13(3):521–544. https://doi.org/10.1137/0313030
Ciavolino E, Al-Nasser AD (2009) Comparing generalised maximum entropy and partial least squares methods for structural equation models. J Nonparametr Stat 21(8):1017–1036. https://doi.org/10.1080/10485250903009037
Ciavolino E, Calcagnì A (2014) Generalized cross entropy method for analysing the SERVQUAL model. J Appl Stat 42(3):520–534. https://doi.org/10.1080/02664763.2014.963526
Ciavolino E, Calcagnì A (2016) A generalized maximum entropy (GME) estimation approach to fuzzy regression model. Appl Soft Comput 38:51–63. https://doi.org/10.1016/j.asoc.2015.08.061
Ciavolino E, Carpita M (2014) The GME estimator for the regression model with a composite indicator as explanatory variable. Qual Quant 49(3):955–965. https://doi.org/10.1007/s11135-014-0061-4
Ciavolino E, Dahlgaard JJ (2009) Simultaneous equation model based on the generalized maximum entropy for studying the effect of management factors on enterprise performance. J Appl Stat 36(7):801–815. https://doi.org/10.1080/02664760802510026
Ciavolino E, Carpita M, Al-Nasser AD (2014) Modelling the quality of work in the italian social co-operatives combining NPCA-RSM and SEM-GME approaches. J Appl Stat 42(1):161–179. https://doi.org/10.1080/02664763.2014.938226
Cover TM, Thomas JA (2006) Elements of information theory. Wiley, Hoboken, NJ. https://doi.org/10.1002/047174882x
Crooks GE (1999) Entropy production fluctuation theorem and the nonequilibrium work relation for free energy differences. Phys Rev E 60(3):2721–2726. https://doi.org/10.1103/physreve.60.27
Daum F (2005) Nonlinear filters: beyond the Kalman filter. IEEE Aerosp Electron Syst Mag 20(8):57–69. https://doi.org/10.1109/maes.2005.1499276
Dewar R (2009) Maximum entropy production as an inference algorithm that translates physical assumptions into macroscopic predictions: don’t shoot the messenger. Entropy 11(4):931–944. https://doi.org/10.3390/e11040931
Feynman RP (1998) Statistical mechanics: a set of lectures, advanced books classics (revised edition). Westview Press, Boulder, CO
Gladyshev G (1997) Thermodynamic theory of the evolution of living beings. Nova Science Pub. Inc, Hauppauge
Golan A (1998) Maximum entropy, likelihood and uncertainty. In: Erickson G, Rychert JT, Smith CR (eds) Maximum entropy and Bayesian methods. Boise, Idaho, USA, 1997: proceedings of the 17th international workshop on maximum entropy and Bayesian methods of statistical analysis, vol 98 of fundamental theories of physics. Springer, Dordrecht. https://doi.org/10.1007/978-94-011-5028-6
Golan A (2007) Information and entropy econometrics—a review and synthesis. Found Trends Econ 2(1–2):1–145. https://doi.org/10.1561/0800000004
Golan A (2018) Foundations of info-metrics: modeling, inference, and imperfect information. Oxford University Press, New York, NY. https://doi.org/10.1093/oso/9780199349524.001.0001
Golan A, Judge G, Miller D (1996) Maximum entropy econometrics: robust estimation with limited data. Wiley, Chichester
Holzinger A, Hörtenhuber M, Mayer C, Bachler M, Wassertheurer S, Pinho AJ, Koslicki D (2014) On entropy-based data mining. In: Holzinger A, Jurisica I (eds) Interactive knowledge discovery and data mining in biomedical informatics. Springer Nature, pp 209–226. https://doi.org/10.1007/978-3-662-43968-5_12
Jarzynski C (1997) Nonequilibrium equality for free energy differences. Phys Rev Lett 78(14):2690–2693. https://doi.org/10.1103/physrevlett.78.2690
Jaynes ET (2003) Probability theory—the logic of science. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511790423
Khinchin AI (1957) Mathematical foundations of information theory. Dover, Grove
Landau LD, Lifschitz EM (1980) Statistical physics, vol 5 of course of theoretical physics. Butterworth-Heinemann, Oxford
Papalia R Bernardini, Ciavolino E (2011) GME estimation of spatial structural equations models. J Classif 28(1):126–141. https://doi.org/10.1007/s00357-011-9073-0
Pukelsheim F (1994) The three sigma rule. Am Stat 48(2):88–91. https://doi.org/10.2307/2684253
Simon D (2006) Optimal state estimation. Wiley, Hoboken, NJ. https://doi.org/10.1002/0470045345
Solomonoff RJ (1964) A formal theory of inductive inference, parts i, ii. Inf Control 7(1, 2):1–2, 224–254. https://doi.org/10.1016/s0019-9958(64)90131-7
Widrow B, Winter R (1988) Neural nets for adaptive filtering and adaptive pattern recognition. Computer 21(3):25–39. https://doi.org/10.1109/2.29
Wu X (2009) A weighted generalized maximum entropy estimator with a data-driven weight. Entropy 11(4):917–930. https://doi.org/10.3390/e11040917
Xu X, He H, Hu D (2002) Efficient reinforcement learning using recursive least-squares methods. J Artif Intell Res 16:259–292. https://doi.org/10.1613/jair.946
Zanetti R (2012) Recursive update filtering for nonlinear estimation. IEEE Trans Autom Control 57(6):1481–1490. https://doi.org/10.1109/tac.2011.2178334
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no conflict of interest.
Additional information
Communicated by M. Squillante.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Angelelli, M., Ciavolino, E. & Pasca, P. Streaming generalized cross entropy. Soft Comput 24, 13837–13851 (2020). https://doi.org/10.1007/s00500-019-04632-w
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04632-w