Abstract
In the frame of designing a knowledge discovery system, we have developed stochastic models based on high-order hidden Markov models. These models are capable to map sequences of data into a Markov chain in which the transitions between the states depend on the n previous states according to the order of the model. We study the process of achieving information extraction from spatial and temporal data by means of an unsupervised classification. We use therefore a French national database related to the land use of a region, named Ter Uti, which describes the land use both in the spatial and temporal domain. Land-use categories (wheat, corn, forest, ...) are logged every year on each site regularly spaced in the region. They constitute a temporal sequence of images in which we look for spatial and temporal dependencies.
The temporal segmentation of the data is done by means of a second-order Hidden Markov Model (HMM2) that appears to have very good capabilities to locate stationary segments, as shown in our previous work in speech recognition. The spatial classification is performed by defining a fractal scanning of the images with the help of a Hilbert–Peano curve that introduces a total order on the sites, preserving the relation of neighborhood between the sites. We show that the HMM2 performs a classification that is meaningful for the agronomists.
Spatial and temporal classification may be achieved simultaneously by means of a two levels HMM2 that measures the a posteriori probability to map a temporal sequence of images onto a set of hidden classes.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agrawal R, Mannila H, Srikant R, Toivonen H, Verkano AI (1996) Fast discovery of association rules. In: MI Fayyad U, (ed), Advances in Knowledge Discovery and Data Mining, pages 307–328. AAAI Press, Menlo Park
Agrawal R, Srikant R (1995) Mining sequential pattern. In: Eleventh international conference on data engineering (ICDE'95), pp 3–14
Ambroise C, Dang MV, Govaert G (1996) Clustering of spatial data by the EM algorithm. In: GeoENV-96, 1st European conference on geostatistics for environmental application, Lisbonne, pp 20–22
Aycard O, Charpillet F, Fohr D, Mari J-F Sep(1997) Place learning and recognition using hidden markov models. In: Proceedings IEEE-RSJ on international conference on intelligent robots and systems, Grenoble, France, pp 1741–1746
Bahl LR, Brown PF, De Souza PV, Mercer RL (1986) Maximum mutual information estimation of hidden markov model parameters. In: Proceedings of IEEE-international conference on acoustics, speech, and signal processing, Tokyo, pp 49–52
Baker JK (1974) Stochastic modeling for automatic speech understanding. In: Reddy DR (ed), Speech recognition, Academic Press, New York, pp 521–542
Benmiloud B, Pieczynski W (1995) Estimation des paramètres dans les chaînes de Markov cachés et segmentation d'images. Traitement du signal 12(5):433–454
Benoît M, Le Ber F, Mari J-F June(2001) Recherche des successions de cultures et de leurs évolutions : analyse par HMM des données Ter-Uti en Lorraine. Agreste Vision La statistique agricole 31:23–30
Berndt, DJ (1996) Finding patterns in time series. In: Fayyad UM, Piatetsky-Shapiro G, Smyth P,Uthurusamy R (eds) Advances in knowledge discovery and data mining, AAAI Press/The MIT Press, Menlopark, pp 229–248
Bize L, Muri F, Samson F, Rodolphe F, Dusko Ehrlich S, Prum B, Bessières P (1999) Searching gene transfers on bacillus subtilis using hidden markov models. In: RECOMB'99
Brachman RJ, Selfridge PG, Terveen LG, Altman B, Borgida A, Halper F, Kirk T, Lazar A, McGuinness DL, Resnick LA (1993) Integrated support for data archaeology. Int J Intell Cooperative Inf
Bréhélin L, Gascuel O, Caraux G (1999) Apprentissage de séquences de vecteurs booléens à l'aide de modèles de markov cachés avec patterns. Application au test de circuits intégrés. In: Conférence d'apprentissage, pp 25–35
Casterad MA, Herrero J (1998) Irrivol: a method to estimate the yearly and monthly water applied in an irrigation district. Water Resources Research 34:3045–3049
Dempster AP, Laird NM, Rubin DB (1977) Maximum-likelihood from incomplete data via the EM algorithm. J Roy Stat Soc Ser B (methodological) 39:1–38
Geman S, Geman D (1984) Stochastic relaxation, gibbs distribution, and the bayesian restoration of images. IEEE transaction on pattern analysis and machine intelligence, p 6
Hergalant S, Aigle B, Decaris B, Mari J-F, Leblond P (2003) Hmm, an efficient way to detect transcriptional promoters in bacterial genomes? In: European conference on computational biology – ECCB'2003, Paris, Poster in conjonction with the french national conference on Bioinformatics (JOBIM 2003) pp 417–419
Huo Q, Chan C, Leebah C-H (1995) Bayesian adaptive learning of the parameters of hidden markov model for speech recognition. IEEE Trans Acoutics Speech Signal Processing, 3(5):334–345
Adibi J, Shen W-M (2001) Self similar layered hidden markov model. In: Fifth european conference on principles of knowledge discovery in databases, Freiburg, Germany
Jelinek F Apr (1976) Continuous speech recognition by statistical methods. IEEE Trans acoutics, speech signal processing 64(4):532–556
Mari J-F Nov (1985) Reconnaissance de mots enchaînés à l'aide de modèles markoviens discrets. In: Actes congrès AFCET reconnaissance des formes et intelligence artificielle, Grenoble pp 859–867
Mari J-F, Haton J-P, Kriouile A Jan (1997) Automatic word recognition based on second-order hidden markov models. IEEE Trans speech and audio processing, 5:22–25
Mari J-F, Napoli A Sep(1997) Modèles stochastiques pour la classification de signaux temporels. In: Actes des cinquièmes rencontres de la société francophone de classification, Lyon, France pp 51–54
Mari J-F, Schott R Jan(2001) Probabilistic and Statistical Methods in Computer Science. Kluwer Dordrecht
Mesmin D (2002) Estimation de l'assolement d'un territoire. Technical report, Mémoire DESS Statistiques et traitement du signal, Université Blaise Pascal, Clermont–Ferrand, 2002. 73
Meybeck M, de Marsilly G, Fustec E (1998) La Seine en son bassin, fonctionnement d'un système fluvial anthropisé. Elsevier, Amsterdam, 750p
Mignolet C, Schott C, Mari J-F, Benoît M Feb(2003) Typologies des successions de cultures et des techniques culturales dans le bassin de la seine. Rapport intermédiaire, Institute national pour la recherche agronomique (INRA)
Mury F (1997) Comparaison d'algorithmes d'identification de chaînes de Markov cachées et application à la détection de régions homogènes dans les séquences d'ADN. Thèse de doctorat, Université René Descartes, Paris V
Quian W, Titterington DM (1990) Parameter estimation for hidden gibbs chains. Stat Prob Lett 10:49–58
Rabiner L, Juang B (1995) Fundamentals of speech recognition. Prentice Hall
Robert CP, Celeux G, Diebolt J (1993) Bayesian estimation of hidden markov chains: a stochastic implementation. Stat Prob Lett 16:77–83
Saon G (1997) Modèles markoviens uni- et bidimensionnels pour la reconnaissance de l'écriture manuscrite hors-ligne. PhD thesis, Université Henri Poincaré - Nancy I, Vandœuvre-lès-Nancy
Saon G, Belaid A (1996) Recognition of unconstrained handwritten words using markov random fields and HMMs. In: Vth (ed) workshop on frontiers in hand-writing recognition, University of Essex, England
Fine S, Singer Y, Tishby N (1998) The hierarchical hidden markov model: analysis and applications. Machine Learning, pp 41 – 62
Tou JT, Gonzales R (1974) Pattern recognition principles. Addison-Wesley, Reading
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mari, JF., Ber, F. Temporal and spatial data mining with second-order hidden markov models. Soft Comput 10, 406–414 (2006). https://doi.org/10.1007/s00500-005-0501-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-005-0501-0