Abstract
Analysis of the relationship between a large number of sequences is a significant problem in many different applications such as business processes, sport, voting, weblogs, etc. Generally, studying relationship is based on clustering the sequences and creating a network of relationships. Interpretation and validation of such results require a domain expert knowledge. In this paper, we propose a methodology which is able to provide an insight into the sequence dataset prior to the analysis and independently of a domain expert. Such information may be used to direct the analysis, identify sequences of interest and expose special patterns in the sequences. This methodology leverages tools such as transition matrix, Shannon entropy, complexity index, pairwise state occurrence, etc. Due to the low computational complexity of these methods, this approach is possible to use on the large datasets and help to identify the subsets of such datasets which should be inspected closer with more sophisticated tools. Ability to extract relevant information using the aforementioned tools was validated on two datasets, one from business processes simulation and the other from robot soccer game simulation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Disco. https://fluxicon.com/disco/. Accessed 20 Apr 2019
Prom tools. http://www.promtools.org/doku.php. Accessed 20 Apr 2019
Robocup. https://www.robocup.org/. Accessed 20 Apr 2019
Signavio gmbh. https://www.signavio.com/. Accessed 20 Apr 2019
Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Syst. 1695 (2006). http://igraph.org
Gabadinho, A., Ritschard, G., Müller, N.S., Studer, M.: Analyzing and visualizing state sequences in R with TraMineR. J. Stat. Softw. 40(4), 1–37 (2011)
Gabadinho, A., Ritschard, G., Studer, M., Müller, N.S.: Indice de complexité pour le tri et la comparaison de séquences catégorielles. Revue des nouvelles technologies de l’information (RNTI) E-19, 61–66 (2010)
R Core Team: R: A Language and Environment for Statistical Computing (2018). https://www.r-project.org/
Shannon, C.E.: A mathematical theory of communication. Bell Syst. Tech. J. 27 (1948)
Slaninová, K., Martinovič, J., Šperka, R., Dráždilová, P.: Extraction of agent groups with similar behaviour based on agent profiles. In: Saeed, K., Chaki, R., Cortesi, A., Wierzchoń, S. (eds.) CISIM 2013. LNCS, vol. 8104, pp. 348–357. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40925-7_32
Slaninová, K., Vymětal, D., Martinovič, J.: Analysis of event logs: behavioral graphsD. In: Benatallah, B., et al. (eds.) WISE 2014. LNCS, vol. 9051, pp. 42–56. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-20370-6_4
Visser, A., Ito, N., Kleiner, A.: RoboCup rescue simulation innovation strategy. In: Bianchi, R.A.C., Akin, H.L., Ramamoorthy, S., Sugiura, K. (eds.) RoboCup 2014. LNCS (LNAI), vol. 8992, pp. 661–672. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18615-3_54
Vymetal, D., Sperka, R.: MAREA - from an agent simulation application to the social network analysis. In: 18th Annual International Conference on Knowledge-Based and Intelligent Information and Engineering Systems KES-2014. Procedia Computer Science. Gdynia Maritime University, Pomeranian Sci & Technol, Gdynia, Poland, vol. 35, pp. 1416–1425, 15–17 September 2014. https://doi.org/10.1016/j.procs.2014.08.198
Wickham, H.: tidyverse: easily install and load the ’Tidyverse’ (2017). https://CRAN.R-project.org/package=tidyverse, r package version 1.2.1
Acknowledgement
This work was supported by The Ministry of Education, Youth and Sports from the National Programme of Sustainability (NPS II) project “IT4Innovations excellence in science - LQ1602”.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Martinovič, T., Janurová, K., Martinovič, J., Slaninová, K., Svatoň, V. (2019). Sequence Analysis for Relationship Pattern Extraction. In: Saeed, K., Chaki, R., Janev, V. (eds) Computer Information Systems and Industrial Management. CISIM 2019. Lecture Notes in Computer Science(), vol 11703. Springer, Cham. https://doi.org/10.1007/978-3-030-28957-7_29
Download citation
DOI: https://doi.org/10.1007/978-3-030-28957-7_29
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-28956-0
Online ISBN: 978-3-030-28957-7
eBook Packages: Computer ScienceComputer Science (R0)