Abstract
In highly complex and flexible environments, event logs tend to exhibit high levels of heterogeneity, and clustering-based methods are candidate techniques for simplifying the mined process models from the process observations. To compensate for the information loss occurring during clustering, semantic information from event logs may be extracted and organized in the form of knowledge structures such as process ontologies using methods of ontology learning. In this article, we propose an overall computational framework for event log pre-processing, and then focus on a specific component of the framework, namely event log aggregation. We develop a detailed system architecture for this component, along with an implemented and evaluated research prototype SemAgg. We use phrase-based semantic similarity between normalized event names to aggregate event logs in a hierarchical form. We discuss the practical implications of this work for learning lower level process ontology classes as well as performing further process mining and analytics.
Similar content being viewed by others
References
Alves de Medeiros, A. K., van Dongen, B. F., van der Aalst, W. M. P., & Weijters, A. J. M. M. (2004). Process mining for ubiquitous mobile systems: An overview and a concrete algorithm. In L. Baresi, S. Dustdar, H. Gall, & M. Matera (Eds.), Ubiquitous Mobile Information and Collaboration Systems (UMICS 2004) (Vol. 3272, pp. 151–165). Berlin: Springer.
Alves de Medeiros, A. K., Weijters, A. J. M. M., & van der Aalst, W. M. P. (2006). Genetic process mining: an experimental evaluation. Data Mining and Knowledge Discovery, 14(2), 245–304.
Alves de Medeiros, A. K., Pedrinaci, C., van der Aalst, W. M. P., Domingue, J., Song, M., Rozinat, A., Norton, B., Cabral, L. (2007). An outlook on semantic business process and monitoring. In Proceedings of the 2007 OTM Confederated international conference on the move to meaningful internet systems - Volume Part II (1244–1255). Berlin: Springer-Verlag.
Alves de Medeiros, A. K., Guzzo, A., Greco, G., & van der Aalst, W. M. P. (2008a). Process mining based on clustering: A quest for precision. In Business process management workshops lecture notes in Computer Science, (Vol. 4928, pp. 17–29. Berlin: Springer.
Alves de Medeiros, A. K., Karla, A., van der Aalst, W. M. P., Pedrinaci, C., & Alves de Medeiros, A. K. (2008b). Semantic process mining tools: Core building blocks. In W. Golden, T. Acton, K. Conboy, H. van der Heijden, & V. Tuunainen (Eds.), Proceedings of the 16th European Conference on Information Systems (ECIS’08) (pp. 1953–1964). Ireland: Galway.
APQC. (2012). APQC’s Process Classification Framework, Version 6.0.0-en-XI.
Bae, J., Caverlee, J., Liu, L., & Yan, H. (2006). Process mining by measuring process block similarity. In Business Process Management Workshops, Lecture Notes in Computer Science (Vol. 4103, pp. 141–152). Berlin: Springer-Verlag.
Bose, R., & van der Aalst, W. M. P. (2009). Abstractions in process mining: A taxonomy of patterns. In Business process management lecture notes in Computer Science (Vol. 5701, pp. 159–175). Berlin: Springer.
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V. (2002). GATE: An architecture for development of robust HLT applications. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics (ACL) (pp. 168–175).
Edgington, T. M., Raghu, T. S., & Vinze, A. S. (2010). Using process mining to identify coordination patterns in IT service management. Decision Support Systems, 49(2), 175–186. doi:10.1016/j.dss.2010.02.003.
Ehrig, M., Koschmider, A., & Oberweis, A. (2007). Measuring similarity between semantic business process models. In Proceedings of the 4th Asia-Pacific conference on Comceptual modelling - Volume 67 (APCCM '07) (pp. 71–80). Australia: Darlinghurst.
Ferreira, D. R., & Thom, L. H. (2012). A semantic approach to the discovery of workflow activity patterns in event logs. International Journal of Business Process Integration and Management, 6(1), 4–17.
Folino, F., Greco, G., Guzzo, A., & Pontieri, L. (2011). Mining usage scenarios in business processes: outlier-aware discovery and run-time prediction. Data & Knowledge Engineering, 70(12), 1005–1029. doi:10.1016/j.datak.2011.07.002.
Fowlkes, E. B., & Mallows, C. L. (1983). A method for comparing two hierarchical clusterings. Journal of the American Statistical Association, 78(383), 553–569.
Grau, B. C., Parsia, B., Sirin, E. (2004). Working with Multiple Ontologies on the Semantic Web. In The Semantic Web – ISWC 2004, Lecture Notes in Computer Science (Vol. 3298, pp. 620–634)
Greco, G., Guzzo, A., Pontieri, L., & Sacca, D. (2006). Discovering expressive process models by clustering log traces. IEEE Transactions on Knowledge and Data Engineering, 18(8), 1010–1027.
Günther, C. W. & van der Aalst, W. M. P. (2007). Fuzzy mining – adaptive process simplification based on multi-perspective metrics. In Business process management lecture notes in Computer Science (Vol. 4714, pp. 328–343). Berlin: Springer.
Hwang, M., Choi, C., & Kim, P. (2011). Automatic enrichment of semantic relation network and its application to word sense disambiguation. IEEE Transactions on Knowledge and Data Engineering, 23(6), 845–858.
IEEE Task Force on Process Mining. (2011). Process mining manifesto.
Iglesias, J. A., Angelov, P., Ledezma, A., & Sanchis, A. (2012). Creating evolving user behavior profiles automatically. IEEE Transactions on Knowledge and Data Engineering, 24(5), 854–867.
Jareevongpiboon, W., & Janecek, P. (2013). Ontological approach to enhance results of business process mining and analysis. Business Process Management Journal, 19(3), 459–476. doi:10.1108/14637151311319905.
Leacock, C., & Chodorow, M. (1998). Combining local context and WordNet similarity for word sense identification. In WordNet: An electronic lexical database (pp. 265–283). MIT press.
Lin, D. (1998). An information-theoretic definition of similarity. In Proceeding ICML’98 Proceedings of the Fifteenth International Conference on Machine Learning (pp. 296 – 304).
Lin, Y. (2008). Semantic annotation for process models: facilitating process knowledge management via semantic interoperability. Department of computer and information science. Trondheim: Norwegian University of Science and Technology.
Ly, L. T., Indiono, C., Mangler, J., Rinderle-Ma, S. (2012). Data transformation and semantic log purging for process mining. In Advanced Information Systems Engineering, Lecture Notes in Computer Science (Vol. 7328, pp. 238–253).
Maedche, A., Motik, B., Stojanovic, L., Studer, R., Volz, R. (2002). Managing multiple ontologies and ontology evolution in ontologging. In Intelligent Information Processing: IFIP — The International Federation for Information Processing (Vol. 93, pp. 51–63).
Malone, T. W., Crowston, K., & Herman, G. A. (2003). Organizing business knowledge: The MIT process handbook. Cambridge: The MIT Press.
Mans, R. S., Schonenberg, M. H., Song, M., & van der Aalst, W. M. P. (2009). Application of process mining in healthcare—a case study in a dutch hospital. Biomedical Engineering Systems and Technologies Communications in Computer and Information Science, 25, 425–438.
Navigli, R. (2009). Word sense disambiguation: a survey. ACM Computing Surveys, 41(2), 1–69. doi:10.1145/1459352.1459355.
Patwardhan, S., Banerjee, S., Pedersen, T. (2003). Using measures of semantic relatedness for word sense disambiguation. In Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics (pp. 241–257). Mexico City, Mexico.
Pedersen, T., Patwardhan, S., Michelizzi, J. (2004). WordNet: Similarity - measuring the relatedness of concepts. In Proceeding HLT-NAACL--Demonstrations’04 Demonstration Papers at HLT-NAACL 2004 (pp. 38–41).
Princeton-University. (2012). About WordNet. Retrieved from http://wordnet.princeton.edu/.
Resnik, P. (1995). Using information content to evaluate semantic similarity in a taxonomy. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (Vol. 1).
Sánchez, D., Batet, M., Valls, A., & Gibert, K. (2009). Ontology-driven web-based semantic similarity. Journal of Intelligent Information Systems, 35(3), 383–413. doi:10.1007/s10844-009-0103-x.
Shamsfard, M., & Abdollahzadeh Barforoush, A. (2003). The state of the art in ontology learning: a framework for comparison. The Knowledge Engineering Review, 18(4), 293–316. doi:10.1017/S0269888903000687.
Shepitsen, A., Gemmell, J., Mobasher, B., & Burke, R. (2008). Personalized recommendation in social tagging systems using hierarchical clustering. Proceedings of the 2008 ACM conference on Recommender systems - RecSys’08, 259. doi:10.1145/1454008.1454048.
Shima, H. (2013). WS4J. Retrieved from https://code.google.com/p/ws4j/.
Smirnov, S., Reijers, H. A., & Weske, M. (2011). From fine-grained to abstract process models : a semantic approach. Information Systems, 37(8), 784–797.
Sokal, R. R., & Rohlf, F. J. (1962). The comparison of dendrograms by objective methods. Taxon, 11(2), 33–40.
Song, M., & van der Aalst, W. M. P. (2008). Towards comprehensive support for organizational mining. Decision Support Systems, 46(1), 300–317. doi:10.1016/j.dss.2008.07.002.
Song, M., Günther, C., & van der Aalst, W. (2009). Trace clustering in process mining. Business Process Management Workshops Lecture Notes in Business Information Processing, 17(2), 109–120.
Tao, J., & Deokar, A. V. (2012). Creating semantic activity profiles using semantically-annotated event logs. In Proceedings of the 2012 SIGBPS Workshop on Business Processes and Services (SIGBPS’12) (pp. 136–140).
Thomas, O., & Fellmann, M. (2006). Semantic event-driven process chains. In Proceedings of the Workshop on Semantics for Business Process Management (SBPM’06), held at the 3rd European Semantic Web Conference (ESWC 2006). Budva, Montenegro.
Tiwari, A., Turner, C. J., & Majeed, B. (2008). A review of business process mining: state-of-the-art and future trends. Business Process Management Journal, 14(1), 5–22. doi:10.1108/14637150810849373.
Van der Aalst, W. M. P. (2008). Decision support based on process mining. In F. Burstein & C. W. Holsapple (Eds.), Handbook on decision support systems. Berlin: Springer.
Van der Aalst, W. M. P., & Weijters, A. J. M. M. (2004). Process mining: a research agenda. Computers in Industry, 53(3), 231–244. doi:10.1016/j.compind.2003.10.001.
Van der Aalst, W. M. P., de Beer, H. T., van Dongen, B. F. (2005). Process mining and verification of properties: An approach based on temporal logic. In On the Move to Meaningful Internet Systems 2005: CoopIS, DOA, and ODBASE, Pt 1, Proceedings (Vol. 3760, pp. 130–147). Berlin: Springer-Verlag Berlin.
Van der Aalst, W. M. P., Reijers, H. A., Weijters, A. J. M., van Dongen, M., de Medeiros, B. F., Song, A. K. A. M., & Verbeek, H. M. W. (2007). Business process mining: an industrial application. Information Systems, 32(5), 713–732. doi:10.1016/j.is.2006.05.003.
Van Dongen, B. F., & van der Aalst, W. M. P. (2004). EMiT: A process mining tool. In Applications and Theory of Petri Nets 2004, Proceedings (Vol. 3099, pp. 454–463). Berlin: Springer-Verlag Berlin.
Van Dongen, B. F., & van der Aalst, W. M. P. (2005). A meta model for process mining data. In J. Casto & E. Teniente (Eds.), Proceedings of the open interop workshop on enterprise modelling and ontologies for interoperability (EMOI-INTEROP “05), co-located with CAiSE”05 conference (Vol. 5, pp. 309–320). Porto: FEUP.
Van Dongen, B., Ferreira, D. R., & Weber, B. (2011). Business Processing Intelligence Challenge (BPIC). doi:10.4121/uuid:d9769f3d-0ab0-4fb8-803b-0d1120ffcf54.
Van Dongen, B., Ferreira, D. R., Weber, B. (2012). Business Processing Intelligence Challenge (BPIC) 2012. Retrieved from http://www.win.tue.nl/bpi2012/doku.php?id=challenge.
Veiga, G. M., & Ferreira, D. R. (2010). Understanding spaghetti models with sequence clustering for ProM. Business Process Management Workshops Lecture Notes in Business Information Processing, 43, 92–103.
Wang, H. J., & Wu, H. (2011). Supporting process design for e-business via an integrated process repository. Information Technology and Management, 12(2), 97–109. doi:10.1007/s10799-010-0076-z.
Weber, P., Bordbar, B., & Tiño, P. (2013). A framework for the analysis of process mining algorithms. Systems IEEE Transactions on Systems Man and Cybernetics, 43(2), 303–317.
Weijters, A. J. M. M., van der Aalst, W. M. P., Alves de Medeiros, A. K. (2006). Process Mining with the Heuristics Miner-algorithm. Eindhoven.
Wetzstein, B., & Ma, Z. (2007). Semantic business process management: A lifecycle based requirements analysis. In Workshop on Semantic Business Process Lifecycle Management (pp. 1–11).
Wu, Z., & Palmer, M. (1994). Verbs semantics and lexical selection. In Proceedings of the 32nd annual meeting on Association for Computational Linguistics (ACL’94) (pp. 133–138).
Yarowsky, D. (2000). Hierarchical decision lists for word sense disambiguation. Computers and the Humanities, 34(1/2), 179–186. doi:10.1023/A:1002674829964.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
Rights and permissions
About this article
Cite this article
Deokar, A.V., Tao, J. Semantics-based event log aggregation for process mining and analytics. Inf Syst Front 17, 1209–1226 (2015). https://doi.org/10.1007/s10796-015-9563-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10796-015-9563-4