Abstract
The aim of a process discovery algorithm is to construct from event data a process model that describes the underlying, real-world process well. Intuitively, the better the quality of the input event data, the better the quality of the resulting discovered model should be. However, existing process discovery algorithms do not guarantee this relationship. We demonstrate this by using a range of quality measures for both event data and discovered process models. This paper is a call to the community of IS engineers to complement their process discovery algorithms with properties that relate qualities of their inputs to those of their outputs. To this end, we distinguish four incremental stages for the development of such algorithms, along with concrete guidelines for the formulation of relevant properties and experimental validation. We use these stages to reflect on the state of the art, which shows the need to move forward in our thinking about algorithmic process discovery.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The source code is available on: https://github.com/ArchitectureMining/SamplingFramework.
References
van der Aalst, W.M.P.: Process Mining-Data Science in Action, 2nd edn. Springer, Heidelberg (2016). https://doi.org/10.1007/978-3-662-49851-4
van der Aalst, W., et al.: Process mining manifesto. In: Daniel, F., Barkaoui, K., Dustdar, S. (eds.) BPM 2011. LNBIP, vol. 99, pp. 169–194. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28108-2_19
van der Aalst, W.M.P.: Relating process models and event logs–21 conformance propositions. In: ATAED, volume 2115 of CEUR Workshop Proceedings, pp. 56–74. CEUR-WS.org (2018)
van der Aalst, W.M.P., Weijters, A.J.M.M., Maruster, L.: Workflow mining: discovering process models from event logs. Knowl. Data Eng. 16(9), 1128–1142 (2004)
Augusto, A., Conforti, R., Dumas, M., La Rosa, M.: Split miner: discovering accurate and simple business process models from event logs. In: ICDM 2017, pp. 1–10. IEEE (2017)
Augusto, A., et al.: Automated discovery of process models from event logs: review and benchmark. IEEE Trans. Knowl. Data Eng. 31(4), 686–705 (2019)
Bauer, M., Senderovich, A., Gal, A., Grunske, L., Weidlich, M.: How much event data is enough? A statistical framework for process discovery. In: Krogstie, J., Reijers, H.A. (eds.) CAiSE 2018. LNCS, vol. 10816, pp. 239–256. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-91563-0_15
Berti, A.: Statistical sampling in process mining discovery. In: eKNOW 2017, pp. 41–43. IARIA (2017)
Bose, J.C., Mans, R.S., van der Aalst, W.M.P.: Wanna improve process mining results? In: CIDM 2013, pp. 127–134. IEEE (2013)
Bozkaya, M., Gabriels, J.M.A.M., van der Werf, J.M.E.M.: Process diagnostics : a method based on process mining. In: eKNOW 2009, pp. 22–27. IEEE (2009)
Buijs, J.C.A.M., van Dongen, B.F., van der Aalst, W.M.P.: Quality dimensions in process discovery: the importance of fitness, precision, generalization and simplicity. Int. J. Coop. Inf. Syst. 23(1), 1440001 (2014)
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms. MIT Press Ltd, Cambridge (2009)
de Leoni, M., Mannhardt, F.: Road Traffic Fine Management Process, February 2015. https://doi.org/10.4121/uuid:270fd440-1057-4fb9-89a9-b699b47990f5
de Medeiros, A.K.A., Weijters, A.J.M.M., van der Aalst, W.M.P.: Genetic process mining: an experimental evaluation. Data Min. Knowl. Discov. 14(2), 245–304 (2007)
van Eck, M.L., Lu, X., Leemans, S.J.J., van der Aalst, W.M.P.: PM\(^2\): a process mining project methodology. In: Zdravkovic, J., Kirikova, M., Johannesson, P. (eds.) CAiSE 2015. LNCS, vol. 9097, pp. 297–313. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19069-3_19
Günther, C.: Process mining in flexible environments. Ph.D. thesis, Eindhoven University of Technology (2009)
Knols, B., van der Werf, J.M.E.M.: Measuring the behavioral quality of log sampling. In: ICPM 2019, pp. 97–104. IEEE (2019
Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Scalable process discovery with guarantees. In: Gaaloul, K., Schmidt, R., Nurcan, S., Guerreiro, S., Ma, Q. (eds.) CAISE 2015. LNBIP, vol. 214, pp. 85–101. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19237-6_6
Liu, C., Pei, Y., Zeng, Q., Duan, H.: LogRank: an approach to sample business process event log for efficient discovery. In: Liu, W., Giunchiglia, F., Yang, B. (eds.) KSEM 2018. LNCS (LNAI), vol. 11061, pp. 415–425. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-99365-2_36
Maggi, F.M., Bose, R.P.J.C., van der Aalst, W.M.P.: Efficient discovery of understandable declarative process models from event logs. In: Ralyté, J., Franch, X., Brinkkemper, S., Wrycza, S. (eds.) CAiSE 2012. LNCS, vol. 7328, pp. 270–285. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-31095-9_18
Mannhardt, F.: Sepsis Cases - Event Log, December 2016. https://doi.org/10.4121/uuid:915d2bfb-7e84-49ad-a286-dc35f063a460
Polyvyanyy, A., et al.: Entropia: a family of entropy-based conformance checking measures for process mining. In: ICPM Doctoral Consortium and Tool Demonstration, volume 2703 of CEUR, pp. 39–42. CEUR-WS.org (2020)
Polyvyanyy, A., Kalenkova, A.A.: Monotone conformance checking for partially matching designed and observed processes. In: ICPM 2019, pp. 81–88 (2019)
Polyvyanyy, A., Solti, A., Weidlich, M., Di Ciccio, C., Mendling, J.: Monotone precision and recall measures for comparing executions and specifications of dynamic systems. ACM Trans. Softw. Eng. Methodol. 29(3), 17:1–17:41 (2020)
Rehse, J.-R., Fettke, P.: Process mining crimes – a threat to the validity of process discovery evaluations. In: Weske, M., Montali, M., Weber, I., vom Brocke, J. (eds.) BPM 2018. LNBIP, vol. 329, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98651-7_1
Fani Sani, M., van Zelst, S.J., van der Aalst, W.M.P.: Improving the performance of process discovery algorithms by instance selection. Comput. Sci. Inf. Syst. 17(3), 927–958 (2020)
Syring, A.F., Tax, N., van der Aalst, W.M.P.: Evaluating conformance measures in process mining using conformance propositions. In: Koutny, M., Pomello, L., Kristensen, L.M. (eds.) Transactions on Petri Nets and Other Models of Concurrency XIV. LNCS, vol. 11790, pp. 192–221. Springer, Heidelberg (2019). https://doi.org/10.1007/978-3-662-60651-3_8
Tax, N., Lu, X., Sidorova, N., Fahland, D., van der Aalst, W.M.P.: The imprecisions of precision measures in process mining. Inf. Process. Lett. 135, 1–8 (2018)
Weijters, A.J.M.M., Ribeiro, J.T.S.: Flexible heuristics miner (FHM). In: CIDM 2011, pp. 310–317. IEEE (2011)
van Wensveen, B.R.: Estimation and analysis of the quality of event log samples for process discovery. Master’s thesis, Utrecht University (2020). https://dspace.library.uu.nl/handle/1874/400143
van der Werf, J.M.E.M., van Dongen, B.F., Hurkens, C.A.J., Serebrenik, A.: Process discovery using integer linear programming. Fundamenta Informaticae 94(3–4), 387–412 (2009)
Acknowledgments
Artem Polyvyanyy was in part supported by the Australian Research Council project DP180102839.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
van der Werf, J.M.E.M., Polyvyanyy, A., van Wensveen, B.R., Brinkhuis, M., Reijers, H.A. (2021). All that Glitters Is Not Gold. In: La Rosa, M., Sadiq, S., Teniente, E. (eds) Advanced Information Systems Engineering. CAiSE 2021. Lecture Notes in Computer Science(), vol 12751. Springer, Cham. https://doi.org/10.1007/978-3-030-79382-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-030-79382-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-79381-4
Online ISBN: 978-3-030-79382-1
eBook Packages: Computer ScienceComputer Science (R0)