Abstract
Automated process discovery techniques allow us to extract business process models from event logs. The quality of models discovered by these techniques can be assessed with respect to various criteria related to simplicity and accuracy. One of these criteria, namely precision, captures the extent to which the behavior allowed by a process model is observed in the log. While several measures of precision have been proposed, a recent study has shown that none of them fulfills a set of five axioms that capture intuitive properties behind the concept of precision. In addition, existing precision measures suffer from scalability issues when applied to models discovered from real-life event logs. This paper presents a family of precision measures based on the idea of comparing the k-th order Markovian abstraction of a process model against that of an event log. We demonstrate that this family of measures fulfils the aforementioned axioms for a suitably chosen value of k. We also empirically show that representative exemplars of this family of measures outperform a commonly used precision measure in terms of scalability and that they closely approximate two precision measures that have been proposed as possible ground truths.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
A third accuracy criterion in automated process discovery is generalization: the extent to which the process model captures behavior that, while not observed in the log, is implied by it.
- 2.
To enhance the readability, in the rest of this paper we refer to \({\tau }_{\varSigma }\) as \({\tau }\), omitting the set \(\varSigma \).
- 3.
In the case \(\mathscr {B}_{P} = {\varGamma }_{\varSigma }\), \(P\) corresponds to the flower model.
- 4.
The support of a multiset is the set containing the distinct elements of the multiset.
- 5.
The operator \(\oplus \) is the concatenation operator.
- 6.
Formally, \(\exists {\widehat{\tau }}\in \mathscr {B}_{P_2} \setminus \mathscr {B}_{P_1}\), s.t. for \(k^* = \left| {\widehat{\tau }}\right| \Longrightarrow \exists (-, {\widehat{\tau }}) \in E_{P_2} \setminus E_{P_1}\).
- 7.
Available at http://apromore.org/platform/tools.
- 8.
The public data used in the experiments can be found at https://doi.org/10.6084/m9.figshare.6376592.v1.
- 9.
Some values differ from those in [16] as we used each measure’s latest implementation.
- 10.
References
Adriansyah, A., Munoz-Gama, J., Carmona, J., van Dongen, B., van der Aalst, W.: Measuring precision of modeled behavior. IseB 13(1), 37–67 (2015)
Augusto, A., Conforti, R., Dumas, M., La Rosa, M.: Automated discovery of structured process models from event logs: the discover-and-structure approach. DKE (2017)
Augusto, A., et al.: Automated discovery of process models from event logs: review and benchmark. TKDE (2018, to appear)
Augusto, A., Conforti, R., Dumas, M., La Rosa, M., Polyvyanyy, A.: Split miner: automated discovery of accurate and simple business process models from event logs. KAIS (2018)
Augusto, A., Conforti, R., Dumas, M., La Rosa, M.: Split miner: discovering accurate and simple business process models from event logs. In: IEEE ICDM. IEEE (2017)
Conforti, R., La Rosa, M., ter Hofstede, A.: Filtering out infrequent behavior from business process event logs. IEEE TKDE 29(2), 300–314 (2017)
De Weerdt, J., De Backer, M., Vanthienen, J., Baesens, B.: A robust F-measure for evaluating discovered process models. In: IEEE Symposium on CIDM. IEEE (2011)
Greco, G., Guzzo, A., Pontieri, L., Sacca, D.: Discovering expressive process models by clustering log traces. IEEE TKDE 18(8), 1010–1027 (2006)
Kuhn, H.W.: The Hungarian method for the assignment problem. NRL 2(1–2), 83–97 (1955)
Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from event logs - a constructive approach. In: Colom, J.-M., Desel, J. (eds.) PETRI NETS 2013. LNCS, vol. 7927, pp. 311–329. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38697-8_17
Leemans, S.J.J., Fahland, D., van der Aalst, W.M.P.: Discovering block-structured process models from event logs containing infrequent behaviour. In: Lohmann, N., Song, M., Wohed, P. (eds.) BPM 2013. LNBIP, vol. 171, pp. 66–78. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-06257-0_6
Leemans, S., Fahland, D., van der Aalst, W.: Scalable process discovery and conformance checking. Softw. Syst. Model. (2016)
Muñoz-Gama, J., Carmona, J.: A fresh look at precision in process conformance. In: Hull, R., Mendling, J., Tai, S. (eds.) BPM 2010. LNCS, vol. 6336, pp. 211–226. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15618-2_16
Rozinat, A., van der Aalst, W.: Conformance checking of processes based on monitoring real behavior. ISJ 33(1), 64–95 (2008)
Tax, N., Lu, X., Sidorova, N., Fahland, D., van der Aalst, W.: The imprecisions of precision measures in process mining. Inf. Process. Lett. 135, 1–8 (2018)
van Dongen, B.F., Carmona, J., Chatain, T.: A unified approach for measuring precision and generalization based on anti-alignments. In: La Rosa, M., Loos, P., Pastor, O. (eds.) BPM 2016. LNCS, vol. 9850, pp. 39–56. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-45348-4_3
vanden Broucke, S., De Weerdt, J.: Fodina: a robust and flexible heuristic process discovery. DSS 100, 109–118 (2017)
Weijters, A., Ribeiro, J.: Flexible heuristics miner (FHM). In: CIDM. IEEE (2011)
Acknowledgements
This research is partly funded by the Australian Research Council (DP180102839) and the Estonian Research Council (IUT20-55).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Augusto, A., Armas-Cervantes, A., Conforti, R., Dumas, M., La Rosa, M., Reissner, D. (2018). Abstract-and-Compare: A Family of Scalable Precision Measures for Automated Process Discovery. In: Weske, M., Montali, M., Weber, I., vom Brocke, J. (eds) Business Process Management. BPM 2018. Lecture Notes in Computer Science(), vol 11080. Springer, Cham. https://doi.org/10.1007/978-3-319-98648-7_10
Download citation
DOI: https://doi.org/10.1007/978-3-319-98648-7_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-98647-0
Online ISBN: 978-3-319-98648-7
eBook Packages: Computer ScienceComputer Science (R0)