This repository extends the Event Knowledge Graph for BPIC17 Data repository with a module for multi-perspective concept drift detection. This module reproduces the experiments and results provided in the corresponding paper.
The code assumes that Neo4j is installed.
Install Neo4j:
- Use the Neo4j Desktop (recommended), or
- Neo4j Community Server
PromG should be installed as a Python package using pip
pip install promg=1.0.10
.
The source code for PromG can be found PromG Core Github repository.
- Create a New Graph Data In Neo4j Desktop
- Select
+Add
(Top right corner) - Choose Local DBMS or Remote Connection
- Follow the prompted steps (the default password we assume is 12345678)
- Select
- Install APOC (see https://neo4j.com/labs/apoc/)
- Install
Neo4j APOC Core library
:- Select the database in Neo4j desktop
- On the right, click on the
plugins
tab > Open theAPOC
section > Click theinstall
button - Wait until a green check mark shows up next to
APOC
- that means it's good to go!
- Install
Neo4j APOC Extended library
- Download the appropriate release (same
version numbers as your Neo4j version)
- Look for the release that matches the version number of your Neo4j Database.
- Download the file
apoc-[your neo4j version]-extended.jar
- Locate the
plugins
folder of your database:
Select the Neo4j Server in Neo4j Desktop > Click the three dots > SelectOpen Folder
> SelectPlugins
- Put
apoc-[your neo4j version]-extended.jar
into theplugins
folder of your database - Restart the server (database)
- Download the appropriate release (same
version numbers as your Neo4j version)
- Configure extra settings using the configuration file
$NEO4J_HOME/conf/apoc.conf
- Locate the
conf
folder of your database
Select the Neo4j Server in Neo4j Desktop > Click the three dots > SelectOpen Folder
> SelectConf
- Create the file
apoc.conf
- Add the following line to
apoc.conf
:apoc.import.file.enabled=true
.
- Locate the
- Install
- Ensure to allocate enough memory to your database, advised:
dbms.memory.heap.max_size=10G
- Select the Neo4j Server in Neo4j Desktop > Click the three dots > Select
Settings
- Locate
dbms.memory.heap.max_size=512m
- Change
512m
to10G
- Select the Neo4j Server in Neo4j Desktop > Click the three dots > Select
- Configuration;
config.yaml
- Set the URI in
config.yaml
to the URI of your server. Default value isbolt://localhost:7687
. - Set the password in
config.yaml
to the password of your server. Default value is12345678
. - Set the import directory in
config.yaml
to the import directory of your Neo4j server. You can determine the import directory as follows:- Select the Neo4j Server in Neo4j Desktop > Click the three dots > Select
Open Folder
> SelectImport
- This opens the import directory, so now you can copy the directory.
- Select the Neo4j Server in Neo4j Desktop > Click the three dots > Select
- Set the URI in
We provide data and scripts for BPI Challenge 2017; store the original data in CSV format in the directory /data
.
The datasets are available from:
Esser, Stefan, & Fahland, Dirk. (2020). Event Data and Queries
for Multi-Dimensional Event Data in the Neo4j Graph Database
(Version 1.0) [Data set]. Zenodo.
http://doi.org/10.5281/zenodo.3865222
The library can be installed in Python using pip: pip install promg==1.0.10
.
The source code for PromG can be found PromG Core Github repository.
config.yaml
Configuration of the database.config_analysis.yaml
Configuration of the clustering (number of clusters, filter for variants to consider in the clustering) and concept drift analysis (penalty, window size, features to use).
There is one script that implements all queries for constructing, manipulating and analyzing the Event knowledge graph :
main.py
The script implements the following steps:
step_clear_db
Clears the Event knowledge graph.step_populate_graph
Imports a normalized event table of BPIC17 from CSV files and executes several data modeling queries to construct an event knowledge graph using the semantic header. See Event Knowledge Graph for BPIC17 Data repository for more information on the semantic header.step_build_tasks
Identifies and constructs task instances. Requires step 1.step_cluster_tasks
Clusters and aggregates task instances based on the clustering result. Requires steps 1-2.step_visualize_clusters
Visualizes the task instance variants grouped per cluster as aid for inspecting the result of the task clustering. Requires steps 1-3.step_process_level_drift_detection
Detects change points in the entire process based on a set of specified features (see Configuration files). Requires steps 1-3.step_actor_drift_detection
Detects change points on the level of individual actors based on a set of specified features (see Configuration files). Requires steps 1-3.step_collab_drift_detection
Detects change points on the level of actor interactions perspective (see Configuration files). Requires steps 1-3.step_compare_to_process_drift
Compares detected change points in specific actors or collabs (set inconfig_analysis.yaml
) to detected process level change points and outputs results in text file. Requires steps 1-3 and 5-7.step_calculate_magnitude_signal_changes
Calculates the magnitude of the signal changes for each feature in the specified feature set for each detected process level change point. Requires steps 1-3 and 5.step_calculate_overall_average_max_signal_change
Calculates the average and maximum signal change across all change points and all features for each feature set. Requires steps 1-3, 5 and 8.step_detailed_change_signal_analysis
Outputs a detailed comparison of task/task variant features vs control-flow-based features, see Table 2 in corresponding paper. Requires step 1-3, 5, 9 and 10.
All steps specified in main.py
can be switched on/off. Please note that most steps assume graph constructs or other
results from preceding steps.
- Set the configuration in
config.yaml
.- For database settings, see Create a new graph database.
- Make sure the set password matches that of the Neo4j server.
- Set the configuration in
config_analysis.yaml
. - Turn desired steps on/off in
main.py
. - Start the Neo4j server.
- Run
main.py
.