Declare Miner

Declarative Process Mining

The discovery of declarative models in general and of Declare models in particular can be used to effectively solve two important drawbacks of the existing process discovery techniques:  

  • the produced models tend to be large and complex especially in flexible environments where process executions involve multiple alternatives
  • they offer limited possibilities to guide the mining process towards specific properties of interest

Using a declarative model, the process behaviour is described as a set of rules which must be satisfied during the process execution. The discovery of declarative models can easily be guided in terms of rule templates.  

In [1], the authors propose an approach for the discovery of Declare models allowing analysts to specify which kinds of templates they are interested in. This feature allows analysts to shape the discovery process to extract the properties that are most relevant for them. The proposed approach has been implemented as a plug-in in the process mining tool ProM: the Declare Miner.

Declare Miner

The Declare Miner is part of ProM 6.1. The Declare Miner plug-in allows users to discover a Declare model from a log by specifying a number of settings. There are two versions of the plug-in. The first one, the Declare Miner, requires a Declare language as input. The second one, the Declare Miner Default, uses a predefined Declare language and does not require any language as input.

The functionality of the plug-in can best be described by an example.

Example use of the plug-in

As the miner requires a log to operate on, a log must be loaded in ProM. The Declare Miner is able to discover significant Declare models when the log file only contains atomic activities so that the user could need to filter the log to extract, for instance, all the events with the same event type.  

To use the Declare Miner it is also necessary to import an XML file specifying the Declare language that will be used to generate the constraints in the discovered model. Such XML file is part of the Declare System. In particular, it is located in the sub-directory xml of the directory where the Declare System has been installed and it is called template.xml. Note that this file is automatically updated by the Declare Designer when new templates are defined.
 
A standard version of such XML file can be downloaded here

A second XML file specifying a Declare language must be also imported. Here the specified templates (with the same names as in the first one) are associated to the formulas to be used to identify the interesting witnesses for a constraint. An interesting witness for a constraint is a process instance where the constraint is not only satisfied but also activated (see [1] for more details on it).

The Declare Miner Default does not require the user to specify any Declare language as input. The only input required is the log and it uses the standard Declare as language.

After the Declare Miner plug-in has been started, it asks the user for the necessary settings. In particular, the following dialog appears:  

In this dialog, it is possible to specify the value for the metric Percentage of Events (PoE). This metric allows users to specify the percentage of the event classes to be used to generate the discovered constraints. For instance, assuming that PoE=40%, the discovered constraints will only involve the 40% of all the event classes available in the log (considering the most frequent ones). This is useful in the case that users are not interested in discovering constraints involving event classes that rarely occur in the log.

The dialog also shows the list of all the event classes available in the log ordered by frequency (the frequency of each event class is specified in square brackets). This list allows users to select one by one the event classes to be considered in the discovery process.

The metric PoE can be also used to speed up the discovery process since the execution time of the underlying algorithm grows exponentially with the number of the considered event classes.

After having selected the event classes to be used to generate the discovered model, it is possible to move to the the next dialog:

In this dialog, it is possible to specify which templates (among the ones included in the specified Declare language) will be used to generate the Declare constraints in the discovered model.

In particular, selecting a template in the list on the top-right hand side of the dialog, at the bottom a description of the selected template is shown. A panel is also shown to tune the mining settings for that specific template (see below for more details).

On the top-left hand side of the dialog it is possible to select or deselect all the templates of the templates list and also specify further mining options. In particular, it is possible to include all the mined constraints in a single view (combined model option) or group them by template and show each group in a different view (multiple models option).

Tune the level of interest

After every prefix, there are four possible evaluations of a constraint: satisfied, when the constraint is satisfied and cannot be violated anymore; temporarily satisfied, when the constraint is satisfied but can be violated in the future; temporarily violated, when the constraint is violated but it is possible to bring it back to a satisfied state by executing a sequence of events; violated, when the constraint is violated independently of future. Accordingly, three kinds of LTL semantics for truncated paths can be defined:

weak semantics where: temporarily xxx   –>   satisfied
neutral semantics where: temporarily xxx   –>   xxx
strong semantics where: temporarily xxx   –>   violated

In the yellow panel (tune the level of interest) of the dialog in the picture above, it is possible to set the semantics to be associated to each template. Note that the neutral semantics is the typical LTL semantics for a Declare template. Moreover, while on the one side the weak semantics allows for more flexibility in the discovery process but less reliability (because everything temporarily satisfied or violated is considered as satisfied), on the other side the strong semantics guarantees strong reliability but less flexibility (because everything temporarily satisfied or violated is considered as violated). Finally, note that a good practice is to use the weak semantics when it is known that the log contains partial process instances. Indeed, in this case, it is desirable that a constraint is discovered also if it is (in some process instances) temporarily violated because a temporarily violated constraint in a partial process instance can become satisfied in its continuation.  

In the same (yellow) panel, it is possible to specify, for each template, one of the following metrics:

  • Percentage of Instances (PoI). This metric can be used to specify (for the selected template) that a Declare constraint can be discovered also if it is not satisfied for all the process instances of the log. For example, assuming that PoI=95%, a constraint will be discovered if it is satisfied for the 95% of the process instances at least. Using this metric users are allowed to deal with noisy data since they are able to discard some infrequent behaviours which can likely be considered as noise.
  • Percentage of Interesting Witnesses (PoIW). This metric can be used to specify (for the selected template) that a Declare constraint can be discovered only if the log include at least a given percentage of interesting witnesses for that constraint. For example, assuming that PoIW=10%, a constraint will be discovered only if it is activated at least in 10% of the process instances in the log.

By default the Declare Miner produces a model including constraints generated considering all the event classes existing in the log, satisfied for the 100% of the process instances and with an unspecified number of interesting witnesses.

When the templates to be used to generate the discovered model have been selected and the settings for each of them have been tuned, it is possible to start the discovery. The discovery process may take some time, especially for logs including a large number of event classes.

After the discovery has finished, a Declare Model object will be visualized.

If the multiple models option has been selected, different tabs will be visualized, each tab containing all the discovered constraints for a given template:

If the single model option has been selected, a single model will be visualized including all the discovered constraints for the selected templates:

The discovered Declare model can be exported to the file system. The files generated by the export plug-in can be opened and modified using the Declare Designer. 

Declare2LTL for Declare models checking

The Declare2LTL plug-in is also included in ProM 6.1. To check a log w. r. t. a discovered Declare model it is possible to convert the Declare model into an LTL model using the Declare2LTL plug-in. It takes as input a Declare model and generates the corresponding LTL model which can be used as input of the LTL Checker (that can also be found in ProM 6.1).

[1] F. M. Maggi, A. J. Mooij, and W. M. P. van der Aalst, “User-Guided Discovery of Declarative Process Models,” in 2011 ieee symposium on computational intelligence and data mining, , 2011.
[Bibtex]
@incollection{discovery,
Author = {F.M. Maggi and A.J. Mooij and W.M.P. {van der Aalst}},
Booktitle = {2011 IEEE Symposium on Computational Intelligence and Data Mining},
Date-Added = {2011-01-26 13:11:34 +0100},
Date-Modified = {2011-02-10 00:18:08 +0100},
Title = {{User-Guided Discovery of Declarative Process Models}},
Year = {2011}
}