To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.

Quality Dimentions

azaras
edited June 2013 in - Development

Hello Dear friends,

I have a problem in using Prom algorithms, Please help me that I can write my paper.

I want compute the Quality Dimensions values but I don't know what packages must use!!!!

such as appendix picture, I want obtain a table that contain fitness, simplicity, precision and generalization due to I use the compute fitness packages in Prom6 and calculate fitness but for other dimensions I don't know what I do? Please guide me that I can solve my problem. Please guide me step to step for computing this values.

 

also I have the other problem that I hope you help me.

I generate event log on one dataset by nitro and XESame softwares.

when I run heuristic model on generated event log by nitro and compute fitness I obtain 1 while I run heuristic model on generated event log by XESame and compute fitness, I obtain -2.49. also when I done this work for Genetic algorithm the obtained values are different?

what is the problem that the fitness have high difference? the created model also have different but I select same attributes in generating event log.

please answer my questions dear friends. 

 

Thanks so much.

 

Answers

  • Dear Azaras,

    There are two ways you can obtain values for the different quality dimensions.

    The first way is probably the easiest for you. You can obtain the replay fitness metric by running the conformance checker of Arya Adriansyah. Precision and Generalization scores can be obtained using the ET Conformance plugin.
    For these plug-ins you need a Petri net and corresponding event log.
    Simplicity should be calculated manually, depending on how you want to measure this quality dimension.

    The quality dimensions as those shown in the picture you attached are from a paper that we wrote [1].
    Here the four quality dimensions (replay fitness, precision, generalization and simplicity) are calculated on an internal process model format called process trees.
    Hence, to reproduce this, you need to translate your input process model (petri net, BPMN model, ...) to our process tree notation. Then, using methods you can find in the EvolutionaryTreeMiner [2] package in ProM (development) you can obtain the figures.

    As to your second question: I don't know why the heuristics miner returns a negative fitness value. Please also be aware that the fitness value returned by the heuristics miner should not be compared to the conformance replay fitness by Adriansyah's alignments since they are calculated in a completely different way!!!
    It is however quite logical that different mining algorithms result in different replay fitness values since the resulting process models are different.

    Hope this helps, good luck with your paper.

    References:
    [1] On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery by
    JCAM Buijs, Boudewijn F van Dongen, Wil MP van der Aalst
    Via: http://link.springer.com/chapter/10.1007/978-3-642-33606-5_19 and/or http://wwwis.win.tue.nl/~wvdaalst/publications/p688.pdf
    [2] EvolutionaryTreeMiner ProM6 package development source code: https://svn.win.tue.nl/trac/prom/browser/Packages/EvolutionaryTreeMiner/Trunk
    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
  • hi Dear Jbuijs

    thanks for your guide, I implement your said and will ask my question after that.

    best rigards,

    azar.

  • Hello,

    is there formula for generalization, simplicty and Precision such as fitness? if yes please refrence me to help of them.

    thanks.

  • Dear JBuijs

    hi,

    I want use the EvolutionaryTreeMiner pakage for achiving the quality dimenstions but I don't know how I can used it in prom. can you guide me step to step?

    how I must add this pakage to prom? the above link is a code for EvolutionaryTreeMiner pakage and if i want to use it, I don't know how it would add to prom!!!

    please help me,

    thanks for your attention,

    best regard,

    azar.

  • Dera Azar,

    To answer your first question, there are several formula's to calculate these.
    Please have a look at the two publications ([1] and [2]) mentioned below).

    Regarding the Evolutionary Tree Miner (ETM) package/plugin, if you download ProM 6.3, the package is included and you can run it.

    [1] Adriansyah et. al., Replaying history on process models for conformance checking and performance analysis http://wires.wiley.com/WileyCDA/WiresArticle/wisId-WIDM1045.html
    [2] Buijs et. al., On the Role of Fitness, Precision, Generalization and Simplicity in Process Discovery http://wwwis.win.tue.nl/~wvdaalst/publications/p688.pdf
    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
  • I recently got a question via e-mail about this topic and I thought I might as well post the question, and the answers.

    The question:


    First, I ran Check
    conformance using ET-Conformance with inputs a  log and a petri net in
    ProM 6.3, without changing any setting. I get the first screenshot attached,
    where I can see only the precision value.

    By selecting the settings: minimal disconformant traces and show prefix
    automaton it seems to me that no additional information is added. With
    "compute interval of confidence" too, I get no output visualization
    due to [ERROR]null.



    Second, I ran the other plug-in Check Precision based on Align-ETConformance,
    selecting "measure behavioral appropriateness" and I get the second
    attached screenshot, where I see only precision, different from the previous
    one;

    in particular I get

    precision: 0.625 with algorithm 1 align precision

    precision:Nan with representative-align, all-align

    the ETC precision algorithm doesn't allow me to go on since some transition are
    not mapped to any event class (the log traces contain fewer events than the
    net).

    so I suppose this measure is calculated differently from the previous plugin?



    Since I am new to these ProM measures, I would like to understand which value
    of precision I should keep, or those 2 values represent different meanings (In
    my example all the 11 traces in the log have fitness 1).

    And where can I derive the generalization score?




    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
  • Answer by Jorge Munoz-Gama:

    I think it has been a kind
    of misunderstanding: ETConformance package only compute 'precision' (for
    generalization there are other approaches, ... I think ARYA has one and he can
    tell you more about it).



    Regarding precision, long story short, there are two approaches: ETC pure [1,2]
    (computes precision directly from the log), and Align-ETC[3,4] (first compute
    the alignments, and then compute precision over those alignments). It's been
    shown that the pure ETC, although faster, has some problems when the log is not
    completely fitting (see [3,4]). Therefore, i encourage you to use always
    Align-ETC (e.g., ap, ...) precision techniques (unless time is an issue).



    The Align-ETConformance are under the ETConformance package, and the last
    version is in the plugin "Check Precision based on
    Align-ETConformance".

    The 'default' approach (and the one i recommend you) is to "1-ALIGN"
    (with sequence and and only looking at 'precion' in the results result (not
    forward not balanced) ). The details of the rest of options will be published
    soon [4], ... but i think they won't be necessary: 1-Alignment+sequence is what
    it's been using in all the papers so far (e.g., in this survey to compare
    discovery algorithms [5])



    I hope this solve your problems/doubts. Please don't hesitate on contact me for
    any other question.



    JORGE



    [1]        Jorge Munoz-Gama, Josep Carmona: A Fresh Look at
    Precision in Process Conformance. BPM 2010: 211-226



    [2] Jorge Munoz-Gama, Josep Carmona: Enhancing precision in Process
    Conformance: Stability, confidence and severity. CIDM 2011: 184-191



    [3] Arya Adriansyah, Jorge Munoz-Gama, Josep Carmona, Boudewijn F. van Dongen,
    Wil M. P. van der Aalst: Alignment Based Precision Checking. Business Process
    Management Workshops 2012: 137-149



    [4] Arya Adriansyah, Jorge Munoz-Gama, Josep Carmona, Boudewijn F. van Dongen,
    Wil M. P. van der Aalst: Measuring Precision of Modeled Behavior. Information
    Systems and E-Business Management (to appear)



    [5] Seppe Vanden Broucke, Cédric Delvaux, João Freitas, Taisiia Rogova, Jan
    Vanthienen and Bart Baesens. Uncovering the Relationship between Event Log
    Characteristics and Process Discovery Techniques. BPI 2013.
    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
  • Answer by Arya Adriansyah:

    As Jorge mentioned, I
    implemented a generalization plug-in in ProM 6. If you have the
    PNetAlignmentAnalysis package installed, you should have a “Measure
    Precision/Generalization” plug-in. This is an implementation of the approach in
    [1].



    I’d say that the generalization metric implemented in the plug-in has many
    weaknesses, but it is to my knowledge, the best that we have so far to measure
    pure generalization for arbitrary Petri nets.



    Regards,

    Arya



    Reference

    [1] Aalst, W.M.P. van der, Adriansyah, A., & Dongen, B.F. van. Replaying
    History on Process Models for Conformance Checking and Performance Analysis.
    WIREs Data Mining Knowledge & Discovery 2012, 2: 182-192. doi:
    10.1002/widm.1045
    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
  • Hi azaras
    did you find an explanation of getting a negative fitness value from the Heuristics miner ?
    I faced the same thing and I do need to understand the results .
    could any one help please ?

    I read the papers that Joos mentioned they are really helpful but I couldn't find the answer for my question there ..

    Thanks

    Amirah 
  • dear friends
    i have 4 metrics of multiple models (fitness,precision,generalization and simplicity).
    SOS: how can i compare that models? which of these metrics prefered... 
Sign In or Register to comment.