To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.

Dataset for Algorithm Evaluation

lubkenlubken Posts: 8
edited March 2011 in Event Logs
Hey

I am doing at the moment my Master Thesis in the context of Process Mining Algorithm Evaluation. I need for this two datasets to evaluate on. I have one from a company I cooperate with. About the other one I would like to ask if some Datasets are public? Or could be provided to me confidentially?  Like the datasets presented in the Genetic Mining paper ( Bezwaar BezwaarWOZ Afschriften Bouwvergunning ).

Alternative would be creating a dataset artifically (http://www.processmining.it/sw/plg) But I would prefer real life logs.

Thank you for your help.

Best regards

Lukas


Tagged:

Comments

  • JBuijsJBuijs Posts: 912
    edited March 2011
    Hi,

    For the Business Process Intelligence workshop at the BPM'11 conference there is a challenge published. This includes a large, real life, event log.
    You might want to check out http://www.win.tue.nl/bpi2011/doku.php?id=challenge and/or http://data.3tu.nl/repository/uuid:d9769f3d-0ab0-4fb8-803b-0d1120ffcf54

    I think you can use this event log for you testing purposes.
    Post edited by JBuijs on
    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
  • lubkenlubken Posts: 8
    Thank you, that looks great.



    Is this log used in some article yet? I read the information in the link
    but would be interested in deeper analysis of other papers.



    Thanks
  • JBuijsJBuijs Posts: 912
    Hi Lubken,

    Well, the idea is actually that a lot of people start to analyse this log and prepare a report in a contest setting. The one that can create the most creative, best, interesting, ... report wins.
    So, no, there are no papers that use this dataset, also because this dataset has been released only recently.

    There is a master student in our group working on, and almost finished with, creating a dataset to test process discovery algorithms. I will ask him if his dataset is ready.
    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
Sign In or Register to comment.