To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.
Data pre-processing for the process mining
I am doing my Thesis now in the area of process mining and would be appropriated for ideas, how can I modify my event-log for the software (plan to use Disko, ProM and one of commercial product also).
My issue is that my data set has several activities (Like Activity A, Activity B, C etc. and every activity could have several meanings, like Activity A could have statuses 'Active', 'Closed', 'Lost' etc.). In addition, I have the Case ID and TimeStamp. Usually a software proposes to map data like unique Case ID, Activity Type and TimeStamp. However, I have in my data next situation for the unique CaseID at the first TimeStamp Activity A changes to status "Active", in the second TimeStamp Activity A changes to status "Active" and Activity B changes to status "20%", at the third TimeStamp Activity B changes status to "40%" and Activity C to status "Lost". Activities could be changed in parallel.
I was thinking to make an Activity Table and a Case table. Where Activity Table could store the CaseID, ActivityType (looking like 'Activity A changed' or 'Activity A and B changed') and a TimeStamp. Case table could contain the extention for every CaseID in a certain TimeStamp. But in this case I got the message about the duplicates.
I would be happy to receive advises how do I need to modify the data set for the process analysis.