Hi, I am dealing with a process log which contains some events handled by machine and some by human intervene. All machine events are deterministic (not changing) and human part (whenever human interacts) is very random. How to mine such a process?
I am looking for a plugin/technique to mine the process on the basis of events, not on basis of cases. Kindly help if anyone has any clue of sorting this out. I can explain more if needed.
Comments
Eric.
Hi Eric, thanks for your reply.
I am sharing an abstracted log as the attached file. It is a typical online chat process between chatbot server and customer. The process kicks off with a question raised by the machine which is answered by the customer. During this process flow, machine guesses the solution to the query of the customer and suggest a workaround to the customer at the end of the process. 'Answer given by' column in the data provides information about the reply from machine or customer.
All the machine-related activities are triggered by the answer from the customer so only branching possible in the process is when a customer interacts. Machine part is fixed, and all activities done by machine are deterministic.
Query1: Am I right to say that machine part and human part both are activities of this process?
Query2: Is there any way in process mining that I can examine both portions of the process separately, with fact that all cases are random and there is no strict format in the actual log. I want to see what causes this process to be very lengthy and how to make the process efficient. What changes I can suggest in the machine portion and what in customer one.
Apologies for the long reply. Please provide your detailed suggestions.
Thanks for your reply. I have done the control-flow discovery part and also run through ActiTraC and AHC using ProM6 (nightly builds too). Now in advance stage, I am trying to do the conformance analysis to identify bottlenecks and most frequent paths, etc.
Assume that the answers are categorized already, what you exactly mean by 'Later on, you could enhance this model with guards, which may be discovered based on the answers as given (also by humans).'?
Can you suggest any plugin/documentation/algorithm which can allow me to check conformance of both Human and machine part separately? (or maybe it is not a good idea?)
Thanks for your reply. I took some time to get back as I was going through literature and trying to sort out confusions related to conformance. Below are two queries based on our discussion:
1) "Decision-Tree Miner' plugin is not working for me in both ProM6.10 and Lite version. After configuration screen when I click Continue then it takes me back to the Actions screen. Is there anything I am doing wrong?
2) Can you suggest any plugin/literature which performs event-level clustering/filtering. Like in the Explorer view of the log, can I filter my log from Event1 till Event 15. I need this as I want to slice my large process log into several segments. Such as25% of process from start, last 40% of events etc.
Thanks once again for your time. I am really not able to handle the log that I am dealing with. Your expert advice will definitely help.
Thanks for your reply. I have seen your work at ProM and have read your research papers. Its a blessing to have people like you and Eric around for replying to our queries.
Regarding 1) I have tested the plugin on public data (BPI & Artificial Loan Process) and its working fine, so there may be some issue with my data set. I am reviewing my dataset and will try again using the plugin.
Regarding 2) I am using excel for pre-processing now. I wanted to split my large numbered event log (average 280 events per case) into manageable segments but I was not sure that on which basis I can split the cases. One way is to split the cases is on percentage of length, such as 25% length for each split. But I wanted to see if there is any other literature available where split is done using some other statistical basis. I am searching literature on this.
Thanks for your assistance once again.