To prevent spam users, you can only post on this forum after registration, which is by invitation. If you want to post on the forum, please send me a mail (h DOT m DOT w DOT verbeek AT tue DOT nl) and I'll send you an invitation in return for an account.

Hardware sizing/specs for a real project

Hi all, I haven't found any reference on sizing the hardware to use with ProM 6.

I'm experimenting/learning with a laptop with Win10Pro, 8Gb of RAM, SSD and i7 CPU and sometimes I get performance problems with large/complicated logs(1)
I have a chance to start a real project in a client and I'm worried we get locked by performance issues.

Any clue? Could anybody provide the specs of a machine used in a real project?

Thanks in advance

(1) For example, if you mine the BPI Challenge 2012 log with the visual inductive miner plugin and use as classifier the org:resource atribute the system start working, consuming a of RAM/CPU but never ends.

Answers

  • Hi Luis,

    There are no hard requirements that we can provide, as each dataset and algorithm is different. Your setup seems to be sufficient to do most work.

    My main recommendation is to start with the question: what do you want to know?
    How do you then need to filter/scope the data?

    OF course exploring the data is one of the first things to do, but no useful process model will be discovered from the raw data without filtering.
    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
  • Hi Joos, thanks for your quick answer.
    Let me reformulate my question because I see it wasn't clear - my fault.

    I didn't asked for a recipe to size the hardware but for a example of sizing in a real project.

    The question, reformulated, would be: Could somebody provide examples of the hard/soft used in a real project along with the volume of data and algorithms used?


    Thanks again.
    Regards

    PS. I find very relevant your last statement: "no useful process model will be discovered from the raw data without filtering"



  • Hi Luis,

    OK, I use the following PC for years to do process mining, also on real datasets (e.g. BPI15 like size):
    Intel Xeon CPU with 4 cores at 2.8Ghz
    12 Gb memory
    SSD as main disk ('normal' secondary disk)

    But as I said, you could work with a dual core 8 Gb PC just fine :)

    Joos Buijs

    Senior Data Scientist and process mining expert at APG (Dutch pension fund executor).
    Previously Assistant Professor in Process Mining at Eindhoven University of Technology
Sign In or Register to comment.