Analyze That

Process Mining workshop @Technionista

On Friday April 12 2019 I was invited to give a small presentation to a new class of 'Techionista's" - women who are jumping on the data science bandwagon.

in the Johan Cruijff Arena
In a café with my name on it! ;-)

Next are the slides that I used in my presentation...

For process mining we need data in a specific format. CaseID and Activity are mandatory, timestamp is almost always needed and resource is optional (as are many other fields that could be used for further analysis).

For case id you can think of typical business id's like order number, quote number, customer number, patient number and so on. It's worth spending some to carefully select your case id. For example clickstream analysis: would you take the session id or the customer id (containing multiple web site sessions by 1 customer)? Activity are mostly statuses, but suprisingly the resource could be used as activity as well! This allows you to do a Handover of work analysis that is discussed later in this presentation.

Timestamps are mostly needed to show the order in which activities happened. I have once done an analysis of care paths where I didn't care about time but only wanted to compare care paths and see if some patient took different paths. We then discussed these deviating paths with doctors so they could learn from each other.

A resource can be used as a filter to analyse the process flow for a one employee. Be careful with this option as some countries don't allow it (Germany) or you could jump to conclusions "this employee is really slow". Maybe she is handling your most difficult issues!

The next slides are taken from a Fluxicon slide deck to introduce the concept of process mining. It shows how the algorithm combines different cases into one process map.

I worked as an auditor at the start of my career. That's why I love conformance checking. With tools like Disco it's possible to do some conformance checking like 3-way matching (match purchase order, receipt of goods and vendor invoice). Other options like providing the tool with a process map and playing the log over it are not possible. For this you could use ProM, an open source process mining tool, that is a bit harder to use but allows for more flexibility. In the example above it checks if a step in the supplied process map also occurred in the log and vice versa. This way you can compute a conformance ratio of (1,283 - 151)/1,283 = 88.2%. 151 cases missed an activity here as T10 was expected according to the model, but didn't occur in the log.

Typical analysis questions for handover of work are:

  • who works with who?
  • who gets all the difficult tasks as third line support?
  • who is a bottleneck because all cases pass him/her?

Most process mining tools have really powerful filters that would take you a long time to program in another tool. The sheet above shows a timeframe filter that allows for example to only select cases that are complete or started in a certain period.

Many process mining projects fail because somebody in a company gets interested and then tries to do everything by himself. In the age of GDPR the privacy officer also claims a seat at the table as the data that is used could contain really sensitive information.

I have seen (and done) quite some projects from a push perspective. Somebody really wants to start using process mining but the organisation is not ready. Either the applications don't expose the right data, the data is too messy or the project lacks support from higher echelons. Because big data is a hype I also receive calls like "could you help us with big data / process mining?". This is a typical push example. We don't have any painpoints right now, but we would love to apply this new technique in our organisation and boast about it on Linkedin. ;-)

The next slides show some findings from a project I did for Zorginstituut Nederland (Healthcare Institute Netherlands), a body that helps health care institutions with IT-related stuff and manages (messaging) standards.

This process map shows a maximum recursion of 4 at activity 'Overlijden' (deceased). That seems strange as most people typically don't die 4 times. In this case the team was really important because the business analyst told me that our analysis was wrong. The domain expert didn't agree with her and explained why this could happen.

From the perspective of a health care institution it would be really good news if clients would only start receiving care after their request is approved. However this analysis shows that there is a great difference in days outstanding between different laws. The Zorginstuut likes to see this as they manage the highly structured WLZ-messaging between institutions and government bodies. This causes the delay to be only 7 days on average. The WMO on the other side is executed by municipalities where every municipality could impose their own rules and standards. Most care institutions have to work with tens of municipalities making this process a mess, resulting in an average delay of 49 days!

For this process health care institutions have to send invoices to insurers for care provided. Insurer 1 is much slower (11 days on average) than Insurer 2 (3 days). Maybe they also check invoices better as they are the only insurer that also rejected invoices.

Predicting next steps will be the future of process mining. In combination with chat based messages like shown above in the Microsoft Teams app. In this way you can just ask the model what the status is of case and what it thinks will happen in the future. This allows you to act before things go wrong.  

Left side: process for cases without purchase order / Right side: cases with purchase order

This demo example clearly shows that purchase orders help structure the purchase to pay process.

Process flows per bucketized invoice amount

In the next demo I showed that small invoices (0-0.5K, 3 buckets) have a lot more variability in their processes. Based on this picture a client could consider skipping some activities like approvals for smaller amounts.

Chocolates are a men's best friend ;-)

Of course the Technionista ladies spoiled me with some nice chocolates! Of course I had to participate in the 'beers (wines) and bitterballen' afterwards and I was impressed by de cooperative vibe. The women were really supportive towards each other even when some of them were competing for the same job! I think the companies who are able to hire them are really lucky to have Techionista help them finding these talented women!