'Camunda: Receive multiple, different messages at once
I am currently developing a kinda complex workflow with camunda. The goal of this workflow is to orchestrate the execution of different external business processes. Which includes start, overwatch and synchronize these workflows. Everything besides the synchronization works as expected.
Example:
My example has one main workflow which starts multiple sub workflows. The main workflow has to be aware when all sub workflows are finished. Every sub workflow is triggered by a message and sends a message back to the main workflow at the end of execution. Therefore, all sub workflows should be synchronized in the main workflow.
Xml can be accessed on this site: https://pastebin.com/2aj4z0zU
Unfortunately, this leads to numerous message correlation exceptions at the choke point in the main workflow (1st lane, after the first parallel gateway). I am using the following code to correlate the messages:
this.runtimeService.createMessageCorrelation(messageName)
.processInstanceId(processInstanceId)
.setVariables(payload)
.correlate();
The whole workflow is executable and runs without errors, but only if one example_workflow at a time is executed. Starting multiple example_workflows quickly one after another results in this type of exception randomly for every message type:
ENGINE-16004 Exception while closing command context: Cannot correlate message 'PROCESS_B_FINISHED': No process definition or execution matches the parameters org.camunda.bpm.engine.MismatchingMessageCorrelationException: Cannot correlate message 'PROCESS_B_FINISHED': No process definition or execution matches the parameters
at org.camunda.bpm.engine.impl.cmd.CorrelateMessageCmd.execute(CorrelateMessageCmd.java:88) ~[camunda-engine-7.14.0.jar!/:7.14.0]
Currently, the correlation exceptions occur if a postgresql database is used. The same workflow runs much better, but not perfect, when we use a h2 file-based database. All receive tasks are not configured asynchronously, only send tasks are (async before + exclusive).
Questions:
Is this already the best practice to synchronize multiple messages in one workflow?
What could be the reason for the correlation exceptions while using a postgresql database?
Used software:
- spring boot application [Version:2.3.4]
- camunda [Version:7.14.0]
- h2 [Version:1.4.200]
- postgresql [Version:42.2.22]
Solution 1:[1]
- the process model seems to contain sequences where it can run into a deadlock (What if blue is followed directly by green? Or yellow?) or where you have race conditions. If the process has not reached a state where it is in a receiving state for the message, then the message delivery will fail (as indicated in the error message you shared)
(The reason you are observing the CorellationException more frequently on postgresql if the race condition. With this external database some operations take slightly more time, increasing the chance of the race condition occurring).
- The process engine needs to be able to match a message to a unique receiver. If there are multiple potential receivers for the same message name, and no other correlation criteria creating a unique match is provided, then the delivery will also fail. You either need to use unique message names per instance or better use a businessKey or a process data which is unique per instance as additional correlation criteria. This is why it does not work when you run multiple process instances.
Solution 2:[2]
Modelling a workflow with this parallel message bottleneck leads to a race condition, as mentioned by @rob2universe's post.
To solve this problem, I had firstly to correlate the messages directly. I did this by adding a unique identifier to every message, which was not a big deal due to the fact that an item ID was defined within the payload of every message. Secondly, I had to remove all asynchronous and exclusive markers for every receive task and connected gateways. And thirdly, I had to reset the job executor properties to default values. Limiting the pool size and jobs per acquisition did not benefit the workflow execution.
After all these changes, my workflow now runs as expected with no errors. Unfortunately, due to the described bottleneck optimistic logging exceptions are common, but the workflow engine handles these exceptions without further errors.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | rob2universe |
Solution 2 | M4Z3 |