Best Practice Design - Process - Business Object Access patterns

webcyberrob

unread,

Jan 15, 2014, 12:36:47 AM1/15/14

to camunda-...@googlegroups.com

Lets say I have a hypothetical process which just contains a customer record key as a process variable. The customer detail resides in a CRM system. Lets assume I now need to send some customer details to two separate systems/locations, ie I have tasks [Send To A] -> [Send To B] (alternatively the tasks could be do activity A with customer record, do activity B with customer record etc)

Hence I have a question regarding design patterns for accessing/managing the business object;

Option 1:

Each task leverages a Data Access Object/mechanism in order to load the customer detail into the task handler for the task logic to subsequently process. This is a 'clean' implementation in terms of minimizing process variables, ie I only need the customer key, however it has the consequence of accessing the business object many times in quick succession (hence potentially inefficient in use of access resources). Hence relative to the CRM system, the process looks 'chatty'

Option 2:

Alternatively I could access the business object once and just copy the relevant attributes into process variables. Hence really this just sifts the access burden to the engine rather than the buisness object system.

Option 3:

I could change the granularity of the task to a single task called Send to A and B. However with this approach I could end up with a Process containing a single task called do everything, or I ultimately get back to the original design problem. (Perhaps in most cases coalescing discrete tasks based on common business object works in real world processes?)

Option 4:

In task A I could load the business object attributes into process variables and then remove the process variables at the end of task B. However this then tightly couples the process to a fixed sequence (eg brittle process) and it may also have shifted the access burden to the engine.

Option 5:

Use option 1 as the pattern and augment the system with a business object cache such that the access overhead becomes smaller, or just ignore the overhead as computing resources are cheap.

Option 6:

In practice, we tend to find those objects most accessed by a process (eg sales order) are best realised using access pattern 1 or 5, and those business objects where I may need a subset of attributes for the process duration (eg customer name) are loaded as process variables...

Other? Could it depend on whether I am accessing the business object via say a 'local' JPA implementation versus a remote web service call (eg if CRM was in Salesforce or similar)

Im not advocating a single pattern, but Im interested in others views and experiences...At the moment, I typically use a blend of option 6 and option 5 via JPA

regards

Rob

Daniel Meyer

unread,

Jan 15, 2014, 7:26:14 AM1/15/14

to camunda-...@googlegroups.com

Hi Rob,

Wow, great and concise summary of such common usage patterns.

We also had some community discusson about this orchestrated by Bernd Rücker. The outcome was this Wiki Page:

https://app.camunda.com/confluence/display/BestPractices/Data+in+Processes

I know that camunda consulting also wants to do some work in that direction, they have set up this page on Github:

https://github.com/camunda/camunda-consulting/tree/master/patterns/data-in-processes

Cheers,

Daniel

Jan Galinski

unread,

Jan 15, 2014, 4:56:04 PM1/15/14

to camunda-...@googlegroups.com

Hi Rob, Daniel,

this is an interesting topic, since sooner or later (hopefully sooner) you will have to face it in every bpm project.

We just had this discussion last week, here is what we came up with:

1.) we need our Domain data separated from the process data. Your business objects will last longer than your processes and you will need to access them after the instances finished. So although other scenarios are possible, we have a strict "ids-only" policy when it comes to process-payload.

2.) the user alway wants (and sometimes really needs) to use the most recent values of the business objects, so we have to receive it multiple times during a long running process. Caching values for a short time is possible, though.

3.) Our processes orchestrate our business logic. We need to stay flexible and refactor processes, so we can not rely on the order of service tasks thus not on the execution of javadelegates, which makes the whole "who saves a cached value and who cleans it up again" debate obsolete.

4.) when we look at the history of of process instances, we dont want to see clean process variables that allow us to easily get what happened. We dont want any cached or redundant values that might already be outdated.

So basically, that leaves us with the two options you preferred as well: access the remote system/database once in every user and/or service task or provide some kind of caching.

Here is what we do:

a) for user tasks, we load the recent data from the remote system once when the form is opened. Since a user might actually start to work on the task hours or days after the engine created it, we can not assume that any information we stored in process variables is still up to date.

b) for service tasks and listeners we use a @RequestScope d caching. Since there are no transaction breaks unless a user task or message/timer event is reached, the same RequestScope applies to all code executed. Once the state is persisted, the request scope ends and the cache is cleand up automatically. Of course, we use CDI on JBoss7 for this.

This leaves us with some blind spots, for example async. execution, where we need to reload the complete business objects although only seconds might have passed, but overall, it reduces the access of remote datasources to an acceptable level.

What didn't work for us (so dont try this at home) is using a ThreadLocal "cache" ... since you might access the same thread from the thread pool before it was cleaned, it gets really messy really quick. And if you invest in safe ThreadLocal caching, you are back to "code defines process order" again.

Long story short (we should update the wiki):

1.) It is a good idea to keep only ids in payload and access the business object when needed.

2.) When running inside a CDI container, use @RequestScope to cache business objects

Thanks for staying with me

Jan

webcyberrob

unread,

Jan 15, 2014, 7:03:00 PM1/15/14

to camunda-...@googlegroups.com

Jan,

Thanks for the great response - our thinking seems to be completely aligned, hence very encouraging that by concensus we seem to be doing the same things.

Id also thought about threadlocal cacheing, but I didn't mention it as I thought it was getting too complex, particularly with the async job executor. I don't have the luxury of CDI at the moment, however using JPA & Hibernate, I can at least leverage the Hibernate cacheing. The downside is this only works for relatively 'local' business objects...

I like and agree with the principles, so thanks for explicitely writing them up.

regards

Rob

Bernd Rücker (camunda)

unread,

Jan 17, 2014, 2:51:37 AM1/17/14

to camunda-...@googlegroups.com

Hi Rob and Jan.

Excellent discussion – perfect list of options Rob. I totally agree with everything said here. So I would definitely try to go with option 5 and fall back to option 1 if 5 is impossible or too much effort. In case of JPA/Hibernate the EntityManager should actually do 5 automatically. A ThreadLocal cache would not be rocket science for other cases in my eyes though – but you have to do housekeeping right indeed.

I would only store local copies (e.g. as process variables, e.g. like described in option 2) if it is an explicit business requirements (e.g. access the rating you loaded at the time the customer registered, work on a copy of the customer data until it is merged into the real system after approval, …).

I would never try to do option 3 or 4.

Option 6 is indeed interesting – but I see this completely additional. You might want to store some key data duplicated as process variable for several reasons: Quick access, possibility to easily for them in the process engine and some other good example I forgot at the moment ;-) But be aware that these are copies – best only use immutable data (immutable in terms of business requirements).

I would love to put all these stuff into some well written tutorial – as we discussed that already in detail in our working group – but we are all lacking time at the moment to do :-/ Volunteers for a first draft here on the list??

Cheers

Bernd

Consultant & Evangelist (www.camunda.org/community/team.html)

We are hiring: http://www.camunda.org/community/jobs.html

--
You received this message because you are subscribed to the Google Groups "camunda BPM users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to camunda-bpm-us...@googlegroups.com.
To post to this group, send email to camunda-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/camunda-bpm-users/51831a12-97ea-4034-8293-0dcc6f4cec46%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all

Reply to author

Forward