Handling of SQL deadlocks for parallel messages returning to a saga in a fork and merge process

已查看 166 次
跳至第一个未读帖子

Simon Davis

未读,
2016年7月11日 22:39:172016/7/11
收件人 masstransit-discuss
I'm trying to handle multiple messages of the same type returning to my saga, which then need to update the saga state to track the number of the messages which have completed.

In my specific case I could trigger up to nine messages to be processed in parallel which need to be then need to be merged and update the saga state, how ever I am currently getting SQL connection deadlocks when testing with just two messages.

I'm using the EntityFrameworkSagaRepository to manage the Entity Framework DbContext for the saga.

I'm going to experiment with adding a retry on the for the connection, but I don't see this as a very scalable solution when a larger number of messages return at the same time.

Had anyone had to solve a similar issue and have an idea how to implement this use case?

Ben Poelstra

未读,
2016年7月12日 13:33:342016/7/12
收件人 masstransit-discuss
I had this happen to me too. Here is the best solution that I have come up with. 

Create a table where each row in the table is the forked messages and stores the result of the processing. I created this table outside the SagaRepository, but it stores the saga correlation and the individual message correlation.
Insert the rows before the fork.
Update the row when the fork has completed processing in the consumer.
Publish a message for merging when each fork has finished
Validate that all the forked messages are finished and then publish a transition message.

I had issues where it would publish more than one transition message because they were happening in parallel, so I had to mark the whole thing as complete on the saga instance and validate that it wasn't set before publishing the transition message.

Hope that helps.

Alexey Zimarev

未读,
2016年7月13日 13:32:502016/7/13
收件人 masstransit-discuss
I have similar issue with using RavenDb saga repository. It happens with just two messages and CombinedEvent, the waiting bit was overriding in a race condition. When I turned on RavenDb optimistic concurrency, it started to fail on the wrong instance version. I do have the retry policy configured but this seems to have no effect on saga. At least for me it does not retry, just crashes the whole thing. I do not have any solution for this though, we are now using in memory repository since we are not in production yet but this is not acceptable.

My view on this that messages that are related to one saga instance should not be processed in parallel but rather put in a sequence or queue in memory and not ACKed until saga actually processed them. When saga state gets updated (this includes joins) and messages come fast, parallel processing causes version conflicts.

Chris Patterson

未读,
2016年7月13日 13:35:492016/7/13
收件人 masstrans...@googlegroups.com
You should be able to use the retry policy specifically for this situation. It's one of the key features of MT.

endpoint.UseRetry(Retry.Interval(...));
endpoint.StateMachineSaga(sagaRepository, stateMachine);

The order matters, the retry must come before the saga/consumer/etc.


On Wed, Jul 13, 2016 at 10:32 AM, Alexey Zimarev <azim...@gmail.com> wrote:
I have similar issue with using RavenDb saga repository. It happens with just two messages and CombinedEvent, the waiting bit was overriding in a race condition. When I turned on RavenDb optimistic concurrency, it started to fail on the wrong instance version. I do have the retry policy configured but this seems to have no effect on saga. At least for me it does not retry, just crashes the whole thing. I do not have any solution for this though, we are now using in memory repository since we are not in production yet but this is not acceptable.

My view on this that messages that are related to one saga instance should not be processed in parallel but rather put in a sequence or queue in memory and not ACKed until saga actually processed them. When saga state gets updated (this includes joins) and messages come fast, parallel processing causes version conflicts.

--
You received this message because you are subscribed to the Google Groups "masstransit-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to masstransit-dis...@googlegroups.com.
To post to this group, send email to masstrans...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/masstransit-discuss/f4301245-3e36-4bf2-8dc8-e866083b04b9%40googlegroups.com.

For more options, visit https://groups.google.com/d/optout.

Alexey Zimarev

未读,
2016年7月13日 13:41:072016/7/13
收件人 masstrans...@googlegroups.com
I didn’t know about configuration order, will try that. I thought about retries after looking at MongoDbSagaRepository since it throws an exception when it finds an override.

You received this message because you are subscribed to a topic in the Google Groups "masstransit-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/masstransit-discuss/IJrc1BG_f80/unsubscribe.
To unsubscribe from this group and all its topics, send an email to masstransit-dis...@googlegroups.com.

To post to this group, send email to masstrans...@googlegroups.com.

Alexey Zimarev

未读,
2016年7月13日 13:54:342016/7/13
收件人 masstransit-discuss
Yes, we have it wrong:


            ec.LoadConsumers(context, contextName);
            ec.LoadStateMachineSagas(context, contextName);

            if (serviceBusSettings.Retries > 0)
            {
                ec.UseRetry(Retry.Exponential(
                    serviceBusSettings.Retries,
                    serviceBusSettings.RetryMinInterval,
                    serviceBusSettings.RetryMaxInterval,
                    serviceBusSettings.RetryIntervalDelta));
            }

Chris Patterson

未读,
2016年7月13日 14:01:542016/7/13
收件人 masstrans...@googlegroups.com
Yeah, that does essentially nothing, since the UseRetry() is after the consumers and state machines are configured on the pipeline.

Moving the UseRetry above the other lines will resolve your issue.



--
You received this message because you are subscribed to the Google Groups "masstransit-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to masstransit-dis...@googlegroups.com.
To post to this group, send email to masstrans...@googlegroups.com.

Alexey Zimarev

未读,
2016年7月15日 10:56:342016/7/15
收件人 masstransit-discuss
I actually found what my problem is.

I have no composite event status update because this struct has no set. RavenDb cannot deserialize it back.
Saving works but reading from the database always return the initial state.

How critical is it to change from no set to private set? I can submit a PR.

Chris Patterson

未读,
2016年7月15日 11:19:262016/7/15
收件人 masstrans...@googlegroups.com
Just use an int, no reason to use that struct any longer. And int serializes much better into every data store known to man.


--
You received this message because you are subscribed to the Google Groups "masstransit-discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to masstransit-dis...@googlegroups.com.
To post to this group, send email to masstrans...@googlegroups.com.

Alexey Zimarev

未读,
2016年7月15日 11:45:292016/7/15
收件人 masstrans...@googlegroups.com
OK, this makes sense. Will do that, thanks for the suggestion.

You received this message because you are subscribed to a topic in the Google Groups "masstransit-discuss" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/masstransit-discuss/IJrc1BG_f80/unsubscribe.
To unsubscribe from this group and all its topics, send an email to masstransit-dis...@googlegroups.com.

To post to this group, send email to masstrans...@googlegroups.com.
回复全部
回复作者
转发
0 个新帖子