RSB - Application Crash due to unhandled NHibernate Exception when setting ThreadCount > 1

357 views
Skip to first unread message

Adam

unread,
Feb 1, 2011, 12:51:50 PM2/1/11
to Rhino Tools Dev
Hi,

I've successfully implemented RSB+NHibernate for a project and
everything works great when I have the RSB thread count set to 1. If I
try to increase the thread count, I will eventually get an unhandled
exception from NHiberbate causing the entire application to fail
(service to stop when running as a service). The unhandled exception
is logged in the Event Log, and in my log4Net log, I get 3 errors from
NHibernate at the same time talking about a deadlock.

I am fully expecting a deadlock to occur in a mutlti-threaded
environment, and sometimes the RSB application processes several
hundread messages before crashing, other times its just a handful.
I've implemented a MessageModule to handle the NHIbernate session as
shown in Ayende's MSDN article and also as shown in a couple of posts
on this group.

I'm going to dig into this a little more, as well as make sure I'm
running the latest versions of RSB and NHibernate, but thought I would
post now to see if anyone has an idea of where I could focus my effort
in resolving this issue.

Unhandled Exception:

Application: Rhino.ServiceBus.Host.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.InvalidOperationException
Stack:
at NHibernate.AdoNet.ConnectionManager.Disconnect()
at NHibernate.AdoNet.ConnectionManager.Close()
at NHibernate.Impl.SessionImpl.Close()
at NHibernate.Impl.SessionImpl.Dispose(Boolean)
at
NHibernate.Impl.SessionImpl.CloseSessionFromDistributedTransaction()
at NHibernate.Transaction.AdoNetWithDistributedTransactionFactory
+<>c__DisplayClass1.<EnlistInDistributedTransactionIfNeeded>b__0(System.Object,
System.Transactions.TransactionEventArgs)
at
System.Transactions.TransactionCompletedEventHandler.Invoke(System.Object,
System.Transactions.TransactionEventArgs)
at
System.Transactions.TransactionStatePromotedAborted.EnterState(System.Transactions.InternalTransaction)
at
System.Transactions.InternalTransaction.DistributedTransactionOutcome(System.Transactions.InternalTransaction,
System.Transactions.TransactionStatus)
at
System.Transactions.Oletx.RealOletxTransaction.FireOutcome(System.Transactions.TransactionStatus)
at
System.Transactions.Oletx.OutcomeEnlistment.InvokeOutcomeFunction(System.Transactions.TransactionStatus)
at
System.Transactions.Oletx.OletxTransactionManager.ShimNotificationCallback(System.Object,
Boolean)
at
System.Threading._ThreadPoolWaitOrTimerCallback.PerformWaitOrTimerCallback(System.Object,
Boolean)



Errors in log4net Logging, which I expect, and also get sometimes w/o
the whole application crashing.


2011-02-01 11:28:20,499 [18] WARN
NHibernate.Util.ADOExceptionReporter [(null)] -
System.Data.SqlClient.SqlException (0x80131904): Transaction (Process
ID 59) was deadlocked on lock resources with another process and has
been chosen as the deadlock victim. Rerun the transaction.
at System.Data.SqlClient.SqlConnection.OnError(SqlException
exception, Boolean breakConnection)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning()
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior,
SqlCommand cmdHandler, SqlDataReader dataStream,
BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject
stateObj)
at
System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds,
RunBehavior runBehavior, String resetOptionsString)
at
System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior
cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean
async)
at
System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior
cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String
method, DbAsyncResult result)
at
System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(DbAsyncResult
result, String methodName, Boolean sendToPipe)
at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
at System.Data.SqlClient.SqlCommandSet.ExecuteNonQuery()
at NHibernate.AdoNet.SqlClientSqlCommandSet.ExecuteNonQuery() in d:
\CSharp\NH\nhibernate\src\NHibernate\AdoNet
\SqlClientSqlCommandSet.cs:line 117
at
NHibernate.AdoNet.SqlClientBatchingBatcher.DoExecuteBatch(IDbCommand
ps) in d:\CSharp\NH\nhibernate\src\NHibernate\AdoNet
\SqlClientBatchingBatcher.cs:line 91
2011-02-01 11:28:20,570 [18] ERROR
NHibernate.Util.ADOExceptionReporter [(null)] - Transaction (Process
ID 59) was deadlocked on lock resources with another process and has
been chosen as the deadlock victim. Rerun the transaction.
2011-02-01 11:28:20,574 [18] ERROR
NHibernate.Event.Default.AbstractFlushingEventListener [(null)] -
Could not synchronize database state with session
NHibernate.Exceptions.GenericADOException: could not execute batch
command.[SQL: SQL not available] --->
System.Data.SqlClient.SqlException: Transaction (Process ID 59) was
deadlocked on lock resources with another process and has been chosen
as the deadlock victim. Rerun the transaction.
at System.Data.SqlClient.SqlConnection.OnError(SqlException
exception, Boolean breakConnection)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning()
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior,
SqlCommand cmdHandler, SqlDataReader dataStream,
BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject
stateObj)
at
System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds,
RunBehavior runBehavior, String resetOptionsString)
at
System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior
cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean
async)
at
System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior
cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String
method, DbAsyncResult result)
at
System.Data.SqlClient.SqlCommand.InternalExecuteNonQuery(DbAsyncResult
result, String methodName, Boolean sendToPipe)
at System.Data.SqlClient.SqlCommand.ExecuteNonQuery()
at System.Data.SqlClient.SqlCommandSet.ExecuteNonQuery()
at NHibernate.AdoNet.SqlClientSqlCommandSet.ExecuteNonQuery() in d:
\CSharp\NH\nhibernate\src\NHibernate\AdoNet
\SqlClientSqlCommandSet.cs:line 117
at
NHibernate.AdoNet.SqlClientBatchingBatcher.DoExecuteBatch(IDbCommand
ps) in d:\CSharp\NH\nhibernate\src\NHibernate\AdoNet
\SqlClientBatchingBatcher.cs:line 91
--- End of inner exception stack trace ---
at
NHibernate.AdoNet.SqlClientBatchingBatcher.DoExecuteBatch(IDbCommand
ps) in d:\CSharp\NH\nhibernate\src\NHibernate\AdoNet
\SqlClientBatchingBatcher.cs:line 103
at
NHibernate.AdoNet.AbstractBatcher.ExecuteBatchWithTiming(IDbCommand
ps) in d:\CSharp\NH\nhibernate\src\NHibernate\AdoNet
\AbstractBatcher.cs:line 431
at NHibernate.AdoNet.AbstractBatcher.ExecuteBatch() in d:\CSharp\NH
\nhibernate\src\NHibernate\AdoNet\AbstractBatcher.cs:line 416
at NHibernate.AdoNet.AbstractBatcher.OnPreparedCommand() in d:
\CSharp\NH\nhibernate\src\NHibernate\AdoNet\AbstractBatcher.cs:line
168
at NHibernate.AdoNet.AbstractBatcher.PrepareCommand(CommandType
type, SqlString sql, SqlType[] parameterTypes) in d:\CSharp\NH
\nhibernate\src\NHibernate\AdoNet\AbstractBatcher.cs:line 155
at
NHibernate.Persister.Entity.AbstractEntityPersister.Update(Object id,
Object[] fields, Object[] oldFields, Object rowId, Boolean[]
includeProperty, Int32 j, Object oldVersion, Object obj,
SqlCommandInfo sql, ISessionImplementor session) in d:\CSharp\NH
\nhibernate\src\NHibernate\Persister\Entity
\AbstractEntityPersister.cs:line 2722
at
NHibernate.Persister.Entity.AbstractEntityPersister.UpdateOrInsert(Object
id, Object[] fields, Object[] oldFields, Object rowId, Boolean[]
includeProperty, Int32 j, Object oldVersion, Object obj,
SqlCommandInfo sql, ISessionImplementor session) in d:\CSharp\NH
\nhibernate\src\NHibernate\Persister\Entity
\AbstractEntityPersister.cs:line 2689
at
NHibernate.Persister.Entity.AbstractEntityPersister.Update(Object id,
Object[] fields, Int32[] dirtyFields, Boolean hasDirtyCollection,
Object[] oldFields, Object oldVersion, Object obj, Object rowId,
ISessionImplementor session) in d:\CSharp\NH\nhibernate\src\NHibernate
\Persister\Entity\AbstractEntityPersister.cs:line 2965
at NHibernate.Action.EntityUpdateAction.Execute() in d:\CSharp\NH
\nhibernate\src\NHibernate\Action\EntityUpdateAction.cs:line 79
at NHibernate.Engine.ActionQueue.Execute(IExecutable executable) in
d:\CSharp\NH\nhibernate\src\NHibernate\Engine\ActionQueue.cs:line 136
at NHibernate.Engine.ActionQueue.ExecuteActions(IList list) in d:
\CSharp\NH\nhibernate\src\NHibernate\Engine\ActionQueue.cs:line 126
at NHibernate.Engine.ActionQueue.ExecuteActions() in d:\CSharp\NH
\nhibernate\src\NHibernate\Engine\ActionQueue.cs:line 170
at
NHibernate.Event.Default.AbstractFlushingEventListener.PerformExecutions(IEventSource
session) in d:\CSharp\NH\nhibernate\src\NHibernate\Event\Default
\AbstractFlushingEventListener.cs:line 241

Corey Kaylor

unread,
Feb 1, 2011, 12:55:34 PM2/1/11
to rhino-t...@googlegroups.com
Is your session thread static and setup with a IMessageModule?

Adam

unread,
Feb 1, 2011, 2:30:35 PM2/1/11
to Rhino Tools Dev
Yes. I followed the example in this MSDN article:
http://msdn.microsoft.com/en-us/magazine/ff796225.aspx.

Corey Kaylor

unread,
Feb 1, 2011, 2:41:47 PM2/1/11
to rhino-t...@googlegroups.com
Sorry, I realized you said that already after I had already sent the question. Are you running under the default isolation level Serializable?

Adam

unread,
Feb 1, 2011, 2:53:33 PM2/1/11
to Rhino Tools Dev
Yes.

Adam

unread,
Feb 1, 2011, 3:19:51 PM2/1/11
to Rhino Tools Dev
I decided to try a Isoloation Level of Read Committed. 90% of the way
thru my load test I have seen no dead locks in the service bus
application, and thus no application crashes. I'm going to review my
code/requirements to see if this isolation level is acceptable. I
would welcome any comments from people who have changed the default
isolation level from serializable to something else. Why did you do
it? Any gotchas or non-obvious things to be aware of?
> > question. Are you running under the default isolation level Serializable?- Hide quoted text -
>
> - Show quoted text -

Jason Meckley

unread,
Feb 1, 2011, 4:49:27 PM2/1/11
to rhino-t...@googlegroups.com
I would profile your sql queries to see if you can tune the queries; indexing, joins etc.
Deadlocks shouldn't crash the service, they will fail the consumption of the message, but they shouldn't bring down the entire service.

Corey Kaylor

unread,
Feb 1, 2011, 4:57:03 PM2/1/11
to rhino-t...@googlegroups.com
I do remember under certain circumstances though that when the DTC callback occurs it can crash the application. In other words, I don't believe it's something we can catch. This same thing can happen with Linq to SQL or Entity Framework also.

The lines that are suspect...

João Bragança

unread,
Feb 3, 2011, 2:48:41 PM2/3/11
to rhino-t...@googlegroups.com
Getting the same problem here. I think the best solution is to run the
bus single threaded, and then move slow message consumers to separate
processes. However, this seems like it would become difficult to
maintain. Is there any interest in a project that could ease this
configuration burden? I am thinking of something like

import MyConsumers

partition MyConsumers.OrderSaga:
numberOfRetries 2
queueIsolationLevel ReadCommitted
// other familiar rhino esb facility config options here
partition MyConsumers.OrderView
partition MyConsumers.CustomerView:
disabled

Every other message consumer in the indicated assembly not defined
here would get stuffed into a default endpoint. Then it would leverage
RemoteAppDomainHost for each partition. It would need to let you
define a convention to say 'this message is a command' so we can add
its default endpoint to all the configurations.

> --
> You received this message because you are subscribed to the Google Groups
> "Rhino Tools Dev" group.
> To post to this group, send email to rhino-t...@googlegroups.com.
> To unsubscribe from this group, send email to
> rhino-tools-d...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/rhino-tools-dev?hl=en.
>

Corey Kaylor

unread,
Feb 3, 2011, 3:34:15 PM2/3/11
to rhino-t...@googlegroups.com
I'm actually a fan of Udi's approach and isolating consumers / sagas into individual endpoints. It wouldn't be too hard to manage if using something like Topshelf that can spool up an appdomain for each folder under a /Services folder where an assembly implements something like IBootStrapper. I had plans to incorporate a hosting option that uses Topshelf for this reason.

By having all your messages in separate endpoints they will no longer compete with each other for things like this and will typically perform pretty well on one thread. Also when a subscriber goes down, it doesn't affect the SLA of other things trying to send messages to other subscribers unrelated to the one that went down. One message handler that takes 30 seconds doesn't affect another that is much faster and receives more messages. It also allows you to scale individual components that need to scale, not the entire thing just because a couple don't perform well.

Also once you've deployed this way performance counters could paint a very interesting picture of what's going on with your business, not just what's going on with EndpointA that handles 200 types of messages.

Adam

unread,
Feb 3, 2011, 4:42:39 PM2/3/11
to Rhino Tools Dev
I agree with your're assessment. While i wasn't able to fix the
specific issue, I was able to remove some functionality that was not
actually needed (updating rows in the DB), as well as move to Read
Committed for the Isloation level. Now I'm running with 10 threads
with no problem.

On Feb 1, 3:57 pm, Corey Kaylor <co...@kaylors.net> wrote:
> I do remember under certain circumstances though that when the DTC callback
> occurs it can crash the application. In other words, I don't believe it's
> something we can catch. This same thing can happen with Linq to SQL or
> Entity Framework also.
>
> The lines that are suspect...
>
> System.Transactions.Oletx.OutcomeEnlistment.InvokeOutcomeFunction(System.Tr­ansactions.TransactionStatus)
>   at
> System.Transactions.Oletx.OletxTransactionManager.ShimNotificationCallback(­System.Object,
> Boolean)
>   at
> System.Threading._ThreadPoolWaitOrTimerCallback.PerformWaitOrTimerCallback(­System.Object,
> Boolean)

Adam

unread,
Feb 3, 2011, 4:49:48 PM2/3/11
to Rhino Tools Dev
While I'm not at a point that requires functionality like you are
describing, I can see where it would be useful. I also begin to see
similiarities with the AppFabric add-in for IIS, at least from the
management point of view. Instead of hosting in a windows service, you
would host it as a WCF app (with a MSMQ endpoint?). You get the nice
management console that AppFabric provides, and isolation between
requests. I have no idea if it would be practical to implement, just
that I see some similarities, and I hate re-inventing the wheel ;)

Adam

João Bragança

unread,
Feb 3, 2011, 4:52:41 PM2/3/11
to rhino-t...@googlegroups.com
Definitely gonna take a look at Topshelf for this. Don't really want
to give each of my consumers a separate project like Udi suggests.
Looks like I might be able to make a subdirectory and drop a .config
inside for each message consumer with the ShelfConfiguration section.
Then come up with a convention to define all the default endpoints for
commands and programatically configure each bus instance.

Corey Kaylor

unread,
Feb 3, 2011, 5:03:14 PM2/3/11
to rhino-t...@googlegroups.com
I like the sounds of that.

Matt Burton

unread,
Feb 3, 2011, 5:23:40 PM2/3/11
to rhino-t...@googlegroups.com
I'm using TopShelf right now with RSB, but be forewarned, the TopShelf
"shelving" support is not fully baked. When I last tried it (~1 month
ago - could have changed since, haven't been keeping track) the
auto-restart capabilities were not working as advertised yet. It's a
neat concept but just not there yet. I don't like a lot of projects
either, so what I'm doing is spinnging up RemoteAppDomainHosts for
bootstrapper types that I find. Currently all of my endpoints are in
one assembly, so I'm just scanning the types in that assembly looking
for types derived from my bootstrapper base type (little special sauce
on top of the RSB AbstractBootstrapper) but could expand that to do
full assembly scanning within the app directory if needed. I then have
a config file per endpoint, and they get spun up in separate app
domains inside the main TopShelf-based service host executable like
so:

foreach (var bootstrapperServiceName in bootstrapperTypes.Keys)
{
string serviceName = bootstrapperServiceName;

x.ConfigureService<RemoteAppDomainHost>(s =>
{
var type = bootstrapperTypes[serviceName];

s.Named(serviceName);
s.HowToBuildService(name => new RemoteAppDomainHost(type)
.Configuration("Config/" + serviceName + ".config"));
s.WhenStarted(host => host.Start());
});
}

I don't have auto-restart - don't know if I'll need it yet - but
beyond that it's working very smoothly. Installing the service is as
simple as:

HostName.exe install

Pretty happy for the time being.

Thanks,
Matt

2011/2/3 João Bragança <joao...@braganca.name>:

Corey Kaylor

unread,
Feb 4, 2011, 12:01:23 PM2/4/11
to rhino-t...@googlegroups.com
A conventional approach or DSL with MSMQ might not be too hard. Rhino Queues in its current form might not be as easy. I've kind of felt for a while that Rhino Queues would be better deployed on its own where processes on the machine can communicate through a channel of some sort. This would also make an Administrative UI much easier to build in isolation also.

João Bragança

unread,
Feb 25, 2011, 8:51:40 PM2/25/11
to rhino-t...@googlegroups.com
Example project up at

https://github.com/JoaoBraganca/rhino-esb-topshelf

TODO:
Support rhino queues (hard to do since IIRC each queue needs to be on
its own port)
Build script
Readme.md

Corey Kaylor

unread,
Feb 25, 2011, 9:07:07 PM2/25/11
to rhino-t...@googlegroups.com
Nice! 

2011/2/25 João Bragança <joao...@braganca.name>

João Bragança

unread,
Mar 3, 2011, 7:41:26 PM3/3/11
to rhino-t...@googlegroups.com
Well I got it working more or less for msmq (needed lots of tweaking).
Can't seem to get this working for RQ because of the ports issue.
Coming up with a process level singleton that deterministically
assigns a port number to an endpoint is not as fun as it sounds.

I think I would need to implement Corey's idea of one RQ instance per
machine (just like msmq). Looking at the RQ code, it looks like
rhino.queues://localhost:2200/a and rhino.queues://localhost:2200/b
should work (c/d?). The only snag is that RhinoQueuesTransport creates
a new QueueManager when it starts.

Trying to think of what would be the simplest thing here... we could
throw IQueueManager behind wcf using a named pipe, and install as a
windows service. But I know that ayende has a special place in his
heart for WCF so that may not be the way to go. Thoughts?

Corey Kaylor

unread,
Mar 3, 2011, 8:19:21 PM3/3/11
to rhino-t...@googlegroups.com
NamedPipeClientStream could be another option that doesn't require WCF. I haven't thought through various implications for this though, trasactions, etc. Sending may need tweaking a bit to minimize a particular queues outgoing messages from degrading anothers.

2011/3/3 João Bragança <joao...@braganca.name>
Reply all
Reply to author
Forward
0 new messages