From:
http://msdn.microsoft.com/en-us/library/system.data.linq.datacontext.aspx
Any public static (Shared in Visual Basic) members of this type
are thread safe. Any instance members are not guaranteed to be
thread safe.
Granted, this is for .NET's DataContext type, but DbLinq aims to
implement the same API, with the same constraints, as .NET's API.
In short, DataContext is NOT thread safe, never has been, likely never
will be, so your scenario is broken by design.
Sorry.
> For connection pooling in ADO.Net a new database connection should be
> created each time a databse connection object is needed and the
> connection pool takes care of making connection re-use efficient. That
> solves the problem for multi-threaded apps as long as they don't try
> and pass a database connection object around which they shouldn't.
This is likely a good idea anyway (and is now at the end of my rather
lengthy TODO list).
That said, following this behavior will not magically make DataContext
thread safe, and thus won't make DataContext suitable for your uses.
(For example, objects returned by queries are "tied" to the DataContext
instance that created them for change tracking purposes, and these
objects are stored in a collection, so your multi-threaded access of the
DataContext may very well corrupt this collection, which will cause no
end of debugging pain....)
- Jon
Currently dblinq seems to work in this way: new IDbConnection() reads
connection from static thread-safe ado .net pool (if ado provider implements
pooling).
IDbConnection.Dispose() issued by connection creator (dblinq or user who
passes connection) returns connection to pool.
So I don'nt undrestand what to do?
Andrus.
This is a known issue. (Known to me as of two days ago. ;-)
The larger problem is that the current mechanism interacts *horribly*
with transactions (i.e. it doesn't work with DataContext.Transaction AT
ALL). The result is that I've had to disable several unit tests which
require use of Transactions (see my prior message about "unit tests
which altered the database"; I'm using Transactions in some tests to
prevent the changes from being persisted).
(The problem is that DataContext.Transaction is currently ignored, and
my feeble attempts so far at making DataContext.SubmitChanges() support
DataContext.Transaction results in "transactions within transactions are
not supported" errors, because in various places DatabaseContext is
being used when executing the SQL commands, and may implicitly create a
new context. This is obviously BAD.)
I think the entire use of DbConnection/etc. needs to be reworked
(possibly by removing DatabaseContext?), but I'm not sure when I'll get
to it. :-/
- Jon
I'm still not seeing the benefit here. DataContext.Connection already
exposes the IDbConnection instance to use, and DataContext appears to be
accessible from nearly everywhere of consequence.
DatabaseConnection does have one benefit: it can centralize the logic
for when to close the connection, as required in
http://msdn.microsoft.com/en-us/library/bb292288.aspx:
A DataContext opens and closes a database connection as needed
if you provide a closed connection or a connection string. In
general, you should never have to call Dispose on a DataContext.
If you provide an open connection, the DataContext will not
close it. Therefore, do not instantiate a DataContext with an
open connection unless you have a good reason to do this. In a
System.Transactions transaction, a DataContext will not open or
close a connection to avoid promotion.
So DatabaseConnection handles the implicit opening and closing of the
IDbConnection when it's needed. That seems fairly useful to me.
(Though it should be enhanced to check for presence of a transaction and
not open if a transaction is in process, though this might not be
necessary -- shouldn't the connection already be open for it to be part
of a transaction?)
DatabaseContext, on the other hand, has methods that aren't used used
anywhere (as far as VS is concerned) such as DatabaseContext.Connect(),
or DatabaseContext(DbProviderFactory) (why should we care about
DbProviderFactory?).
It's apparently "utility for the sake of utility" rather than necessity,
and what's necessary is *working* Transaction support (e.g. get
Transactions.TransactionRollbackDelete() and
Insert_Update_Delete.LinqToSqlInsert06() working), along with supporting
the documented DataContext IDbConnection semantics (e.g. open the
connection for as short a period as possible, leave open if the
connection was originally open, wrap DataContext.SubmitChanges() within
a transaction UNLESS DataContext.Transaction isn't null, in which case
we reuse the existing transaction, etc.).
My suspicion is that just getting transactions working properly will
result in either simplifying or removing many of these types, so I would
suggest trying that first. :-)
> I don't mind having a crack at implementing this but would like to get
> an indication that the design at least sounds ok. Otherwise there
> could be an obvious critical thing I'm missing.
I don't see how your approach of using GetConnection() would work,
actually. DbLinq cannot require creation of new instances, as the
IDbConnection instance may be provided to the constructor, and
IDbConnection isn't ICloneable. How could we reliably create new
connections?
The only solution I see, baring glaring bits of idiocy on my part, is to
open and close the IDbConnection on-demand whenever we perform a SQL
command, with the proviso that we don't close the connection if it was
already open when the DataContext was created. DatabaseConnection
should help with this logic.
Thanks,
- Jon
This is also wrong. MSDN explicitly documents[0] that the IDbConnection
will be kept open for as short a period as possible, for each SQL
request sent to the DB. Consequently, assuming that the IDbConnection
is closed during DataContext construction, it will be .Open()ed
AND .Close()d for each SQL request, which could be more than once.
Specifically, at minimum every DataContext.SubmitChanges() call should
result in both opening and closing the IDbConnection (along with every
query traversal, etc.).
- Jon
[0] http://msdn.microsoft.com/en-us/library/bb292288.aspx
So in your experience, long lived IDbConnection instances are bad. What
is long lived -- seconds or minutes?
I'll go out on a limb and assume that seconds is not long-lived. :-)
That said, the guideline [0] seems to be that a DataContext should be
created as a method local variable, not as a class field. Thus the
DataContext lifetime should be fairly short (i.e. seconds, not minutes),
removing most issues with reusing IDbConnection instances.
For example, most [1] code doesn't provide an IDbConnection instance,
but instead uses the connection string. This will BY NECESSITY result
in the creation of a *new* IDbConnection instance every time a
DataContext is created. (Using the connection string has added benefits
such as allowing specification of the SQL vendor w/o using DbLinq
extensions such as the DataContext(IDbConnection,IVendor) constructor.)
In short, for apps following existing guidelines (short-lived
DataContexts and connection strings), I don't think reusing the
IDbConnection instance will cause any problems. For apps which do
explicitly provide an IDbConnection, then the onus is on them to
recreate the IDbConnection as appropriate, as DbLinq won't do so (nor
should it be expected to do so, given the design).
- Jon
[0] I'd swear I saw a better guideline elsewhere on MSDN, but this is
the best my Google-fu can currently find:
http://social.msdn.microsoft.com/Forums/en-US/linqprojectgeneral/thread/ea6fe19c-cd21-40d1-b005-750af6632bfe
[1] "most" is, of course, pulled OOMA, but this is true for most example
code I've seen on MSDN.
I would have agreed with you, prior to reading the documentation, but
then I started looking at the object tracking implementation. In order
to perform updates, DataContext needs to track *every* entity[0] object
it returns, and additionally track property changes on these objects.
If you have DataContext living longer than a method call, and reuse it,
it would not seem that unusual for it to be tracking a *huge* number of
objects. (Specifically, I wouldn't be surprised if it were tracking
millions of objects if it were used with a busy web service within a few
days time, if not sooner.) Needing to track large quantities of objects
will increase pressure on the GC, increase memory use, and slow down the
application (as more OS disk cache is used as memory use increases).
It only gets worse when you realize that there's no mechanism to clear
the tracked object cache, meaning you'd have to create a new DataContext
instance at some point anyway, just to clear out unused entities...
> It's arguable the
> identity management and query caching in DbLinq won't be of much
> benefit if the DataContext is only used for single units of work.
I'm not sure about identity management, but I'm dubious about query
tracking (as I've mentioned numerous times), as the only way to usefully
cache the query is to use a global cache, which could have similar
unbounded growth issues.
On the other hand, CompiledQuery doesn't suffer from these problems, as
it's held by the developer, doesn't refer to a DataContext (it needs to
be provided to the CompiledQuery on each delegate call), and isn't
"invisible global memory use." But this is a discussion for another
time....
- Jon
[0] "entity" defined as "MappingSource.GetModel(Type) returns non-null
for the entity's type."
It's arguable the
identity management and query caching in DbLinq won't be of much
benefit if the DataContext is only used for single units of work.