Support for Oracle 9 is a very minor thing in the backend code itself -- only a handful of lines -- but these lines execute a command on the backend to get the server version, at every connection, and our data indicates this command does affect performance. You might expect that it just looks up a constant, but in fact it is a stored procedure that looks in several places to calculate not just the software version, but also the interface compliance level.
In the backend, the version info is only used to tell if we're still on 9. If we drop 9 support, we don't need that anymore.
For the benefit of those who need to know the version (e.g. 3rd-party code which uses specific Oracle 11 features), I propose we change oracle_version from an attribute on the connection that is filled at connection time, to a property that proxies the cx-oracle connection property. That property is memoized (at the connection-object level) anyway, so for code that relies on it, the only difference would be in when the server statements are executed, while for the rest (most?) of us, the statements will just not be run.
So -- can we drop Oracle 9? Should I open a ticket for it?
> Support for Oracle 9 is a very minor thing in the backend code itself -- only > a handful of lines -- but these lines execute a command on the backend to get > the server version, at every connection, and our data indicates this command > does affect performance. You might expect that it just looks up a constant, > but in fact it is a stored procedure that looks in several places to calculate > not just the software version, but also the interface compliance level.
> In the backend, the version info is only used to tell if we're still on 9. If > we drop 9 support, we don't need that anymore.
> For the benefit of those who need to know the version (e.g. 3rd-party code > which uses specific Oracle 11 features), I propose we change oracle_version > from an attribute on the connection that is filled at connection time, to a > property that proxies the cx-oracle connection property. That property is > memoized (at the connection-object level) anyway, so for code that relies on > it, the only difference would be in when the server statements are executed, > while for the rest (most?) of us, the statements will just not be run.
> So -- can we drop Oracle 9? Should I open a ticket for it?
-0. This isn't really like deprecating old Python versions, which is necessary to keep the codebase unified while gaining access to new features. We have a certain subset of SQL that we need to be supported by a backend, and that subset doesn't grow very quickly. Django works on Oracle 9 with the one exception of regex query filters, and that status is unlikely to change with new Oracle versions. The main reason to drop Oracle 9 support would be to support more recent features, and I'm not aware of anything particularly compelling for use by Django.
I would also want to poll the community before dropping support, to verify that few users are still using Oracle 9 with Django. We still have some Oracle 9 databases where I work, although none running Django at this point. I've even (recently) heard of organizations still stuck on Oracle 8.
The version check that you point out does not really need to be done on every connection, I think. If we make a connection and find one version and then later make a new connection using the same DatabaseWrapper, we expect the database to still be the same version. I would propose that instead of dropping support, we make this check once per DatabaseWrapper per thread, in the same way that the LIKE operators are already configured.
On Nov 17, 3:00 am, Ian Kelly <ian.g.ke...@gmail.com> wrote:
> The version check that you point out does not really need to be done > on every connection, I think. If we make a connection and find one > version and then later make a new connection using the same > DatabaseWrapper, we expect the database to still be the same version. > I would propose that instead of dropping support, we make this check > once per DatabaseWrapper per thread, in the same way that the LIKE > operators are already configured.
+1. There is some other cleanup needed in there, too. In DatabaseOperations: """ def regex_lookup(self, lookup_type): # If regex_lookup is called before it's been initialized, then create # a cursor to initialize it and recur. self.connection.cursor() return self.connection.ops.regex_lookup(lookup_type) """ in DatabaseWrapper.cursor for new connections: """ try: self.oracle_version = int(self.connection.version.split('.')[0]) # There's no way for the DatabaseOperations class to know the # currently active Oracle version, so we do some setups here. # TODO: Multi-db support will need a better solution (a way to # communicate the current version). if self.oracle_version <= 9: self.ops.regex_lookup = self.ops.regex_lookup_9 else: self.ops.regex_lookup = self.ops.regex_lookup_10 except ValueError: pass """
1. It could be better if regex_lookup did the redirection to the correct regex_lookup implementation. As is, dbwrapper needs to know internals of ops, and ops rely on internals of dbwrapper. 2. What happens if that ValueError is hit and regex_lookup is called? I hope it isn't actually possible to hit the ValueError. 3. Is that TODO still accurate?
<anssi.kaariai...@thl.fi> wrote: > 1. It could be better if regex_lookup did the redirection to the > correct regex_lookup implementation. As is, dbwrapper needs to know > internals of ops, and ops rely on internals of dbwrapper.
Good call. There's really no reason for any of this to be in the wrapper. Moving it all into ops will result in cleaner code, and if the user is not using regex filters, there is no need to do a version check in the first place.
> 2. What happens if that ValueError is hit and regex_lookup is called? > I hope it isn't actually possible to hit the ValueError.
AFAIK it's not. The ValueError only guards against getting a weirdly formatted version string from the database. If it did get hit, I think that regex_lookup would result in an infinite recursion, which we should definitely fix.
> 3. Is that TODO still accurate?
No. I think it was based on an incorrect assumption about how multi-db would work.
> On Mon, Nov 21, 2011 at 3:53 PM, Anssi Kääriäinen
> <anssi.kaariai...@thl.fi> wrote: > > 1. It could be better if regex_lookup did the redirection to the > > correct regex_lookup implementation. As is, dbwrapper needs to know > > internals of ops, and ops rely on internals of dbwrapper.
> Good call. There's really no reason for any of this to be in the > wrapper. Moving it all into ops will result in cleaner code, and if > the user is not using regex filters, there is no need to do a version > check in the first place.
> > 2. What happens if that ValueError is hit and regex_lookup is called? > > I hope it isn't actually possible to hit the ValueError.
> AFAIK it's not. The ValueError only guards against getting a weirdly > formatted version string from the database. If it did get hit, I > think that regex_lookup would result in an infinite recursion, which > we should definitely fix.
> > 3. Is that TODO still accurate?
> No. I think it was based on an incorrect assumption about how > multi-db would work.
I can do a patch for this if needed. Are you (or somebody else) working on this?
> On Tuesday 22 November 2011, Anssi Kääriäinen wrote: > > I can do a patch for this if needed. Are you (or somebody else) > > working on this?
> While you're at it, [...]
While we're all at it, I would like to bring to your attention another suggested fix for Oracle. The patch attached here tells the backend that if a fetch operation retrieved less than the requested array size, then no further fetch attempts are necessary.
The fix is motivated by our DBA's finding, that many single-row-selects are followed by double fetches. When I checked the code, I realized that with the exception of aggregates, this is true for all single-row selects: whether it originates from qset.get(...) or from qset[j], the QuerySet code adjusts the SQL and then calls fetchmany in a loop until no results are returned. Since neither cx_oracle nor the Oracle client library take it upon themselves to stop such requests, they go all the way to the database.
The Python DB-API[0] says on fetchmany: """ The method should try to fetch as many rows as indicated by the size parameter. If this is not possible due to the specified number of rows not being available, fewer rows may be returned. """ I interpret that as "no rows should be fetched after a call failed to fill the buffer". Under a loose-enough transaction isolation level, rows may "come into being" in mid-transaction, but I don't think it is reasonable to expect them to do so in mid-statement.
That being said, the problem that this patch fixes is a little hard to test for; it essentially requires mocking the database library. Also, I'm not sure this is the right fix; I don't know if other backends have the same issue. If they do, it might be better to make qset[j] just use fetchone instead of fetchmany, like the aggregates do; and it might be worthwhile to also change the qset.get() logic (in a more subtle way -- we don't want to lose the check for multiple records returned).
As for the patch, it is only "tested" in the sense that one Django-1.2 based application continues to run correctly with (essentially) it; I don't(yet) have an Oracle+Trunk testing environment.
On Thu, Nov 24, 2011 at 3:00 PM, Shai Berger <s...@platonix.com> wrote: > While we're all at it, I would like to bring to your attention another > suggested fix for Oracle. The patch attached here tells the backend that if a > fetch operation retrieved less than the requested array size, then no further > fetch attempts are necessary.
> The fix is motivated by our DBA's finding, that many single-row-selects are > followed by double fetches. When I checked the code, I realized that with the > exception of aggregates, this is true for all single-row selects: whether it > originates from qset.get(...) or from qset[j], the QuerySet code adjusts the > SQL and then calls fetchmany in a loop until no results are returned. Since > neither cx_oracle nor the Oracle client library take it upon themselves to > stop such requests, they go all the way to the database.
There is nothing specific to Oracle about this. This optimization should be performed in the query engine, not in the backend.
> The Python DB-API[0] says on fetchmany: > """ > The method should try to fetch as many rows as indicated by the size > parameter. If this is not possible due to the specified number of rows not > being available, fewer rows may be returned. > """ > I interpret that as "no rows should be fetched after a call failed to fill the > buffer". Under a loose-enough transaction isolation level, rows may "come into > being" in mid-transaction, but I don't think it is reasonable to expect them > to do so in mid-statement.
> That being said, the problem that this patch fixes is a little hard to test > for; it essentially requires mocking the database library. Also, I'm not sure > this is the right fix; I don't know if other backends have the same issue. If > they do, it might be better to make qset[j] just use fetchone instead of > fetchmany, like the aggregates do; and it might be worthwhile to also change > the qset.get() logic (in a more subtle way -- we don't want to lose the check > for multiple records returned).