Problem with history view in admin page

85 views
Skip to first unread message

Rahul

unread,
Nov 23, 2009, 8:12:38 AM11/23/09
to Django developers, ope...@us.ibm.com
Hi All,

There is a problem reported for history view in admin page.
For history view there is a sql query generated, which trying to do
exact look-up on OBJECT_ID column (LogEntry model, DJANGO_ADMIN_LOG
table).

"TextField" is responsible for handling large text and in DB2
"TextField" is mapped to CLOB.
DB2 doesn't support exact look-up with CLOB datatype.

When I checked the values for this OBJECT_ID column in the database,
it only contains integer (1, 2, 3, etc.).

Now the question arises: Why "OBJECT_ID" defined as "TextField" type?
In most of the cases "XXX_ID" field is expected to be INTEGER.

Is there any specific reason why "OBJECT_ID" defined as "TextField" in
LogEntry model in "django\contrib\admin\models.py".

The original post of this problem is available on
http://groups.google.com/group/ibm_db/browse_thread/thread/368e5739eb358e20

Regards,
Rahul Priyadarshi

Russell Keith-Magee

unread,
Nov 23, 2009, 10:05:54 AM11/23/09
to django-d...@googlegroups.com
On Mon, Nov 23, 2009 at 9:12 PM, Rahul <rahul.pr...@in.ibm.com> wrote:
> Hi All,
>
> There is a problem reported for history view in admin page.
> For history view there is a sql query generated, which trying to do
> exact look-up on OBJECT_ID column (LogEntry model, DJANGO_ADMIN_LOG
> table).
>
> "TextField" is responsible for handling large text and in DB2
> "TextField" is mapped to CLOB.
> DB2 doesn't support exact look-up with CLOB datatype.
>
> When I checked the values for this OBJECT_ID column in the database,
> it only contains integer (1, 2, 3, etc.).
>
> Now the question arises: Why "OBJECT_ID" defined as "TextField" type?
> In most of the cases "XXX_ID" field is expected to be INTEGER.
>
> Is there any specific reason why "OBJECT_ID" defined as "TextField" in
> LogEntry model in "django\contrib\admin\models.py".

Yes.

The reason is that while *most* objects will use an integer as the
primary key, not *all* objects use an integer as a primary key. A
TextField is the most generic storage that can be provided to store
*any* kind of primary key - integer, char, or otherwise. What is
stored in this field is a string-serialized representation of the
primary key value.

LogEntry.get_edited_object then issues the query to find the actual
object that is associated with the id; this relies on the fact that
Django queries will coerce string input into an appropriate data type
for a primary key lookup.

Yours
Russ Magee %-)

Karen Tracey

unread,
Nov 23, 2009, 10:24:28 AM11/23/09
to django-d...@googlegroups.com
Note Oracle had the same problem:

http://code.djangoproject.com/ticket/5087

I can't actually read Oracle but I think the fix involves adding a cast that extracts the first 4000 chars of data from the field for comparison.  They override the field_cast_sql method in the Oracle DatabaseOperations to add the cast to any field type that ends with LOB.

Karen

Mario Briggs

unread,
Nov 24, 2009, 12:07:35 AM11/24/09
to Django developers
>>
What is stored in this field is a string-serialized representation of
the
primary key value.
<<

I agree that INTEGER is not the right choice, but then so too is CLOB.
How long is this string-serialized representation going to be? greater
than 4000 characters ? Varchar(X) where X is > 4000 or something is
then the right choice. This is validated by what Karen says is the
Oracle fix.

Note: the original workaround provided to Helgi was to use varchar
(3000) and if willing to be more adventurous then try INTEGER

thanks
Mario

Russell Keith-Magee

unread,
Nov 24, 2009, 1:00:50 AM11/24/09
to django-d...@googlegroups.com
On Tue, Nov 24, 2009 at 1:07 PM, Mario Briggs <mario....@in.ibm.com> wrote:
>>>
> What is stored in this field is a string-serialized representation of
> the
> primary key value.
> <<
>
> I agree that INTEGER is not the right choice, but then so too is CLOB.
> How long is this string-serialized representation going to be? greater
> than 4000 characters ? Varchar(X)  where X is > 4000 or something is
> then the right choice. This is validated by what Karen says is the
> Oracle fix.

Well, Django doesn't make the decision to use CLOB - that's in the
hands of your backend. In the same circumstances, SQLite and Postgres
use 'text'. MySQL uses 'longtext'. Oracle uses 'NCLOB'

In theory, the contents of the object_id field could be anything -
including a string of arbitrary length (i.e., a TextField). However,
in practice, I would be surprised to see 4000+ characters for
object_id - most primary keys are going to be integers, and the ones
that aren't are likely to be short strings or string-like datatypes.

I can't comment on what would be an appropriate datatype for DB2 in
this context.

Yours,
Russ Magee %-)

Ian Kelly

unread,
Nov 24, 2009, 12:25:32 PM11/24/09
to django-d...@googlegroups.com
On Mon, Nov 23, 2009 at 11:00 PM, Russell Keith-Magee
<freakb...@gmail.com> wrote:
> On Tue, Nov 24, 2009 at 1:07 PM, Mario Briggs <mario....@in.ibm.com> wrote:
>> I agree that INTEGER is not the right choice, but then so too is CLOB.
>> How long is this string-serialized representation going to be? greater
>> than 4000 characters ? Varchar(X)  where X is > 4000 or something is
>> then the right choice. This is validated by what Karen says is the
>> Oracle fix.
>
> Well, Django doesn't make the decision to use CLOB - that's in the
> hands of your backend. In the same circumstances, SQLite and Postgres
> use 'text'. MySQL uses 'longtext'. Oracle uses 'NCLOB'

The decision is based on the field type, not on the individual field.
If it were possible within the Oracle backend to override this
particular field to VARCHAR2, we would happily do it. But TextFields
in general need to be LOBs.

> In theory, the contents of the object_id field could be anything -
> including a string of arbitrary length (i.e., a TextField). However,
> in practice, I would be surprised to see 4000+ characters for
> object_id - most primary keys are going to be integers, and the ones
> that aren't are likely to be short strings or string-like datatypes.

Due to the hackish way the Oracle backend implements lookups on LOBs,
it will still fail if the object_id is actually longer than 4000
characters. It might even raise an error (I haven't tried it). So at
least in Oracle we gain nothing by making this a TextField rather than
a CharField(max_length=4000). Plus, if it were a CharField, we would
be able to index it.

Correct me if I'm wrong, but I think the only way to get an object_id
so large that it won't fit inside a backend-dependent CharField is if
the primary key of the model is itself a TextField. If you ask me,
anybody foolish enough to use a TextField as a primary key deserves
what they get. Oracle won't even allow it.

Regards,
Ian

Russell Keith-Magee

unread,
Nov 24, 2009, 7:09:47 PM11/24/09
to django-d...@googlegroups.com
Your comment about foolishness is definitely correct :-) As for the
CharField point - following this logic, we could define object_id as
a CharField(max_length=N), for some large N. However, in the general
case, we can't specify an N that is greater than all N's that
end-users might use. Therefore, we use a TextField, which is unbound.

This is a slightly moot point, though; Django defines the admin log
object_id using a TextField, and changing this would be a big
backwards incompatibility.

Yours
Russ Magee %-)

Karen Tracey

unread,
Nov 24, 2009, 7:28:40 PM11/24/09
to django-d...@googlegroups.com
On Tue, Nov 24, 2009 at 7:09 PM, Russell Keith-Magee <freakb...@gmail.com> wrote:
This is a slightly moot point, though; Django defines the admin log
object_id using a TextField, and changing this would be a big
backwards incompatibility.


Also I believe there's at least one other place in Django that does similar -- the object_pk in the base contrib.comments model is also a TextField.  I suspect any DB backend that  has trouble with the admin history view is also going to run into trouble with comments.

Karen

Mario Briggs

unread,
Nov 24, 2009, 11:54:56 PM11/24/09
to Django developers
Russell,

> >> Well, Django doesn't make the decision to use CLOB - that's in the
> >> hands of your backend. In the same circumstances, SQLite and Postgres
> >> use 'text'. MySQL uses 'longtext'. Oracle uses 'NCLOB'

I understand that the decision to what 'Text' should be mapped to is
the choice of the backend. Given the examples you gave what the DB2
adapter is doing is right.

IMHO during DB design, choosing a column' type as CLOB when it really
wont be varcharXXX doesnt cut it from multiple angles.

If the DB2 backend did the same hack as Karen mentioned, we would be
guilty of abetting bad database design. Therefore we will not do that.
Instead we will let users run into errors when they do a '=' compare
of a Django Text type, and then let them ponder whether they need that
column as a CLOB in the backend.

Mario

Richard Laager

unread,
Nov 25, 2009, 12:47:51 AM11/25/09
to django-d...@googlegroups.com
On Tue, 2009-11-24 at 20:54 -0800, Mario Briggs wrote:
> Instead we will let users run into errors when they do a '=' compare
> of a Django Text type, and then let them ponder whether they need that
> column as a CLOB in the backend.

Shouldn't something be changing that = into a LIKE (and escaping LIKE
metacharacters in the parameter)?

I don't know why a user should have <queryset>.filter(object_id='1')
fail, as that breaks the ORM abstraction. Maybe that's not what you're
suggesting?

Richard

signature.asc

Yuri Baburov

unread,
Nov 25, 2009, 3:24:48 AM11/25/09
to django-d...@googlegroups.com
Hi Russell,

is it possible to introduce some new field type
ShortTextField for that purpose, that will be by default
`varchar(4000)` on Oracle and DB2 who supports long varchars, and
`text` on other backends like it was before, excepting 'smalltext'
instead of 'longtext' for mysql?
I think this is a point where "TextField is accessed by value"
abstraction breaks, and better separation between long string and
referenced object should be introduced.
I assume we have not much Oracle & DB2 users yet, and nothing will
change for them unless they already suffer from this problem and they
will not anymore. Migration script is single "alter table" command,
and that needs to be documented.

On Wed, Nov 25, 2009 at 6:09 AM, Russell Keith-Magee
> --
>
> You received this message because you are subscribed to the Google Groups "Django developers" group.
> To post to this group, send email to django-d...@googlegroups.com.
> To unsubscribe from this group, send email to django-develop...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.
>
>
>



--
Best regards, Yuri V. Baburov, ICQ# 99934676, Skype: yuri.baburov,
MSN: bu...@live.com

Russell Keith-Magee

unread,
Nov 26, 2009, 2:47:29 AM11/26/09
to django-d...@googlegroups.com
On Wed, Nov 25, 2009 at 4:24 PM, Yuri Baburov <bur...@gmail.com> wrote:
> Hi Russell,
>
> is it possible to introduce some new field type
> ShortTextField for that purpose, that will be by default
> `varchar(4000)` on Oracle and DB2 who supports long varchars, and
> `text` on other backends like it was before, excepting 'smalltext'
> instead of 'longtext' for mysql?

I can see that this approach doesn't have any backward compatibility
issues. However, from a documentation point of view, explaining the
difference between a ShortTextField and a TextField requires a leaky
abstraction (i.e., you have to know about implementation details in
order for the explanation to make sense).

With a different coat of paint, it might be more palatable. A name
like ShortTextField presupposes the storage implementation, but tells
you nothing about the appropriate usage. However, a different name -
something like GenericKeyField tells you nothing about the storage,
but does tell you when it might be appropriate to use it.

Of course GenericKeyField has the problem of being easy to confuse
with GenericForeignKey. For that reason, I'm not sold on
GenericKeyField as a name - any suggestions in this general vein are
welcome.

> I think this is a point where "TextField is accessed by value"
> abstraction breaks, and better separation between long string and
> referenced object should be introduced.

Agreed.

> I assume we have not much Oracle & DB2 users yet, and nothing will
> change for them unless they already suffer from this problem and they
> will not anymore. Migration script is single "alter table" command,
> and that needs to be documented.

I'm always a little nervous about upgrading instructions, and doubly
so about those that include database migrations. History has shown us
that it doesn't matter how well we write upgrading notes - backwards
incompatible changes cause problems. Unless there is a particularly
compelling reason to change the existing Oracle implementation, I'd
rather not force that change.

Yours,
Russ Magee %-)

Yuri Baburov

unread,
Nov 27, 2009, 2:22:41 AM11/27/09
to django-d...@googlegroups.com
Hi Russell,

On Thu, Nov 26, 2009 at 1:47 PM, Russell Keith-Magee
<freakb...@gmail.com> wrote:
> On Wed, Nov 25, 2009 at 4:24 PM, Yuri Baburov <bur...@gmail.com> wrote:
>> Hi Russell,
>>
>> is it possible to introduce some new field type
>> ShortTextField for that purpose, that will be by default
>> `varchar(4000)` on Oracle and DB2 who supports long varchars, and
>> `text` on other backends like it was before, excepting 'smalltext'
>> instead of 'longtext' for mysql?
>
> I can see that this approach doesn't have any backward compatibility
> issues. However, from a documentation point of view, explaining the
> difference between a ShortTextField and a TextField requires a leaky
> abstraction (i.e., you have to know about implementation details in
> order for the explanation to make sense).
>
> With a different coat of paint, it might be more palatable. A name
> like ShortTextField presupposes the storage implementation, but tells
> you nothing about the appropriate usage. However, a different name -
> something like GenericKeyField tells you nothing about the storage,
> but does tell you  when it might be appropriate to use it.
>
> Of course GenericKeyField has the problem of being easy to confuse
> with GenericForeignKey. For that reason, I'm not sold on
> GenericKeyField as a name - any suggestions in this general vein are
> welcome.
First, since we have a lot of fields already like
PositiveIntegerField, there is no problem to add one more specific
field. It's not new kind of fields distinction, it's just one more
specie of them.

But then much more general problem arises: use of CLOB instead of
varchar2 where 255<length<=4000.
To support Oracle, all Django programmers should now be aware of this
Oracle-specific data type distinction, should know that SmallTextField
is better than TextField for some situations, etc.
That's awful.

So the enhanced idea appears:
TextField(max_length=x) where x<=4000 to be treated as varchar2(x) in Oracle.
TextField with greater length and TextField without length to be
treated as NCLOB.

LogEntry.object_id rewritten as TextField(max_length=4000).
Documented as a restriction by design, that prevents bad usages.

This should be documented in Oracle backend docs on data types in bold.
Am I correct that varchar2 is always preferred to NCLOB where appropriate?

>> I think this is a point where "TextField is accessed by value"
>> abstraction breaks, and better separation between long string and
>> referenced object should be introduced.
>
> Agreed.
>
>> I assume we have not much Oracle & DB2 users yet, and nothing will
>> change for them unless they already suffer from this problem and they
>> will not anymore. Migration script is single "alter table" command,
>> and that needs to be documented.
>
> I'm always a little nervous about upgrading instructions, and doubly
> so about those that include database migrations. History has shown us
> that it doesn't matter how well we write upgrading notes - backwards
> incompatible changes cause problems. Unless there is a particularly
> compelling reason to change the existing Oracle implementation, I'd
> rather not force that change.
Yes, they are always a pain, but in such situation I'm feeling like in
a surgeon.
I tend to prefer to do surgery once instead of living with constant pain.
For this problem, pain is not strong, but surgery is quick, and those who use
newer Django won't even know about that pain.
If we'll settle on a good enough solution, of course.

Mario Briggs

unread,
Nov 27, 2009, 3:21:16 AM11/27/09
to Django developers
Richard,

> I don't know why a user should have <queryset>.filter(object_id='1')
> fail, as that breaks the ORM abstraction. Maybe that's not what you're
> suggesting?

I am saying exactly what i am saying. So here's my example that does
the same thing in a java ORM

// Here's the model
@Entity
public class ClobTest {
@Id
@Type(type="text")
private String id ;

private String firstName ;
}

// here's the app
ClobTest c = EntityManager.find(hellojpa.ClobTest.class, "1");

// here's what happens
Exception in thread "main" javax.persistence.PersistenceException:
org.hibernate.exception.SQLGrammarException: could not load an entity:
[hellojpa.ClobTest#1]
at
org.hibernate.ejb.AbstractEntityManagerImpl.throwPersistenceException
(AbstractEntityManagerImpl.java:637)
at org.hibernate.ejb.AbstractEntityManagerImpl.find
(AbstractEntityManagerImpl.java:202)
Caused by: com.ibm.db2.jcc.b.nm: DB2 SQL Error: SQLCODE=-401,
SQLSTATE=42818,
SQL0401N - The data types of the operands for the operation
"<operator>" are not compatible.

I am certain i would get the same exception if i used Oracle as the
DB. So i cant see why the Django Oracle Adapter should cover this.
Russ,

> With a different coat of paint, it might be more palatable. A name
> like ShortTextField presupposes the storage implementation, but tells
> you nothing about the appropriate usage. However, a different name -
> something like GenericKeyField tells you nothing about the storage,
> but does tell you  when it might be appropriate to use it.
>

GenericKeyField - Its hard for me to see that a end-user would not be
interested in the specific type of his column at the database
especially if its a key field.

thanks
Mario

Russell Keith-Magee

unread,
Nov 28, 2009, 2:47:22 AM11/28/09
to django-d...@googlegroups.com
On Fri, Nov 27, 2009 at 4:21 PM, Mario Briggs <mario....@in.ibm.com> wrote:
> Russ,
>
>> With a different coat of paint, it might be more palatable. A name
>> like ShortTextField presupposes the storage implementation, but tells
>> you nothing about the appropriate usage. However, a different name -
>> something like GenericKeyField tells you nothing about the storage,
>> but does tell you  when it might be appropriate to use it.
>>
>
> GenericKeyField  - Its hard for me to see that a end-user would not be
> interested in the specific type of his column at the database
> especially if its a key field.

Depends on the focus of the developer. DB-focussed developers or
developers looking to optimize query performance might care about the
underlying storage. However, developers working at a higher level just
want their data to be stored. The exact database type is irrelevant as
long as queries using that field work in reasonable time.

Django currently uses a TextField is currently storing a
string-serialized form of the key of an arbitrary object. What is
being stored in any given row can be determined by looking at the
content_type of the row (or rather, the primary key type of the model
that the content type refers to). It is used to create log entries
that point at arbitrary objects in the database. From a pure DBA,
third-normal-form perspective, it isn't an especially efficient data
structure - but it *works* for the situation in which it used. At the
end of the day, that's what really matters.

Yours,
Russ Magee %-)

Russell Keith-Magee

unread,
Nov 28, 2009, 3:57:40 AM11/28/09
to django-d...@googlegroups.com
PostiveIntegerField isn't a good example here - in fact, it reinforces
my point. PositiveIntegerField tells you nothing concrete about the
underlying storage - all it describes is a validation condition.

> But then much more general problem arises: use of CLOB instead of
> varchar2 where 255<length<=4000.
> To support Oracle, all Django programmers should now be aware of this
> Oracle-specific data type distinction, should know that SmallTextField
> is better than TextField for some situations, etc.
> That's awful.

Agreed. That's why I don't like the idea of adding
(Short|Small)TextField. However, if you step away from calling it a
TextField, and give it a semantic purpose - storing a Generic Key -
then it isn't confusing at all. If you want to store a generic key,
use a GenericKeyField. If you want to store text, use a TextField.

> So the enhanced idea appears:
> TextField(max_length=x) where x<=4000 to be treated as varchar2(x) in Oracle.
> TextField with greater length and TextField without length to be
> treated as NCLOB.
>
> LogEntry.object_id rewritten as TextField(max_length=4000).
> Documented as a restriction by design, that prevents bad usages.
>
> This should be documented in Oracle backend docs on data types in bold.

I'm not wild about the idea of having underlying datatypes change
based on attributes in a field definition. To date, Django has
maintained a clear mapping between Field type and database type.

> Am I correct that varchar2 is always preferred to NCLOB where appropriate?

On this point I would need to defer to our friendly community Oracle experts.

>>> I think this is a point where "TextField is accessed by value"
>>> abstraction breaks, and better separation between long string and
>>> referenced object should be introduced.
>>
>> Agreed.
>>
>>> I assume we have not much Oracle & DB2 users yet, and nothing will
>>> change for them unless they already suffer from this problem and they
>>> will not anymore. Migration script is single "alter table" command,
>>> and that needs to be documented.
>>
>> I'm always a little nervous about upgrading instructions, and doubly
>> so about those that include database migrations. History has shown us
>> that it doesn't matter how well we write upgrading notes - backwards
>> incompatible changes cause problems. Unless there is a particularly
>> compelling reason to change the existing Oracle implementation, I'd
>> rather not force that change.
> Yes, they are always a pain, but in such situation I'm feeling like in
> a surgeon.
> I tend to prefer to do surgery once instead of living with constant pain.
> For this problem, pain is not strong, but surgery is quick, and those who use
> newer Django won't even know about that pain.

Django has a very firm backwards compatibility guarantee. We are very
careful that when we roll out a new version, it doesn't break any
existing installations because of changes in documented interfaces.
Even if you take one of the biggest changes in Django's post 1.0
history - CSRF support - it has been added in such a way that any
existing installation shouldn't be affected by upgrading from 1.1 to
trunk.

Unfortunately, absent of a schema migration system built into Django,
there simply isn't any way to handle changes to model definitions for
existing Django tables in a completely transparent way. Documenting
changes like this just doesn't cut it - the bad publicity generated by
"code broke when I upgraded" isn't mitigated by saying "but it says do
X in the documentation".

The "cure" administered by the surgeon may be simple, obvious, and
right; but if it isn't completely transparent, I'm afraid you're going
to have an uphill battle to convince me it's the right solution.

Yours,
Russ Magee %-)

Mario Briggs

unread,
Dec 2, 2009, 11:59:48 PM12/2/09
to Django developers
Russ,

I dont agree to the *it works* theory here - Ian rightly said 'If you
ask me,
anybody foolish enough to use a TextField as a primary key deserves
what they get' and you agreed 'Your comment about foolishness is
definitely correct '

Putting it in context, this is in the 'user control area' and the fact
that one can go ahead and change the type in the LogEntry definition
like Helgi did means it not a blockade (though it would be good if
Django got i right first time).

I have another question though but that is due to my lack of wide
usage of Django - so you mentioned that u were scared to make the
changes becuase of problems with migrations - once Helgi makes that
change and the next time he is upgrading to the a higher version of
Django, is there anything he needs to do ?

cheers
Mario

Russell Keith-Magee

unread,
Dec 3, 2009, 3:20:30 AM12/3/09
to django-d...@googlegroups.com
On Thu, Dec 3, 2009 at 12:59 PM, Mario Briggs <mario....@in.ibm.com> wrote:
> Russ,
>
> I dont agree to the *it works* theory here - Ian rightly said 'If you
> ask me,
> anybody foolish enough to use a TextField as a primary key deserves
> what they get' and you agreed  'Your comment about foolishness is
> definitely correct '

I can point at countless Django installations that have perfectly
functioning admin interfaces. I'm sorry to break this to you, but that
means that Django's admin works.

Django assumes that it is possible to do a lookup on a TextField. This
behavior is implemented and usable on SQLite, Postgres, MySQL and
Oracle. Oracle has a slight caveat (4k character limit) on this
capability, but that doesn't place any practical limitations on the
unconventional way that Django uses TextField in the admin.

If the DB2 backend doesn't support lookups on TextFields, then that is
a problem with the DB2 backend, not Django.

> Putting it in context, this is in the 'user control area' and the fact
> that one can go ahead and change the type in the LogEntry definition
> like Helgi did means it not a blockade

I'm not sure what you mean by "user control area", but if you want to
propose manual table modification as a solution to this problem,
that's between you and your userbase.

> (though it would be good if
> Django got i right first time).

Django *did* get it right first time. I've already explained why the
code is the way it is. *None* of the officially supported Django
backends have any problem with this usage. To my reading, this is
*entirely* a problem with the DB2 backend.

Now, if you want to propose a modification to Django that will make it
easier for you to implement the DB2 backend, I'm open to suggestions.
I've already made one such suggestion, which based on your reaction,
you don't like. Now it's your turn to make a counterproposal.

> I have another question though but that is due to my lack of wide
> usage of Django - so you mentioned that u were scared to make the
> changes becuase of problems with migrations - once Helgi makes that
> change and the next time he is upgrading to the a higher version of
> Django, is there anything he needs to do ?

Once Django has rolled out a table, it doesn't try to recreate or
validate that the table is the same at the database level. As long as
the database-level table definition is functionally compatible with
the model definition, you shouldn't have any problems.

Yours,
Russ Magee %-)

Mario Briggs

unread,
Dec 9, 2009, 7:17:00 AM12/9/09
to Django developers
I was too harsh on the GenericKeyField. How about GenericKeyField
(length=x). I think the reason i put length in there is obvious, but i
can explain if need be.

In post #14 on this thread you suggested - " I'm not wild about the
idea of having underlying datatypes change based on attributes in a
field definition. To date, Django has maintained a clear mapping
between Field type and database type."

I am not suggesting that backends change the type in the DB based on
the 'length'. They should use 1 fixed type for GenericKeyField

regards
Mario




Russell Keith-Magee

unread,
Dec 9, 2009, 8:09:09 AM12/9/09
to django-d...@googlegroups.com
On Wed, Dec 9, 2009 at 8:17 PM, Mario Briggs <mario....@in.ibm.com> wrote:
> I was too harsh on the GenericKeyField. How about GenericKeyField
> (length=x). I think the reason i put length in there is obvious, but i
> can explain if need be.

Well, you may have to, because the reason that it doesn't need a
length is also obvious, and I've already explained it. To clarify:

The *only* reason that we are even talking about adding this field is
because DB2 can't index on the datatype that is the natural match for
a length-unlimited TextField.

That means that the DB2 backend needs to be able to differentiate
between text fields that are actually storing length-unlimited text,
and text fields that are being used as a generic key store.

Now, a field is either a generic key field, or it isn't. If it is, it
needs to be able to store as generic a key as the backend will allow.
Ideally, that would be an length-unlimited text field, and on SQLite,
MySQL, Postgres and Oracle, it will be. However, this isn't possible
on DB2; but given that text fields aren't used as primary keys in
practice, a suitably long character limitation will suffice.

How long? Pick a number. It doesn't matter what number you pick - DB2
is the only database that will be affected, and whatever N you pick, I
will be able to produce a model with a CharField primary key with
length N+1 that will break it. You just need to pick an N such that
all practical use cases will be covered. Oracle's indexing imposes an
implicit limit of 4000 characters; that sounds to me like as good a
number as any.

As far as I can make out, there is absolutely no use case for
specifying a length in GenericKeyField. Feel free to let me know the
flaw in my reasoning.

Yours,
Russ Magee %-)

Mario Briggs

unread,
Dec 10, 2009, 3:03:09 AM12/10/09
to Django developers
> The *only* reason that we are even talking about adding this field is
> because DB2 can't index on the datatype that is the natural match for
> a length-unlimited TextField.
> Oracle's indexing imposes an
> implicit limit of 4000 characters; that sounds to me like as good a
> number as any.

Russ,
When i saw Karen's post about the cast to 4k in the adapter, i assumed
that it was not backed by an index in the database. Your above 'text'
made me go and search this in Oracle and i found that u can actually
index the first 4k characters in an Oracle NCLOB. The fact that one
can index it, changes things and let me relook at the needs here

regards
Mario

Mario Briggs

unread,
Jan 7, 2010, 1:02:26 AM1/7/10
to Django developers
Russ,

The indexing that Oracle is supporting is 'index-on-expression'. DB2
also supports that, but it isnt enabled for character long columns,
since they hadnt had a request for that. I talked to the DB2 server
folks about these use-cases and they have agreed to support this via
index-on-expression shortly. For the immediate needs, Rahul is going
to look at supporting this via 'hidden' columns - Whenever there is a
text field with an index specified, the backend will create a hidden
and generated column (poplated with first X chars of the text field).
The index will be created on this hidden column. When there is a
lookup by value, the backend will actually do the lookup on the hidden
indexed column. - you see any blockers to this ?


BTW, index is not turned ON on LogEntry.object_id. Do you feel this
use-case usage is low to warrant that ?

regards
Mario

Russell Keith-Magee

unread,
Jan 7, 2010, 1:24:02 AM1/7/10
to django-d...@googlegroups.com
On Thu, Jan 7, 2010 at 2:02 PM, Mario Briggs <mario....@in.ibm.com> wrote:
> Russ,
>
> The indexing that Oracle is supporting is 'index-on-expression'. DB2
> also supports that, but it isnt enabled for character long columns,
> since they hadnt had a request for that. I talked to the DB2 server
> folks about these use-cases and they have agreed to support this via
> index-on-expression shortly. For the immediate needs, Rahul is going
> to look at supporting this via 'hidden' columns -  Whenever there is a
> text field with an index specified, the backend will create a hidden
> and generated column (poplated with first X chars of the text field).
> The index will be created on this hidden column. When there is a
> lookup by value, the backend will actually do the lookup on the hidden
> indexed column. - you see any blockers to this ?

I'm not sure what you're asking me. "Blockers" of what?

I know almost nothing about DB2; I know only slightly more about
Oracle. I can only advise you on what Django does. In this case, the
code tells you everything you need to know: Django needs to be able to
issue the following query:

action_list = LogEntry.objects.filter(
object_id = object_id,
content_type__id__exact =
ContentType.objects.get_for_model(model).id
).select_related().order_by('action_time')

As long as you can satisfy that requirement, you've done everything
you need to do.

> BTW, index is not turned ON on LogEntry.object_id. Do you feel this
> use-case usage is low to warrant that ?

I can't say the lack of an index on that field has been a performance
problem on any site I've managed, and I don't recall seeing any
reports on the mailing list or Trac that indicate that it has caused a
problem for anyone else.

Yours,
Russ Magee %-)

Mario Briggs

unread,
Jan 7, 2010, 3:57:09 AM1/7/10
to Django developers

>
> I'm not sure what you're asking me. "Blockers" of what?
>

I meant, for sure we know that the backend can switch the lookup to
the hidden column (from the original column) by overriding -
'field_cast_sql(self, db_type):'

The backend also needs to execute the SQL to create the hidden column
during Django's index creation process. Rahul hasnt yet started
digging where to add that. The 'Blocker' question was if uphead anyone
knew that would be problematic for the backend to do ?

regards
Mario

Reply all
Reply to author
Forward
0 new messages