Unique CharField greater than 256 characters on MySQL Fails - what is the correct solution

1,083 views
Skip to first unread message

adamiis111

unread,
Oct 16, 2008, 6:28:19 PM10/16/08
to Django developers
I opened up a ticket regarding a bug in the django-app-plugin with
regards to the indexed CharField being too large
for MySQL 5.0:

http://code.google.com/p/django-app-plugins/issues/detail?id=11

I think that maybe Django's handling of this should be examined more
generally for one of the following options.

1) Django should disallow CharFields max-length>255.
2) Django should silently use TEXT as the datatype when CharField is
greater than 255.
3) Django should silently use TEXT as the datatype when CharField is
greater than 255 and max-length>255.
3) Django should disallow unique=true and max-length>255.

Other thoughts? I literally just moved over from PHP this week so I'm
probably not in a great position to be committing patches although I'm
fine with that. Otherwise, I would appreciate feedback and a
direction on this.

Thanks,
Adam

Malcolm Tredinnick

unread,
Oct 16, 2008, 7:09:13 PM10/16/08
to django-d...@googlegroups.com

On Thu, 2008-10-16 at 15:28 -0700, adamiis111 wrote:
> I opened up a ticket regarding a bug in the django-app-plugin with
> regards to the indexed CharField being too large
> for MySQL 5.0:
>
> http://code.google.com/p/django-app-plugins/issues/detail?id=11
>
> I think that maybe Django's handling of this should be examined more
> generally for one of the following options.
>
> 1) Django should disallow CharFields max-length>255.

For MySQL versions prior to 5.0.3, Django will raise a validation error
(when you run "manage.py validate") when you have a CharField with a
max_length > 255. That's been in place for about two years.

There's a problem with external apps when they want to use a field that
large. It means they're basically saying the requirement is a more
recent version of MySQL, which might well be something they're happy
with. Otherwise they have to know about that restriction and change to a
TextField. However Django isn't going to flat-out forbid such fields
just because older versions of one particular database backend have this
restriction. That would unfairly handicap everybody else because of a
problem in MySQL. Thus we make it a validation error only for particular
versions of MySQL.

And, by the way, if you're not running "manage.py validate" every now
and again to check for errors like that, you should start doing so. We
only check that stuff when you run validate to avoid unnecessary
overhead at runtime, since development sanity checking should be done at
development time, not runtime.

Regards,
Malcolm

Dj Gilcrease

unread,
Oct 16, 2008, 7:18:47 PM10/16/08
to django-d...@googlegroups.com
The max length of a varchar field is Database dependent

MySQL it is 255
MSSQL it is 2^31-1 (2gb)
PostgreSQL it is ~ 2^20-1 (1gb)

MySQL 4.1 and greater supposedly changes any varchar or char field
with a max length or greater to 255 to a TEXT field which can hold ~
1gb

Dj Gilcrease
OpenRPG Developer
~~http://www.openrpg.com

adamiis111

unread,
Oct 17, 2008, 9:25:37 AM10/17/08
to Django developers
Malcom,

Thanks for the well-explained response. It's a bit more complicated
than that though as this is only an issue if the field uses a UNIQUE
index. It is an issue up through the latest and greatest stable MySQL
(tested on 5.0.45 and no bug fixes at MySQL up through 5.0.67) and "./
manage.py validate" triggers 0 errors with a model using
CharText(255+) and UNIQUE and using the mysql database type.

As an aside, are most people using Postgres? Should I switch to that
for the best tested database for Django?

Thanks,
Adam

On Oct 16, 7:09 pm, Malcolm Tredinnick <malc...@pointy-stick.com>
wrote:

adamiis111

unread,
Oct 17, 2008, 10:01:25 AM10/17/08
to Django developers
How about we raise a validation error on UNIQUE CharField fields
greater than 255 characters for all MySQL 5.0 implementations.

I think the docs should be updated as well telling people that it's
not best practice to use CharField for fields that need more than 255
characters (or at least put a note in there so MySQL users no that
they can't mix 255+ character CharField and UNIQUE). Even from a
typical web usage standpoint, it seems that fields with more than 255
characters are better handled under the TextField model.

The only reason I'm trying to be a stickler about this is because this
code is ending up in what are meant to portable external libraries
(django-app-plugins) that are being redistributed into larger projects
(pinax).

I'll move to Postgres today :-)

Thanks,
Adam

Ian Holsman

unread,
Oct 19, 2008, 11:58:09 PM10/19/08
to django-d...@googlegroups.com
With MySQL it depends on the character set you are creating the table with.
for example with mysql 5.0.45:

mysql> create table foo ( x varchar(999), primary key (x)) character set
= 'latin1';
Query OK, 0 rows affected (0.00 sec)

mysql> create table foo2 ( x varchar(999), primary key (x)) character
set = 'utf8';
ERROR 1071 (42000): Specified key was too long; max key length is 999 bytes

mysql> create table foo2 ( x varchar(255), primary key (x)) character
set = 'utf8';
Query OK, 0 rows affected (0.01 sec)


So.. if you can live without the benefit of UTF8 you can achieve your
result.

personally I think you should look at your DB-design, as indexes on
large fields are expensive in terms of space used and time to retrieve
results. you could possibly index the MD5 of the field (32 bytes), and
then search for duplicates on that key. it might be faster and cheaper
on the disk.

regards
Ian

Adam Nelson

unread,
Oct 20, 2008, 9:47:23 AM10/20/08
to Django developers
The specific problem was already fixed at
http://code.google.com/p/django-app-plugins/issues/detail?id=11

I was speaking of the more general problem about what Pinax's stance
with regards to these type of limitations. From the limited feedback,
it sounds like our minimum standard will be UTF8 for all applications
- which means that apps/libs that can't work on the current stable
MySQL using the UTF8 character set would have to be patched in order
to be included.

As for the indexing of the hash, I think that's kind of overkill. It
would reduce the column (including index) size by about 3/4 but it
would add processing overhead . Anyway, this is only a table used to
hold no more than 100 records since it's just a glue table to connect
plugins to eachother.

Still, I don't own the app so I don't have write privileges to it.
It's bundled into Pinax :-)

Thanks,
Adam

Malcolm Tredinnick

unread,
Oct 22, 2008, 3:25:18 AM10/22/08
to django-d...@googlegroups.com

On Fri, 2008-10-17 at 06:25 -0700, adamiis111 wrote:
> Malcom,
>
> Thanks for the well-explained response. It's a bit more complicated
> than that though as this is only an issue if the field uses a UNIQUE
> index. It is an issue up through the latest and greatest stable MySQL
> (tested on 5.0.45 and no bug fixes at MySQL up through 5.0.67) and "./
> manage.py validate" triggers 0 errors with a model using
> CharText(255+) and UNIQUE and using the mysql database type.

Then you should open a ticket, even better if it includes a patch, so
that behaviour similar to that already in
django/db/backends/mysql/validation.py is done for those fields and
those MySQL versions as well.

Again, MySQL-specific validation is quite appropriate here.

>
> As an aside, are most people using Postgres?

Who knows? There's no way of answering a question like that reliably.
Also, raw numbers don't tell the whole story, since the usefulness of a
database backend varies depending on the domain it's being used in.

> Should I switch to that
> for the best tested database for Django?

*shrug* MySQL is a supported backend for Django. Bugs in Django's
support of MySQL should be fixed and if we aren't throwing and error for
the case you describe and if it is a real limitation of MySQL, we should
be doing so, as mentioned above.

Regards,
Malcolm


Reply all
Reply to author
Forward
0 new messages