db_index = True on BooleanField (MySQL)

546 views
Skip to first unread message

chefsmart

unread,
Nov 26, 2009, 9:35:29 AM11/26/09
to Django users
I had a fruitless discussion with a fellow coder today. Summarily, we
have a model for students that will keep growing with time. I mean to
say the number of records in the database will keep growing, and no
rows are going to be deleted.

Now, students fall in one category or the other, and this is marked by
a BooleanField.

We are using MySQL. Is it advisable to index that BooleanField (that
is db_index = True in Django)? All queries on the model will 'almost
always' include a check on that BooleanField, that is
Student.object.filter(categorized=False) or Student.object.filter
(categorized=True)

I have spent a lot of time on the MySQL forums to find out an answer
to this, but am as lost as before. Some even said an index on a
Boolean in such a case would degrade performance.

Tom Evans

unread,
Nov 26, 2009, 10:05:18 AM11/26/09
to django...@googlegroups.com
The answer is 'it depends'. Remember that MySQL will only use one index from each table in a query. and that a boolean field has a cardinality of 2, so it potentially doesn't reduce the number of rows to examine.

Furthermore, if (say) 95% of your Students are categorized, and you are looking at Student.objects.filter(categorized=True), then MySQL would ignore your index anyway, and do a full table scan. I'm unsure of exactly what percentage of rows before MySQL does this though..

The only correct answer is to fill your database with sample data, and analyze each of your queries in MySQL, and create appropriate composite keys that reduce cardinality appropriately for your queries.

Cheers

Tom
Reply all
Reply to author
Forward
0 new messages