I was having some performance issues and started looking into it more
closely and it seemed to be exponential time instead of something
relatively linear.
I made the query execute on each loop by printing the results and used
django-debug-toolbar to time each query.
Here is the important loop:
{{{#!python
text = 'Choose one, if there are ten or more fallen'
words = text.split(' ')
for word in words:
word_query = Q()
for search_field in search_fields:
word_query |= Q(**{search_field + '__icontains': word})
if search_field == 'name':
# Also check the alternative name
word_query |= Q(**{'name_without_punctuation__icontains':
word})
if exactness_option == CONS.TEXT_CONTAINS_ALL:
cards = cards.filter(word_query)
print(word_query)
print(cards)
}}}
The word_query object scome out as expected, essentially checking if each
word exists in any of the relationships, one word at a time
{{{
(OR: ('name__icontains', 'Choose'),
('name_without_punctuation__icontains', 'Choose'),
('name_without_punctuation__icontains', 'Choose'),
('ability_texts__text__icontains', 'Choose'), ('races__name__icontains',
'Choose'))
(OR: ('name__icontains', 'one,'), ('name_without_punctuation__icontains',
'one,'), ('name_without_punctuation__icontains', 'one,'),
('ability_texts__text__icontains', 'one,'), ('races__name__icontains',
'one,'))
(OR: ('name__icontains', 'if'), ('name_without_punctuation__icontains',
'if'), ('name_without_punctuation__icontains', 'if'),
('ability_texts__text__icontains', 'if'), ('races__name__icontains',
'if'))
(OR: ('name__icontains', 'there'), ('name_without_punctuation__icontains',
'there'), ('name_without_punctuation__icontains', 'there'),
('ability_texts__text__icontains', 'there'), ('races__name__icontains',
'there'))
(OR: ('name__icontains', 'are'), ('name_without_punctuation__icontains',
'are'), ('name_without_punctuation__icontains', 'are'),
('ability_texts__text__icontains', 'are'), ('races__name__icontains',
'are'))
(OR: ('name__icontains', 'ten'), ('name_without_punctuation__icontains',
'ten'), ('name_without_punctuation__icontains', 'ten'),
('ability_texts__text__icontains', 'ten'), ('races__name__icontains',
'ten'))
(OR: ('name__icontains', 'or'), ('name_without_punctuation__icontains',
'or'), ('name_without_punctuation__icontains', 'or'),
('ability_texts__text__icontains', 'or'), ('races__name__icontains',
'or'))
(OR: ('name__icontains', 'more'), ('name_without_punctuation__icontains',
'more'), ('name_without_punctuation__icontains', 'more'),
('ability_texts__text__icontains', 'more'), ('races__name__icontains',
'more'))
(OR: ('name__icontains', 'fallen'),
('name_without_punctuation__icontains', 'fallen'),
('name_without_punctuation__icontains', 'fallen'),
('ability_texts__text__icontains', 'fallen'), ('races__name__icontains',
'fallen'))
}}}
Using django-debug-toolbar to check the times for these queries however go
as follows:
{{{
0.47s
0.83s
1.04s
2.43s
4.74s
26.27s
60.18s
124.46s
175.75s
}}}
As far as I can see these values should be somewhat similar since each
query is checking the same 4 columns for a string of comparable length.
Instead, they are going somewhat exponential.
This is using PostgreSQL 13.4 with the engine
django.db.backends.postgresql, Django v3.2.6
Let me know if there's any other details I can provide to help with this.
--
Ticket URL: <https://code.djangoproject.com/ticket/33749>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
* cc: Enhaloed (added)
--
Ticket URL: <https://code.djangoproject.com/ticket/33749#comment:1>
* type: Uncategorized => Bug
--
Ticket URL: <https://code.djangoproject.com/ticket/33749#comment:2>
* status: new => closed
* resolution: => invalid
Comment:
Hi. There's not enough to demonstrate an issue in Django here.
TBH it looks more like a usage issue. (If `cards` is a QuerySet that never
gets reset you're asking it do ever more work in the loop. Either way, you
need to be using explain to see the work your DB is actually performing.)
Please see TicketClosingReasons/UseSupportChannels for appropriate places
to get help.
If you can identify a concrete issue with a reproduce in Django, then here
is the appropriate place for that.
Good luck.
--
Ticket URL: <https://code.djangoproject.com/ticket/33749#comment:3>