al...@hill.net.au
unread,Jun 4, 2013, 1:21:57 AM6/4/13You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to
Hi,
I have a couple of FTS queries which I *think* are equivalent, but one is about 20 times slower than the other. The slower one has the to_tsquery call embedded in the WHERE clause, the quicker one has it in the FROM clause with an alias, and the WHERE clause rewritten to refer to the alias.
With the to_tsquery in the FROM, the query is about 20 times faster. It seems to be because of the seq scan on books_subjects. However I rephrase this query, using combinations of JOINs or WHERE INs, having the to_tsquery in the WHERE clause seems to yield this seq scan and slow it down dramatically.
Unfortunately, it's very hard to make Django's ORM generate the fast query...
Queries and EXECUTE ANALYZE output follow.
Thanks for your time.
Alex Hill
The first (fast) query:
EXPLAIN ANALYZE
SELECT "books".*
FROM to_tsquery('english', 'GFC') "query",
"books" INNER JOIN "books_subjects" ON "books"."APN" = "books_subjects"."book"
WHERE "books_subjects"."subject" IN ('BUSECOECO', 'BUSMANMAN')
AND "books"."text_vector" @@ "query"
Nested Loop (cost=17.54..845.40 rows=3 width=1340) (actual time=0.201..0.590 rows=3 loops=1)
-> Nested Loop (cost=17.54..711.41 rows=199 width=1340) (actual time=0.086..0.141 rows=10 loops=1)
-> Function Scan on query (cost=0.00..0.01 rows=1 width=32) (actual time=0.019..0.020 rows=1 loops=1)
-> Bitmap Heap Scan on books (cost=17.54..709.41 rows=199 width=1340) (actual time=0.055..0.083 rows=10 loops=1)
Recheck Cond: (text_vector @@ query.query)
-> Bitmap Index Scan on text_search_index (cost=0.00..17.49 rows=199 width=0) (actual time=0.043..0.043 rows=10 loops=1)
Index Cond: (text_vector @@ query.query)
-> Index Only Scan using books_subjects_pkey on books_subjects (cost=0.00..0.66 rows=1 width=13) (actual time=0.029..0.034 rows=0 loops=10)
Index Cond: ((book = books."APN") AND (subject = ANY ('{BUSECOECO,BUSMANMAN}'::text[])))
Heap Fetches: 3
Total runtime: 0.771 ms
The second, slower query:
EXPLAIN ANALYZE
SELECT "books".*
FROM "books"
INNER JOIN "books_subjects" ON "books"."APN" = "books_subjects"."book"
WHERE "books_subjects"."subject" IN ('BUSECOECO', 'BUSMANMAN')
AND "books"."text_vector" @@ to_tsquery('english', 'GFC')
yields
Hash Join (cost=711.90..1512.12 rows=3 width=1340) (actual time=6.847..14.639 rows=3 loops=1)
Hash Cond: (books_subjects.book = books."APN")
-> Seq Scan on books_subjects (cost=0.00..798.04 rows=573 width=13) (actual time=0.029..14.172 rows=564 loops=1)
Filter: (subject = ANY ('{BUSECOECO,BUSMANMAN}'::text[]))
Rows Removed by Filter: 39599
-> Hash (cost=709.41..709.41 rows=199 width=1340) (actual time=0.126..0.126 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 15kB
-> Bitmap Heap Scan on books (cost=17.54..709.41 rows=199 width=1340) (actual time=0.050..0.076 rows=10 loops=1)
Recheck Cond: (text_vector @@ '''gfc'''::tsquery)
-> Bitmap Index Scan on text_search_index (cost=0.00..17.49 rows=199 width=0) (actual time=0.041..0.041 rows=10 loops=1)
Index Cond: (text_vector @@ '''gfc'''::tsquery)
Total runtime: 14.833 ms