Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Query slower with to_tsquery in WHERE clause vs. in FROM clause

52 views
Skip to first unread message

al...@hill.net.au

unread,
Jun 4, 2013, 1:21:57 AM6/4/13
to
Hi,

I have a couple of FTS queries which I *think* are equivalent, but one is about 20 times slower than the other. The slower one has the to_tsquery call embedded in the WHERE clause, the quicker one has it in the FROM clause with an alias, and the WHERE clause rewritten to refer to the alias.

With the to_tsquery in the FROM, the query is about 20 times faster. It seems to be because of the seq scan on books_subjects. However I rephrase this query, using combinations of JOINs or WHERE INs, having the to_tsquery in the WHERE clause seems to yield this seq scan and slow it down dramatically.

Unfortunately, it's very hard to make Django's ORM generate the fast query...

Queries and EXECUTE ANALYZE output follow.

Thanks for your time.

Alex Hill



The first (fast) query:

EXPLAIN ANALYZE
SELECT "books".*
FROM to_tsquery('english', 'GFC') "query",
"books" INNER JOIN "books_subjects" ON "books"."APN" = "books_subjects"."book"
WHERE "books_subjects"."subject" IN ('BUSECOECO', 'BUSMANMAN')
AND "books"."text_vector" @@ "query"

Nested Loop (cost=17.54..845.40 rows=3 width=1340) (actual time=0.201..0.590 rows=3 loops=1)
-> Nested Loop (cost=17.54..711.41 rows=199 width=1340) (actual time=0.086..0.141 rows=10 loops=1)
-> Function Scan on query (cost=0.00..0.01 rows=1 width=32) (actual time=0.019..0.020 rows=1 loops=1)
-> Bitmap Heap Scan on books (cost=17.54..709.41 rows=199 width=1340) (actual time=0.055..0.083 rows=10 loops=1)
Recheck Cond: (text_vector @@ query.query)
-> Bitmap Index Scan on text_search_index (cost=0.00..17.49 rows=199 width=0) (actual time=0.043..0.043 rows=10 loops=1)
Index Cond: (text_vector @@ query.query)
-> Index Only Scan using books_subjects_pkey on books_subjects (cost=0.00..0.66 rows=1 width=13) (actual time=0.029..0.034 rows=0 loops=10)
Index Cond: ((book = books."APN") AND (subject = ANY ('{BUSECOECO,BUSMANMAN}'::text[])))
Heap Fetches: 3
Total runtime: 0.771 ms



The second, slower query:

EXPLAIN ANALYZE
SELECT "books".*
FROM "books"
INNER JOIN "books_subjects" ON "books"."APN" = "books_subjects"."book"
WHERE "books_subjects"."subject" IN ('BUSECOECO', 'BUSMANMAN')
AND "books"."text_vector" @@ to_tsquery('english', 'GFC')

yields

Hash Join (cost=711.90..1512.12 rows=3 width=1340) (actual time=6.847..14.639 rows=3 loops=1)
Hash Cond: (books_subjects.book = books."APN")
-> Seq Scan on books_subjects (cost=0.00..798.04 rows=573 width=13) (actual time=0.029..14.172 rows=564 loops=1)
Filter: (subject = ANY ('{BUSECOECO,BUSMANMAN}'::text[]))
Rows Removed by Filter: 39599
-> Hash (cost=709.41..709.41 rows=199 width=1340) (actual time=0.126..0.126 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 15kB
-> Bitmap Heap Scan on books (cost=17.54..709.41 rows=199 width=1340) (actual time=0.050..0.076 rows=10 loops=1)
Recheck Cond: (text_vector @@ '''gfc'''::tsquery)
-> Bitmap Index Scan on text_search_index (cost=0.00..17.49 rows=199 width=0) (actual time=0.041..0.041 rows=10 loops=1)
Index Cond: (text_vector @@ '''gfc'''::tsquery)
Total runtime: 14.833 ms
0 new messages