Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Re: [PERFORM] query using incorrect index

1 view
Skip to first unread message

Kevin Grittner

unread,
Aug 2, 2012, 4:13:01 PM8/2/12
to
Russell Keane <Russel...@inps.co.uk> wrote:

> Clearly the statistics are off somehow but I really don't know
> where to start.
>
> Any help you can give me would be very much appreciated.

It would help to know your more about your hardware and PostgreSQL
configuration. The latter can probably best be communicated by
copy/paste of the results of the query on this page:

http://wiki.postgresql.org/wiki/Server_Configuration

Can you also post the EXPLAIN ANALYZE output for the slow query with
both indexes present but without the LIMIT clause?

-Kevin

--
Sent via pgsql-performance mailing list (pgsql-pe...@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Robert Klemme

unread,
Aug 3, 2012, 5:18:20 AM8/3/12
to


On Thu, Aug 2, 2012 at 4:54 PM, Russell Keane <Russel...@inps.co.uk> wrote:

 

Using PG 9.0 and given 2 queries (messageq_current is a view on the messageq_table):

 

select entity_id from messageq_current

where entity_id = 123456;

 

select entity_id from messageq_current

where incoming = true

and inactive = false

and staff_ty = 2

and staff_id = 2

order by entity_id desc

limit 1;

 

and 2 indexes (there are 15 indexes in total but they are left out here for brevity):

 

messageq1:

CREATE INDEX messageq1

  ON messageq_table

  USING btree

  (entity_id);

 

And messageq4:

 

CREATE INDEX messageq4

  ON messageq_table

  USING btree

  (inactive, staff_ty, staff_id, incoming, tran_dt);


Of course a lot of detail is missing (full schema of table, all the other indexes) but with "inactive" a boolean column I suspect selectivity might not be too good here and so having it as a first column in a covering index is at least questionable.  If query 2 is frequent you might also want to consider creating a partial index only on (staff_ty, staff_id) with filtering criteria on incoming and active as present in query 2.

Btw, why don't you formulate query 2 as max query?

select max(entity_id) as entity_id

from messageq_current

where incoming = true

and inactive = false

and staff_ty = 2

and staff_id = 2;

 

 

With the messageq1 index present, query 1 is very quick (0.094ms) and query 2 is very slow (241.515ms).

If I remove messageq1 then query 2 uses messageq4 and is very quick (0.098ms) but then query 1 must use a different index and is therefore slower (> 5ms).

 

So, to the Query plans:


Of which query?  Shouldn't there be four plans in total?  I'd post plans here:
http://explain.depesz.com/
 

With messageq1:

"Limit  (cost=0.00..2670.50 rows=1 width=4) (actual time=241.481..241.481 rows=0 loops=1)"

"  Output: messageq_table.entity_id"

"  Buffers: shared hit=32 read=18870 written=12"

"  ->  Index Scan Backward using messageq1 on prac_live_10112.messageq_table  (cost=0.00..66762.53 rows=25 width=4) (actual time=241.479..241.479 rows=0 loops=1)"

"        Output: messageq_table.entity_id"

"        Filter: (messageq_table.incoming AND (NOT messageq_table.inactive) AND (messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND (aud_status_to_flag(messageq_table.aud_status) = 1))"

"        Buffers: shared hit=32 read=18870 written=12"

"Total runtime: 241.515 ms"

 

Without messageq1:

"Limit  (cost=12534.45..12534.45 rows=1 width=4) (actual time=0.055..0.055 rows=0 loops=1)"

"  Output: messageq_table.entity_id"

"  Buffers: shared read=3"

"  ->  Sort  (cost=12534.45..12534.51 rows=25 width=4) (actual time=0.054..0.054 rows=0 loops=1)"

"        Output: messageq_table.entity_id"

"        Sort Key: messageq_table.entity_id"

"        Sort Method:  quicksort  Memory: 17kB"

"        ->  Bitmap Heap Scan on prac_live_10112.messageq_table  (cost=174.09..12534.32 rows=25 width=4) (actual time=0.043..0.043 rows=0 loops=1)"

"              Output: messageq_table.entity_id"

"              Recheck Cond: ((messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2))"

"              Filter: (messageq_table.incoming AND (NOT messageq_table.inactive) AND (aud_status_to_flag(messageq_table.aud_status) = 1))"

"              Buffers: shared read=3"

"              ->  Bitmap Index Scan on messageq4  (cost=0.00..174.08 rows=4920 width=0) (actual time=0.040..0.040 rows=0 loops=1)"

"                    Index Cond: ((messageq_table.inactive = false) AND (messageq_table.staff_ty = 2) AND (messageq_table.staff_id = 2) AND (messageq_table.incoming = true))"

"                    Buffers: shared read=3"

"Total runtime: 0.098 ms"

 

Clearly the statistics are off somehow but I really don’t know where to start.

 

Any help you can give me would be very much appreciated.


Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Kevin Grittner

unread,
Aug 3, 2012, 10:33:31 AM8/3/12
to
Russell Keane <Russel...@inps.co.uk> wrote:

> "log_min_duration_statement";"1ms"

> "shared_buffers";"32MB"
> "work_mem";"1MB"

Those are pretty low values even for a 4GB machine. I suggest the
following changes and additions, based on the fact that you seem to
have the active portion of the database fully cached.

shared_buffers = '160MB'
work_mem = '8MB'
seq_page_cost = 0.1
random_page_cost = 0.1
cpu_tuple_cost = 0.03
effective_cache_size = '2GB'

> Explain analyse with both indexes present but without the limit
> (uses the correct index):

> "Total runtime: 0.092 ms"

Part of problem is that it thinks it will find a matching row fairly
quickly, and having done so using the index it chose will mean it is
the *right* row. The problem is that there are no matching rows, so
it has to scan the entire index. More fine-grained statistics
*might* help. If other techniques don't help, you can rewrite the
query slightly to create an optimization fence, but that should be a
last resort. I agree with Robert that if you have a lot of queries
that select on "incoming" and/or "inactive", a conditional index
(with a WHERE clause in its definition) is likely to be very
helpful.
0 new messages