Re: best practices with index on varchar column

Tom Lane

unread,

Mar 22, 2005, 1:08:14 PM3/22/05

to

PFC <li...@boutiquenumerique.com> writes:
>> Can I use an index on a varchar column to optimize the SELECT queries
>> that use " column LIKE 'header%' "?

> Yes

> Note that if you want case insensitive matching you need to make an index
> on lower(column) and SELECT WHERE lower(column) LIKE 'header%'

> Locales may bite you.

Yes. If your database locale is not "C" then the default btree index
behavior does not match up with what LIKE needs. In that case you need
a special index using the appropriate "pattern_ops" opclass, eg

CREATE INDEX test_index ON test_table (col varchar_pattern_ops);

or if you want case insensitive matching

CREATE INDEX test_index ON test_table (lower(col) varchar_pattern_ops);

and then write the queries with lower() as PFC illustrates. *Don't* use
ILIKE --- it basically can't use indexes at all.

For more info see
http://www.postgresql.org/docs/8.0/static/indexes-opclass.html

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Dawid Kuroczko

unread,

Mar 22, 2005, 4:49:25 AM3/22/05

to

On Tue, 22 Mar 2005 18:22:24 +0900, Layet Benjamin
<benj...@moonfactory.co.jp> wrote:
> Can I use an index on a varchar column to optimize the SELECT queries that
> use " column LIKE 'header%' "?

> If yes what is the best tree algotithm to use ?

Yes, that is the correct place. The best tree algorithm is B-Tree,
which is the default. So no need for giving 'USING ...' to CREATE INDEX.

The other types of indexes are either not trees (HASH), different
and more complex (GiST, RTREE) kinds of trees which are there
for different kinds of data (spatial, full text, etc).

Remember to VACUUM ANALYZE this table from time to time,
so the planner can judge efficiently whether to use this new
index or not.

Use EXPLAIN ANALYZE SELECT .... to see whether the index
is really used.

> I don't care about optimising INSERT, DELETE and UPDATE queries, as they are
> only done at night when the load is very low.
> Thank you very much for any help,

Oh, they can benefit from the index anyhow. :)

Regards,
Dawid

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majo...@postgresql.org)

PFC

unread,

Mar 22, 2005, 5:49:36 AM3/22/05

to

> Can I use an index on a varchar column to optimize the SELECT queries
> that
> use " column LIKE 'header%' "?

Yes

> If yes what is the best tree algotithm to use ?

Btree

Note that if you want case insensitive matching you need to make an index
on lower(column) and SELECT WHERE lower(column) LIKE 'header%'

Locales may bite you.

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majo...@postgresql.org

Layet Benjamin

unread,

Mar 22, 2005, 4:22:24 AM3/22/05

to

Hi everyone,

I hope it is the correct newsletter for this question.

Can I use an index on a varchar column to optimize the SELECT queries that use " column LIKE 'header%' "?

If yes what is the best tree algotithm to use ?

I don't care about optimising INSERT, DELETE and UPDATE queries, as they are only done at night when the load is very low.

Thank you very much for any help,

Benjamin Layet

Michael Ryan S. Puncia

unread,

Mar 22, 2005, 11:11:56 PM3/22/05

to

I have an experience using LIKE in a VARCHAR column and select statement
suffers a lot so I decided to go back in CHAR

Note: my database has about 50 millions records a b tree index

Yes

Btree

Locales may bite you.

__________ NOD32 1.1023 (20050310) Information __________

This message was checked by NOD32 Antivirus System.
http://www.nod32.com

Dawid Kuroczko

unread,

Mar 23, 2005, 4:35:48 AM3/23/05

to

On Wed, 23 Mar 2005 12:11:56 +0800, Michael Ryan S. Puncia
<mpu...@census.gov.ph> wrote:
>
> I have an experience using LIKE in a VARCHAR column and select statement
> suffers a lot so I decided to go back in CHAR
>
> Note: my database has about 50 millions records a b tree index

Strange...

According to the PostgreSQL's documentation:

Tip: There are no performance differences between these three types,
apart from the increased storage size when using the blank-padded type.
While character(n) has performance advantages in some other database
systems, it has no such advantages in PostgreSQL. In most situations text
or character varying should be used instead.

To my best knowledge char and varchar are stored in a same way
(4-byte length plus textual value), so using char should make tables
bigger in your case. Then again, having each row exactly the same
size makes it easier to delete and then later insert a new row in
a same spot. Am I thinking correct? Is it a case where using char(n)
makes that table avoid hmm fragmentation of some sort?

Regards,
Dawid

Richard Huxton

unread,

Mar 23, 2005, 6:31:28 AM3/23/05

to

Dawid Kuroczko wrote:
> On Wed, 23 Mar 2005 12:11:56 +0800, Michael Ryan S. Puncia
> <mpu...@census.gov.ph> wrote:
>
>>I have an experience using LIKE in a VARCHAR column and select statement
>>suffers a lot so I decided to go back in CHAR

> According to the PostgreSQL's documentation:

>
> Tip: There are no performance differences between these three types,
> apart from the increased storage size when using the blank-padded type.
> While character(n) has performance advantages in some other database
> systems, it has no such advantages in PostgreSQL. In most situations text
> or character varying should be used instead.
>
>
> To my best knowledge char and varchar are stored in a same way
> (4-byte length plus textual value), so using char should make tables
> bigger in your case. Then again, having each row exactly the same
> size makes it easier to delete and then later insert a new row in
> a same spot. Am I thinking correct? Is it a case where using char(n)
> makes that table avoid hmm fragmentation of some sort?

There aren't any noticeable differences between char and varchar. MVCC
doesn't overwrite rows anyway, so static size is irrelevant. In any
case, PG's toast setup splits out large text fields and compresses them
- so it's not that simple.

--
Richard Huxton
Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings