Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

New approach to ye olde cross-datatype indexing problem

5 views

Skip to first unread message

Tom Lane

unread,

Nov 8, 2003, 10:52:16 AM11/8/03

Once again it's time to consider that old problem that indexes can't
handle cross-datatype comparisons, for example
SELECT ... WHERE int8col = 42
won't use an index on int8col because 42 is int4.

We have spent a huge amount of time trying to find ways to get the system
to assign the same type to the comparison value as the indexed column has.
So far every such proposal has failed, because of unwanted side-effects on
the semantics of operations unrelated to indexes. I've become more and
more convinced that that approach is a dead end.

What if we attack the problem head on --- make indexes able to handle
cross-datatype comparison operators directly? That would solve the
problem without *any* semantic side-effects. AFAIR we've never seriously
considered doing that; I guess anyone who thought of it dismissed it as
too hard. But I've been studying the idea for the last day or so and I'm
starting to think that it's eminently practical.

Here's what I'm thinking of: specify that the "input" datatype of an
operator class (pg_opclass.opcintype) is actually just the type of the
indexed column. Operators that are members of the opclass must take this
type as their left-hand input, but the right-hand input can be some other
type. pg_amop gets an additional column that is the right-hand data type
of the operator, and its primary key becomes (opclass, righthandtype,
strategy) rather than just (opclass, strategy). For example, the btree
opclass for int8 will still use strategy number 1 for 'less than', but
it would now contain entries for int84lt, int82lt, etc in addition to
int8lt. We can still maintain the secondary unique index on (opclass,
operator) --- that is, any particular operator still stands in one unique
relationship to an opclass. The support procedure catalog pg_amproc
likewise needs to get an additional column that is the "right hand side"
datatype. For example, the btree opclass for int8 formerly had one
support procedure btint8cmp(int8, int8) but it will now need additional
entries for btint84cmp(int8, int4), btint82cmp(int8, int2) and so on.

The cross-datatype operators we need for this largely already exist ---
in fact it's their presence in the system catalogs that creates the
problem in the first place. We'll need to write new functions for the
additional support procedures, but there aren't all that many of them.

The existing CREATE OPERATOR CLASS syntax already works for this approach,
since it's possible to specify the input datatypes of an operator. We
just need to modify the code to allow multiple instances of the same
operator or function number as long as the input datatypes are different.

Inside the system, the changes needed are really rather minimal. We will
need to add fields to the ScanKey data structure that specify the strategy
number of the operator being invoked and the righthand datatype (ie, the
actual datatype of the sk_argument datum). Given that information and
knowledge of the index column's opclass, the actual operator can be looked
up in the pg_amop syscache. The reason I want to add the strategy number
to ScanKey is that this is something the planner knows already (it finds
it out when it discovers that an operator can be used with the index in
the first place) and it seems silly to require the index access method to
re-discover the strategy number from the operator procedure OID. Passing
the strategy number directly will save cycles. Furthermore, it
essentially eliminates the need for the "StrategyMap" mechanism (istrat.c
and friends), which is a big chunk of vestigial code that I've been
wanting to get rid of for awhile. The only thing the StrategyMap routines
are really doing for us is providing a procedure-OID-to-strategy-number
reverse lookup mechanism, and we won't need that anymore. This is a good
thing since the existing StrategyMap data structure falls apart in the
presence of multiple operators with the same strategy number.

Relation cache entries for indexes will still include rd_operator and
rd_support fields indexed by strategy and procedure number. We will
decree that the cached entries are only for the operators and procedures
whose righthand datatype is the same as the indexed datatype. This is
reasonable since these will still be the most-used items (for example,
the index access method will need to use them to do internal comparisons
between index items).

The index access methods will need to be careful about whether they are
comparing two index entries (necessarily of the same datatype) or an index
entry against an externally-supplied scankey datum (possibly a different
datatype requiring a different operator). In the btree code the changes
needed are pretty minimal. I haven't looked at the other index types yet,
but actually I don't care much whether they can support cross-datatype
operators or not. If it turns out to be at all hard for them to do it,
we can just say that they don't support it. Fixing btree will solve 99.9%
of the problem anyway.

One way in which we will lose some flexibility is that this design nails
down forevermore the assumption that the indexed column is on the lefthand
side of any indexable clause. This has been a de facto restriction for a
long time, but it will become a permanent restriction embedded in the
system catalog design. I don't see that this is a problem though --- the
existing solution of letting the planner try to commute the clause to put
the indexkey on the left seems to work fine.

I'm planning to push forward with trying to implement this. Any comments,
ideas, objections?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Peter Eisentraut

unread,

Nov 8, 2003, 11:38:40 AM11/8/03

Tom Lane writes:

> Here's what I'm thinking of: specify that the "input" datatype of an
> operator class (pg_opclass.opcintype) is actually just the type of the
> indexed column. Operators that are members of the opclass must take this
> type as their left-hand input, but the right-hand input can be some other
> type. pg_amop gets an additional column that is the right-hand data type
> of the operator, and its primary key becomes (opclass, righthandtype,
> strategy) rather than just (opclass, strategy).

Yes, that looks to be the right way.

> One way in which we will lose some flexibility is that this design nails
> down forevermore the assumption that the indexed column is on the lefthand
> side of any indexable clause.

I don't see this as a problem, but if it becomes one we can relabel "left
operand" as "indexed operand" and "right operand" as "variable operand",
and add a boolean flag telling which is right and left.

--
Peter Eisentraut pet...@gmx.net

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Tom Lane

unread,

Nov 8, 2003, 12:13:52 PM11/8/03

Peter Eisentraut <pet...@gmx.net> writes:

> Tom Lane writes:
>> One way in which we will lose some flexibility is that this design nails
>> down forevermore the assumption that the indexed column is on the lefthand
>> side of any indexable clause.

> I don't see this as a problem, but if it becomes one we can relabel "left
> operand" as "indexed operand" and "right operand" as "variable operand",
> and add a boolean flag telling which is right and left.

Okay. I won't put in the bool flag for the moment, but I'll use column
names along the lines of "indtype" and "othertype" so that we won't need
to rename the columns if we want to do that in the future.

BTW, I am not planning to remove the possibility of having unary
operators in an opclass. (We have talked about supporting indexing
of IS NULL/IS NOT NULL queries by treating IS(NOT)NULL as a unary
operator --- while I'm not planning to tackle that now, I don't want
to foreclose the possibility.) I think I will specify that "othertype"
for a unary operator should be set equal to the opclass's "indtype",
rather than the more-obvious zero. This will allow the unary operator
to be part of the cached set of operators (which are only going to be
the ones with othertype=indtype).

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majo...@postgresql.org

Tom Lane

unread,

Nov 12, 2003, 7:10:53 PM11/12/03

I wrote:
> What if we attack the problem head on --- make indexes able to handle
> cross-datatype comparison operators directly? That would solve the
> problem without *any* semantic side-effects. AFAIR we've never seriously
> considered doing that; I guess anyone who thought of it dismissed it as
> too hard. But I've been studying the idea for the last day or so and I'm
> starting to think that it's eminently practical.

In fact, it works --- as of CVS tip, you can do:

regression=# create table foo (f1 bigint primary key);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "foo_pkey" for table "foo"
CREATE TABLE

regression=# explain select * from foo where f1 = 44;
QUERY PLAN
--------------------------------------------------------------------
Index Scan using foo_pkey on foo (cost=0.00..4.82 rows=1 width=8)
Index Cond: (f1 = 44)
(2 rows)

or even

regression=# explain select * from foo where f1 = 44::int2;
QUERY PLAN
--------------------------------------------------------------------
Index Scan using foo_pkey on foo (cost=0.00..4.82 rows=1 width=8)
Index Cond: (f1 = 44::smallint)
(2 rows)

As seems to happen every time I touch the btree code, I learned a few
formerly-undocumented details about how it works :-(. It turns out that
allowing an operator class to include operators between the indexed
datatype and other datatypes isn't really sufficient for everything the
backend wants to do. In particular, there are places where the system
wants to determine which of two inequalities is a tighter constraint.
For example, consider a query like

SELECT ... WHERE x > 4 AND x >= 5;

The x > 4 condition is redundant and can be thrown away. The btree
subsystem actually has code to do this, and in fact was *depending on*
the assumption that it could do so, because there were bits of logic
that assumed they'd not have to deal with more than one inequality
constraint at a time. Now in order to do this, the code has to compare
the constants (4 and 5). This is no problem as long as an operator
class deals with just a single datatype --- you can pick the needed
comparison operator out of the opclass. However, in a cross-type
situation you could have something like

SELECT ... WHERE int8var > 4::int4 AND int8var >= 5::int2;

where the operators appearing in the query are int84lt and int82le.
To determine which condition is a tighter constraint, you'd need to
apply an int4-vs-int2 comparison --- and there are no such operators
available in the int8 opclass.

Since the whole point of an opclass is to encapsulate a particular
comparison semantics, it's not reasonable for the system to guess at
which comparison operator on the other type(s) is compatible with the
opclass it's working with. And coercing the comparison values to
the opclass' own datatype is no good (that opens up all the problems
we wanted to avoid, such as possible overflow).

For the moment, I've just punted this problem; the btree code will not
optimize away redundant cross-type comparisons. (I went and fixed the
places that failed to cope with extra conditions.) It would be nice
to do better, but it's surely a much lower priority than getting
cross-type comparisons to be indexed at all.

Looking to the long term of how we might deal with this if we wanted
to invest more effort, I can think of a couple of ways to proceed.

One is to allow an opclass to contain operators that aren't directly
usable with the index at all, but do apply to datatypes that are used
within the opclass --- for instance, we'd have to add all the
int4-vs-int4 operators, *and* the int4-vs-int2 and int2-vs-int2
operators, into the int8 opclass; likewise for the int2 and int4
opclasses. This seems like a recipe for bloating pg_amop quite a lot;
there'd be a lot of redundancy among these opclasses.

The other idea I had was to somehow create associations of opclasses,
so that you could say that "int2_ops, int4_ops, and int8_ops all sort
values compatibly, so it's okay to go find the operator you need in
int4_ops or int2_ops". I don't have any concrete idea about how to
represent that though. Any thoughts about that, or other ideas
altogether?

For the moment I'm satisfied with what we have; this is already a big
step forward on a problem that's been there forever. But there's always
something else that could be done ...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Greg Stark

unread,

Nov 13, 2003, 10:36:50 AM11/13/03

Tom Lane <t...@sss.pgh.pa.us> writes:

> The other idea I had was to somehow create associations of opclasses,
> so that you could say that "int2_ops, int4_ops, and int8_ops all sort
> values compatibly, so it's okay to go find the operator you need in
> int4_ops or int2_ops". I don't have any concrete idea about how to
> represent that though. Any thoughts about that, or other ideas
> altogether?

At the risk of saying something obvious and not understanding the scope of the
problem, you add a column pg_amop.superclass that is a unique integer except
for groups of operators like this where they can all share the same value.

That's assuming you only need a transitive relationship, which seems like what
you need.

--
greg

0 new messages