Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

IN operator: maximum reasonable?

14 views
Skip to first unread message

Karsten Wutzke

unread,
Apr 5, 2010, 4:22:13 PM4/5/10
to
Hello,

I was wondering about using the IN operator with check constraints to
restrict the values of a column to a certain number of elements
(strings). Is there any reasonable maximum for the number of elements
in the set?

Given:

CREATE TABLE Languages
(
iso_code CHAR(2) NOT NULL PRIMARY KEY CHECK (iso_code IN ('en',
'de', 'fr', ...))
)

I might extend the list every now and then, e.g. when supporting a new
language. But where is the point the check constraint doesn't really
serve any more purpose?

I could remove the check and allow any combination, but I'm not sure
why I should do that.

Some people believe that using ENUMs, which are similar, only makes
sense for a rather low number of elements, but I'm not sure what
should be wrong with larger sets/lists.

Any comments on that appreciated.

Karsten

Andreas Kretschmer

unread,
Apr 7, 2010, 2:51:08 AM4/7/10
to
Karsten Wutzke <kwu...@web.de> wrote:
> Any comments on that appreciated.

Use a lookup-table instead. (FK-constraint)


Andreas
--
Andreas Kretschmer
Linux - weil ich es mir wert bin!
GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net

Thomas Kellerer

unread,
Apr 7, 2010, 3:31:14 AM4/7/10
to
Andreas Kretschmer, 07.04.2010 08:51:

> Karsten Wutzke<kwu...@web.de> wrote:
>> Any comments on that appreciated.
>
> Use a lookup-table instead. (FK-constraint)
>
>
> Andreas

I was about to answer the same, until I realized his languages table *is* the lookup table and he wants to limit the values in there.

Thomas

Andreas Kretschmer

unread,
Apr 7, 2010, 12:46:22 PM4/7/10
to

That make no sense. Create a table with a CHECK-constraint and alter
this constraint every time for a new record?


Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknown)
Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°

Karsten Wutzke

unread,
Apr 7, 2010, 6:48:05 PM4/7/10
to
On 7 Apr., 18:46, Andreas Kretschmer <akretsch...@spamfence.net>
wrote:

> Thomas Kellerer <OTPXDAJCS...@spammotel.com> wrote:
> > Andreas Kretschmer, 07.04.2010 08:51:
> >> Karsten Wutzke<kwut...@web.de>  wrote:

> >>> Any comments on that appreciated.
>
> >> Use a lookup-table instead. (FK-constraint)
>
> >> Andreas
>
> > I was about to answer the same, until I realized his languages table *is* the lookup table and he wants to limit the values in there.
>
> That make no sense. Create a table with a CHECK-constraint and alter
> this constraint every time for a new record?
>
> Andreas

Yes. I know it sounds contradictory at first, but if I don't use a
lookup table, I have to adjust the CHECK for each new entry, too.
Lookup tables are for situations where you want/need to *reuse* a
check constraint, right? Using this kind of "double-check" clearly
flags the table as a lookup table and not a regular one (even though
the check is redundant). I mean, if it's still somewhat static, I do
not change it often - if at all - anyway. It's truly another issue I'm
very undecided about.

But the question was rather: is there any limit on the IN check
itself, I mean the list, if that makes sense?

Example: country codes.

http://www.iso.org/iso/iso-3166-1_decoding_table

While they are CHAR(2), there are 26*26 = 676 combinations, of which
not every one is valid. So, I might want to create a CHECK (cc IN
('AD', ..., 'ZW')) or so, with several hundred entries. Does it make
sense to create such a big IN check? Is there a reasonable limit to
the number of elements in the list? If so, how many? 7? 10? 14? 23?
62? 89? 200?

Karsten

Andreas Kretschmer

unread,
Apr 8, 2010, 3:45:08 AM4/8/10
to
Karsten Wutzke <kwu...@web.de> wrote:
> While they are CHAR(2), there are 26*26 = 676 combinations, of which
> not every one is valid. So, I might want to create a CHECK (cc IN
> ('AD', ..., 'ZW')) or so, with several hundred entries. Does it make
> sense to create such a big IN check? Is there a reasonable limit to
> the number of elements in the list? If so, how many? 7? 10? 14? 23?
> 62? 89? 200?

Well, haven't tried that, but i think up to several thousand rows it is
possible, so i don't see a problem in this case. Try it, and tell us
about the result ;-)

Marco Mariani

unread,
Apr 8, 2010, 4:59:12 AM4/8/10
to
On 04/08/2010 12:48 AM, Karsten Wutzke wrote:

> While they are CHAR(2), there are 26*26 = 676 combinations, of which
> not every one is valid. So, I might want to create a CHECK (cc IN
> ('AD', ..., 'ZW')) or so, with several hundred entries. Does it make
> sense to create such a big IN check? Is there a reasonable limit to
> the number of elements in the list? If so, how many? 7? 10? 14? 23?
> 62? 89? 200?

I have happily used IN() with 30k elements, but the right way to solve
your issue is to have another single-column table with the full iso-3166
combinations, and make languages.iso_code both a primary and foreign key.

Jasen Betts

unread,
Apr 8, 2010, 5:19:58 AM4/8/10
to

definately use a lookup table. having constraints that need to be
updated every time a country changes name they time release a new
version seems like a mug's game
(see all those entries marked transitional...)

> While they are CHAR(2), there are 26*26 = 676 combinations, of which
> not every one is valid. So, I might want to create a CHECK (cc IN
> ('AD', ..., 'ZW')) or so, with several hundred entries. Does it make
> sense to create such a big IN check?

not when the rules will potentially change every few years.

> Is there a reasonable limit to
> the number of elements in the list? If so, how many? 7? 10? 14? 23?
> 62? 89? 200?

somewhere Out past 10000 it could start getting more expensive than
the foreign key check. somewhere before that maintenance cost will
get excessive.


--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---

Karsten Wutzke

unread,
Apr 8, 2010, 10:04:04 AM4/8/10
to
>
> > While they are CHAR(2), there are 26*26 = 676 combinations, of which
> > not every one is valid. So, I might want to create a CHECK (cc IN
> > ('AD', ..., 'ZW')) or so, with several hundred entries. Does it make
> > sense to create such a big IN check?
>
> not when the rules will potentially change every few years.
>

Years? That sounds ok for me...

> > Is there a reasonable limit to
> > the number of elements in the list? If so, how many? 7? 10? 14? 23?
> > 62? 89? 200?
>
> somewhere Out past 10000 it could start getting more expensive than
> the foreign key check.  somewhere before that maintenance cost will
> get excessive.
>

So using a small list does not or only have marginal performance
differences? Why is that so? Because the DBMS does the majority of
integrity checks for the values on ALTER TABLE rather than on row
inserts?

Karsten

Jasen Betts

unread,
Apr 9, 2010, 9:37:02 AM4/9/10
to
On 2010-04-08, Karsten Wutzke <kwu...@web.de> wrote:
>>
>> > While they are CHAR(2), there are 26*26 = 676 combinations, of which
>> > not every one is valid. So, I might want to create a CHECK (cc IN
>> > ('AD', ..., 'ZW')) or so, with several hundred entries. Does it make
>> > sense to create such a big IN check?
>>
>> not when the rules will potentially change every few years.
>
> Years? That sounds ok for me...

how soon Y2K is forgotten!

having to dig into the logic of the application to make changes that
could be anticipated seems like a bad plan.

>> > Is there a reasonable limit to
>> > the number of elements in the list? If so, how many? 7? 10? 14? 23?
>> > 62? 89? 200?
>>
>> somewhere Out past 10000 it could start getting more expensive than
>> the foreign key check.  somewhere before that maintenance cost will
>> get excessive.
>>
>
> So using a small list does not or only have marginal performance
> differences?

If it's a frequently performed test count on the relevant table or index
already being in memory....

> Why is that so? Because the DBMS does the majority of
> integrity checks for the values on ALTER TABLE rather than on row
> inserts?

if the check rule is long enough there is more cost loading the rule than
examinining the parts of the index needed for the for the FK check.

It's a database, it writes stuff to disk and reads it back. that is
always going to be slow. Sql is compiled only to bytecode (the
planner does this), that bytecode is going to be slow too.

Mladen Gogala

unread,
Apr 9, 2010, 12:23:49 PM4/9/10
to

From the programming point of view, I believe it would be better to wrap
it up in a Perl function. That way, the table definition never changes
and check constraints would look something like this
CHECK (IS_LEGIT_LANG(LANG) == 1);

The function would search through a hash defined like this:

my %LANG= ( 'en' => 1,
'fr' => 1,
'pt' => 1);

That would probably be faster than using
CHECK( LANG IN ('en','fr','pt'...));

As an added benefit you not only get a better performance but much better
manageability, too.


--
http://mgogala.byethost5.com

Jasen Betts

unread,
Apr 11, 2010, 5:17:03 AM4/11/10
to

having not examined the source I can't categorically state that
postgres doesn't itself use a hash to resolve the IN(list) operation.

> As an added benefit you not only get a better performance but much better
> manageability, too.

you could more easily(*) use an SQL function.

(* no need to install plperl)

Mladen Gogala

unread,
Apr 12, 2010, 12:36:01 PM4/12/10
to
On Sun, 11 Apr 2010 09:17:03 +0000, Jasen Betts wrote:

> you could more easily(*) use an SQL function.
>
> (* no need to install plperl)

I love Perl. It's my favorite tool. I am biased.

--
http://mgogala.byethost5.com

Karsten Wutzke

unread,
Apr 12, 2010, 1:43:51 PM4/12/10
to
On 12 Apr., 18:36, Mladen Gogala <n...@email.here.invalid> wrote:
> On Sun, 11 Apr 2010 09:17:03 +0000, Jasen Betts wrote:
> > you could more easily(*) use an SQL function.
>
> > (* no need to install plperl)
>
> I love Perl. It's my favorite tool. I am biased.
>

Remember I posted my question into a DB/Standard SQL group on purpose.
I need the constraint on the DB. I also don't want to synchronize
application code with DB constraints.

Karsten

Lennart Jonsson

unread,
Apr 12, 2010, 2:03:02 PM4/12/10
to
On 2010-04-05 22:22, Karsten Wutzke wrote:
[...]

> CREATE TABLE Languages
> (
> iso_code CHAR(2) NOT NULL PRIMARY KEY CHECK (iso_code IN ('en',
> 'de', 'fr', ...))
> )
>
> I might extend the list every now and then, e.g. when supporting a new
> language. But where is the point the check constraint doesn't really
> serve any more purpose?
>
> I could remove the check and allow any combination, but I'm not sure
> why I should do that.
>

I know you are more interested in discussing the size of the check
constraint, but if you don't mind I'd like to reflect on the purpose of
the check constraint. Couldn't you achieve what you want to do by
revoking delete/insert/update on the table from certain users?

/Lennart

Thomas Kellerer

unread,
Apr 12, 2010, 5:31:32 PM4/12/10
to
Karsten Wutzke wrote on 12.04.2010 19:43:
>
> Remember I posted my question into a DB/Standard SQL group on purpose.

No, you posted to comp.databases.postgresql which is DB-specific and not a "Standard SQL group"

Robert Klemme

unread,
Apr 14, 2010, 3:07:50 PM4/14/10
to
On 04/08/2010 12:48 AM, Karsten Wutzke wrote:
> On 7 Apr., 18:46, Andreas Kretschmer <akretsch...@spamfence.net>
> wrote:
>> Thomas Kellerer <OTPXDAJCS...@spammotel.com> wrote:
>>> Andreas Kretschmer, 07.04.2010 08:51:
>>>> Karsten Wutzke<kwut...@web.de> wrote:
>>>>> Any comments on that appreciated.
>>>> Use a lookup-table instead. (FK-constraint)
>>>> Andreas
>>> I was about to answer the same, until I realized his languages table *is* the lookup table and he wants to limit the values in there.
>> That make no sense. Create a table with a CHECK-constraint and alter
>> this constraint every time for a new record?
>>
>> Andreas
>
> Yes. I know it sounds contradictory at first, but if I don't use a
> lookup table, I have to adjust the CHECK for each new entry, too.

Yeah, but now you have to update the constraint AND insert a record for
each new entry. IMHO the check constraint is bad, you only want the
data in the table and a FK in other tables that need such an ISO code.

> Lookup tables are for situations where you want/need to *reuse* a
> check constraint, right?

Not sure I could agree here. Basically you create such a table in order
to be able to efficiently check values against it. I believe a CHECK
constraint on the other tables has these drawbacks:

- Might be slower from a certain number of entries on.
- Redundant: *all* tables that need to check this code need the same
check constraint that you need to update whenever you add a new legal code.

> Using this kind of "double-check" clearly
> flags the table as a lookup table and not a regular one (even though
> the check is redundant). I mean, if it's still somewhat static, I do
> not change it often - if at all - anyway. It's truly another issue I'm
> very undecided about.

I am not sure I follow you here with the "flagging through double
checking". In any case, if you want to limit the number of persons that
can alter the constraint then you could handle that with proper
permissions on the table.

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

0 new messages