Translating constraints to RM Terms

Kenneth Downs

unread,

Jun 5, 2005, 12:51:55 PM6/5/05

to

I'm wondering if some RM theorist might express something for me.

Consider table X with columns A and B. There is a constraint that A must be
less than B. While we discuss such constraints every day, I realize I do
not know how to express such constraints in relational terms.

For instance, A is actually a domain, as is B, but the constraint A < B must
in fact be part of the very definition of the domain A, and here we are
defining one domain in terms of another. I realized I have not seen this
disccussed in the year or so I have been a regular here. Is defining one
domain in terms of another allowed and considered trivial, no big deal? Is
it actually not allowed and this is some SQL alteration of true RM?

--
Kenneth Downs
Secure Data Software, Inc.
(Ken)nneth@(Sec)ure(Dat)a(.com)

lauri.pi...@atbusiness.com

unread,

Jun 5, 2005, 3:16:14 PM6/5/05

to

I don't think it is neither necessary nor allowed. The column (or
attribute) "A" is a different entity from the domain "A". The column
"A" draws it's values from the domain "A" (ditto for B). The
contstraint could be classified as a table (or relational) constraint
and it is easy to express in SQL.

Best Regards
Lauri Pietarinen

Jonathan Leffler

unread,

Jun 6, 2005, 12:36:52 AM6/6/05

to

Kenneth Downs wrote:
> I'm wondering if some RM theorist might express something for me.
>
> Consider table X with columns A and B. There is a constraint that A must be
> less than B. While we discuss such constraints every day, I realize I do
> not know how to express such constraints in relational terms.
>
> For instance, A is actually a domain, as is B, but the constraint A in fact be part of the very definition of the domain A, and here we are
> defining one domain in terms of another.

A is a 'column' or attribute; it is defined on a domain, but (except in
Codd's 1970 paper), columns are different from domains. A domain is the
data type.

If A and B are attributes that must be compared (constrained) as you
suggest, they would normally be the same type (defined on the same domain).

Don't forget that a domain defines the complete set of values that can
be stored in an attribute defined on that domain. If you have a domain
of negative integers - call it A - and a domain of positive integers -
call it B - then you can infer that all values in domain A are smaller
than all values in domain B.

So, I reject the contention that 'A < B' must be part of the definition
of domain A, and I'm far from convinced that we are defining one domain
in terms of another.

> I realized I have not seen this
> disccussed in the year or so I have been a regular here. Is defining one
> domain in terms of another allowed and considered trivial, no big deal? Is
> it actually not allowed and this is some SQL alteration of true RM?

It depends in part to whom you speak. To compare two values, there must
be a suitable comparator. The simplest cases expect to compare two
values of the same type. In practice, SQL DBMS usually provide numerous
implicit conversions such that you can 'compare' two values of different
types - because the DBMS actually converts one or both values so that
they can be compared.

--
Jonathan Leffler #include <disclaimer.h>
Email: jlef...@earthlink.net, jlef...@us.ibm.com
Guardian of DBD::Informix v2005.01 -- http://dbi.perl.org/

Marshall Spight

unread,

Jun 6, 2005, 3:27:48 AM6/6/05

to

> For instance, A is actually a domain, as is B, but the constraint A in fact be part of the very definition of the domain A, and here we are
> defining one domain in terms of another. I realized I have not seen this
> disccussed in the year or so I have been a regular here.

These kind of questions are fairly interesting, but the data management
field isn't exactly bristling with activity around type systems. For
that,
you have to go to the type theorists. They are quite an interesting
lot.
Much of what they say goes over my head, but much is also
comprehensible.

But I think I must take exception to your claim that we are defining
one domain in terms of another. And this rather gets to the heart
of a question that's been buzzing around the back of my head
for a year or two, namely: just what is the relationship between
types and constraints?

And I in fact have a tentative answer, which is that they are,
respectively, the compile time and run time behavior of the
variables. (Alternatively, the static and dynamic behavior
of the variables.)

Consider: the types of the columns of a table are statically
identifiable, and for any typed tuple variable whatsoever, we can
determine at compile time whether the variable is, say,
insert-compatible with the table. (Or whatever other analyses
we care to run.)

But what of constraints? Let us consider a foreign key constraint.
What can we say at compile time about whether a given insert
will be compatible with a foreign key constaint? Exactly nothing,
because the success or failure depends on the value of the variables
at the time the insert is attempted. We can't know until it's time
to try it.

I believe the same thing applies to your a<b constraint. It is not
in fact a part of the type at all; it is a constraint on the table
variable, which is a distinct thing. If we have a tuple variable
and we wish to insert its value into the table, we have to check
that a<b at the time of insert. It cannot be done statically. But
wait! you say. You could put such a constraint on the tuple
variable, and then you could verify it statically. Right!
The problem, though, is that you've just pushed back the
question one level. Can you push it back all the way? No,
because you might be getting values from an external source.
If you wanted to read two integers from standard input, you
couldn't statically tell that the first was less than the second,
so *there has to be a runtime check somewhere.* You might
as well put it right where it matters and not try to back-propogate
it through the code, as this will only complicate things unnecessarily.

Marshall

Alfredo Novoa

unread,

Jun 6, 2005, 4:36:34 AM6/6/05

to

Kenneth Downs ha escrito:

> Consider table X with columns A and B. There is a constraint that A must be
> less than B. While we discuss such constraints every day, I realize I do
> not know how to express such constraints in relational terms.

This is one of the easiest constraints we can imagine:

constraint X IsEmpty(T where B >= A);

> For instance, A is actually a domain, as is B, but the constraint A in fact be part of the very definition of the domain A, and here we are
> defining one domain in terms of another.

No, you are confusing everything. Domains and columns are not the same.

> I realized I have not seen this
> disccussed in the year or so I have been a regular here. Is defining one
> domain in terms of another allowed and considered trivial, no big deal?

It is allowed but I would not say that it is trivial. I suggest you to
read Date's Intro to Database Systems and The Third Manifesto. Although
it has nothing to do with your constraint, of course.

Regards

John

unread,

Jun 6, 2005, 6:22:14 AM6/6/05

to

Kenneth Downs wrote:

This is straightforward, but we must use the proper terminology.

Relation X has Attributes A and B.

For every Tuple t in the body of X, the attribute value for A must be
"<" the attribute value for B.

In X, the Attributes A and B are both of (in) some domain T.

You have supplied insufficient information to define T, but it is worth
noting that the domain T could be anything you like as long as the
comparison operator "<" is well defined with respect to that domain.

-----

Informally then, you have tried to apply a constraint to the domains,
which isn't what you wanted to do. The constraint is applied to the
tuples in the relation, NOT to the domain(s).

John

Paul

unread,

Jun 6, 2005, 6:41:32 AM6/6/05

to

Marshall Spight wrote:
> But I think I must take exception to your claim that we are defining
> one domain in terms of another. And this rather gets to the heart
> of a question that's been buzzing around the back of my head
> for a year or two, namely: just what is the relationship between
> types and constraints?
>
> And I in fact have a tentative answer, which is that they are,
> respectively, the compile time and run time behavior of the
> variables. (Alternatively, the static and dynamic behavior
> of the variables.)

Here's an example to think about: I create a domain/type called "Number"
which contains real numbers (maybe chopped off at some min/max values
and some level of decimal places or significant figures). I make a
column with this as the type. Then I add a column constraint like
"round(col) = col" i.e. the value must be an integer.

Is this functionally identical to creating an "integer" type and using that?

To go further, imagine a universal type that was a union of the set of
every possible type you could want. You set every column to this type
and then have column constraints to restrict each column to the values
you want.

Maybe this would link in with the thread on NULLs in that your universal
type would only need a single universal NULL?

Note: I'm not seriously advocating this, just arguing a logical point to
see if it reaches an absurdity.

Paul.

Alfredo Novoa

unread,

Jun 6, 2005, 7:09:18 AM6/6/05

to

On Mon, 06 Jun 2005 11:41:32 +0100, Paul <pa...@test.com> wrote:

>Here's an example to think about: I create a domain/type called "Number"
>which contains real numbers (maybe chopped off at some min/max values
>and some level of decimal places or significant figures). I make a
>column with this as the type. Then I add a column constraint like
>"round(col) = col" i.e. the value must be an integer.
>
>Is this functionally identical to creating an "integer" type and using that?

No, you would not have integer division, for instance.

>To go further, imagine a universal type that was a union of the set of
>every possible type you could want. You set every column to this type
>and then have column constraints to restrict each column to the values
>you want.

Then you would not have any operation other than equality. You would
have to "cast" the values all the time.

In Tutorial D there is a type that is the set of all scalar values:
Alpha

We can do things like this:

var a relation { a Alpha };

a := relation { tuple { a 1 }, tuple { a 'shoe' } };

Although this does not work with Rel AFAIK.

>Maybe this would link in with the thread on NULLs in that your universal
>type would only need a single universal NULL?

Nulls are not values, they are an ad hoc cheat.

>Note: I'm not seriously advocating this, just arguing a logical point to
>see if it reaches an absurdity.

Good attitude.

Regards

Kenneth Downs

unread,

Jun 6, 2005, 8:05:36 AM6/6/05

to

Jonathan Leffler wrote:

> Kenneth Downs wrote:
>> I'm wondering if some RM theorist might express something for me.
>>
>> Consider table X with columns A and B. There is a constraint that A must
>> be
>> less than B. While we discuss such constraints every day, I realize I do
>> not know how to express such constraints in relational terms.
>>
>> For instance, A is actually a domain, as is B, but the constraint A > must in fact be part of the very definition of the domain A, and here we
>> are defining one domain in terms of another.
>
> A is a 'column' or attribute; it is defined on a domain, but (except in
> Codd's 1970 paper), columns are different from domains. A domain is the
> data type.

OK.

>
> If A and B are attributes that must be compared (constrained) as you
> suggest, they would normally be the same type (defined on the same
> domain).

OK.

>
> Don't forget that a domain defines the complete set of values that can
> be stored in an attribute defined on that domain. If you have a domain
> of negative integers - call it A - and a domain of positive integers -
> call it B - then you can infer that all values in domain A are smaller
> than all values in domain B.
>
> So, I reject the contention that 'A < B' must be part of the definition
> of domain A, and I'm far from convinced that we are defining one domain
> in terms of another.

OK.

>
>> I realized I have not seen this
>> disccussed in the year or so I have been a regular here. Is defining one
>> domain in terms of another allowed and considered trivial, no big deal?
>> Is it actually not allowed and this is some SQL alteration of true RM?
>
> It depends in part to whom you speak. To compare two values, there must
> be a suitable comparator. The simplest cases expect to compare two
> values of the same type. In practice, SQL DBMS usually provide numerous
> implicit conversions such that you can 'compare' two values of different
> types - because the DBMS actually converts one or both values so that
> they can be compared.
>

Then I guess my question is restated, does the RM have anything to say about
defining one column in terms of another? While the constraint is generally
expressed and accepted as existing as a property of the *table*, is it not
in fact a fundamental property of COLUMN A?

Kenneth Downs

unread,

Jun 6, 2005, 8:14:12 AM6/6/05

to

Marshall Spight wrote:

>> For instance, A is actually a domain, as is B, but the constraint A > must in fact be part of the very definition of the domain A, and here we
>> are
>> defining one domain in terms of another. I realized I have not seen this
>> disccussed in the year or so I have been a regular here.
>
> These kind of questions are fairly interesting, but the data management
> field isn't exactly bristling with activity around type systems. For
> that,
> you have to go to the type theorists. They are quite an interesting
> lot.
> Much of what they say goes over my head, but much is also
> comprehensible.
>
> But I think I must take exception to your claim that we are defining
> one domain in terms of another. And this rather gets to the heart
> of a question that's been buzzing around the back of my head
> for a year or two, namely: just what is the relationship between
> types and constraints?

Jonathan gave a pretty clear answer on type <> domain, so I am with you so
far. Most importantly, the restatement of the question as relationship
twixt types and constraints is on the money. In my own mind I am also
asking if the constraint is a property of the column or the table.

>
> And I in fact have a tentative answer, which is that they are,
> respectively, the compile time and run time behavior of the
> variables. (Alternatively, the static and dynamic behavior
> of the variables.)

I didn't know where you were going with this at first, but it is quite neat.

But when you say compile time, are you speaking allegorically of the CREATE
TABLE command?

>
> Consider: the types of the columns of a table are statically
> identifiable, and for any typed tuple variable whatsoever, we can
> determine at compile time whether the variable is, say,
> insert-compatible with the table. (Or whatever other analyses
> we care to run.)

OK.

>
> But what of constraints? Let us consider a foreign key constraint.
> What can we say at compile time about whether a given insert
> will be compatible with a foreign key constaint? Exactly nothing,
> because the success or failure depends on the value of the variables
> at the time the insert is attempted. We can't know until it's time
> to try it.

Right.

>
> I believe the same thing applies to your a in fact a part of the type at all; it is a constraint on the table
> variable, which is a distinct thing. If we have a tuple variable
> and we wish to insert its value into the table, we have to check
> that a wait! you say. You could put such a constraint on the tuple
> variable, and then you could verify it statically. Right!
> The problem, though, is that you've just pushed back the
> question one level. Can you push it back all the way? No,
> because you might be getting values from an external source.
> If you wanted to read two integers from standard input, you
> couldn't statically tell that the first was less than the second,
> so *there has to be a runtime check somewhere.* You might
> as well put it right where it matters and not try to back-propogate
> it through the code, as this will only complicate things unnecessarily.
>

What's really bugging me though is the question of how the constraint is
defined.

In our db-builder system, our first-pass constraint system did no more than
allow you a list of expressions to be evaluated at the table level. When
we refined it, we defined the constraints in terms of allowed values for
individual columns, which was far easier to deal with. In other words, the
dictionary table holding constraints had as its foreign key a table and
column, not a table. This was much easier to work with.

When things get much easier and work much better, it is usually a clue you
have found the "true nature" of something. This is what got me wondering
if the RM has anything to say about stating the allowed values of one
column in terms of another, since that seems the most natural way to do it
in practice.

Kenneth Downs

unread,

Jun 6, 2005, 9:40:55 AM6/6/05

to

John wrote:

Thank you for the straightforward reply.

Alfredo Novoa

unread,

Jun 6, 2005, 10:18:43 AM6/6/05

to

On Mon, 06 Jun 2005 08:05:36 -0400, Kenneth Downs
<knode.wa...@see.sigblock> wrote:

>Then I guess my question is restated, does the RM have anything to say about
>defining one column in terms of another?

Queries.

> While the constraint is generally
>expressed and accepted as existing as a property of the *table*, is it not
>in fact a fundamental property of COLUMN A?

It is not.

Regards

Alfredo Novoa

unread,

Jun 6, 2005, 10:17:54 AM6/6/05

to

On Mon, 06 Jun 2005 08:14:12 -0400, Kenneth Downs
<knode.wa...@see.sigblock> wrote:

>Jonathan gave a pretty clear answer on type <> domain,

Where?

He said: column <> domain.

Type and domain are almost always used as synonyms.

> so I am with you so
>far. Most importantly, the restatement of the question as relationship
>twixt types and constraints is on the money. In my own mind I am also
>asking if the constraint is a property of the column or the table.

Your constraint is a property of the table because it is not a column
type constraint.

Regards

Paul

unread,

Jun 6, 2005, 11:13:07 AM6/6/05

to

Kenneth Downs wrote:
> Then I guess my question is restated, does the RM have anything to say about
> defining one column in terms of another?

you mean like a calculated column? Or a view?

> While the constraint is generally expressed and accepted as existing
> as a property of the *table*, is it not in fact a fundamental
> property of COLUMN A?

No, because if you drop column B, it become invalid. Maybe you could say
it's a property of the set of columns {A, B}, but it seems to me cleaner
and more useful to just say it's a table property.

Paul.

Kenneth Downs

unread,

Jun 6, 2005, 12:57:06 PM6/6/05

to

Alfredo Novoa wrote:

> On Mon, 06 Jun 2005 08:14:12 -0400, Kenneth Downs
> <knode.wa...@see.sigblock> wrote:
>
>>Jonathan gave a pretty clear answer on type <> domain,
>
> Where?
>
> He said: column <> domain.

my bad, I meant column <> domain

>
> Type and domain are almost always used as synonyms.
>
>> so I am with you so
>>far. Most importantly, the restatement of the question as relationship
>>twixt types and constraints is on the money. In my own mind I am also
>>asking if the constraint is a property of the column or the table.
>
> Your constraint is a property of the table because it is not a column
> type constraint.
>
>
> Regards

--

mAsterdam

unread,

Jun 6, 2005, 2:22:29 PM6/6/05

to

Kenneth Downs wrote:

The constraint 'A < B' suggests to me that there is
'something' of A in B or the other way round. In this
case it is easy to engage some isolation.
The base table (say W) should just contain A and C, C being a positive
number. X becomes a view with A = W.A and B = W.A + W.C

That this isolation is possible in this example
should not be taken to mean that it is in all cases
you have in mind, though.

My 2 Eurocents.

Gene Wirchenko

unread,

Jun 6, 2005, 7:47:29 PM6/6/05

to

On Mon, 06 Jun 2005 11:41:32 +0100, Paul <pa...@test.com> wrote:

[snip]

>To go further, imagine a universal type that was a union of the set of
>every possible type you could want. You set every column to this type
>and then have column constraints to restrict each column to the values
>you want.

I could want a type that is a local phone number, a string with
picture "999-9999". I could want a type that is a student number for
Thompson Rivers University, a string with picture "999-9999".
Complications may ensue.

>Maybe this would link in with the thread on NULLs in that your universal
>type would only need a single universal NULL?
>
>Note: I'm not seriously advocating this, just arguing a logical point to
>see if it reaches an absurdity.

Already there, already there.

Sincerely,

Gene Wirchenko

Marshall Spight

unread,

Jun 7, 2005, 1:04:15 AM6/7/05

to

>> And I in fact have a tentative answer, which is that they are,
>> respectively, the compile time and run time behavior of the
>> variables. (Alternatively, the static and dynamic behavior
>> of the variables.)
>
> I didn't know where you were going with this at first, but it is quite neat.
>
> But when you say compile time, are you speaking allegorically of the
> CREATE TABLE command?

Kinda. I can do it by looking at the create table command, because
that has all the static data I need. (It contains the type data.)

In a language like, say C++, there is a clear and obvious distinction
between compile time and runtime. In SQL, it is not at all obvious,
for two reasons: you have the interactive use of the language, and
you *untypechecked* use embedded in code. The distinction
between compile time and runtime is obscured, because the two
most common ways you use SQL have compile time and runtime
always paired.

But consider: any *type constraint* can be checked without
actually executing the query. Any non-type constraint cannot.
If you give me a SQL insert, and the table definitions of the
associated tables, I can tell you if the query is well-typed or
not, but I cannot tell you if the constraints are met.

So if you have a table A with a single int column a, and
you hand me "insert into A values ('hello, world')" I can
reject this as mistyped immediately, *without reference to
the current value of A.* This query is *always* wrong,
because it is mistyped.

In contrast, if there's a foreign key constraint, and you hand
me "insert into A values (1)" I cannot know whether this
will pass constraints without examining the database.
If your constraint involves invoking functions, I cannot know
whether it will pass without actually executing the function.

Marshall

Kenneth Downs

unread,

Jun 7, 2005, 10:28:22 AM6/7/05

to

mAsterdam wrote:

So are you suggesting the possibility that if A is contrained by B, then it
is possible that B can be determined by A or by other column values? This
would be exciting if it could be demonstrated in the general case.

But it does not seem to apply generally. Consider the case of credit limit
and customer exposure. Credit limit is a normal column in the customer
table. Exposure equals sum of open orders and invoices, and exposure may
not exceed credit limit. Notwithstanding whether or not we materialize the
column "exposure", it is defined as an attribute of the customer.