Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

What to call this operator?

3 views
Skip to first unread message

Marshall Spight

unread,
Jun 26, 2005, 11:27:15 PM6/26/05
to
In chapter 4 or The Third Manifesto, D&D define a "new relational
algebra."
This algebra includes two operations named "<AND>" and "<OR>". (They
use some weird triangle characters which I'm approximating with <>.)

Given relations S and T, having sets of attributes a (only in S),
b (in both S and T) and c (only in T), they define:

<AND> as { (a, b, c) | (a, b) in S, (b, c) in T }

<OR> as { (a, b, c) | (a, b) in S, c unconstrained UNION
(a, b, c) | (b, c) in T, a unconstrained }

"unconstrained" means that all values from the domain are present.

They go on to point out that <AND> is the natural join, but they
don't give a name to <OR>.

Does anyone have a good idea for what it should be called?
I don't like "or" because it's ambiguous with the boolean
operator. "<OR>" isn't great for syntactic reasons. "Disjunction"
is cumbersome. I'd like to hear something analogous to "join."
What about "meet", does that work? It's the usual counterpart to
"join" but I don't know enough math to decide if it's appropriate.

Anyone have any other ideas?


Marshall

paul c

unread,
Jun 27, 2005, 9:18:25 AM6/27/05
to
Marshall Spight wrote:
> In chapter 4 or The Third Manifesto, D&D define a "new relational
> algebra."
> This algebra includes two operations named "<AND>" and "<OR>". (They
> use some weird triangle characters which I'm approximating with <>.)

on pg 44 they call those characters arrowheads. i like the association
with anthropology.

>
> Given relations S and T, having sets of attributes a (only in S),
> b (in both S and T) and c (only in T), they define:
>
> <AND> as { (a, b, c) | (a, b) in S, (b, c) in T }
>
> <OR> as { (a, b, c) | (a, b) in S, c unconstrained UNION
> (a, b, c) | (b, c) in T, a unconstrained }
>
> "unconstrained" means that all values from the domain are present.
>
> They go on to point out that <AND> is the natural join, but they
> don't give a name to <OR>.

in the 'parallel' paragraphs on pg 56 they call it 'union'. they also
suggest 'conjoin' and 'disjoin'.


>
> Does anyone have a good idea for what it should be called?
> I don't like "or" because it's ambiguous with the boolean
> operator. "<OR>" isn't great for syntactic reasons. "Disjunction"
> is cumbersome. I'd like to hear something analogous to "join."
> What about "meet", does that work? It's the usual counterpart to
> "join" but I don't know enough math to decide if it's appropriate.

this field is full of people calling different things by the same names.
personally, i'm not bothered by 'and' and 'or' or 'relational and',
'relational or'. it's all a force-fit anyway, trying to obtain a single
relation result.

>
> Anyone have any other ideas?

multiple relation results!

p

David Cressey

unread,
Jun 27, 2005, 12:01:44 PM6/27/05
to

"Marshall Spight" <marshal...@gmail.com> wrote in message
news:1119842835.4...@f14g2000cwb.googlegroups.com...

With apologies to Humberto Eco, you could call it a "rose".

But seriously, I'm unable to figure out from the formal description what it
is. more importantly, I'm unable to figure out what it is FOR. Not that
there's anything wrong with your formal description. It's just my own
unfamiliarity with it that causes problems.

How does <OR> differ from "full outer join"?

Is the "unconstrained" construct just another way of introducing NULLs (for
unknown values), without using that dreaded term?

What is the value of <OR> MINUS <AND> ???

What would <OR> be useful for?

Sorry to answer a question with so many questions, but it's the best I can
do, for now.


Mikito Harakiri

unread,
Jun 27, 2005, 12:43:41 PM6/27/05
to

The outer union operator dates back to Codd
"Extending Relational Model to capture more meaning"

See also Galindo-Legaria "Outer joins as disjunctions"

In my opinion this definition of union is less interesting than
http://arxiv.org/pdf/cs.DB/0501053

Marshall Spight

unread,
Jun 27, 2005, 4:22:05 PM6/27/05
to

I skimmed that paper and it looks quite interesting. (I'll try to
give it a deeper read tonight.) I'm unfamiliar with the tensor
product, and Googling only gave me vector definitions which I
couldn't map into relations.

As a math paper it makes sense, but I wouldn't expect to be
able to use those concepts unmodified in trying to build
a computer system because of the use of infinite relations.
It might be possible to specify these intensionally but
of course not extensionally. So for example in sect 3.2,
selection is defined in terms of join + an infinite relation.
I don't know that it's all that much of a difference to
supply the relation intentionally, aka with a selection
predicate. Of course, this means adding back in selection
as a primitive. Hmmm; maybe you could extend join so it
works with explicitly intentional relations. Hmmm.

My intuition is that there might be some advantage to
building systems with as few primitives as possible
because it would simplify the optimizer. But that's
not immediately clear.

Very interesting.


Marshall

Jon Heggland

unread,
Jun 28, 2005, 5:13:23 AM6/28/05
to
In article <IpVve.12426$pa3....@newsread2.news.atl.earthlink.net>,
david....@earthlink.net says...

> But seriously, I'm unable to figure out from the formal description what it
> is. more importantly, I'm unable to figure out what it is FOR. Not that
> there's anything wrong with your formal description. It's just my own
> unfamiliarity with it that causes problems.

It is part of a minimalistic approach to relational algebra that is more
geared towards logic instead of set theory. <OR> is a generalisation of
union. If the relations have the same heading, the result is a union.

I think the point is to have a counterpart to <AND> (which covers join,
product, intersection, selection and extension) that covers union, but
places no restrictions on the types of the operands, and has simple
logic-based semantics.

> How does <OR> differ from "full outer join"?

It is more like outer union than outer join. It has clearly defined
semantics, and there are no nulls. It is, however, possibly infinite.
--
Jon

David Cressey

unread,
Jun 28, 2005, 9:24:34 AM6/28/05
to

"Jon Heggland" <hegg...@idi.ntnu.no> wrote in message
news:MPG.1d2b3321b...@news.ntnu.no...

Thanks Jon. I'm starting to get the picture, however dimly. <AND> and <OR>
are both subsets of the cartesian product of a, b, and c. Because of the
way they are both defined, the linkage domain, b, follows rules for a
"natural join", at my level of abstraction (which is less than this
formalism).

and now I'm starting to fill in the gaps. if A is in a, B is in b, and C is
in c then

A, B, C is in S <AND> T iff A,B is in S and B,C is in T.
A, B, C is in S <OR> T iff A,B is ion S or B,C is in T.

Is this right? Is it complete?

Mikito Harakiri

unread,
Jun 28, 2005, 12:37:17 PM6/28/05
to
Jon Heggland wrote:
> In article <IpVve.12426$pa3....@newsread2.news.atl.earthlink.net>,
> david....@earthlink.net says...
> > But seriously, I'm unable to figure out from the formal description what it
> > is. more importantly, I'm unable to figure out what it is FOR. Not that
> > there's anything wrong with your formal description. It's just my own
> > unfamiliarity with it that causes problems.
>
> It is part of a minimalistic approach to relational algebra that is more
> geared towards logic instead of set theory. <OR> is a generalisation of
> union. If the relations have the same heading, the result is a union.
>
> I think the point is to have a counterpart to <AND> (which covers join,
> product, intersection, selection and extension) that covers union, but
> places no restrictions on the types of the operands, and has simple
> logic-based semantics.

I wonder how far this algebra is developed. Both binary operations
<AND> and <OR> are idempotent, commutative and accociative. This three
properties are enough to define a semilattice. That is we have 2
semilattices:
1. the upper semilattice for <OR>
2. the lowe semilattice for <AND>
Each semilattice defines a partial order relation, one for upper
lattice
x<=y iff y=x<OR>y
and one for lower one
x<=y iff x=x<AND>y
Even though I slopily used the same symbol "<=", these two partial
orders are incompatible unless the algebra meet the *Absorption Law*.
Absorption law merges the two semilattices into a single lattice.

Unfortunately, D&D algebra doesn't meet the absorption law. The Lattice
algebra that I mentioned in the other thread does. The partial order
there is a generalization of the "is subset of" relation applied to the
any pair of database relations, even those with different headings.

Neither of those algebras (D&D,nor Lattice) is boolean. D&D has nice
distributivity property, although without absorption it doesn't buy us
much. Lattice algebra doesn't have distributivity, which looks like
serious obstacle when transforming expressions.

operations <AND> and <OR> semilatt

Mikito Harakiri

unread,
Jun 28, 2005, 1:04:31 PM6/28/05
to
As Jon correctly noticed, I goofed with outer union:-)

Marshall Spight wrote:
> > http://arxiv.org/pdf/cs.DB/0501053
>
> I skimmed that paper and it looks quite interesting. (I'll try to
> give it a deeper read tonight.) I'm unfamiliar with the tensor
> product, and Googling only gave me vector definitions which I
> couldn't map into relations.

Now that either Cartesian Product, not Tensor Product is considered
primitive they are somewhat less interesting. It was called "Tensor"
because it behaves multiplicatively on columns and additively on rows
(ahem, tuples). More pressing questions are how to express difference,
and aggregation. Without answering these there is little practicality.

> As a math paper it makes sense, but I wouldn't expect to be
> able to use those concepts unmodified in trying to build
> a computer system because of the use of infinite relations.

Well, in D&D algebra infinite relation can be a result of primitive
operation (<OR>) applied to finite relations! Lattice algebra produces
finite relations whenever arguments are finite, although there is
little camfort in that.

> My intuition is that there might be some advantage to
> building systems with as few primitives as possible
> because it would simplify the optimizer. But that's
> not immediately clear.

Simplifying query transformations was indeed one the goals. With only 2
operations one can hope to make query rewrite formal and mechanical.
The major stumbling block, however, is non-distributivity.

paul c

unread,
Jun 28, 2005, 4:12:55 PM6/28/05
to
Mikito Harakiri wrote:
> As Jon correctly noticed, I goofed with outer union:-)
>
> Marshall Spight wrote:
>
>> ...

>
>>My intuition is that there might be some advantage to
>>building systems with as few primitives as possible
>>because it would simplify the optimizer. But that's
>>not immediately clear.
>
>
> Simplifying query transformations was indeed one the goals. With only 2
> operations one can hope to make query rewrite formal and mechanical.
> The major stumbling block, however, is non-distributivity.
>

D&D chapter 4 intrigues me for a similar reason - it seems more directly
programmable suggesting a more elemental (and i would hope, smaller)
implementation without the confusion that i think the sql products have
produced with their attention to user artifacts such as files and
'tables' which seem to have led to all kinds of detours and dead ends
over the last 30 years. just my intuition too.

i don't know enough theory to talk much about optimization, but the
optimizations that would intrigue me would be ones that let results,
intermediate or 'final' be expressed as multiple relations. maybe my
ignorance is also showing when i say that i wouldn't mind having to
operate on a negated table.

the place of <OR> in the world puzzles me too. for one thing it appears
to me that it produces the same result of <AND> when there are no
attributes in common, ie. cartesian product. am i wrong?

if it does product the cartesian product, is this somehow contrary to
orthogonality?

so far, the only use i can see for <OR> is as a separate version to
double-check the results of <AND> and <NOT>. or maybe as an
optimization on occasion.

p

Mikito Harakiri

unread,
Jun 28, 2005, 4:48:41 PM6/28/05
to
paul c wrote:
> D&D chapter 4 intrigues me for a similar reason - it seems more directly
> programmable suggesting a more elemental (and i would hope, smaller)
> implementation without the confusion that i think the sql products have
> produced with their attention to user artifacts such as files and
> 'tables' which seem to have led to all kinds of detours and dead ends
> over the last 30 years. just my intuition too.

SQL products aside, 5 basic classic relational operators (+renaming) is
just too many for an algebra to bear. And then, when we consider view
equations, an algebra with complex operators simply defy developing any
expression rewriting technique.

> the place of <OR> in the world puzzles me too. for one thing it appears
> to me that it produces the same result of <AND> when there are no
> attributes in common, ie. cartesian product. am i wrong?

If relations have disjoint headers, the result would be an infinite
relation, not the Cartesian product.

> if it does product the cartesian product, is this somehow contrary to
> orthogonality?
>
> so far, the only use i can see for <OR> is as a separate version to
> double-check the results of <AND> and <NOT>. or maybe as an
> optimization on occasion.

Once again, there are at least three versions of union definition to
consider:
1. D&D
2. outer union
3. Lattice

I don't quite see though how options #1 and #2 help reducing the number
of primitive operations. How does D&D represents renaming and
projection, for example?

paul c

unread,
Jun 28, 2005, 5:45:06 PM6/28/05
to
Mikito Harakiri wrote:
> paul c wrote:
>
>>...

>>the place of <OR> in the world puzzles me too. for one thing it appears
>>to me that it produces the same result of <AND> when there are no
>>attributes in common, ie. cartesian product. am i wrong?
>
>
> If relations have disjoint headers, the result would be an infinite
> relation, not the Cartesian product.
>

i gather that you are saying that the definition means i must infer all
possible tuples given whatever domains their attributes specify rather
than all possible tuples based only on the contents of the origial
relations.

just to see if i've got this straight i wonder if you'd check this:

if may leave infinity out of it and if i may assume a common finite
domain with symbols for its values of 1 and 2 and if i may order the
attributes just for this purpose and i have r1=<1> and r2 = <1> but no
common attribute names (ie. disjoint headers) then r1 <or> r2 would be
<1,1>,<1,2>,<2,1>, whereas the cartesian product would be <1,1>.

if that's right, then i am a bit happier about <or> in that it doesn't
seem to give the same result of <and> (although i can't think of a good
reason right now for objecting to that).

>
>>if it does product the cartesian product, is this somehow contrary to
>>orthogonality?
>>
>>so far, the only use i can see for <OR> is as a separate version to
>>double-check the results of <AND> and <NOT>. or maybe as an
>>optimization on occasion.
>
>
> Once again, there are at least three versions of union definition to
> consider:
> 1. D&D
> 2. outer union
> 3. Lattice
>
> I don't quite see though how options #1 and #2 help reducing the number
> of primitive operations. How does D&D represents renaming and
> projection, for example?

i gather you're not pleased with D&D's idea of 'treating operators as
relations'. must admit i don't think i understand it entirely. will
have to ponder it.

many thanks,
p

Mikito Harakiri

unread,
Jun 28, 2005, 5:59:49 PM6/28/05
to
paul c wrote:
> if that's right, then i am a bit happier about <or> in that it doesn't
> seem to give the same result of <and> (although i can't think of a good
> reason right now for objecting to that).

Your intution is quite right: it would be very suspicious if they
produce the same result. <OR> and <AND> are dual in the "normal"
boolean algebra, so that the only time when they produce the same
result is when the first argument is the same as the second one. Once
again, relational algebra is not boolean. The notion of duality,
however, still has a well defined interpretation in the Lattice model.

paul c

unread,
Jun 28, 2005, 6:07:32 PM6/28/05
to
paul c wrote:
> Mikito Harakiri wrote:
>
>> paul c wrote:
>>
>>> ...
>>> the place of <OR> in the world puzzles me too. for one thing it appears
>>> to me that it produces the same result of <AND> when there are no
>>> attributes in common, ie. cartesian product. am i wrong?
>>
>>
>>
>> If relations have disjoint headers, the result would be an infinite
>> relation, not the Cartesian product.
>>
>
>
> i gather that you are saying that the definition means i must infer all
> possible tuples given whatever domains their attributes specify rather
> than all possible tuples based only on the contents of the origial
> relations.
>
> just to see if i've got this straight i wonder if you'd check this:
>
> if may leave infinity out of it and if i may assume a common finite
> domain with symbols for its values of 1 and 2 and if i may order the
> attributes just for this purpose and i have r1=<1> and r2 = <1> but no
> common attribute names (ie. disjoint headers) then r1 <or> r2 would be
> <1,1>,<1,2>,<2,1>, whereas the cartesian product would be <1,1>.
>
> ...

to put it another way, if the result of <or> happens to be <a,b> i can
interpret the result as:

a is true and b is true
or (in the more common sense, not the D&D <or>)
a is true and b is false
or
a is false and b is true

but not "a is false and b is false".

i think now that is the way i should have approached the definition in
the first place as it reads more like the usual definition of logical
'or'.

still, i can't get it through my head why it is important to allow
infinite domains. granted that results for finite domains could still
be very large, but othertimes they could be very small!

is the reason, say for integers, simply that integers in math are
considered infinite? if so, there's some historical evidence that
limiting them won't cause problems for many applications.

or is the thinking that the number of times results using infinite
domains are small is no fewer than if the domains were finite?


thanks again,
p

Jon Heggland

unread,
Jun 29, 2005, 2:59:47 AM6/29/05
to
In article <1119991721....@o13g2000cwo.googlegroups.com>,
mikharaki...@yahoo.com says...

> > the place of <OR> in the world puzzles me too. for one thing it appears
> > to me that it produces the same result of <AND> when there are no
> > attributes in common, ie. cartesian product. am i wrong?
>
> If relations have disjoint headers, the result would be an infinite
> relation, not the Cartesian product.

Minor nit: It would not be infinite if all the domains involved were
finite.

> I don't quite see though how options #1 and #2 help reducing the number
> of primitive operations. How does D&D represents renaming and
> projection, for example?

As fundamental operators. Or to be more precise: projection is called
<REMOVE>, specifies the attributes to remove instead of retain, and
corresponds to the existential quantifier. I think D&D's <RENAME> could
be dispensed with using Tropashko's technique, though.
--
Jon

Jon Heggland

unread,
Jun 29, 2005, 3:12:17 AM6/29/05
to
In article <ESjwe.1820496$Xk.1729912@pd7tw3no>, toledob...@oohay.ac
says...

> to put it another way, if the result of <or> happens to be <a,b> i can
> interpret the result as:
>
> a is true and b is true
> or (in the more common sense, not the D&D <or>)
> a is true and b is false
> or
> a is false and b is true
>
> but not "a is false and b is false".

Or, to put it more simply, "a is true or b is true". Whereas the result
of an <and> operation is interpreted as "a is true and b is true".

> still, i can't get it through my head why it is important to allow
> infinite domains. granted that results for finite domains could still
> be very large, but othertimes they could be very small!

In theory, domains can be infinite, so the theory has to take that into
account. In an implementation, domains are always finite, of course---
though they are most likely large enough that it is impractical to
materialise the results of such <or> invocations (or <not>s, of course).
I don't really understand your objection, though.
--
Jon

Jon Heggland

unread,
Jun 29, 2005, 4:11:09 AM6/29/05
to
In article <1119976637....@g47g2000cwa.googlegroups.com>,
mikharaki...@yahoo.com says...

> I wonder how far this algebra is developed. Both binary operations
> <AND> and <OR> are idempotent, commutative and accociative. This three
> properties are enough to define a semilattice. That is we have 2
> semilattices:
> 1. the upper semilattice for <OR>
> 2. the lowe semilattice for <AND>
> Each semilattice defines a partial order relation, one for upper
> lattice
> x<=y iff y=x<OR>y
> and one for lower one
> x<=y iff x=x<AND>y
> Even though I slopily used the same symbol "<=", these two partial
> orders are incompatible unless the algebra meet the *Absorption Law*.
> Absorption law merges the two semilattices into a single lattice.

This is very interesting. Would you mind elaborating? Can you formalise
the absorption law? How did you arrive at these partial order
definitions?

> Unfortunately, D&D algebra doesn't meet the absorption law. The Lattice
> algebra that I mentioned in the other thread does. The partial order
> there is a generalization of the "is subset of" relation applied to the
> any pair of database relations, even those with different headings.

I quite liked the symmetry of the join and union of Tropashko's lattice
algebra---that the attributes of a join result is the union of the
attributes of its operands, and the attributes of a union result is the
intersection of the attributes of its operands.

However, the symmetry/duality is lost in the semantics of the
operations---the relation predicate of the result. If relvars A and B
have predicates PA and PB, respectively, the expression A JOIN/<AND> B
has the predicate PA AND PB (logical and) in both D&D and Tropashko.
However, D&D's A <OR> B has the predicate PA OR PB (logical or), while
Tropashko's A UNION B has a rather more complicated and less intuitive
predicate (on first glance).

> Neither of those algebras (D&D,nor Lattice) is boolean. D&D has nice
> distributivity property, although without absorption it doesn't buy us
> much.

Why is this absorption so important?
--
Jon

Jon Heggland

unread,
Jun 29, 2005, 4:13:09 AM6/29/05
to
In article <mccwe.363$aY6...@newsread1.news.atl.earthlink.net>,
david....@earthlink.net says...

> Thanks Jon. I'm starting to get the picture, however dimly. <AND> and <OR>
> are both subsets of the cartesian product of a, b, and c.

No. See the other thread branch.
--
Jon

Mikito Harakiri

unread,
Jun 29, 2005, 12:19:01 PM6/29/05
to

Jon Heggland wrote:
> In article <1119976637....@g47g2000cwa.googlegroups.com>,
> mikharaki...@yahoo.com says...
> > I wonder how far this algebra is developed. Both binary operations
> > <AND> and <OR> are idempotent, commutative and accociative. This three
> > properties are enough to define a semilattice. That is we have 2
> > semilattices:
> > 1. the upper semilattice for <OR>
> > 2. the lowe semilattice for <AND>
> > Each semilattice defines a partial order relation, one for upper
> > lattice
> > x<=y iff y=x<OR>y
> > and one for lower one
> > x<=y iff x=x<AND>y
> > Even though I slopily used the same symbol "<=", these two partial
> > orders are incompatible unless the algebra meet the *Absorption Law*.
> > Absorption law merges the two semilattices into a single lattice.
>
> This is very interesting. Would you mind elaborating? Can you formalise
> the absorption law? How did you arrive at these partial order
> definitions?

This was nothing more than rehash of the first 2 pages of

http://math.berkeley.edu/~gbergman/245/Ch.5.ps

In fact, Bergman calls absorption law "compatibility":
x /\ (x\/y) = x

> > Unfortunately, D&D algebra doesn't meet the absorption law. The Lattice
> > algebra that I mentioned in the other thread does. The partial order
> > there is a generalization of the "is subset of" relation applied to the
> > any pair of database relations, even those with different headings.
>
> I quite liked the symmetry of the join and union of Tropashko's lattice
> algebra---that the attributes of a join result is the union of the
> attributes of its operands, and the attributes of a union result is the
> intersection of the attributes of its operands.
>
> However, the symmetry/duality is lost in the semantics of the
> operations---the relation predicate of the result. If relvars A and B
> have predicates PA and PB, respectively, the expression A JOIN/<AND> B
> has the predicate PA AND PB (logical and) in both D&D and Tropashko.
> However, D&D's A <OR> B has the predicate PA OR PB (logical or), while
> Tropashko's A UNION B has a rather more complicated and less intuitive
> predicate (on first glance).

I'm not qualified to comment on calculus perspective.

> > Neither of those algebras (D&D,nor Lattice) is boolean. D&D has nice
> > distributivity property, although without absorption it doesn't buy us
> > much.
>
> Why is this absorption so important?

Again, see compatibility in Bergman's notes. Other than that I cant do
better than hand waving. With lattice order everything comes together:
projection and renaming are magically expressed in the meet and join
terms, "subset of" is generalized to be applicable to any pair of
database relations. Distributivity is less important because one can
still transform expressions without it.

Marshall Spight

unread,
Jun 29, 2005, 2:24:26 PM6/29/05
to
Okay, here's what I came up with. The goal is to take this algebra
and product something that can be implemented. This means
avoiding infinite relations is a desirable goal.

Base operations: natural join and generalized union, henceforth
referred to as "join" and "meet." Both work on any stored
relation values.

With these operators, we can implement join, union, (duh) and
projection.

In addition, we allow ourselves to construct arbitrary functions
with (domain, range) both being tuples of named fields.

The join operator is extended to work with a relation value
and a function value, such that the fields of the function
domain are a subset of the fields of the relation. This is
easy to implement. With this operator we can obviously
implement extend. With extend and project, we can implement
rename.

If we use join-func with a boolean function and join with
{ (result=true) } we can implement difference and restrict.

That pretty much wraps it up.

One final note, reducing everything down to these 2.5
operations has another advantage, which is that it make
key inference a lot simpler.

With join, all keys are preserved.
With meet, all the keys that were completely contained
in the remaining columns remain. If there are none, then
you get a single key that covers all the remaining columns.

Did I get that right?


Marshall

Mikito Harakiri

unread,
Jun 29, 2005, 2:08:48 PM6/29/05
to
Jon Heggland wrote:
> As fundamental operators. Or to be more precise: projection is called
> <REMOVE>, specifies the attributes to remove instead of retain, and
> corresponds to the existential quantifier. I think D&D's <RENAME> could
> be dispensed with using Tropashko's technique, though.

Ah, yes, renaming A(x,y)->A(w,z) is a composition of join and
projection

project_w,z( A(x,y) |><| (x=w) |><| (y=z) )

paul c

unread,
Jun 29, 2005, 1:37:37 PM6/29/05
to

i probably don't understand it either. my objection might be merely
psychological, being used to programs to that finish sooner or later.

another thing that intrigues me about D&D <or> (now that Mikito has
corrected my interpretation of it) is that maybe it is a way to expose
(perhaps i should say materialize?) a domain.

p

Marshall Spight

unread,
Jun 29, 2005, 1:36:41 PM6/29/05
to
Jon Heggland wrote:
>
> I quite liked the symmetry of the join and union of Tropashko's lattice
> algebra---that the attributes of a join result is the union of the
> attributes of its operands, and the attributes of a union result is the
> intersection of the attributes of its operands.
>
> However, the symmetry/duality is lost in the semantics of the
> operations---the relation predicate of the result. If relvars A and B
> have predicates PA and PB, respectively, the expression A JOIN/<AND> B
> has the predicate PA AND PB (logical and) in both D&D and Tropashko.
> However, D&D's A <OR> B has the predicate PA OR PB (logical or), while
> Tropashko's A UNION B has a rather more complicated and less intuitive
> predicate (on first glance).

I look at it this way:

(I'll call Tropashko's operator "generalized union.")

A natural join B: the union of the columns and the intersection of the
rows

A generalized union B: the intersection of the columns and union of the
rows.

Quite symmetric, as I see it.


Marshall

Vadim Tropashko

unread,
Jun 30, 2005, 12:53:50 PM6/30/05
to
Mikito Harakiri wrote:
> "subset of" is generalized to be applicable to any pair of
> database relations.

There are couple minor but cute comments about the "is subset of"
relation.

Let use capital letters for (database) relation variables. Normally we
write A(x,y) to denote a relation A with attributes x and y. For empty
relations A(x), A(y), B(z) let just write them as x,y,z. Let also "<="
to denote "is subset of". Then

x|><|y <= A(x,y)

or more succinctly

x|><|y <= A

Similar to math nothation let's drop the join symbol, assuming that in
the absence of the other operators it's implicit between any 2 relation
variables. There we have

x y <= A

That's right, instead of bracket notation A(x,y) saying that relation A
has attributes x and y, we can just write "A >= x y" implying that A is
a superset of join of empty relations x and y.

Vadim Tropashko

unread,
Jun 30, 2005, 1:31:41 PM6/30/05
to
Marshall Spight wrote:
> Okay, here's what I came up with. The goal is to take this algebra
> and product something that can be implemented. This means
> avoiding infinite relations is a desirable goal.

Avoiding infinite relations is one possibility. Transforming an
expression to the form where the order of operations guarantees that
there is no infinite intermediate result is another (how?) Pipelining
the execution (assuming enumerateable domains) is yet another.

> Base operations: natural join and generalized union, henceforth
> referred to as "join" and "meet." Both work on any stored
> relation values.
>
> With these operators, we can implement join, union, (duh) and
> projection.
>
> In addition, we allow ourselves to construct arbitrary functions
> with (domain, range) both being tuples of named fields.
>
> The join operator is extended to work with a relation value
> and a function value, such that the fields of the function
> domain are a subset of the fields of the relation. This is
> easy to implement. With this operator we can obviously
> implement extend. With extend and project, we can implement
> rename.
>
> If we use join-func with a boolean function and join with
> { (result=true) } we can implement difference and restrict.
>
> That pretty much wraps it up.

This indeed might be simpler from implementational that theoretical
perspective -- difficult to tell from brief outline of your idea.

Regarding the difference operator, the paper is left hanging with a
flaw, as difference can't be expressed directly via join and meet. It
can be expressed as set of equations, though. Formally,

Z union (X join Y) = X union Y
Z join X join Y = {}

defines symmetric difference Z for any X and Y. (The ordinary
difference, then, is a join of the symmetric difference Z with relation
X.)

The situation is somewhat similar to arithmetics where we define minus
via solution of equation

x + y = z

With this approach existance and uniqueness of solution is a theorem to
prove:-)

Mikito Harakiri

unread,
Jun 30, 2005, 3:02:09 PM6/30/05
to
Vadim Tropashko wrote:
> Let use capital letters for (database) relation variables. Normally we
> write A(x,y) to denote a relation A with attributes x and y. For empty
> relations A(x), A(y), B(z) let just write them as x,y,z. Let also "<="
> to denote "is subset of". Then
>
> x|><|y <= A(x,y)
>
> or more succinctly
>
> x|><|y <= A
>
> Similar to math nothation let's drop the join symbol, assuming that in
> the absence of the other operators it's implicit between any 2 relation
> variables. There we have
>
> x y <= A
>
> That's right, instead of bracket notation A(x,y) saying that relation A
> has attributes x and y, we can just write "A >= x y" implying that A is
> a superset of join of empty relations x and y.

On a symmetrical note, lets use capital letters X, Y etc to denote an
infinite relation each of which is a full domain. Then

A <= X Y

Jon Heggland

unread,
Jul 1, 2005, 5:10:36 AM7/1/05
to
In article <1120069466.3...@o13g2000cwo.googlegroups.com>,
marshal...@gmail.com says...

> With join, all keys are preserved.

Umm... they are? What do you mean by "preserved"? Perhaps I
misunderstand you, but a key of one of the operands is not necessarily a
key of the result.
--
Jon

Jon Heggland

unread,
Jul 1, 2005, 5:11:07 AM7/1/05
to
In article <1120156799.1...@o13g2000cwo.googlegroups.com>,
mikharaki...@yahoo.com says...

> > x y <= A
> >
> > That's right, instead of bracket notation A(x,y) saying that relation A
> > has attributes x and y, we can just write "A >= x y" implying that A is
> > a superset of join of empty relations x and y.
>
> On a symmetrical note, lets use capital letters X, Y etc to denote an
> infinite relation each of which is a full domain. Then
>
> A <= X Y

Nice, but is it not also so that A >= x y z and A <= X Y Z for A(x,y)?
Which makes that notation less useful....
--
Jon

Jon Heggland

unread,
Jul 1, 2005, 5:11:57 AM7/1/05
to
In article <1120066601....@g47g2000cwa.googlegroups.com>,
marshal...@gmail.com says...

> > However, the symmetry/duality is lost in the semantics of the
> > operations---the relation predicate of the result. If relvars A and B
> > have predicates PA and PB, respectively, the expression A JOIN/<AND> B
> > has the predicate PA AND PB (logical and) in both D&D and Tropashko.
> > However, D&D's A <OR> B has the predicate PA OR PB (logical or), while
> > Tropashko's A UNION B has a rather more complicated and less intuitive
> > predicate (on first glance).
>
> I look at it this way:
>
> (I'll call Tropashko's operator "generalized union.")
>
> A natural join B: the union of the columns and the intersection of the
> rows
>
> A generalized union B: the intersection of the columns and union of the
> rows.
>
> Quite symmetric, as I see it.

Yes, but the relation predicate is not. Given relations A(x,y) and B
(y,z) with predicates PA(x,y) "employee x works in department y" and PB
(y,z) "department y is led by manager z".

The predicate PJAB(x,y,z) of A join B is "employee x works in department
y, and department y is led by manager z" (or more simply, "employee x
works in department y which is led by manager z").

The predicate PUAB(y) of A generalized union B is "there exists an
employee x such that employee x works in department y, or there exists a
manager z such that department y is led by manager z"

Whereas the predicate POAB(x,y,z) of A <OR> B is "employee x works in
department y, or department y is led by manager z".

Questions of usefulness may of course be raised; I am merely pointing
out that there is a lack of symmetry from this viewpoint.
--
Jon

Marshall Spight

unread,
Jul 1, 2005, 11:09:24 AM7/1/05
to

Okay. What rule would you propose?

In fact, since I proposed that rule in the above-referenced message,
I did think of a counterexample which would make some relations
unjoinable, which isn't good.)


Marshall

Mikito Harakiri

unread,
Jul 1, 2005, 1:13:01 PM7/1/05
to

No, A <= X Y Z is false:

A X Y Z != A
A X Y Z != X Y Z

and symmetrically

A union X Y Z != A
A union X Y Z != X Y Z

(In the above formulas we assume that join takes precedence over union,
which allows some bracket economy).

However, A <= X and A <= Y.

A >= x y z formula can be interpreted that the A(x,y) predicate can be
in fact be written as A(x,y,z)!

Vadim Tropashko

unread,
Jul 1, 2005, 1:32:09 PM7/1/05
to

A >= x y z transitively follows from

A >= x y x >= x y

However A <= X Y Z doesn't hold. Ascii diagram for the lattice A(x,y) =
{(1,a)} with domains X = {1,2} and Y = {a,b} illustrates this assymetry
.....xy
.../.|.\
../..|..\
.x...A...y
.|../|\..|
.|./.|.\.|
.|/..|..\|
{1}..XY.{a}
.|../.\..|
.|./...\.|
.|/.....\|
.X.......Y
..\...../
...\.../
....\./
....{}

Mikito Harakiri

unread,
Jul 1, 2005, 2:08:22 PM7/1/05
to
Jon Heggland wrote:
> Yes, but the relation predicate is not. Given relations A(x,y) and B
> (y,z) with predicates PA(x,y) "employee x works in department y" and PB
> (y,z) "department y is led by manager z".

Although you made an excellent point later, here I'm confused by the
lack of quantifiers in the above sentences. Wouldn't it be more
precisely to say

"there exists employee x and department y such that x works in y"
"there exists department y and manager z such that y is led by z"

Predicate logic sentences without quantifiers are assumed to have
implicit universal quantifier, right? Let's manipulate closed sentences
only, from now on.

> The predicate PJAB(x,y,z) of A join B is "employee x works in department
> y, and department y is led by manager z" (or more simply, "employee x
> works in department y which is led by manager z").

"there exists employee x and department y such that x works in y
AND
there exists department y and manager z such that y is led by z"

which can be reduced to

"there exists employee x, department y, and manager z
such that x works in y and y is led by z"

> The predicate PUAB(y) of A generalized union B is "there exists an
> employee x such that employee x works in department y, or there exists a
> manager z such that department y is led by manager z"

"there exists department y such that
...there exists employee x such that x works in y
OR
...there exists manager z such that y is led by z"

Isn't this logically equivalent to

> Whereas the predicate POAB(x,y,z) of A <OR> B is "employee x works in
> department y, or department y is led by manager z".

"there exists employee x and department y such that x works in y
OR
there exists department y and manager z such that y is led by z"

It looks like the way we choose free variable is an interpretation made
outside of logic, and both sentences are logically equivalent.

Vadim Tropashko

unread,
Jul 1, 2005, 2:18:42 PM7/1/05
to
Mikito Harakiri wrote:
> "there exists employee x and department y such that x works in y
> AND
> there exists department y and manager z such that y is led by z"
>
> which can be reduced to
>
> "there exists employee x, department y, and manager z
> such that x works in y and y is led by z"

Quantifies are subtle entities. I don't think

"there exists employee x and department y such that x works in y
AND

there exists department v and manager w such that v is led by w"

is logically equivalent to

"there exists employee x, department y, and manager z
such that x works in y and y is led by z"

I renamed bind variables in the second half of the first sentence to
make this obvious.

Vadim Tropashko

unread,
Jul 1, 2005, 2:56:52 PM7/1/05
to
Mikito Harakiri wrote:
> However, A <= X and A <= Y.

To clarify even more: A not <= Z

Jon Heggland

unread,
Jul 2, 2005, 4:51:13 AM7/2/05
to
In article <1120230564.5...@g43g2000cwa.googlegroups.com>,
marshal...@gmail.com says...
> > > With join, all keys are preserved.
> >
> > Umm... they are? What do you mean by "preserved"? Perhaps I
> > misunderstand you, but a key of one of the operands is not necessarily a
> > key of the result.
>
> Okay. What rule would you propose?

I'm not sure I understand you. Surely the keys of a join result are
determined by logic, not by rules one might propose? Date and Darwen(?)
has explored this, in connection with their work on view updating. Off
the top of my head, it depends on whether keys of the operands are
subsets of the join attributes.
--
Jon

Jon Heggland

unread,
Jul 2, 2005, 5:03:23 AM7/2/05
to
In article <1120237981....@g44g2000cwa.googlegroups.com>,
mikharaki...@yahoo.com says...

> > Nice, but is it not also so that A >= x y z and A <= X Y Z for A(x,y)?
> > Which makes that notation less useful....
>
> No, A <= X Y Z is false:

Ah, yes of course. I was confused by the somewhat counterintuitive fact
that A <= B implies that the attributes of B is a subset of the
attributes of A. :) (I hope I got that right.)

But my point has still a little validity: It is not enough to know that
A >= x y to know that the attributes of A are x and y; we also need to
know that A <= X Y.

> A >= x y z formula can be interpreted that the A(x,y) predicate can be
> in fact be written as A(x,y,z)!

This does not make sense to me.
--
Jon

Jon Heggland

unread,
Jul 2, 2005, 5:19:33 AM7/2/05
to
In article <1120241302.9...@z14g2000cwz.googlegroups.com>,
mikharaki...@yahoo.com says...

> Jon Heggland wrote:
> > Yes, but the relation predicate is not. Given relations A(x,y) and B
> > (y,z) with predicates PA(x,y) "employee x works in department y" and PB
> > (y,z) "department y is led by manager z".
>
> Although you made an excellent point later, here I'm confused by the
> lack of quantifiers in the above sentences. Wouldn't it be more
> precisely to say
>
> "there exists employee x and department y such that x works in y"
> "there exists department y and manager z such that y is led by z"
>
> Predicate logic sentences without quantifiers are assumed to have
> implicit universal quantifier, right?

Perhaps in general, but not when used as relation/relvar predicates.
Then the point is that they are open, and that each variable corresponds
to a relation attribute. The tuples in the relation then represent the
variable bindings that makes the predicate evaluate to true.

> > The predicate PJAB(x,y,z) of A join B is "employee x works in department
> > y, and department y is led by manager z" (or more simply, "employee x
> > works in department y which is led by manager z").
>
> "there exists employee x and department y such that x works in y
> AND
> there exists department y and manager z such that y is led by z"

No, because this predicate has no free variables, while the relation
PJAB(x,y,z) has three attributes. Your predicate corresponds to the
projection of PJAB on no attributes ((A JOIN B) {} in tutorial D
syntax), which will evaluate to either TABLE_DEE (the relation with no
attributes and one tuple) if your proposition is true, and TABLE_DUM
(the relation with no attributes and no tuples) if it is false.

> It looks like the way we choose free variable is an interpretation made
> outside of logic, and both sentences are logically equivalent.

No, each attribute of the result must correspond to a free variable.

But of course, PUAB is logically equivalent to POAB { y } (I.e. the
projection of POAB over y).
--
Jon

Marshall Spight

unread,
Jul 2, 2005, 2:31:31 PM7/2/05
to
Jon Heggland wrote:
> In article <1120230564.5...@g43g2000cwa.googlegroups.com>,
> marshal...@gmail.com says...
> > > > With join, all keys are preserved.
> > >
> > > Umm... they are? What do you mean by "preserved"? Perhaps I
> > > misunderstand you, but a key of one of the operands is not necessarily a
> > > key of the result.
> >
> > Okay. What rule would you propose?
>
> I'm not sure I understand you. Surely the keys of a join result are
> determined by logic, not by rules one might propose?

You're quibbling over terminology. What rule, derived via
logic, would you propose? What's the right answer? Phrase
it any way you like; I just want to know what the correct
answer is.


Marshall

Jan Hidders

unread,
Jul 2, 2005, 3:45:47 PM7/2/05
to

The rule is that if we take the natural join of R and S then we can
derive a candidate key K for the result if K is a candidate key of both
R and S. Is that what you wanted to hear?

-- Jan Hiddesr

Marshall Spight

unread,
Jul 2, 2005, 4:14:24 PM7/2/05
to
Jan Hidders wrote:

> Marshall Spight wrote:
>
> The rule is that if we take the natural join of R and S then we can
> derive a candidate key K for the result if K is a candidate key of both
> R and S. Is that what you wanted to hear?

That seems correct but incomplete. If R(a, b) and S(b, c) and b
is non-empty, and (a) is a candidate key of R, then it seems
that (R join S) will also have (a) as a candidate key. Wouldn't
it? It will certainly be unique. Is saying that a set of columns
in the result will be unique over the result relation the same
as saying that it is a candidate key?

We also need some kind of fallback rule, such that if we cannot
derive any keys for the relation, then the union of all columns
is a key.


Marshall

Mikito Harakiri

unread,
Jul 2, 2005, 4:35:37 PM7/2/05
to
Jon Heggland wrote:
> > A >= x y z formula can be interpreted that the A(x,y) predicate can be
> > in fact be written as A(x,y,z)!
>
> This does not make sense to me.

Why? Consider functions of multiple variables. Function f(x)=sin(x) is
equivalent to function f(x,y)=sin(x)*1(y), where 1(y) is constant
function evaluating to 1.

Jon Heggland

unread,
Jul 3, 2005, 8:29:07 AM7/3/05
to
In article <1120336537.0...@o13g2000cwo.googlegroups.com>,
mikharaki...@yahoo.com says...

> Why? Consider functions of multiple variables. Function f(x)=sin(x) is
> equivalent to function f(x,y)=sin(x)*1(y), where 1(y) is constant
> function evaluating to 1.

Well, yes. What I meant to question was the sense in doing so (i.e.
expressing a function in terms of irrelevant arguments). I would also
question the definition of this equivalence since one function has two
arguments and the other one. But never mind.
--
Jon

Jan Hidders

unread,
Jul 3, 2005, 9:22:09 AM7/3/05
to
Marshall Spight wrote:
> Jan Hidders wrote:
>>Marshall Spight wrote:
>>
>>The rule is that if we take the natural join of R and S then we can
>>derive a candidate key K for the result if K is a candidate key of both
>>R and S. Is that what you wanted to hear?
>
> That seems correct but incomplete. If R(a, b) and S(b, c) and b
> is non-empty, and (a) is a candidate key of R, then it seems
> that (R join S) will also have (a) as a candidate key. Wouldn't
> it?

Nope. Consider: R(a, b) = { (1, 2) } and S(b, c) = { (2, 3), (2, 4) }.

> We also need some kind of fallback rule, such that if we cannot
> derive any keys for the relation, then the union of all columns
> is a key.

Indeed. Of course the situation gets much interesting if you also take
other constraints into account such as functional dependencies.

-- Jan Hidders

Marshall Spight

unread,
Jul 3, 2005, 1:10:22 PM7/3/05
to
Jan Hidders wrote:
> Marshall Spight wrote:
> > Jan Hidders wrote:
> >>Marshall Spight wrote:
> >>
> >>The rule is that if we take the natural join of R and S then we can
> >>derive a candidate key K for the result if K is a candidate key of both
> >>R and S. Is that what you wanted to hear?
> >
> > That seems correct but incomplete. [...]
> Nope. [...]

D'oh!


> > We also need some kind of fallback rule, such that if we cannot
> > derive any keys for the relation, then the union of all columns
> > is a key.
>
> Indeed. Of course the situation gets much interesting if you also take
> other constraints into account such as functional dependencies.

Are you aware of a comprehensive treatment of this issue? I am
quite anxious to learn how it's done, and not the least bit confident
in my ability to derive the answers from first principles.


Marshall

Mikito Harakiri

unread,
Jul 3, 2005, 1:49:14 PM7/3/05
to

I see no difference. When we say R is a predicate of vaiables x,y,z,
then formally the predicate can depend on x and y only, and be
trivially false or true for all z. In informal interpretation, however,
x,y,z is the minimal set of such arguments, so that predicate depends
on all of them. Informal interpretation, however, could be just viewed
as a sloppy language;-)

Anyway, the point is that in lattice terms we don't even have to refer
to predicate arguments explicitly anymore. Predicate arguments are
defined by the predicate position in Hasse diagram.

Mikito Harakiri

unread,
Jul 3, 2005, 2:05:38 PM7/3/05
to

I have trouble understanding Marshall's idea, but the functional
dependency formula for

R(x,y,z), x->y

in lattice terms is

R <= X,
R <= Y,
R (y=z) \/ xz R (y!=z) = xyz

with the following conventions
* predicate names are single letters,
* predicates with small letters are empty relations,
* predicates with capital letters X,Y,Z are full domain relations,
* the '=' and '!=' with no spaces around it means binary equality and
nonequality relation, correspondingly
* \/ is union
* in the absence of the other infix between the two relation it is
assumed to be a join
* operations associate to the left

Jon Heggland

unread,
Jul 4, 2005, 2:03:27 AM7/4/05
to
In article <1120329091....@g14g2000cwa.googlegroups.com>,
marshal...@gmail.com says...

> > I'm not sure I understand you. Surely the keys of a join result are
> > determined by logic, not by rules one might propose?
>
> You're quibbling over terminology.

Sorry, but I think it is important to distinguish between *discovering*
and *proposing* a rule. Especially if you take the stance that the rule
then determines what joins are "allowed".

> What rule, derived via
> logic, would you propose? What's the right answer? Phrase
> it any way you like; I just want to know what the correct
> answer is.

I can't find the article or passage I was thinking about, but I believe
this is correct; it is based on "Updating Joins and Other Views" by Date
and McGoveran (1993):

Given two relations A and B. Let the columns of a be partitioned into
two disjoint groups X and Y, and the columns of B similarly into Y and
Z, so that the set of columns Y is "common" to the two relations, i.e.
the join attributes.

If Y includes (is a superset of) a key of both A and B, the join is
"one-to-one", and every key of both A and B is a key for the result.

If Y includes a key of A, but not of B, the join is "one-to-many", and
every key of B is a key for the result. This is of course symmetrical.

If Y does not include any key for neither A nor B, the join is "many-to-
many", and each union of any key of A and any key of B is a key of the
result.

Deriving functional dependencies is not much harder, I think. If A and B
are properly normalised, all their functional dependencies are implied
by their keys. If the join is one-to-one, this is the case for the
result as well. If it is a "many" join, there might be FDs in the result
that are not implied by keys, but they are the same as those in A or B.
--
Jon

0 new messages