no names allowed, we serve types only

Keith H Duggar

unread,

Feb 13, 2010, 3:53:56 PM2/13/10

to

Of course the conventional definition of a relation's header
is a set of ordered pairs of "attributes" of the form <A, T>
where A is the "name" of the attribute (which must be unique
within the header) and T is a type.

I'm wondering, do we really need A? Can we not simplify this
header notion to just a set of types? All we need do then is
supply operators to conveniently "copy" types if or when one
needs multiples attributes of the "same" type. Often we don't
even require this. Here are some examples:

no type copy necessary if we already have unique types

DateTime = {Date, Time}
Inventory = {Supplier, Part, Integer}

type copy used to generate unique types

Point2D = {copy Integer X, copy Integer Y}

So aren't the "ordered pair" and "attribute names" a perhaps
sometimes convenient yet always unnecessary complication? We
can do all we need to solely with types and sufficiently rich
type systems. One nice result is that now a header is just a
plain simple set without ordered pairs demanding an auxiliary
uniqueness constraint on names across all the pairs. What are
your thoughts?

KHD

Ben Finney

unread,

Feb 14, 2010, 12:41:29 AM2/14/10

to

Keith H Duggar <dug...@alum.mit.edu> writes:

> Of course the conventional definition of a relation's header is a set
> of ordered pairs of "attributes" of the form <A, T> where A is the
> "name" of the attribute (which must be unique within the header) and T
> is a type.

The use of “set of ordered pairs” is telling. Note that the *pairs* are
ordered, while the *set* is not; the attributes are unordered within the
header.

> I'm wondering, do we really need A? Can we not simplify this header
> notion to just a set of types?

The point of a unique name is to be able to address the attribute.

Without it, there would be no way to distinguish between attributes of
identical types in the header.

There would also be no way to change the type of an attribute without
also changing all the references to that attribute, even ones where the
type is irrelevant.

> So aren't the "ordered pair" and "attribute names" a perhaps sometimes
> convenient yet always unnecessary complication? We can do all we need
> to solely with types and sufficiently rich type systems.

You think we don't need to uniquely address attributes within a header?

--
\ “Alternative explanations are always welcome in science, if |
`\ they are better and explain more. Alternative explanations that |
_o__) explain nothing are not welcome.” —Victor J. Stenger, 2001-11-05 |
Ben Finney

Keith H Duggar

unread,

Feb 14, 2010, 10:59:15 AM2/14/10

to

On Feb 14, 12:41 am, Ben Finney <bignose+hates-s...@benfinney.id.au>
wrote:

> Keith H Duggar <dug...@alum.mit.edu> writes:
>
> > Of course the conventional definition of a relation's header is a set
> > of ordered pairs of "attributes" of the form <A, T> where A is the
> > "name" of the attribute (which must be unique within the header) and T
> > is a type.
>

> The use of ‚Äúset of ordered pairs‚Äù is telling. Note that the *pairs* are

> ordered, while the *set* is not; the attributes are unordered within the
> header.

Dude, I think we all know what "set" and "ordered pair" means.

> > I'm wondering, do we really need A? Can we not simplify this header
> > notion to just a set of types?
>
> The point of a unique name is to be able to address the attribute.

And they are redundant if the types are unique.

> Without it, there would be no way to distinguish between attributes of
> identical types in the header.

Which is why I said you could "copy" identical types to yield unique
but related types.

> There would also be no way to change the type of an attribute without
> also changing all the references to that attribute, even ones where the
> type is irrelevant.

Finally we arrive at some interesting and useful point. I have two
points in response. First it is wrong that you cannot change the type
without also referring to attribute references and least in many
cases.
Take my point example:

Point2D = {copy Integer X, copy Integer Y}

The attributes are referred to by the semantic types X and Y. So if
I want to change the representation type I simply change the header
declaration to say Real

Point2D = {copy Real X, copy Real Y}

but that's it! Because all the other references are to X and Y not
Integer.

Second, as I said in my original post

> > So aren't the "ordered pair" and "attribute names" a perhaps
> > sometimes convenient yet always unnecessary complication?

so while I agree they can "sometimes be convenient" I'm saying
they are not necessary. Also I think we probably overestimate their
convenience because we are used to dealing with systems that have
painfully limited type systems (domain support) and IDE support.

For example your objection above implies that "changing all the
references" is somehow difficult. That is not the case if you
have a suitably powerful development environment. I change names
and types often in my coding work yet it is rarely painful due
to careful programming and a powerful development toolset.

> > So aren't the "ordered pair" and "attribute names" a perhaps sometimes
> > convenient yet always unnecessary complication? We can do all we need
> > to solely with types and sufficiently rich type systems.
>
> You think we don't need to uniquely address attributes within a header?

You think my proposal does not allow you to uniquely address header
attributes? I think you need to try again, perhaps with less of a
focus on "telling us" what sets and ordered pairs are.

KHD

paul c

unread,

Feb 14, 2010, 1:40:34 PM2/14/10

to

Heh, good excuse to mention one of my favourite topics - transitive
relations. I seem to remember that in the 1969 paper, Codd just used
types without attribute names (though I think he called them domains
just as Bertie Russell had). Then in the 1970 version he introduced
attribute names. I thought his reason was to allow closures. I'm not
quite clear how this "copy" enables closures. A transitive relation
seems to need two sets of names or attributes, eg., in a family tree, an
attribute that sometimes stands for "parent" or "child" at other times
stands for "ancestor" or "descendant". I believe it's the case that the
transitive relation implies the closure relation and vice versa. To put
it another way, my guess is that attribute names were Codd's way of
being able to depict both relations in one, avoiding recursive
definitions. I realize this isn't a direct answer to the question.

Reinier Post

unread,

Feb 14, 2010, 2:38:39 PM2/14/10

to

Keith H Duggar wrote:

>You think my proposal does not allow you to uniquely address header
>attributes? I think you need to try again, perhaps with less of a
>focus on "telling us" what sets and ordered pairs are.

Perhaps it would help if you explain what a 'copied type' is
and how 'copying types' is more convenient than ordering
or naming different attributes with the same type.

>KHD

--
Reinier

Ben Finney

unread,

Feb 14, 2010, 4:45:50 PM2/14/10

to

Keith H Duggar <dug...@alum.mit.edu> writes:

> On Feb 14, 12:41 am, Ben Finney <bignose+hates-s...@benfinney.id.au>
> wrote:
> > Keith H Duggar <dug...@alum.mit.edu> writes:
> > > I'm wondering, do we really need A? Can we not simplify this
> > > header notion to just a set of types?
> >
> > The point of a unique name is to be able to address the attribute.

> > Without it, there would be no way to distinguish between attributes
> > of identical types in the header.
>
> Which is why I said you could "copy" identical types to yield unique
> but related types.

You didn't say that, but thanks for the explanation. Okay, so the “copy
type” operation yields a type that is identical to the original except
for its name. What problem is being solved by this?

--
\ “Apologize, v. To lay the foundation for a future offense.” |
`\ —Ambrose Bierce, _The Devil's Dictionary_, 1906 |
_o__) |
Ben Finney

Nilone

unread,

Feb 14, 2010, 5:22:27 PM2/14/10

to

On Feb 13, 10:53 pm, Keith H Duggar <dug...@alum.mit.edu> wrote:
> I'm wondering, do we really need A? Can we not simplify this
> header notion to just a set of types? All we need do then is
> supply operators to conveniently "copy" types if or when one
> needs multiples attributes of the "same" type.

I'm happy to see some activity in here again.

Your suggestion could help to force developers to derive uniquely
named types for their attributes - Name instead of String, Age instead
of Int, and so on. Otherwise, type names aren't descriptive enough
and will result in the annotation of schemata with comments, with all
the associated problems. It'll happen anyway, of course - if
something is optional, people will skip it.

I also see some problems with type checking. Consider the query
"SELECT * FROM Point2D WHERE X = Y". Comparison of the attributes
require that they be the same type (or at least, one should be a
subtype of the other). Distinguishing unnamed attributes require that
they be unique types, which is the reason for copying them, but that
contradicts the comparison requirement.

Keith H Duggar

unread,

Feb 15, 2010, 1:14:03 AM2/15/10

to

On Feb 14, 5:22 pm, Nilone <rea...@gmail.com> wrote:
> On Feb 13, 10:53 pm, Keith H Duggar <dug...@alum.mit.edu> wrote:
>
> > I'm wondering, do we really need A? Can we not simplify this
> > header notion to just a set of types? All we need do then is
> > supply operators to conveniently "copy" types if or when one
> > needs multiples attributes of the "same" type.
>
> I'm happy to see some activity in here again.

Indeed. Thanks for joining ;-)

> Your suggestion could help to force developers to derive uniquely
> named types for their attributes - Name instead of String, Age instead
> of Int, and so on. Otherwise, type names aren't descriptive enough
> and will result in the annotation of schemata with comments, with all
> the associated problems. It'll happen anyway, of course - if
> something is optional, people will skip it.

Agreed. This is one way to "encourage" more sensible typing. Of
course it requires good type support which many current vendors
certainly do not provide.

> I also see some problems with type checking. Consider the query
> "SELECT * FROM Point2D WHERE X = Y". Comparison of the attributes
> require that they be the same type (or at least, one should be a
> subtype of the other). Distinguishing unnamed attributes require that
> they be unique types, which is the reason for copying them, but that
> contradicts the comparison requirement.

As copies of Integer, both X and Y can be coerced to Integer. So
we can allow either implicit or explicit type coercion. So either
we can allow

WHERE X = Y

by default or require explicit coercion

WHERE Integer(X) = Integer(Y)

by default. And which is allowed or required should again be left
to the type system. So when I declare

copy Integer X

it may by default add both explicit and implicit coercion operators

() : X -> Integer
() : X -> Integer
Integer() : X -> Integer
Integer() : X -> Integer

or perhaps the following equality operators

= : Integer , X -> bool
= : X , Integer -> bool

which would allow the simple WHERE X = Y but one should be able
to disable that for example by "deleting" those operators

delete () : X -> Integer
delete () : X -> Integer

or

delete = : Integer , X -> bool
delete = : X , Integer -> bool

In which case WHERE X = Y is now illegal and one must write the
coercions explicitly such as WHERE Integer(X) = Integer(Y).

KHD

Keith H Duggar

unread,

Feb 15, 2010, 1:21:38 AM2/15/10

to

Oops! Duplicated X should have been Y.

> () : X -> Integer
> () : X -> Integer
> Integer() : X -> Integer
> Integer() : X -> Integer

...

> delete () : X -> Integer
> delete () : X -> Integer

should have been

> () : X -> Integer
> () : Y -> Integer
> Integer() : X -> Integer
> Integer() : Y -> Integer
...

> delete () : X -> Integer

> delete () : Y -> Integer

KHD

Keith H Duggar

unread,

Feb 15, 2010, 1:52:11 AM2/15/10

to

On Feb 14, 2:38 pm, r...@raampje.lan (Reinier Post) wrote:
> Keith H Duggar wrote:
> >You think my proposal does not allow you to uniquely address header
> >attributes? I think you need to try again, perhaps with less of a
> >focus on "telling us" what sets and ordered pairs are.
>
> Perhaps it would help if you explain what a 'copied type' is

I was hoping I could get away with just a loose intuitive notion for
what it is ;-) If X is a typecopy of Y then X and Y are distinct types
that are mutually coercible and whose extensions are equal. If need be
we can go into much more detail. However, if possible I'd like to
focus on what usefulness names add other than to handle attributes
with the same type?

For example, just for the sake of argument, pretend we are working
in a universe where every attribute of a relation always has a unique
type in that relation. Ie just pretend that there is never a case
where
we want to have two or more attributes in the same relation having the
same type. Now, in this world do we gain anything any usefulness at
all by having attribute names?

> and how 'copying types' is more convenient than ordering
> or naming different attributes with the same type.

I don't know whether it necessarily more convenient but I do believe
it simplifies the relational model. For one, it is often the case that
one already has unique types and thus can already construct a header
without the additional "attribute names". Second, requiring unique
types and reducing the header to a simple set eliminates the need to
define an additional uniqueness constraint across <name,type> pairs.

KHD

Keith H Duggar

unread,

Feb 15, 2010, 2:01:52 AM2/15/10

to

On Feb 14, 4:45 pm, Ben Finney <bignose+hates-s...@benfinney.id.au>

wrote:
> Keith H Duggar <dug...@alum.mit.edu> writes:
>
> > On Feb 14, 12:41 am, Ben Finney <bignose+hates-s...@benfinney.id.au>
> > wrote:
> > > Keith H Duggar <dug...@alum.mit.edu> writes:
> > > > I'm wondering, do we really need A? Can we not simplify this
> > > > header notion to just a set of types?
>
> > > The point of a unique name is to be able to address the attribute.
> > > Without it, there would be no way to distinguish between attributes
> > > of identical types in the header.
>
> > Which is why I said you could "copy" identical types to yield unique
> > but related types.
>

> You didn't say that, but thanks for the explanation. Okay, so the ‚Äúcopy
> type‚Äù operation yields a type that is identical to the original except

> for its name. What problem is being solved by this?

It is an alternate solution to the problem of identifying attributes
that we want to have the "same type". One solution is to introduce
"attribute names". Another, the one I'm proposing here, is to enrich
the type system to provide these "type copies" which are distinct but
mutually coercible types. I believe this solution is simpler in some
ways and that often one is already in possession of all unique types
in which case attribute names just "get in the way" to some extent.

Of course if you are dealing with crippled domain support then names
are essential. But envision a system with rich type support. I'm
asking
if, in that world, we even need to bother with attribute names at all?

KHD

Ben Finney

unread,

Feb 15, 2010, 2:24:53 AM2/15/10

to

Keith H Duggar <dug...@alum.mit.edu> writes:

> On Feb 14, 4:45 pm, Ben Finney <bignose+hates-s...@benfinney.id.au>
> wrote:
> > Okay, so the ‚Äúcopy type‚Äù operation yields a type that is
> > identical to the original except for its name. What problem is being
> > solved by this?
>
> It is an alternate solution to the problem of identifying attributes
> that we want to have the "same type". One solution is to introduce
> "attribute names". Another, the one I'm proposing here, is to enrich
> the type system to provide these "type copies" which are distinct but
> mutually coercible types. I believe this solution is simpler in some
> ways and that often one is already in possession of all unique types
> in which case attribute names just "get in the way" to some extent.

I don't see how it makes a useful improvement.

There needs to be a way to define attribute behaviour, and also to
define them in such a way that attributes can be uniquely addressed
after definition.

The current method to do this is to assign the attribute a type to
implement its behaviour, and a unique name to address the attribute.
Programming languages support this, to the extent that it even matches
the underlying type metaphor of the programming languages themselves.

Your proposal, if I understand correctly, is that attributes would have
a type that must be unique, and the unique type name would be how to
address the attribute. Uniquely-named types could be identical in their
behaviour. Yet there would be no widespread language support for this,
so it would need to have a pretty significant *benefit* over the
existing system to be worth considering.

So far, all I've seen is that it could be made equivalent to the current
metaphor. You claim less complexity, but I don't see that; all you've
done is shift the complexity around. I also don't see where you've made
explicit what the significant benefit of this change would be.

--
\ “Any intelligent fool can make things bigger and more complex… |
`\ It takes a touch of genius – and a lot of courage – to move in |
_o__) the opposite direction.” —Albert Einstein |
Ben Finney

Bob Badour

unread,

Feb 15, 2010, 9:30:48 AM2/15/10

to

Keith H Duggar wrote:

> On Feb 14, 2:38 pm, r...@raampje.lan (Reinier Post) wrote:
>
>>Keith H Duggar wrote:
>>
>>>You think my proposal does not allow you to uniquely address header
>>>attributes? I think you need to try again, perhaps with less of a
>>>focus on "telling us" what sets and ordered pairs are.
>>
>>Perhaps it would help if you explain what a 'copied type' is
>
> I was hoping I could get away with just a loose intuitive notion for
> what it is ;-) If X is a typecopy of Y then X and Y are distinct types
> that are mutually coercible and whose extensions are equal. If need be
> we can go into much more detail. However, if possible I'd like to
> focus on what usefulness names add other than to handle attributes
> with the same type?
>
> For example, just for the sake of argument, pretend we are working
> in a universe where every attribute of a relation always has a unique
> type in that relation. Ie just pretend that there is never a case
> where
> we want to have two or more attributes in the same relation having the
> same type. Now, in this world do we gain anything any usefulness at
> all by having attribute names?

Yes. We gain control over natural join.

>>and how 'copying types' is more convenient than ordering
>>or naming different attributes with the same type.
>
> I don't know whether it necessarily more convenient but I do believe
> it simplifies the relational model. For one, it is often the case that
> one already has unique types and thus can already construct a header
> without the additional "attribute names". Second, requiring unique
> types and reducing the header to a simple set eliminates the need to
> define an additional uniqueness constraint across <name,type> pairs.

But you haven't really done that. Your "copy" declaration must still
track the underlying type.

Bob Badour

unread,

Feb 15, 2010, 10:17:03 AM2/15/10

to

Keith H Duggar wrote:

> Of course the conventional definition of a relation's header
> is a set of ordered pairs of "attributes" of the form <A, T>
> where A is the "name" of the attribute (which must be unique
> within the header) and T is a type.
>
> I'm wondering, do we really need A? Can we not simplify this
> header notion to just a set of types?

Remember the adage: "As simple as possible, and no simpler."

> All we need do then is
> supply operators to conveniently "copy" types if or when one
> needs multiples attributes of the "same" type. Often we don't
> even require this. Here are some examples:
>
> no type copy necessary if we already have unique types
>
> DateTime = {Date, Time}
> Inventory = {Supplier, Part, Integer}
>
> type copy used to generate unique types
>
> Point2D = {copy Integer X, copy Integer Y}

Your later clarifications reveal that as well as naming the attributes,
your "copy" keyword creates new types. This means one will have to
coerce these back to integer to do anything with them that one
ordinarily does with integers.

Pretending that the name affects the type turns "(X + Y)" into
"(integer(X) + integer(Y))".

Doing so certainly makes queries verbose, and I don't see any benefit to
compensate for the increased verbosity. In fact, your idea seems to
offer a neglibible one-time benefit to the dbms implementer at the cost
of burdening the least sophisticated users in their most common tasks.

> So aren't the "ordered pair" and "attribute names" a perhaps
> sometimes convenient yet always unnecessary complication?

No. You have to have a name to address the attribute. All you have done
is conflate the name with the type. Have you really thought this through?

Point2D = {copy Integer X, copy Integer Y}

Point3D = {copy Rational X, copy Rational Y, copy Rational Z}

Is the above allowed? If it is, what type does X identify? If not, how
can you fully support join?

e.g.

Dept = {DeptID, EmployeeID, Name}
Employee = {EmployeeID, DeptID, Name}

To query the dbms for the names of employees in their respective
department names, one will need to use your "copy" keyword in the query
somehow:

Employee(copy Name EEName) join Dept(copy EmployeeID ManagerID, copy
Name DeptName)

But if type names must be unique, what if someone runs another query and
wants to use EEName or DeptName or ManagerID to identify a different type?

> We
> can do all we need to solely with types and sufficiently rich
> type systems.

I disagree. We need to make relations easy to use. We need to make
relations easy to implement correctly not just easy to implement.

> One nice result is that now a header is just a
> plain simple set without ordered pairs demanding an auxiliary
> uniqueness constraint on names across all the pairs. What are
> your thoughts?

I think we have used names separate from types when using relations for
a very long time:

Cone: {X in Re, Y in Re, R in Re | X*X + Y*Y = R*R }

Tegiri Nenashi

unread,

Feb 15, 2010, 1:07:43 PM2/15/10

to

On Feb 13, 12:53 pm, Keith H Duggar <dug...@alum.mit.edu> wrote:
> Of course the conventional definition of a relation's header
> is a set of ordered pairs of "attributes" of the form <A, T>
> where A is the "name" of the attribute (which must be unique
> within the header) and T is a type.

The more I study relational model, the less I appreciate the concept
of type (domain). This is consistent with dbms vendors failed to
deliver genuine rdbms extensibility via user defined types: when did
you last time program a new type? It looks like the only important
operation on any domain is equality, and the other ones are just
predicates in disguise.

paul c

unread,

Feb 15, 2010, 9:05:59 PM2/15/10

to

Tegiri Nenashi wrote:
> On Feb 13, 12:53 pm, Keith H Duggar <dug...@alum.mit.edu> wrote:
>> Of course the conventional definition of a relation's header
>> is a set of ordered pairs of "attributes" of the form <A, T>
>> where A is the "name" of the attribute (which must be unique
>> within the header) and T is a type.

> ...

And users must be aware of types so that when a 'double' in one header
matches the name but not the type of a double in another header, they
will understand why join might produce an empty relation. (I may have
misunderstood Keith D's idea, but I have the impression that it is just
a substitute for '<rename>'. I think one could just ask what would
happen if all 'doubles' had the same type - some kind of <rename> would
still be needed.)

> The more I study relational model, the less I appreciate the concept
> of type (domain). This is consistent with dbms vendors failed to
> deliver genuine rdbms extensibility via user defined types: when did
> you last time program a new type? It looks like the only important
> operation on any domain is equality, and the other ones are just
> predicates in disguise.

In the old mainframe days, say up to about 1985, the OS'es and add-on
products usually provided tons of 'exits'. (I suppose the modern term
might be 'driver' points.) Usually they needed to be coded in some
low-level language or assembler. Customers soon learned not to let any
old programmer code them, but often they saved tremendous amounts of
appl'n developer work-arounds (sometimes they amounted to just 'jobs for
the boys'.)

Builtin types such as 'integer' and 'char' and even 'utf8' seem rather
puerile to me, or at least beside the point as far as application
relations are concerned, being related more to physical machines than to
what Codd called 'domains'. It is a bloody shame that the mainstream
so-called-relation products haven't done this, by now there might be a
sizeable body of types that had been well shaken down. I imagine some
of the current product developers are aware of D&D's work which could
give them some fairly safe theory for guidelines to customers, indeed a
whole sub-industry would be possible. I knew at least one product
developer who put lots of secret exits in his code. When he was fired,
he made a great living for several years, travelling the world and
exploiting those secret exits (this wasn't a 'relational' product,
rather one that could be called a fairly advanced 4GL, since programmers
had to code their own joins, nevertheless it was good enough to beat the
mainstream sql products in many purchase decisions for mainframe, unix
and windows platforms).

Maybe I'm wrong, but I get the impression that many pretty smart people
mistakenly think that 'user-defined' types means defineable by any old
user and that product developers think this poses huge
compiler/interpreter problems and big optimization problems. Although
I've never seen them refer to this, I have the impression that D&D
tacitly encourage that misperception, but I doubt if they really intend
it, since they were both present during the 'old days'. I believe their
motive is to establish principles that dictate implementation results,
not ones that dictate implementation techniques.

Nilone

unread,

Feb 16, 2010, 4:47:42 AM2/16/10

to

On Feb 15, 8:07 pm, Tegiri Nenashi <tegirinena...@gmail.com> wrote:
>
> The more I study relational model, the less I appreciate the concept
> of type (domain). This is consistent with dbms vendors failed to
> deliver genuine rdbms extensibility via user defined types: when did
> you last time program a new type? It looks like the only important
> operation on any domain is equality, and the other ones are just
> predicates in disguise.

It seems to me operators only apply to types of entities in a defined
algebraic structure like a field (I hope that's right, I'm no
mathematician). Still, they're absolutely necessary for those cases -
how else could we do things like number arithmetic, string slicing,
point translation, shape scaling and rotation, line intersection,
color mixing, matrix multiplication, polynomial addition, regexp
concatenation, and so on in a database? Granted, we can't do most of
those things now because of lack of vendor domain support, but I'm not
ready to give up on types yet.

I agree with your point about predicates, though. Most entities
require no operators except equality (and perhaps constructors). I
just wish those teaching programming and databases knew this. The
database design module I completed last year was pure ER diagrams,
everything-is-an-attribute-of-an-object and not a single mention of
predicates. So too the OO design patterns module, and the OOAD
modules the year before.

David BL

unread,

Feb 17, 2010, 6:29:41 AM2/17/10

to

On Feb 14, 4:53 am, Keith H Duggar <dug...@alum.mit.edu> wrote:

> I'm wondering, do we really need A? Can we not simplify this
> header notion to just a set of types?

I think the RM can be (conceptually) simplified, but you're throwing
out the wrong thing! A formulation of the RM could keep attribute
names, but drop the whole idea of the type system.

Values can be formalised without types. Just look at ZFC. Note that
the equality operator is fundamental to set theory. Values can be
compared even though they have no types.

Type theory involves the idea that values have a well defined type (or
rather a set of types, one of which is the Most Specific Type). That
concept is alien to set theory and most of mathematics. One problem
I have with type systems is their adhoc nature. The MST of a value
depends on the types that have been declared to the database system.
If you instead wanted to be fair and avoid discrimination by allowing
every set containing some value x to be considered one of the possible
types of x then you would find that the MST of x is {x} and the class
of all types of x isn't actually a set (the latter can be proven using
the Russell paradox trick).

Operators can be formalised without a type system too. Simply
formalise an operator as a function defined on some domain, where a
domain is merely a set (not a "type").

The equivalent of the type system (including all conceivable database
constraints) can be expressed using normal set theory. Just use a
single set to describe the possible values of the (entire) database.
What could be simpler?

An un-typed RM is easy to formalise because all the RM operators are
*already* defined without reference to type systems. E.g. join only
depends on attribute names and the ability to test for equality of
attribute values. The attribute domains don't come into it at all.
Similarly for the other RM operators such as union, intersection and
difference.

I wonder whether there are practical economies for avoiding formalised
types in a database schema language. A database schema is no longer
a type definition but instead just a set definition. D&Ds concept of
subtyping by constaint maps to nothing more revolutionary than
subsetting through set comprehension. A relation value can represent
the equivalent of a tuple type and a power set over a relation value
(i.e. a set of tuples) provides the equivalent of a relation type (as
a set of relations). Set comprehension allows arbitrary database
constraints to be imposed.

David

Nilone

unread,

Feb 17, 2010, 8:15:28 AM2/17/10

to

On Feb 17, 1:29 pm, David BL <davi...@iinet.net.au> wrote:
> Operators can be formalised without a type system too. Simply
> formalise an operator as a function defined on some domain, where a
> domain is merely a set (not a "type").

Thanks for the introduction, I haven't seen the typeless model
before. I don't see how such a system would handle arithmetic
operators (e.g. + and <) and string operators like concatenation and
search - could you perhaps give an example?

David BL

unread,

Feb 17, 2010, 1:14:38 PM2/17/10

to

In a typeless system a unary function could for example be formalised
as a triple (D,C,G) where D is the domain, C is the co-domain and G is
the graph of the function (a subset of DxC). This is typeless in the
sense that a function value doesn't formally have any concept of a
defined type. Rather the domain and co-domain are formally part of
the function's value as a triple (D,C,G). For example two functions
can have the same domain and graph but different co-domains. That
makes them distinct. This is actually conventional, as when one
determines whether a given function is surjective (i.e. its range
equals its co-domain). It wouldn't make sense to ask whether a
function is surjective if its co-domain isn't part of its value.

Alternatively a typeless system could instead formalise a unary
function as a set of pairs (i.e. what we above called its graph). In
that case the domain and range is determined from the graph using
projection, but there is no concept of a co-domain.

Similarly a typeless system could formalise a relation in two
different ways. One allows for attributes to have domains specified
independently of the graph (or "body") of the relation, and these
domains represent part of the relation's value. That means that two
distinct relations can have exactly the same graph.

Alternatively a relation can be identified with its graph. In that
case the domains are determined as the projection of each attribute.

In a typeless formalism one is very clear about what it means for two
functions or two relations to be equal. It seems to require more
effort to understand what equality means in a typed formalism.

In a D&D type system, a value has a MST, but this actually depends on
the prevailing type system. E.g. two different databases could use
different type systems. Putting it another way, the MST of a value
depends on who you ask :-).

D&D want to ensure that equality of values is independent of declared
types. That's why they say that a selector for an ellipse value that
happens to specify equal width and height actually returns a value
which has an MST of circle. It's like a "magic downcast". They point
out that OO systems don't normally work that way. E.g. an OO
constructor for ellipse never returns a circle.

I think D&D end up treating relations with the same body and attribute
names as equal. i.e. in essence the declared attribute domains are
not part of the relation's value. I think they define subtyping of
relation types accordingly.

It seems to me that D&D spend a lot of effort discussing ideas that
are either eliminated or trivialised in a typeless formalism of the
RM.

Tegiri Nenashi

unread,

Feb 17, 2010, 1:51:14 PM2/17/10

to

Formalization is less of an issue here. I interpret the question as
how to make a working system operating predicates such as Plus(x,y,z)
and Concat(x,y,z). Logical programming provides sort of an answer.

Nilone

unread,

Feb 17, 2010, 2:36:49 PM2/17/10

to

You're right. I'm a programmer rather than a mathematician. As such,
infinite sets can only be approximated and every value has a cost in
space and time. So I'm interested in how operators would be
effectively (and hopefully, efficiently) defined in a software version
of such a model.

The operators in a typed system are based on inspecting and
manipulation the representation of values. I don't see how anything
similar is possible in an untyped relational model. There's
exhaustively generating all operands and results, which is
impractical. With a successor operator defined (again,
exhaustively?), we can define plus inductively, which would be highly
inefficient. Is there a way to define these operators without
resorting to hidden types or an actor-like model of delegating the
work to the operand?

Tegiri Nenashi

unread,

Feb 17, 2010, 3:00:45 PM2/17/10

to

I'm not sure what hidden types or actor-like model of delegating the
work to the operand, and why it is undesirable. Predicates such as
"Plus" do have a set of functional dependencies, so why not to allow
these dependencies be implemented in procedural language? These could
be considered implementation details, pretty much as indexes belonging
to physical layer of relational model. This idea is explored in

http://vadimtropashko.wordpress.com/relational-programming-with-qbql/manipulating-strings-in-qbql/

Jan Hidders

unread,

Feb 17, 2010, 5:01:04 PM2/17/10

to

> http://vadimtropashko.wordpress.com/relational-programming-with-qbql/...

You seem to be linking being untyped and representing functions/
operations with predicates. Can I ask why? Would you not agree that
one can have one without the other and vice versa?

-- Jan Hidders

Tegiri Nenashi

unread,

Feb 17, 2010, 6:46:04 PM2/17/10

to

No rationale (other than sharing intuition with more elaborate David
BL explanation). What are types? Is there a foundation that doesn't
involve heavy machinery of category theory?

The thread started as Keith's attempt to demote attribute names in
favor of types, and was vehemently objected to. From my angle (that
would be relational lattice:-) Relational Model is a theory of
Relations with Named Attributes. It is difficult to see unnamed
perspective (with positional attributes) as contender anymore. This is
especially evident through the prism of set intersection join (aka
composition) operation...

Tegiri Nenashi

unread,

Feb 17, 2010, 7:21:20 PM2/17/10

to

Let me clarify these ideas. In unnamed perspective attributes are
numbers indicating _relative_ position of each attribute in each
individual relation. One have to devise some drudgery conventions to
be able to compare, say, attribute #5 in the relation Emp with
attribute #2 in the relation Dept. In unnamed perspective Cartesian
Product takes the prime stage with all those mathematical snags around
it (non commutativity, non associativity).

With named attributes we operate relation attributes as ordinary sets.
Unfortunately, classic Relational Algebra doesn't make this point
obvious. Natural join unions the attributes, but about the other
Relational Algebra operations? Compare it to relational lattice, where
- generalized union (denoted "v") intersects attribute sets
- ditto inner join (denoted "*")
- outer union ("+", aka D&D <OR>) unions attributes
- unary inversion operation ("`") builds a relation with
"complementary" set of attributes.
These are not some exotic operations, one can suggest a list of very
practical queries involving them:
http://vadimtropashko.wordpress.com/relational-programming-with-qbql/qbql-for-sql-people/

What about other set operations? For example, symmetric difference is
an operation with some notable algebraic properties; do we have an
binary operation operation over relations that takes a symmetric
difference of attributes? Yes, in fact there are several of them
http://vadimtropashko.wordpress.com/relational-programming-with-qbql/aggregation-and-set-joins/
the most distinguished among these being set intersection join (aka
composition). I wonder if the whole theory shouldn't be recast in
semiring form of natural join and composition...

Keith H Duggar

unread,

Feb 18, 2010, 1:08:55 AM2/18/10

to

On Feb 17, 7:21 pm, Tegiri Nenashi <tegirinena...@gmail.com> wrote:
> On Feb 17, 3:46 pm, Tegiri Nenashi <tegirinena...@gmail.com> wrote:
> > The thread started as Keith's attempt to demote attribute names in
> > favor of types,

Eliminate not just demote.

> > and was vehemently objected to. From my angle (that
> > would be relational lattice:-) Relational Model is a theory of
> > Relations with Named Attributes. It is difficult to see unnamed
> > perspective (with positional attributes) as contender anymore. This is
> > especially evident through the prism of set intersection join (aka
> > composition) operation...

Except that I'm not proposing "positional attributes" so I fail
to see the relevance of your point? First, I'm asking a simple
question: suppose we have already have unique types for every
header then what theoretical capability do the names add? (Bob
argues that they are necessary for "controlling" natural join.
I disagree that they are /necessary/ for this; but my complete
response to that will have to wait a few days.)

Second, I'm pointing out that if the answer to that question is
"none", then a set of unique types is sufficient to logically
identify attributes and names are not needed. Third, operating
on types for such things as "rename" etc is really not any more
difficult than for attribute names (given the proper tools).

> Let me clarify these ideas. In unnamed perspective attributes are
> numbers indicating _relative_ position of each attribute in each
> individual relation. One have to devise some drudgery conventions to

Not in what I'm proposing. There is no "positional" anything.
Instead I simply claim, you need either unique names or unique
types. You do not need both. Having unique names instead of
unique types MIGHT BE convenient especially giving the paucity
of domain support these days, but it is not, in principle,
necessary.

Your "positional" points seem like a distraction or red herring.

KHD

Nilone

unread,

Feb 18, 2010, 1:23:42 AM2/18/10

to

> http://vadimtropashko.wordpress.com/relational-programming-with-qbql/...

Thanks for the reference.

I don't have a problem with such a procedural implementation, but
hiding the types you used to implement the system prevents the user
from extending the model for their needs, unless they can drop down to
the source code and recompile it, but then they can also break it.
The ideal would be a system which is both adaptable and safe.

The actor-like model of delegating to the operand is bad because
operands can override and implement behaviour as they choose. For
example, you won't be able to guarantee that x + y = y + x, since y
may be a user-created value in the domain of Plus which (improperly)
overrides the logic for addition to concatenate the digits.

Nilone

unread,

Feb 18, 2010, 1:31:15 AM2/18/10

to

On Feb 18, 8:08 am, Keith H Duggar <dug...@alum.mit.edu> wrote:
> On Feb 17, 7:21 pm, Tegiri Nenashi <tegirinena...@gmail.com> wrote:
>
> > On Feb 17, 3:46 pm, Tegiri Nenashi <tegirinena...@gmail.com> wrote:
> > > The thread started as Keith's attempt to demote attribute names in
> > > favor of types,
>
> Eliminate not just demote.
>
> > > and was vehemently objected to. From my angle (that
> > > would be relational lattice:-) Relational Model is a theory of
> > > Relations with Named Attributes. It is difficult to see unnamed
> > > perspective (with positional attributes) as contender anymore. This is
> > > especially evident through the prism of set intersection join (aka
> > > composition) operation...
>
> Except that I'm not proposing "positional attributes" so I fail
> to see the relevance of your point? First, I'm asking a simple
> question: suppose we have already have unique types for every
> header then what theoretical capability do the names add? (Bob
> argues that they are necessary for "controlling" natural join.
> I disagree that they are /necessary/ for this; but my complete
> response to that will have to wait a few days.)

Your idea doesn't eliminate attribute names, it just delegates it to
the type namespace. I think your copy method should rather be called
aliasing. The whole model still works as we are familiar with it,
except you can't have Point = {int X, int Y} and PointF = {float X,
float Y}. You would either have to make the types the same or choose
different names for the conflicting attributes.

Ben Finney

unread,

Feb 18, 2010, 2:51:05 AM2/18/10

to

Nilone <rea...@gmail.com> writes:

> Your idea doesn't eliminate attribute names, it just delegates it to
> the type namespace.

Exactly. Given that, I don't see what is gained.

--
\ “Intellectual property is to the 21st century what the slave |
`\ trade was to the 16th.” —David Mertz |
_o__) |
Ben Finney

David BL

unread,

Feb 18, 2010, 3:36:10 AM2/18/10

to

On Feb 18, 3:36 am, Nilone <rea...@gmail.com> wrote:

> You're right. I'm a programmer rather than a mathematician. As such,
> infinite sets can only be approximated and every value has a cost in
> space and time. So I'm interested in how operators would be
> effectively (and hopefully, efficiently) defined in a software version
> of such a model.

A typeless language can simply reference built-in or predefined sets
and operators by name. The crucial point is that all sets and
operators (whether built-in or user defined) don't have any concept of
a type.

E.g. there could be a built-in set named Z3 that's implicitly defined
as follows:

Z3 = { 0,1,2 }

There could be a built-in operator named + implicitly defined as the
triple:

( (Z3xZ3), Z3, { ((0,0),0), ((0,1),1), ..., ((2,2),1) }

A language could allow a user-defined function named F:

let F = (Z3, Z3, (lambda x.x+x))

F has domain Z3, co-domain Z3 and graph {(0,0),(1,2),(2,1)}. F itself
doesn't have a type.

Since this language has gone to the trouble to represent domains and
co-domains for the operators it is very similar to a typed language.
A compiler can validate the code or make use of optimal hardware
encodings of elements of the built-in sets.

There is no problem referring to built-in sets like the integers or
the set of all finite strings. These are countably infinite so it is
possible to encode any given element in finite memory space.

David BL

unread,

Feb 18, 2010, 4:00:38 AM2/18/10

to

On Feb 17, 7:29 pm, David BL <davi...@iinet.net.au> wrote:
>
> I wonder whether there are practical economies for avoiding formalised
> types in a database schema language. A database schema is no longer
> a type definition but instead just a set definition. D&Ds concept of
> subtyping by constaint maps to nothing more revolutionary than
> subsetting through set comprehension. A relation value can represent
> the equivalent of a tuple type and a power set over a relation value
> (i.e. a set of tuples) provides the equivalent of a relation type (as
> a set of relations). Set comprehension allows arbitrary database
> constraints to be imposed.

I've jumped in and had a go at defining a typeless language that can
specify a database schema in a set theoretic way...

Sets
----

Let

union(S)

denote the union over each element of S (S is a set of sets)

Let

powerset(X)

denote the set of all subsets of X.

Function graphs
---------------

Let

functiongraphs(D,C)

denote the set of function graphs with domain D and codomain C. i.e.

{ G | G is a subset of D x C where
for each x in D exists unique y in C
such that (x,y) in G
}

Let G be in functiongraphs(D,C). Then

dom(G) denotes D
range(G) denotes { y in C | (x,y) in G }

If (x,y) in G then G(x) denotes y.

Note that given G there are many codomains C satisfying G in
functiongraphs(D,C). Therefore we cannot define codomain(G).

If D,C are sets then functiongraphs(D,C) is necessarily a set
according to the axioms of ZFC. We cannot avoid codomains by defining
functionsgraphs(D) over all possible codomains because that is not a
set under ZFC.

Function graph projection
-------------------------

Let G be a function graph with domain D and let D' be a subset of D.
Then

proj(G,D')

denotes the projection of G onto domain D'. i.e.

proj(G,D') =
{ (x,y) in G | x in D' }

Functions
---------

Let

functions(D,C)

denote the set of functions with domain D and codomain C. i.e.

functions(D,C) = { (D,C,G) | G in functiongraphs(D,C) }

Let f = (D,C,G) and (x,y) in G. Then

dom(f) denotes D
codom(f) denotes C
range(f) denotes range(G)
graph(f) denotes G and
f(x) denotes y.

Function projection
-------------------

Let f = (D,C,G) be a function and D' be a subset of D. Then

proj(f,D')

denotes the projection of f onto domain D', keeping the same co-
domain. i.e.

proj(f,D') = (D', C, proj(G,D'))

Tuples
------

Let A be a functiongraph representing a set of attributes, where for
each (N,D) in A, N represents an attribute name and D represents an
attribute domain (a set). Note that the attribute names are
necessarily distinct by definition of a functiongraph.

Then

tuples(A)

denotes the set of tuples where each tuple is formally a function
graph that maps attribute names into elements of their respective
domains i.e.

tuples(A)
=
{ t in functiongraphs(dom(A),union(range(A))) |
for each (N,D) in A, t(N) in D }

Note that tuples are function graphs not functions because we don't
want to say two tuples are distinct just because of differing
attribute domains.

Relations
---------

Let

relations(A)

denote

powerset(tuples(A))

Key constraints
---------------

Let A be a given functiongraph representing a set of attributes. Let
K be a given subset of powerset(dom(A)) (i.e. sets of sets of
attribute names). Then

relations(A,K)

denotes

{ r in relations(A) |
for all k in K,
for all t1, t2 in r,
proj(t1,k) = proj(t2,k) => t1 = t2
}

Example
-------

A database value is a tuple that maps relation name to relation value.

mydatabaseschema =
tuples
(
{
(
employee,
relations
(
{
(name,string),
(birthdate,date)
},
{
{name}
}
)
}
)
)

Nilone

unread,

Feb 18, 2010, 5:12:58 AM2/18/10

to

On Feb 18, 9:51 am, Ben Finney <bignose+hates-s...@benfinney.id.au>
wrote:

> Nilone <rea...@gmail.com> writes:
> > Your idea doesn't eliminate attribute names, it just delegates it to
> > the type namespace.
>
> Exactly. Given that, I don't see what is gained.

Nor do I, but I'll accept any definition which doesn't fundamentally
break the model. I'm thinking "someone might implement it this way" -
being flexible about the requirements could make it happen sooner.

Tegiri Nenashi

unread,

Feb 18, 2010, 11:47:15 AM2/18/10

to

On Feb 17, 10:08 pm, Keith H Duggar <dug...@alum.mit.edu> wrote:
> On Feb 17, 7:21 pm, Tegiri Nenashi <tegirinena...@gmail.com> wrote:
>
> > On Feb 17, 3:46 pm, Tegiri Nenashi <tegirinena...@gmail.com> wrote:
> > > The thread started as Keith's attempt to demote attribute names in
> > > favor of types,
>
> Eliminate not just demote.
>
> > > and was vehemently objected to. From my angle (that
> > > would be relational lattice:-) Relational Model is a theory of
> > > Relations with Named Attributes. It is difficult to see unnamed
> > > perspective (with positional attributes) as contender anymore. This is
> > > especially evident through the prism of set intersection join (aka
> > > composition) operation...
>
> Except that I'm not proposing "positional attributes" so I fail
> to see the relevance of your point?

I was addressing Named perspective vs. Positional perspective p.31-33
in the Alice book.

> First, I'm asking a simple
> question: suppose we have already have unique types for every
> header then what theoretical capability do the names add? (Bob
> argues that they are necessary for "controlling" natural join.
> I disagree that they are /necessary/ for this; but my complete
> response to that will have to wait a few days.)

And, adding to Bob's point, there are many more [non-classic]
relational operators that depend on the attribute naming.

The issue is more complex, though. The algebra with unnamed attributes
(that is Algebra of Binary Relations) is arguably much more
established construction in math. The parallels between it and
Relational Algebra (in the lattice form) are puzzling. In Algebra of
Binary Relations they have two versions of each operator: logical and
relative. Composition is relative analog of conjunction, converse is
relative analog of negation and so on. In Relational lattice there are
two versions of conjunction, two versions of negation and so on. It is
natural to extend your question and wonder if types are of any use for
Algebra of Binary Relations?

Cimode

unread,

Feb 18, 2010, 4:43:22 PM2/18/10

to

Snipped

> The thread started as Keith's attempt to demote attribute names in
> favor of types, and was vehemently objected to. From my angle (that
> would be relational lattice:-) Relational Model is a theory of
> Relations with Named Attributes. It is difficult to see unnamed
> perspective (with positional attributes) as contender anymore.

Who said that an un-named attribute theory of relations imposes a
position on attributes.

>This is
> especially evident through the prism of set intersection join (aka
> composition) operation...

How?

David BL

unread,

Feb 18, 2010, 10:44:45 PM2/18/10

to

On Feb 15, 3:01 pm, Keith H Duggar <dug...@alum.mit.edu> wrote:
>
> Of course if you are dealing with crippled domain support then names
> are essential. But envision a system with rich type support. I'm
> asking
> if, in that world, we even need to bother with attribute names at all?

I think a rich type system is desirable. I suggest there should be as
many types as necessary according to exactly where it is considered
useful to test for equality. No more and no less.

For example, it isn't particularly useful to compare a person's weight
with a person's height, so it would be reasonable to have specialised
types like MASS_IN_KG and LENGTH_IN_METRES. These are like copies of
NUMBER, but the idea here would be for them to really be distinct
types (i.e. no implicit conversions). I think they should be strict
subtypes of NUMBER. They would inherit arithmetic operators from
NUMBER, such as the signature

NUMBER <-- NUMBER + NUMBER

which appears to defeat the purpose somewhat. E.g. it is possible to
add a MASS_IN_KG to a LENGTH_IN_METRES. However this is safe because
it returns NUMBER which cannot be implicit downcast to a more
specialised type.

Ideally the type system would add all the appropriate specialisations
of the operators (which for example are covariant on return type).
E.g. a plus operator that allows two masses to be added to give a
mass. Strictly speaking this is a distinct operator to the plus
operator inherited from NUMBER, so operator overloading would be
necessary to make this practical.

A curiosity is that the super-type NUMBER doesn't play the role of a
scalar (where by scalar I mean a dimensionless quantity). For
example, if one tried to define a scalar multiplication like this

MASS_IN_KG <-- NUMBER * MASS_IN_KG

one would find that any specialisation of NUMBER can serve as the
scalar. Instead one seems to need to define a common sub-type of all
the specialisations of NUMBER, to serve the role of a dimensionless
number in the type system.

BTW it seems suspicious to me to ever perform equijoins on attributes
on a type that approximates a dense set of numbers.

I also think specialisations of STRING are quite useful, and happily
the problem is much simpler. E.g. we could define strict subtypes of
STRING such as CITYNAME and SUPPLIERNAME. This assumes we aren't
interested in cases where a supplier's name happens to match a city's
name. i.e. we aren't using the database to answer trivia questions!

These rich types will help protect the user from silly mistakes when
constructing queries. The user can still bypass the type checking by
using explicit coercions, but it is expected that these are almost
never required in practical examples.

I think another important use of rich data types is to allow the DBMS
to determine what joins are likely between attributes, and this would
seem important for example in the TransRelational Model by Tarin.

Ok, now back to the subject of the thread. Despite using a rich set
of types I believe it is still necessary to use attribute names which
serve as role names. For example consider the external predicate

Supplier S is located in city L and dispatches products to city D.

Consider furthermore that in that application it's considered useful
to join on L=D. In that case the attribute names L and D need to have
the same type (e.g. CITYNAME).

Introducing (alias) types named SUPPLIERLOCATIONCITY and
SUPPLIERDISPATCHCITY doesn't seem a good idea to me. I like to think
of these as roles not types.

Keith's suggestion seems similar to the Universal Relation idea. In
that case it is required that all roles across all relations be
globally unique. The goal is for logical independence - a user can
write a query without needing to specify the access path in the sense
of the explicit joins amongst the underlying relations. That's a big
advantage!

However W.Kent in "Consequences of Assuming a Universal Relation",
claims (amongst other things) that attribute names quickly proliferate
and obfuscate (his words). E.g. DATE-PROJECT-ASSIGNED-DEPT.

Tegiri Nenashi

unread,

Feb 19, 2010, 12:33:47 AM2/19/10

to

Types are entirely misguided approach to physical units. Much more
elegant way is to operate units as if they are numbers, for example
the expression 10 * kg / (10 * sec^2) is multiplication/division of 4
number-like entities. One can use the laws of associativity/
commutativity and reduce it to 1 * kg/sec^2.

David BL

unread,

Feb 19, 2010, 3:35:37 AM2/19/10

to

On Feb 19, 1:33 pm, Tegiri Nenashi <tegirinena...@gmail.com> wrote:

> Types are entirely misguided approach to physical units. Much more
> elegant way is to operate units as if they are numbers, for example
> the expression 10 * kg / (10 * sec^2) is multiplication/division of 4
> number-like entities. One can use the laws of associativity/
> commutativity and reduce it to 1 * kg/sec^2.

Interesting and that does look elegant. Are you suggesting the
computer manipulates these symbolic expressions so as to rearrange
them into a canonical form

value * unit

(i.e. units are moved to the right hand side of a multiplicative
expression).

How does validation fit in, if it's not part of the type system? I'm
guessing it's based on unification of symbolic expressions. Is that
right?

I believe the design/implementation of a DBMS is normally broken up
into rather distinct "layers", one of which deals with domain support
(both built-in and user-defined types) and another that deals with the
RM itself and treats domains as generic types.

I'm struggling to understand how a domain datatype could support units
in the way you describe without the generic RM layer ending up
physically recording a unit against every recorded numerical value.
Obviously that's not practical.

If a numerical data type supports units then it must record the unit
as part of an encoded value. If it doesn't then the unit must be
associated statically with the data type itself, contrary to your
claim that types are a misguided approach to physical units. Is it
possible to evade that conundrum?

Kevin Kirkpatrick

unread,

Feb 19, 2010, 11:36:58 AM2/19/10

to

The more I read, the more I find myself warming up to this idea.

Dave, could your SUPPLIERLOCATIONCITY vs SUPPLIERDISPATCHCITY (and
other such objections) be resolved by having two kinds of subtyping -
one allowing implicit coersion and the other not - e.g.

TYPE CITYNAME SUBTYPE STRING EXPLICIT COERSION
TYPE SUPPLIERLOCATIONCITY SUBTYPE CITYNAME IMPLICIT COERSION
TYPE SUPPLIERDISPATCHCITY SUBTYPE CITYNAME IMPLICIT COERSION

Which might be a way of allowing various types of cities to be joined,
while still flagging direct comparisons of SUPPLIERLOCATIONCITY to,
say, COUNTRYNAME?

Admittedly, SUPPLIERDISPATCHCITY seems quite verbose for a type name -
but perhaps that's just based on us having put up with (attribute
name, simple type) pairings for so long.

In fact, it's a bit surprising that Date wouldn't have pushed more in
this direction himself - in his "Database in Depth (pg 24)", one
begins to notice a distinct lack of elegance as he uses rich typing:
PARTS = {PNO PNO, PNAME PNAME, COLOR COLOR, WEIGHT WEIGHT, CITY CHAR}

(one wonders why he chose type CHAR for CITY, when CITY should only
naturally, without explicit coersion, be comparable to other CITY
values, not any arbitrary strings of characters)

Anyway, very intriguing idea Keith.

Tegiri Nenashi

unread,

Feb 19, 2010, 12:00:07 PM2/19/10

to

My message was an echo of 4 year old discussion
http://groups.google.com/group/comp.databases.theory/browse_frm/thread/958d15dad2fa1db6/f4e7dbff4b004268?hl=en&q=units+database+group:comp.databases.theory#f4e7dbff4b004268

Kevin Kirkpatrick

unread,

Feb 19, 2010, 2:42:11 PM2/19/10

to

Looks like MS has recently worked this into their F# language, by
allowing units as parameterizations of types.

Interesting series of blogs that delve into it in great depth:
http://blogs.msdn.com/andrewkennedy/archive/2008/08/20/units-of-measure-in-f-part-one-introducing-units.aspx

Keith H Duggar

unread,

Feb 19, 2010, 4:30:07 PM2/19/10

to

On Feb 19, 11:36 am, Kevin Kirkpatrick <kvnkrkpt...@gmail.com> wrote:
> On Feb 18, 9:44 pm, David BL <davi...@iinet.net.au> wrote:
> > On Feb 15, 3:01 pm, Keith H Duggar <dug...@alum.mit.edu> wrote:
>
> > > Of course if you are dealing with crippled domain support then names
> > > are essential. But envision a system with rich type support. I'm
> > > asking
> > > if, in that world, we even need to bother with attribute names at all?
>
> > I think a rich type system is desirable. I suggest there should be as
> > many types as necessary according to exactly where it is considered
> > useful to test for equality. No more and no less.

[snip David's excellent exposition]

Thanks David for putting meat on the bones! Indeed this is
exactly the direction I was thinking of. I'd never run into
the "Universal Relation" idea and need to look into that.
Do you happen to have any of the better references you
found handy?

> > I also think specialisations of STRING are quite useful, and happily
> > the problem is much simpler. E.g. we could define strict subtypes of
> > STRING such as CITYNAME and SUPPLIERNAME. This assumes we aren't
> > interested in cases where a supplier's name happens to match a city's
> > name. i.e. we aren't using the database to answer trivia questions!

> ...

> > Ok, now back to the subject of the thread. Despite using a rich set
> > of types I believe it is still necessary to use attribute names which
> > serve as role names. For example consider the external predicate
>
> > Supplier S is located in city L and dispatches products to city D.
>
> > Consider furthermore that in that application it's considered useful
> > to join on L=D. In that case the attribute names L and D need to have
> > the same type (e.g. CITYNAME).
>
> > Introducing (alias) types named SUPPLIERLOCATIONCITY and
> > SUPPLIERDISPATCHCITY doesn't seem a good idea to me. I like to think
> > of these as roles not types.
>

> The more I read, the more I find myself warming up to this idea.
>
> Dave, could your SUPPLIERLOCATIONCITY vs SUPPLIERDISPATCHCITY (and
> other such objections) be resolved by having two kinds of subtyping -
> one allowing implicit coersion and the other not - e.g.
>
> TYPE CITYNAME SUBTYPE STRING EXPLICIT COERSION
> TYPE SUPPLIERLOCATIONCITY SUBTYPE CITYNAME IMPLICIT COERSION
> TYPE SUPPLIERDISPATCHCITY SUBTYPE CITYNAME IMPLICIT COERSION
>
> Which might be a way of allowing various types of cities to be joined,
> while still flagging direct comparisons of SUPPLIERLOCATIONCITY to,
> say, COUNTRYNAME?

Yes precisely. A rich type system can provide conveniences such
as that. In addition, though the equality operator is the primary
concern above, a type system can provide fine grained control of
explicit/implicit coersion by allowing one to add and remove
individual function signatures from the explicit/implicit set.

We can also use inline explicit casts just as one would might
use rename in a join for attribute names. So for the L=D above
in they are declared as explicit coersion then one could write
CITYNAME(L)=CITYNAME(D) which is very similar to some of Date's
casting examples here and there.

> Admittedly, SUPPLIERDISPATCHCITY seems quite verbose for a type name -
> but perhaps that's just based on us having put up with (attribute
> name, simple type) pairings for so long.

Right, it's probably not any more verbose than the pair taken
as whole.

> In fact, it's a bit surprising that Date wouldn't have pushed more in
> this direction himself - in his "Database in Depth (pg 24)", one
> begins to notice a distinct lack of elegance as he uses rich typing:
> PARTS = {PNO PNO, PNAME PNAME, COLOR COLOR, WEIGHT WEIGHT, CITY CHAR}
>
> (one wonders why he chose type CHAR for CITY, when CITY should only
> naturally, without explicit coersion, be comparable to other CITY
> values, not any arbitrary strings of characters)

It's funny you mention that example as it is precisely one of
the examples (along with other examples in Date's books) that
started to give me a nagging feeling there was some redundancy
to be simplified in the classical (name, type) attribute pair.

> Anyway, very intriguing idea Keith.

Thanks! I'm happy it's spawning some lively discussion here at
least. Personally I'm still trying to digest David's points re
a typeless system. I think there is some insight to be gained
from it but I still haven't had the time to finish the meal ;-)

KHD

David BL

unread,

Feb 19, 2010, 9:40:57 PM2/19/10

to

On Feb 20, 12:36 am, Kevin Kirkpatrick <kvnkrkpt...@gmail.com> wrote:
>
> The more I read, the more I find myself warming up to this idea.
>
> Dave, could your SUPPLIERLOCATIONCITY vs SUPPLIERDISPATCHCITY (and
> other such objections) be resolved by having two kinds of subtyping -
> one allowing implicit coersion and the other not - e.g.
>
> TYPE CITYNAME SUBTYPE STRING EXPLICIT COERSION
> TYPE SUPPLIERLOCATIONCITY SUBTYPE CITYNAME IMPLICIT COERSION
> TYPE SUPPLIERDISPATCHCITY SUBTYPE CITYNAME IMPLICIT COERSION
>
> Which might be a way of allowing various types of cities to be joined,
> while still flagging direct comparisons of SUPPLIERLOCATIONCITY to,
> say, COUNTRYNAME?

I find it strange that there can be two distinct types T1, T2, and yet
implicit conversions are allowed in both directions, as if they
subtype each other.

Since type systems are the source of so much controversy I would
rather not add fuel!

> Admittedly, SUPPLIERDISPATCHCITY seems quite verbose for a type name -
> but perhaps that's just based on us having put up with (attribute
> name, simple type) pairings for so long.
>
> In fact, it's a bit surprising that Date wouldn't have pushed more in
> this direction himself - in his "Database in Depth (pg 24)", one
> begins to notice a distinct lack of elegance as he uses rich typing:
> PARTS = {PNO PNO, PNAME PNAME, COLOR COLOR, WEIGHT WEIGHT, CITY CHAR}
>
> (one wonders why he chose type CHAR for CITY, when CITY should only
> naturally, without explicit coersion, be comparable to other CITY
> values, not any arbitrary strings of characters)

I agree this provides some motivation, but it doesn't seem sufficient
justification to "conflate" concerns (as Bob says). The crucial point
is this: The attribute domains are part of the relation-type, whereas
the attribute names are part of the relation-value. What happens with
your suggestion?

It's very important to keep a clear separation between a value and its
declared type, in order to understand equality. Your suggestion ends
up conflating relation-type and relation-value.

How do you support sub-typing of relation-types according to sub-
typing of the attribute domains, and still allow for addressing the
attributes by type name? It seems you are forced to support multiple
typenames for an attribute. E.g. a relation has an attribute
containing circles and you must allow it to be addressed using either
circle or ellipse.

David BL

unread,

Feb 19, 2010, 10:15:00 PM2/19/10

to

On Feb 20, 10:40 am, David BL <davi...@iinet.net.au> wrote:
>
> I agree this provides some motivation, but it doesn't seem sufficient
> justification to "conflate" concerns (as Bob says). The crucial point
> is this: The attribute domains are part of the relation-type, whereas
> the attribute names are part of the relation-value. What happens with
> your suggestion?
>
> It's very important to keep a clear separation between a value and its
> declared type, in order to understand equality. Your suggestion ends
> up conflating relation-type and relation-value.
>
> How do you support sub-typing of relation-types according to sub-
> typing of the attribute domains, and still allow for addressing the
> attributes by type name? It seems you are forced to support multiple
> typenames for an attribute. E.g. a relation has an attribute
> containing circles and you must allow it to be addressed using either
> circle or ellipse.

Oops, I got confused saying "your suggestion". Of course I mean
"Keith's suggestion"

David BL

unread,

Feb 19, 2010, 10:45:42 PM2/19/10

to

On Feb 20, 5:30 am, Keith H Duggar <dug...@alum.mit.edu> wrote:

> Thanks David for putting meat on the bones! Indeed this is
> exactly the direction I was thinking of. I'd never run into
> the "Universal Relation" idea and need to look into that.
> Do you happen to have any of the better references you
> found handy?

Start with a thread "Navigation vs Relational operators" on cdt back
in May 2004. D Guntermann provided some links. Jan's link to an on-
line presentation still works.

> Yes precisely. A rich type system can provide conveniences such
> as that. In addition, though the equality operator is the primary
> concern above, a type system can provide fine grained control of
> explicit/implicit coersion by allowing one to add and remove
> individual function signatures from the explicit/implicit set.
>
> We can also use inline explicit casts just as one would might
> use rename in a join for attribute names. So for the L=D above
> in they are declared as explicit coersion then one could write
> CITYNAME(L)=CITYNAME(D) which is very similar to some of Date's
> casting examples here and there.
>
> > Admittedly, SUPPLIERDISPATCHCITY seems quite verbose for a type name -
> > but perhaps that's just based on us having put up with (attribute
> > name, simple type) pairings for so long.
>
> Right, it's probably not any more verbose than the pair taken
> as whole.

Is that true in general? I think there will be a tendency for both
the relation name (e.g. "SUPPLIER") and the attribute name (e.g.
"LOCATION") to be merged with the attribute type (e.g. "CITY").

Jan Hidders

unread,

Feb 21, 2010, 6:05:03 AM2/21/10

to

That's actually no problem at all. The formal definition of a relation
type / relation header in Keith's approach would be a set of types
(*). For those types you can define a subtyping relation as follows: a
relation header H1 is said to be a subtype of H2 if for every type in
R2 there is a subtype in R1. It's even possible to have a matching
semantics with that that follows the subtype=subset principle. So no
notion of coercion is really needed here.

(*) Note there would / should be an additional requirement that a
header can only contain incomparable types, i.e., it cannot contain t1
and t2 such that t1 is a proper subtype of t2.

> It seems you are forced to support multiple
> typenames for an attribute. E.g. a relation has an attribute
> containing circles and you must allow it to be addressed using either
> circle or ellipse.

Indeed. But the header would contain only Ellipse, and all subtypes,
including Circle, would be implied. On a related note: relation
projection could be slightly generalized such that you could not only
project on types in the header but also on subtypes of them.

Btw. was the typing rule for equality already discussed? That's not as
straightforward as some might think. One proposal could be that t1 =
t2 is allowed if t1 and t2 have at least one common supertype. Or
should that be subtype? Or both? :-) The right answer depends on
several things such as the chosen semantics of your types; two types
would be semantically incomparable if the intersection of their
semantics is empty. I can easily imagine that the schema designer
should be able to indicate explicitly that two types are incomparable,
even though they would be semantically comparable.

-- Jan Hidders

Keith H Duggar

unread,

Feb 21, 2010, 11:30:28 AM2/21/10

to

On Feb 21, 6:05 am, Jan Hidders <hidd...@gmail.com> wrote:
> Btw. was the typing rule for equality already discussed? That's not as
> straightforward as some might think. One proposal could be that t1 =
> t2 is allowed if t1 and t2 have at least one common supertype. Or
> should that be subtype? Or both? :-) The right answer depends on
> several things such as the chosen semantics of your types; two types
> would be semantically incomparable if the intersection of their
> semantics is empty. I can easily imagine that the schema designer
> should be able to indicate explicitly that two types are incomparable,
> even though they would be semantically comparable.

Unless I've lost track, it was only briefly discussed

http://groups.google.com/group/comp.databases.theory/msg/81584d495812974f

in the context of operator specifications in general. So a type
system could allow fine grained control over the signatures that
are implicity/explicity allowed with/without coersion. In the
example below I'm going to continue using "copy" to denote the
declaration of a "strong" type alias ie it creates a distinct
type not just another name for same type. So let's suppose we
define X and Y coordinates type as copies of an Integer type

copy Integer X
copy Integer Y

and that by default the type system does not supply implicit
coersion nor the multi-sorted equality operators. Then we would
explicity add those multi-sorted signatures

= : X, Y -> bool
= : Y, X -> bool

to the type system. This would allow X and Y equality comparison.
Now suppose elsewhere we have defined a part count

copy Integer PartCount

but do not add the multi-sorted (with X and Y) equality operators.
Then PartCount will not by default be equality comparable to X and
Y. This allows one to control the semantics.

One can also easily create various groups of semantically related
types to allow convenient shortcuts. For example, if we have more
than just two coordinates say we add Z, W, T it would be annoying
to explicity define the equality signatures for all permutations
of those types even though we want them to be comparable. So we
first create a Coordinate type then copy it

copy Integer Coordinate
copy Coordinate X
copy Coordinate Y
copy Coordinate Z
copy Coordinate W
copy Coordinate T

and now explicity add implicit coersion signatures

() : X -> Coordinate
() : Y -> Coordinate
() : Z -> Coordinate
() : W -> Coordinate
() : T -> Coordinate

which now allows implicit mutual equality comparison (and any other
operators) between X, Y, Z, W, T and Coordinate since they are now
implicitly coerced to Coordinate. Here we only had to explicitly add
only a linear number of implicit coersion operators instead of a
combinatorial number of equality signatures.

KHD

Tegiri Nenashi

unread,

Feb 21, 2010, 2:33:19 PM2/21/10

to

On Feb 21, 3:05 am, Jan Hidders <hidd...@gmail.com> wrote:
> On 20 feb, 03:40, David BL <davi...@iinet.net.au> wrote:

> ... E.g. a relation has an attribute

> > containing circles and you must allow it to be addressed using either
> > circle or ellipse.
>
> Indeed. But the header would contain only Ellipse, and all subtypes,

> including Circle, would be implied. ...

Ellipse-Circle example is unconvincing. Both are conic sections and it
is natural to suggest that the design would greatly benefit from
introducing a single class instead of many. The only objection is that
certain methods being constrained to subtypes (such as Circle) might
greatly benefit in performance. However, this is rarely a concern in
practice with so called "object-oriented design" methodology, where
not much thought is put into creating a wealth of new classes.

The situation is mirrored for physical units. Here in the US the
debate is still imperial vs. metric, where "more educated" crowd
points out that metric is certainly superior because scientists are
using it. Which scientists? It is as early as at physics undergraduate
level that one learns that SI is not used in physics anymore and a
system with 3 basic units (cm-gm-sec) is certainly superior. Later on,
on theoretical physics level, this system is dumped in favor of
dimensionless units where all fundamental constants are set equal to
1.

Therefore, both types and units are seems to be artifacts of our
limited perspective. As soon as we get better knowledge we get rid of
them.

Math is different story. Types were created as a vehicle to avoid
certain paradoxes. A typical view of somebody working in applied
sciences is that borrowing a tool designed to avoid some obscure
theoretical constructions is ridiculous. To cite E.T.Jaynes
http://omega.albany.edu:8008/JaynesBook.html, Appendix B
Formalities and Mathematical Style who quotes Henri Poincare (1909):

""In the old days when people invented a new function they had some
useful purpose in mind: now they invent them deliberately just to
invalidate our ancestors' reasoning, and that is all they are ever
going to get out of them."

Indeed, this fad of artificially contrived mathematical pathology
seems nearly to have run its course, and for just the reason that
Poincare foresaw; nothing useful can be done with it."

A similar situation happened in mathematical foundation area where a
wealth of paradoxes were created, and, unlike analysis, these
pathologies were instrumental for axiomatizing set theory. It is
remarkable that a construction, which was created in such peculiar
circumstances, is one of the most profound ideas in CS.

Bob Badour

unread,

Feb 21, 2010, 6:40:58 PM2/21/10

to

Tegiri Nenashi wrote:

> On Feb 21, 3:05 am, Jan Hidders <hidd...@gmail.com> wrote:
>
>>On 20 feb, 03:40, David BL <davi...@iinet.net.au> wrote:
>>... E.g. a relation has an attribute
>>
>>>containing circles and you must allow it to be addressed using either
>>>circle or ellipse.
>>
>>Indeed. But the header would contain only Ellipse, and all subtypes,
>>including Circle, would be implied. ...
>
> Ellipse-Circle example is unconvincing. Both are conic sections and it
> is natural to suggest that the design would greatly benefit from
> introducing a single class instead of many. The only objection is that
> certain methods being constrained to subtypes (such as Circle) might
> greatly benefit in performance.

What's the radius of an ellipse?

The differences are conceptual and logical not just physical.

Keith H Duggar

unread,

Feb 21, 2010, 8:07:32 PM2/21/10

to

There is just to much vagary above for me to comprehend salient
concrete points. For example "greatly benefit", "performance",
"certainly superior", "limited perspective", "better knowledge"?

Out of curiosity, do you have any practical experience programming
with strongly typed languages? Can you not think of even a single
useful aspect of types in THAT context? You mention performance yet
you fail to mention some of the other big (arguably more important)
benefits such as type safety (entire categories of bugs eliminated
from possibility), semantic expressiveness, etc.

> Math is different story. Types were created as a vehicle to avoid
> certain paradoxes. A typical view of somebody working in applied
> sciences is that borrowing a tool designed to avoid some obscure
> theoretical constructions is ridiculous.

Sorry but the only thing ridiculous is your assertion that "types
were created ...". "type" is just a word and the concept of types
dates back to antiquity. For example the notions of "kinds" aka
"types" of numbers such as "integer", "rational", "real" predates
Russell's particular use of the word by thousands of years. Other
more closely related (to the concept of type were are discussing
here) work such as group theory and abstract algebra also have
roots centuries older. Finally, humans have had the concept of
"kinds" of things probably from the moment a human first realized
their hand had a different purpose than their foot and that beef
doesn't taste like chicken.

So the main source of this ranting above seems to be a confusion
that when a computer scientists thinks of data types that they
have anything at all in their mind related to resolving Russell's
paradox worries.

That both Russell and computer scientists find the same tool, types,
useful for similar purposes, constraining language syntax (Russell
for the languages of logic, a computer scientist for programming
languages) is of superficial if any importance.

> To cite E.T.Jayneshttp://omega.albany.edu:8008/JaynesBook.html, Appendix B

> Formalities and Mathematical Style who quotes Henri Poincare (1909):
>
> ""In the old days when people invented a new function they had some
> useful purpose in mind: now they invent them deliberately just to
> invalidate our ancestors' reasoning, and that is all they are ever
> going to get out of them."
>
> Indeed, this fad of artificially contrived mathematical pathology
> seems nearly to have run its course, and for just the reason that
> Poincare foresaw; nothing useful can be done with it."

Sadly (for the purpose of the rant) types have proven to be very
useful for programmers. Though perhaps it is hard to appreciate
this if you are a mathematician whose only experience with types
are Russell's "levels" rather than say a working programmer with
practical experience with strongly typed programming languages.

> A similar situation happened in mathematical foundation area where a
> wealth of paradoxes were created, and, unlike analysis, these
> pathologies were instrumental for axiomatizing set theory. It is
> remarkable that a construction, which was created in such peculiar

You are confused. The notion of type was not created by Russell
and the notion of type employed in programming languages bear
only superficial similarity to Russell's use of types.

> circumstances, is one of the most profound ideas in CS.

Yes, tudes are always amusing but are without doubt something
that "nothing useful can be done with".

KHD

paul c

unread,

Feb 21, 2010, 8:10:41 PM2/21/10

to

Must confess that I probably have a big mental block about another
aspect of type theory, eg., the very mention of it usually makes me ask
myself how to record the value of one-third of a meter (which might be
why I usually try to keep out of typing discussions). Probably my own
inadequacy, not having grown up in a metric country it never occurred to
me to wonder what one-fifth of a foot is. International agreement about
25.4 mm's being an inch doesn't help. Seems to me that the application
requirement always trumps questions of precision and any practical dbms
needs to require/allow decimal places for numeric types. While I
sympathize with Keith D's wish to eliminate the 'name,type' double, I
think that practical issues require not only so-called 'possreps' but
also what I think of as a 'name,type,use' triples. What a mess!

As for having just types, I still think that Codd introduced his
attribute names because of relations that can have two 'columns' of the
same type. I think Keith D is arguing against this and if I'm right
about that, I'd like him to deal with that case, not to say it can't be
done, just that I'd like to see whether his "copy" in a language is a
better tool.

David BL

unread,

Feb 21, 2010, 9:40:36 PM2/21/10

to

On Feb 21, 7:05 pm, Jan Hidders <hidd...@gmail.com> wrote:
> On 20 feb, 03:40, David BL <davi...@iinet.net.au> wrote:
>
> > I agree this provides some motivation, but it doesn't seem sufficient
> > justification to "conflate" concerns (as Bob says). The crucial point
> > is this: The attribute domains are part of the relation-type, whereas
> > the attribute names are part of the relation-value. What happens with
> > your suggestion?

I got that wrong. I should have said:

The crucial point is this: Whereas both the declared attribute names
and attribute domains are part of the relation-type, only the

attribute names are part of the relation-value.

> > It's very important to keep a clear separation between a value and its
> > declared type, in order to understand equality. Your suggestion ends
> > up conflating relation-type and relation-value.
>
> > How do you support sub-typing of relation-types according to sub-
> > typing of the attribute domains, and still allow for addressing the
> > attributes by type name?
>
> That's actually no problem at all. The formal definition of a relation
> type / relation header in Keith's approach would be a set of types
> (*). For those types you can define a subtyping relation as follows: a
> relation header H1 is said to be a subtype of H2 if for every type in
> R2 there is a subtype in R1. It's even possible to have a matching
> semantics with that that follows the subtype=subset principle. So no
> notion of coercion is really needed here.
>
> (*) Note there would / should be an additional requirement that a
> header can only contain incomparable types, i.e., it cannot contain t1
> and t2 such that t1 is a proper subtype of t2.

There is a problem.

Let's write t1 <= t2 if t1 is a subtype of t2 (let this include the
case t1 = t2).

Definition: header H is valid if for all t1,t2 in H,

t1 <= t2 implies t1 = t2.

(this is your "additional requirement")

Definition: H1 <= H2 if
exists bijection f from H1 to H2
such that for each t in H1, t <= f(t).

Unfortunately even assuming H1,H2 are valid doesn't imply that a
bijection is unique (if it exists).

Here is a counter example: H1 = {t1,t2}, H2 = {t3,t4} where
t1,t2,t3,t4 are all distinct types where:

not t1 <= t2
not t2 <= t1
not t3 <= t4
not t4 <= t3
t1 <= t3
t1 <= t4
t2 <= t3
t2 <= t4

In this example, H1, H2 are both valid and there are two distinct
bijections available. Ambiguity in the bijection is incompatible with
the subtype = subset principle. So multiple inheritance throws a
spanner in the works. The only way to salvage the situation is to
require the bijection be unique (so in the above example it is deemed
that H1 is not a subtype of H2). However I find this gratuitous at
best.

Keith H Duggar

unread,

Feb 21, 2010, 10:41:26 PM2/21/10

to

On Feb 21, 8:10 pm, paul c <toledobythe...@oohay.ac> wrote:
> As for having just types, I still think that Codd introduced his
> attribute names because of relations that can have two 'columns' of the
> same type. I think Keith D is arguing against this and if I'm right
> about that, I'd like him to deal with that case, not to say it can't be
> done, just that I'd like to see whether his "copy" in a language is a
> better tool.

How about we do some concrete exercises then? Please provide your
favorite example of some relations with attributes having the same
type along with some associated code operating on those relations.
Then I will convert the {(name,type)} system example into a {type}
only system so we can compare.

KHD

Gene Wirchenko

unread,

Feb 21, 2010, 11:27:26 PM2/21/10

to

On Fri, 19 Feb 2010 08:36:58 -0800 (PST), Kevin Kirkpatrick
<kvnkr...@gmail.com> wrote:

[snip]

>Dave, could your SUPPLIERLOCATIONCITY vs SUPPLIERDISPATCHCITY (and
>other such objections) be resolved by having two kinds of subtyping -
>one allowing implicit coersion and the other not - e.g.
>
>TYPE CITYNAME SUBTYPE STRING EXPLICIT COERSION
>TYPE SUPPLIERLOCATIONCITY SUBTYPE CITYNAME IMPLICIT COERSION
>TYPE SUPPLIERDISPATCHCITY SUBTYPE CITYNAME IMPLICIT COERSION

I think that this level of typing is silly. *Every* column has
its own type? Is there really any difference between a
SUPPLIERLOCATIONCITY and a SUPPLIERDISPATCHCITY? (If they have the
same operations, what is the difference?)

I think this would be more useful:

TYPE CITYNAME SUBTYPE STRING EXPLICIT COERSION

column SUPPLIERLOCATIONCITY isa CITYNAME
column SUPPLIERDISPATCHCITY isa CITYNAME

[snip]

Sincerely,

Gene Wirchenko

paul c

unread,

Feb 21, 2010, 11:30:01 PM2/21/10

to

Below is a cut and paste, as best as I can manage, of what Codd said in
1970. As some here probably recognize, it has bugged me for years that
arcane ops such as 'tclose' are required in the D&D way of thinking.
Maybe I'm unfair to use such an adjective as arcane but my reason is
that the explanations I've seen for it are intricate and elaborate. In
Codd's approach, relations with two 'columns' having the same 'domain'
don't seem to be able to be closed into relations that transform the
domains, eg., from parent and child to ancestor and decendant. D&D seem
to allow it by use of <RENAME>. One way I'd compare alternative
approaches is to ask 1) how many new operators are needed to substitute
for the above? and 2) how easy are they to 'optimize' for minimum
machine time/cost?. It seems to me that for practical purposes, one
must admit relations that may have two 'columns' having the same 'type'.

(I read something by Bertrand Russell where he talked of domains and I
got the impression that his relations could define the different between
'parent' and 'ancestor', but he wrote that long before people had to
write computer programs and I'm guessing what he wrote is circular in
today's context.)

quote (I hope):
... As the example in Figure 2 shows, two columns
may have identical headings (indicating identical
domains) but possess distinct meanings with respect to the
relation. The relation depicted is called component. It is a
ternary relation, whose first two domains are called part
and third domain is called quantity. The meaning of component
(2, y, z) is that part x is an immediate component
(or subassembly) of part y, and z units of part 5 are needed
to assemble one unit of part y. It is a relation which plays
a critical role in the parts explosion problem.

component
(part part quantity)
1 5 9
2 5 7
3 5 2
2 6 12
3 6 3
4 7 1
6 7 1
FIG. 2.

A relation with-two identical domains

David BL

unread,

Feb 22, 2010, 12:20:12 AM2/22/10

to

On Feb 22, 9:10 am, paul c <toledobythe...@oohay.ac> wrote:

> While I
> sympathize with Keith D's wish to eliminate the 'name,type' double, I
> think that practical issues require not only so-called 'possreps' but
> also what I think of as a 'name,type,use' triples. What a mess!

Regarding possreps, I think they can be understood very clearly in a
typeless setting.

Consider the following from Darwen

(see http://www.dcs.warwick.ac.uk/~hugh/TTM/taamoti.slides.pdf):

TYPE ELLIPSE
IS PLANE_FIGURE
POSSREP { A LENGTH, B LENGTH, CTR POINT
CONSTRAINT A >= B AND B > LENGTH ( 0.0 ) } ;

TYPE CIRCLE
IS ELLIPSE
CONSTRAINT THE_A ( ELLIPSE ) = THE_B ( ELLIPSE )
POSSREP { R = THE_A ( ELLIPSE ) ,
CTR = THE_CTR ( ELLIPSE ) } ;

In the following R means the reals.

Let Tellipse be a set of tuples as follows:

Tellipse =
{ (a,b,ctr) in RxRx(RxR) |
a >= b and b > 0 }

Let Pellipse be a function that maps each tuple t in Tellipse to a
real living ,breathing, ellipse value (i.e. an infinite set of points
in RxR):

for all t in Tellipse,
Pellipse(t) =
{ (x,y) in RxR |
((x-t.ctr.x)/t.a)^2 + ((y-t.ctr.y)/t.b)^2 = 1
}

Similarly let

Tcircle = { (r,ctr) in Rx(RxR) | r > 0 }

for all t in Tcircle,
Pcircle(t) =
{ (x,y) in RxR |
(x-t.ctr.x)^2 + (y-t.ctr.y)^2 = t.r^2
}

Note that range(Pcircle) is a subset of range(Pellipse). This
embodies D&D's assertion that type CIRCLE is a subtype of type
ELLIPSE.

Database systems don't tend to formalise ellipses and circles.
Instead they simply parameterise their values using elements of
Tellipse and Tcircle and leave the mappings Pellipse, Pcircle
unspecified.

In my opinion, we should regard a possrep as formally this
(unspecified) mapping such as Pcircle on a given domain Tcircle.
Usually we would require such a mapping to be onto (so that all
possible circles can be represented) and 1-1 (so that when two tuples
are distinct the DBMS can deduce that the circles that they represent
are distinct as well).

Note that Tcircle is uncountably infinite so we cannot even define an
encoding that can represent each element of Tcircle on a computer. In
practise it means we simply deal with some countable subset of Tcircle
(and therefore limit ourselves to a corresponding subset of all
circles.

The relationship between the two possreps is embodied in the
following:

CtoE = { (t1,t2) in Tcircle x Tellipse |
Pcircle(t1) = Pellipse(t2) }

Since every circle is an ellipse, CtoE is the graph of a function that
maps every tuple in Tcircle to a corresponding tuple in Tellipse that
represents the same circle/ellipse value.

Since Pcircle, Pellipse aren't specified to the DBMS, we instead need
to express CtoE directly. Of course that is very easy

(ctr,r) |---> (ctr,r,r)

Expressing this using D&D type selectors we could express this as:

CIRCLE(ctr,r) is-a ELLIPSE(ctr,r,r)

IMO this is cleaner (and indeed more practical) than the D&D syntax
which seems to get it arse about. D&D syntax specifies how to map a
subset of the ellipse values to matching circle values, by specifying
a constraint and by expressing the possrep of CIRCLE in terms of the
possrep of ELLIPSE.

Jan Hidders

unread,

Feb 22, 2010, 9:39:16 AM2/22/10

to

That was not my definition. It would be:

Definition: H1 <= H2 if

exists a function f : H2 -> H1
such that for each t in H2, f(t) <= t.

Note the direction of f. But also this function is not unique, and
yes, I do think that this is a problem.

> [...] Ambiguity in the bijection is incompatible with

> the subtype = subset principle. So multiple inheritance throws a
> spanner in the works. The only way to salvage the situation is to
> require the bijection be unique (so in the above example it is deemed
> that H1 is not a subtype of H2). However I find this gratuitous at
> best.

There is indeed a problem with the subtype = subset principle, or at
least the one that I had in mind:

- if t1 <= t2 then [[t1]] is a subset of [[t2]]

where [[t]] denotes the set of all values that are of type t. Sorry
for being unclear about that.

The restriction that f in the definition should be uniquely defined is
implied by this principle, which IMNSHO makes it not ad-hoc or
gratuitous but rather well-founded. The proof is left to the reader as
an exercise. ;-)

-- Jan Hidders

Bob Badour

unread,

Feb 22, 2010, 10:09:53 AM2/22/10

to

Keith H Duggar wrote:

> On Feb 21, 8:10 pm, paul c <toledobythe...@oohay.ac> wrote:
>
>>As for having just types, I still think that Codd introduced his
>>attribute names because of relations that can have two 'columns' of the
>>same type. I think Keith D is arguing against this and if I'm right
>>about that, I'd like him to deal with that case, not to say it can't be
>>done, just that I'd like to see whether his "copy" in a language is a
>>better tool.
>
> How about we do some concrete exercises then? Please provide your
> favorite example of some relations with attributes having the same
> type along with some associated code operating on those relations.

With all due respect, when proposing an innovation the onus lies on you
to establish a need and/or benefit. How does discarding names benefit
predicate calculus? What flaw does it address?

Jan Hidders

unread,

Feb 22, 2010, 11:49:31 AM2/22/10

to

PS. Note that the corrected definition can be compactly formulated:

Definition: H1 <= H2 if

for each type t2 in H2
there is exactly one type t1 in H1
such that t1 <= t2.

-- Jan Hidders

David BL

unread,

Feb 22, 2010, 7:33:04 PM2/22/10

to

My problem is that there may be existing types t1,t2,t3,t4 and an
existing declared relation-type with H2 = {t3,t4}. Someone wants to
create a new relation-type with H1 = {t1,t2} that subtypes H2, perhaps
where it is required that t1,t3 represent one role and t2,t4 represent
another. However it happens that both t1 and t2 subtype both t3 and
t4 so therefore it cannot be done because of ambiguity. That seems
unreasonable.

> PS. Note that the corrected definition can be compactly formulated:
>
> Definition: H1 <= H2 if
> for each type t2 in H2
> there is exactly one type t1 in H1
> such that t1 <= t2.

That doesn't seem right to me - by that definition H1 might have more
attributes than H2 and yet be considered a subtype.

Jan Hidders

unread,

Feb 23, 2010, 4:28:41 AM2/23/10

to

On 23 feb, 01:33, David BL <davi...@iinet.net.au> wrote:
> On Feb 23, 12:49 am, Jan Hidders <hidd...@gmail.com> wrote:
>
> > On 22 feb, 15:39, Jan Hidders <hidd...@gmail.com> wrote:
> > > There is indeed a problem with the subtype = subset principle, or at
> > > least the one that I had in mind:
>
> > > - if t1 <= t2 then [[t1]] is a subset of [[t2]]
>
> > > where [[t]] denotes the set of all values that are of type t. Sorry
> > > for being unclear about that.
>
> > > The restriction that f in the definition should be uniquely defined is
> > > implied by this principle, which IMNSHO makes it not ad-hoc or
> > > gratuitous but rather well-founded. The proof is left to the reader as
> > > an exercise. ;-)
>
> My problem is that there may be existing types t1,t2,t3,t4 and an
> existing declared relation-type with H2 = {t3,t4}. Someone wants to
> create a new relation-type with H1 = {t1,t2} that subtypes H2, perhaps
> where it is required that t1,t3 represent one role and t2,t4 represent
> another. However it happens that both t1 and t2 subtype both t3 and
> t4 so therefore it cannot be done because of ambiguity. That seems
> unreasonable.

Mathematics can sometimes be very unreasonable. :-) But I think this
makes sense. If you give me a tuple (row/record) of H2 and would ask
for the t1-value in that tuple, then this is ambiguous, since both the
t3 and t4 value are also t1 values. It seems reasonable to assume that
for tuples of type H1 we require that they contain a single value of
type t1 and a single value of type t2. But the tuple of type H2 would
actually in some sense contain two values of type t1 (and two values
of type t2). So semantically speaking it is indeed not really a
subtype.

You could try to fix this by taking another semantics of the types,
but I don't see a reasonable alternative at the moment. In fact, my
current intuition is that this is a symptom of the fact that we are
trying to shoehorn the atrribute-type relationship into the subtype
relationship, where it doesn't really fit comfortably.

> > PS. Note that the corrected definition can be compactly formulated:
>
> > Definition: H1 <= H2 if
> > for each type t2 in H2
> > there is exactly one type t1 in H1
> > such that t1 <= t2.
>
> That doesn't seem right to me - by that definition H1 might have more
> attributes than H2 and yet be considered a subtype.

Of course. For the usual record types it holds that <a:t1, b:t2> is a
supertype of <a:t1, b:t2, c:t3>. You can use values of the latter
where values of the first are expected, not the other way around.

-- Jan HIdders

David BL

unread,

Feb 23, 2010, 12:08:24 PM2/23/10

to

I agree with all that, and add that it is noteworthy that the
ambiguity problem disappears with named attributes.

Anyway, I don't believe our analysis is complete, because our
definitions don't specify what happens exactly with Keith's "copy"
types (which must be used where there is a subtype relationship
between two of the attributes in a header).

> > > PS. Note that the corrected definition can be compactly formulated:
>
> > > Definition: H1 <= H2 if
> > > for each type t2 in H2
> > > there is exactly one type t1 in H1
> > > such that t1 <= t2.
>
> > That doesn't seem right to me - by that definition H1 might have more
> > attributes than H2 and yet be considered a subtype.
>
> Of course. For the usual record types it holds that <a:t1, b:t2> is a
> supertype of <a:t1, b:t2, c:t3>. You can use values of the latter
> where values of the first are expected, not the other way around.

That view of subtyping concerns a possible interpretation of
substitutability (of values), but I believe it is better to adhere to
the subtype = subset principle for data types. It cannot be said that
tuples of the form <a:t1, b:t2, c:t3> represent a subset of tuples of
the form <a:t1, b:t2>.

C.Date presents this argument very well in section 20.9 of an
Introduction to Database Systems where he claims that a coloured
circle is not a subtype of circle (or vice versa).

I suggest implicit conversions must only be an artefact of the type
system (i.e. implicit conversions cannot actually change a value by
adding or removing information), and in fact no concept of implicit
conversions is required in a typeless formalism.

Jan Hidders

unread,

Feb 24, 2010, 4:22:13 AM2/24/10

to

If he allows copies of copies or subtypes of copies he will run into
the same type of problems. If he doesn't, then he apparently
distinguishes copies (which cannot have copies and subtypes) and
normal types (which can). But that would be IMHO somewhat ad-hoc and a
sign that these copies are not really types at all.

> > > > PS. Note that the corrected definition can be compactly formulated:
>
> > > > Definition: H1 <= H2 if
> > > > for each type t2 in H2
> > > > there is exactly one type t1 in H1
> > > > such that t1 <= t2.
>
> > > That doesn't seem right to me - by that definition H1 might have more
> > > attributes than H2 and yet be considered a subtype.
>
> > Of course. For the usual record types it holds that <a:t1, b:t2> is a
> > supertype of <a:t1, b:t2, c:t3>. You can use values of the latter
> > where values of the first are expected, not the other way around.
>
> That view of subtyping concerns a possible interpretation of
> substitutability (of values), but I believe it is better to adhere to
> the subtype = subset principle for data types. It cannot be said that
> tuples of the form <a:t1, b:t2, c:t3> represent a subset of tuples of
> the form <a:t1, b:t2>.

Depends on what you mean by "of the form". In type theory for
programming languages and databases, if they consider subtyping, the
usual definition of a tuple being of the type <a:t1, b:t2> is that it
is a tuple that *at least* contains the fields a and b with values of
the required types. Under that definition the subtype = subset
principle is adhered to.

In case you are interested:

L Cardelli (1984), 'A semantics of multiple inheritance', in:
Semantics of Data
Types, LNCS 173, eds. Kahn, MacQueen and Plotkin, Springer Verlag,
51-68.

L Cardelli and P Wegner (1985), 'On understanding types, data
abstraction and
polymorphism', ACM Computing Surveys, 17(4), 471-521.

> C.Date presents this argument very well in section 20.9 of an
> Introduction to Database Systems where he claims that a coloured
> circle is not a subtype of circle (or vice versa).

The tuple that represents the circle is not the same thing as the
circle itself. I find Date's argument rather unconvincing, to put it
very mildly. He is by no means an authority in this area, and those
that are mostly disagree with this position.

> I suggest implicit conversions must only be an artefact of the type
> system (i.e. implicit conversions cannot actually change a value by
> adding or removing information), and in fact no concept of implicit
> conversions is required in a typeless formalism.

I would even add to that "or in a typed formalism". If in the
semantics you need implicit conversion you probably need to rethink
the semantics of your types.

-- Jan Hidders

Bob Badour

unread,

Feb 24, 2010, 9:08:15 AM2/24/10

to

Jan Hidders wrote:

Since when do you find argumentum ad verecundiam convincing? Hmmm?
[peers over rim of eyeglasses]

Jan Hidders

unread,

Feb 24, 2010, 10:02:04 AM2/24/10

to

I don't, nor do I think it is without any meaning whatsoever.

-- Jan Hidders

Bob Badour

unread,

Feb 24, 2010, 10:36:41 AM2/24/10

to

Jan Hidders wrote:

What happens when one disagrees on what makes authority. For example, I
consider Date an authority in this area--your ad hominem
notwithstanding--because he put considerable tuition into the subject
over a period of a decade or more. He did so with full knowledge of what
others before him had to say including Cardelli. Further, he did so with
an eye to obviating inconsistencies and flaws in those earlier works.

More to the point, I find his arguments convincing.

If you cannot offer a convincing reply to them, can you at least direct
me to someone who has replied convincingly?

Jan Hidders

unread,

Feb 24, 2010, 2:22:42 PM2/24/10

to

If you are not convinced that's not my problem. I've already hinted at
what my counterargument for the specific issue under discussion would
be. If you want to debate that, don't let me stop you.

-- Jan Hidders

paul c

unread,

Feb 24, 2010, 4:34:24 PM2/24/10

to

Bob Badour wrote:
> Jan Hidders wrote:
> ...
...

>>>>> C.Date presents this argument very well in section 20.9 of an
>>>>> Introduction to Database Systems where he claims that a coloured
>>>>> circle is not a subtype of circle (or vice versa).
>>>
>>>> The tuple that represents the circle is not the same thing as the
>>>> circle itself. I find Date's argument rather unconvincing, to put it
>>>> very mildly. He is by no means an authority in this area, and those
>>>> that are mostly disagree with this position.
>>>
>>> Since when do you find argumentum ad verecundiam convincing? Hmmm?
>>> [peers over rim of eyeglasses]
>>
>> I don't, nor do I think it is without any meaning whatsoever.
>
> What happens when one disagrees on what makes authority. For example, I
> consider Date an authority in this area--your ad hominem
> notwithstanding--because he put considerable tuition into the subject
> over a period of a decade or more. He did so with full knowledge of what
> others before him had to say including Cardelli. Further, he did so with
> an eye to obviating inconsistencies and flaws in those earlier works.
>
> More to the point, I find his arguments convincing.
>
> If you cannot offer a convincing reply to them, can you at least direct
> me to someone who has replied convincingly?

Has Date ever said that a tuple is the same as a circle? That would
surprise me.

Personally, I have a great deal of trust in what he writes. In large
part my reason is that he is pretty thorough about examining the
statements of other authors (not to say 'author' means authority, nor
'academic' for that matter), whereas he's pointed out that a number of
them have declined to correspond with him. His body of writing is
pretty large and I would say it deserves respect and public examination
by all serious academics. I don't always agree or understand what he
says but it is usually thought-provoking. Actually his dogmatic
positions seem rather few, for example I don't think he advocates that
every relational technique must be founded on mathematical principles
that weren't necessarily originated for the purpose of information
theory or a practical dbms.

Many of the people who have publicly argued with him struck me as
hand-wavers.

Can't comment much on type theory (in fact I don't even understand how
this thread crossed into that territory). This is a young field. It
might be useful to remember that many medical 'authorities' of the
18th-century are not remembered that way today.

Gene Wirchenko

unread,

Feb 24, 2010, 5:51:56 PM2/24/10

to

Oooh! You gave a hint! How thoughtful. </sarcasm>

How about making the argument? (If you can not, just say so.)

[peers over rim of eyeglasses] (In my case, this is literal.)
(Bob, do you wear glasses?)

Sincerely,

Gene Wirchenko

Bob Badour

unread,

Feb 24, 2010, 7:47:50 PM2/24/10

to

paul c wrote:

He hasn't. His discussion of circle and ellipse relate to data types not
compositions.

> Personally, I have a great deal of trust in what he writes. In large
> part my reason is that he is pretty thorough about examining the
> statements of other authors (not to say 'author' means authority, nor
> 'academic' for that matter), whereas he's pointed out that a number of
> them have declined to correspond with him. His body of writing is
> pretty large and I would say it deserves respect and public examination
> by all serious academics. I don't always agree or understand what he
> says but it is usually thought-provoking. Actually his dogmatic
> positions seem rather few, for example I don't think he advocates that
> every relational technique must be founded on mathematical principles
> that weren't necessarily originated for the purpose of information
> theory or a practical dbms.
>
> Many of the people who have publicly argued with him struck me as
> hand-wavers.

Indeed.

> Can't comment much on type theory (in fact I don't even understand how
> this thread crossed into that territory). This is a young field. It
> might be useful to remember that many medical 'authorities' of the
> 18th-century are not remembered that way today.

It crossed into the territory when David BL (you inexplicably cut out
the attribution above) introduced the discussion of circles and colored
circles.

paul c

unread,

Feb 24, 2010, 8:09:56 PM2/24/10

to

Bob Badour wrote:
...

> It crossed into the territory when David BL (you inexplicably cut out
> the attribution above) introduced the discussion of circles and colored
> circles.

Sorry, it wasn't on purpose, maybe a little too obsessive about my weak
powers of concentration. I've never been able to handle more than one
or two levels of chevrons. Same goes for parentheses. I think I'm the
only person I ever knew who preferred left-associative grammars and back
when everybody had a hex calculator, nobody who borrowed it ever failed
to return my rpn one, usually it came back immediately.

David BL

unread,

Feb 24, 2010, 8:39:10 PM2/24/10

to

On Feb 24, 5:22 pm, Jan Hidders <hidd...@gmail.com> wrote:
> On 23 feb, 18:08, David BL <davi...@iinet.net.au> wrote:

> > Anyway, I don't believe our analysis is complete, because our
> > definitions don't specify what happens exactly with Keith's "copy"
> > types (which must be used where there is a subtype relationship
> > between two of the attributes in a header).
>
> If he allows copies of copies or subtypes of copies he will run into
> the same type of problems. If he doesn't, then he apparently
> distinguishes copies (which cannot have copies and subtypes) and
> normal types (which can). But that would be IMHO somewhat ad-hoc and a
> sign that these copies are not really types at all.

I wonder whether a subtype of a copy is a subtype of the original.

> > That view of subtyping concerns a possible interpretation of
> > substitutability (of values), but I believe it is better to adhere to
> > the subtype = subset principle for data types. It cannot be said that
> > tuples of the form <a:t1, b:t2, c:t3> represent a subset of tuples of
> > the form <a:t1, b:t2>.
>
> Depends on what you mean by "of the form". In type theory for
> programming languages and databases, if they consider subtyping, the
> usual definition of a tuple being of the type <a:t1, b:t2> is that it
> is a tuple that *at least* contains the fields a and b with values of
> the required types. Under that definition the subtype = subset
> principle is adhered to.

Consider a formalism where

1. a tuple-type T(A) is parameterised by a finite set A of (attribute
name, attribute type) pairs where the names are distinct; and

2. a tuple-value is an element of [[T(A)]] (using your notation) if
it equals a set of (name,value) pairs representing the graph of a
function from the attribute names of A to values of the respective
attribute types.

If the subtype = subset principle is adhered to, then the question of
whether T(A1) <= T(A2) is uniquely determined.

T(A1) <= T(A2) if and only if
exists bijection f : A1 --> A2
such that f(N1,D1) = (N2,D2) implies
N1=N2 and D1 <= D2.

I have doubts whether it is possible under ZFC (which disallows
universal sets such as a set of all sets) to formalise a tuple in such
a way that subtypes occur in the manner you suggest.

If a type <a:t1, b:t2> represents all tuples that at least contains
the fields a and b with values of the required types, then this
involves a universal and won't represent a set under ZFC.

> In case you are interested:
>
> L Cardelli (1984), 'A semantics of multiple inheritance', in:
> Semantics of Data
> Types, LNCS 173, eds. Kahn, MacQueen and Plotkin, Springer Verlag,
> 51-68.
>
> L Cardelli and P Wegner (1985), 'On understanding types, data
> abstraction and
> polymorphism', ACM Computing Surveys, 17(4), 471-521.
>
> > C.Date presents this argument very well in section 20.9 of an
> > Introduction to Database Systems where he claims that a coloured
> > circle is not a subtype of circle (or vice versa).
>
> The tuple that represents the circle is not the same thing as the
> circle itself.

Agreed, but I think the same (informal) reasoning applies to tuples.

> I find Date's argument rather unconvincing, to put it
> very mildly. He is by no means an authority in this area, and those
> that are mostly disagree with this position.

> > I suggest implicit conversions must only be an artefact of the type
> > system (i.e. implicit conversions cannot actually change a value by
> > adding or removing information), and in fact no concept of implicit
> > conversions is required in a typeless formalism.
>
> I would even add to that "or in a typed formalism". If in the
> semantics you need implicit conversion you probably need to rethink
> the semantics of your types.

I was thinking for example that implicit conversion is involved here:

Ellipse e;
e = Circle(1, Point(0,0)); // assignment

Implicit conversion doesn't mean a value is being converted. Instead
it relates to the implicit upcast on the static type of the right hand
side expression. Here, implicit conversion is a concept used in
static type analysis.

David BL

unread,

Feb 24, 2010, 11:26:19 PM2/24/10

to

On Feb 25, 9:39 am, David BL <davi...@iinet.net.au> wrote:
> On Feb 24, 5:22 pm, Jan Hidders <hidd...@gmail.com> wrote:
> > On 23 feb, 18:08, David BL <davi...@iinet.net.au> wrote:

> > > C.Date presents this argument very well in section 20.9 of an
> > > Introduction to Database Systems where he claims that a coloured
> > > circle is not a subtype of circle (or vice versa).
>
> > The tuple that represents the circle is not the same thing as the
> > circle itself.
>
> Agreed, but I think the same (informal) reasoning applies to tuples.

I'll clarify that. Consider the following principle:

Let T1, T2 be two given data types satisfying T1 <= T2.
Let there be a given encoding able to represent each
value of type T2 on a computer. Then that same
encoding is necessarily also able to represent each value
of type T1.

I believe this principle implies both:

not coloured_circle <= circle

and not <a:t1,b:t2,c:t3> <= <a:t1,b:t2>

David BL

unread,

Feb 25, 2010, 1:06:14 AM2/25/10

to

On Feb 25, 9:39 am, David BL <davi...@iinet.net.au> wrote:

> I have doubts whether it is possible under ZFC (which disallows
> universal sets such as a set of all sets) to formalise a tuple in such
> a way that subtypes occur in the manner you suggest.
>
> If a type <a:t1, b:t2> represents all tuples that at least contains
> the fields a and b with values of the required types, then this
> involves a universal and won't represent a set under ZFC.

I'll try to provide more justification of that claim.

It is well known that the class of all sets isn't a set under ZFC.
For a given x, what about the class of all sets that contain x? It
turns out that this is not a set either. Here is a proof:

Suppose exists y such that for all z, x in z --> z in y.

Let R = y union {x}. This must be a set by the axiom of union.

Now x is an element of R, so R must be an element of y (because for
all z, x in z --> z in y).

From R = y union {x} it follows that y is a subset of R.

R is an element of y and y is a subset of R, so R is an element of
R.

R is an element of R contradicts the axiom of regularity (also called
axiom of foundation).

Brian

unread,

Feb 27, 2010, 10:47:48 PM2/27/10

to

Blasphemy! Everything that Date and Darwen write is Gospel, didn't
you know that?

> > I suggest implicit conversions must only be an artefact of the type
> > system (i.e. implicit conversions cannot actually change a value by
> > adding or removing information), and in fact no concept of implicit
> > conversions is required in a typeless formalism.
>
> I would even add to that "or in a typed formalism". If in the
> semantics you need implicit conversion you probably need to rethink
> the semantics of your types.
>

> -- Jan Hidders- Hide quoted text -
>
> - Show quoted text -- Hide quoted text -
>
> - Show quoted text -

David BL

unread,

Mar 1, 2010, 10:50:47 PM3/1/10

to

On Feb 25, 6:51 am, Gene Wirchenko <ge...@ocis.net> wrote:

> >If you are not convinced that's not my problem. I've already hinted at
> >what my counterargument for the specific issue under discussion would
> >be. If you want to debate that, don't let me stop you.
>
> Oooh! You gave a hint! How thoughtful. </sarcasm>

Brevity's hardly a sin so what's really behind the </sarcasm>? To be
frank I'm finding it hard to think of anything that does you credit.

Gene Wirchenko

unread,

Mar 2, 2010, 6:36:32 PM3/2/10

to

Note how I did not explicitly state what was behind my post. You
did not like it, did you? Well, I feel the same about your hint.

Please state your argument instead of making cute remarks about
having given a hint. This is a newsgroup where we discuss database
theory, not a crime scene investigation.

Sincerely,

Gene Wirchenko

David BL

unread,

Mar 2, 2010, 9:03:08 PM3/2/10

to

On Mar 3, 7:36 am, Gene Wirchenko <ge...@ocis.net> wrote:
> On Mon, 1 Mar 2010 19:50:47 -0800 (PST), David BL
>

Firstly you have me confused with someone else. Secondly this is a
news group after all and it's very common for posters to be terse -
e.g. to reference an idea or point of view in the community or
described in the literature. Thirdly Jan provided references to
papers. Fourthly I thought Jan's hint made it pretty clear what he
meant anyway.

It's quite reasonable to ask a poster to elaborate. What I don't
understand is your sarcasm, and I was asking you to elaborate on that
(that's fair isn't it?).

I would have thought that if you were actually interested in more
detail you would choose a different tact than to be offensive (which
invariably has the opposite outcome). It suggests your motives are
far from sincere.

Gene Wirchenko

unread,

Mar 3, 2010, 4:12:24 PM3/3/10

to

On Tue, 2 Mar 2010 18:03:08 -0800 (PST), David BL
<dav...@iinet.net.au> wrote:

[snip]

>I would have thought that if you were actually interested in more
>detail you would choose a different tact than to be offensive (which
>invariably has the opposite outcome). It suggests your motives are
>far from sincere.

Wow! That is two posts in a row where you are claiming my
motives are disreputable. Is this your example of the politeness that
you claim I am not showing? Are you going to go for three?

Sincerely,

Gene Wirchenko