Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Three Kinds of Logical Trees

0 views

Skip to first unread message

Marshall Spight

unread,

Jul 15, 2005, 4:17:16 AM7/15/05

I've been thinking about trees in the abstract lately, and
trying to classify them. I am not talking about trees as
a physical data structure, such as BTrees or Red-Black, but
rather trees as logical data structures. In other words,
the *interface* to a tree.

I've identified three distinct kinds.
"homogeneous tree"
All nodes are the same type, tree has varying structure

"dynamic heterogenous"
Nodes are varying types, tree has varying structure

"static heterogeneous"
Nodes are varying type, tree has fixed structure.

Examples

homogeneous: org chart. Every node is a person record, but the
structure of the organization may be of whatever form.

dynamic heterogeneous: parse tree. There are specific kinds of
nodes, but the structure of the tree is relatively unconstrained.

static heterogeneous: customer/account/invoice/line item. Each
level in the tree is fixed, with fixed relationships.

Here "dynamic" and "static" refer to the structure of the tree.

SQL handles the third kind of tree extremely well. The first
two kinds, not so much. In particular, noth that the transitive
closure problem applies to the first two kinds of trees, but
not to the third.

The functional programming people have lots of examples in their
books of dynamic heterogeneous trees, and the mixture of union
types and pattern matching as language features seems to handle
these structures quite well. You could probably use the same
techniques to handle homogenous trees equally well.

Interestingly, I note that static tree nodes have a reference
(fk) to their parent, while the homogeneous and dynamic hetero
types are done with a node having references to its children.

Comments? Has anyone else written about this sort of thing?

Marshall

len...@kommunicera.umea.se

unread,

Jul 16, 2005, 4:24:17 AM7/16/05

Marshall Spight wrote:

[...]

> Examples
>
> homogeneous: org chart. Every node is a person record, but the
> structure of the organization may be of whatever form.
>

A bit off topic, but IMO an organisation consists of different type of
nodes, say for simplicity Company, Department and Employee. Boss is a
role in the organisation, and the persons who has this role is a leaf
in the tree, just like every other person. Given your classification I
would say the org chart instead belongs to "dynamic heterogeneous"
instead of "homogeneous".

[...]

/Lennart

-CELKO-

unread,

Jul 17, 2005, 6:23:30 AM7/17/05

In trees and hierarchies broke them into fast/slow changing nodes and
fast/slow changing structures. An org chart has slow structural
changes, and higher node (personnel) changes. A message board has fast
structural changes (postings), and very slow node (message) changes.
Etc.

Marshall Spight

unread,

Jul 17, 2005, 10:40:33 AM7/17/05

Interesting. This kind of analysis would be for performance
concerns, yes? And in words two through four of your post,
you're referring to ISBN: B0002Z31P4, right? I've gotta
read that.

Marshall

dawn

unread,

Jul 18, 2005, 8:57:38 AM7/18/05

Marshall Spight wrote:
> I've been thinking about trees in the abstract lately, and
> trying to classify them. I am not talking about trees as
> a physical data structure, such as BTrees or Red-Black, but
> rather trees as logical data structures. In other words,
> the *interface* to a tree.
>
> I've identified three distinct kinds.
> "homogeneous tree"
> All nodes are the same type, tree has varying structure
>
> "dynamic heterogenous"
> Nodes are varying types, tree has varying structure
>
> "static heterogeneous"
> Nodes are varying type, tree has fixed structure.
>
> Examples
>
> homogeneous: org chart. Every node is a person record, but the
> structure of the organization may be of whatever form.
>
> dynamic heterogeneous: parse tree. There are specific kinds of
> nodes, but the structure of the tree is relatively unconstrained.

Or "often constrained by the grammar of a language"?

> static heterogeneous: customer/account/invoice/line item. Each
> level in the tree is fixed, with fixed relationships.
>
> Here "dynamic" and "static" refer to the structure of the tree.
>
> SQL handles the third kind of tree extremely well.

If you looking at these trees having metadata as nodes down to values
as leaf nodes, then I would say that SQL handles some subset of such
trees. If a name on a node refers to a multipart value, or a value
that is a tuple of dimension > 1, so that it has two or more non-leaf
nodes as children, then SQL would require that name to refer to a
relation. Also, if a name on a node refers to multiple values, that
name must refer to a relation (not a list, for example).

It might be the case, however, that trees of this type can be converted
into a tree structure that SQL can handle where only metadata is lost
in the conversion. For example, if "Organization" has a child node of
"address" which has child nodes that include "city" and "postcode" then
SQL isn't going to do well with references made to "address". Or am I
misunderstanding?

> The first
> two kinds, not so much. In particular, noth that the transitive
> closure problem applies to the first two kinds of trees, but
> not to the third.
>
> The functional programming people have lots of examples in their
> books of dynamic heterogeneous trees, and the mixture of union
> types and pattern matching as language features seems to handle
> these structures quite well. You could probably use the same
> techniques to handle homogenous trees equally well.
>
> Interestingly, I note that static tree nodes have a reference
> (fk) to their parent, while the homogeneous and dynamic hetero
> types are done with a node having references to its children.

I work with a model that I think fits into your static homogeneous
category, but where an fk can go either direction (using multivalues).

> Comments? Has anyone else written about this sort of thing?

I find it somewhat interesting and worth pursuing, but this
partitioning doesn't resonate with me. cheers --dawn

>
> Marshall

-CELKO-

unread,

Jul 18, 2005, 10:05:21 AM7/18/05

>> in words two through four of your post.. <,

Yes, it is the book. Sorry my typing is bad right now. I separates
two fighting dogs and my fingers are in band-aids.

dawn

unread,

Jul 16, 2005, 11:25:15 AM7/16/05

Or a simpler example of a tree with an interger on each node.

> dynamic heterogeneous: parse tree. There are specific kinds of
> nodes, but the structure of the tree is relatively unconstrained.

There are people here much more knowledgeable about trees than I (but
that has not stopped me before, eh?) but even with the lose definition,
I wouldn't say "relatively unconstrained" but perhaps "either
unconstrained or constrained by a grammar." The grammar constraint
might not be fully encoded into the tree, due to the complexity of the
grammar.

I'm not so sure. If your tree extends from metadata down to data
values, you might have to add in that less-than-satisfying term "scalar
value" for your leaf nodes. Otherwise, if you have such a structure
and you have a value that is either multipart or multivalued within
this static structure, you still don't have SQL-92 support and most
SQL-DBMS's don't make it easy to have such structures, even if they are
supported in some way.

Interesting topic, but I not quite sure about this partitioning as yet.
Are you trying to carve out how the relational model fits within a
tree model by giving a term to such trees? If so and if I understand
your terms, then there are at least NF2 (non-first normal form) models
that would fit your "static" heterogeneous term.
Cheers! --dawn

Marshall Spight

unread,

Jul 21, 2005, 11:01:43 AM7/21/05

dawn wrote:
> Marshall Spight wrote:
> > I've been thinking about trees in the abstract lately, and
> > trying to classify them. I am not talking about trees as
> > a physical data structure, such as BTrees or Red-Black, but
> > rather trees as logical data structures. In other words,
> > the *interface* to a tree.
> >
> > I've identified three distinct kinds.
> > "homogeneous tree"
> > All nodes are the same type, tree has varying structure
> >
> > "dynamic heterogenous"
> > Nodes are varying types, tree has varying structure
> >
> > "static heterogeneous"
> > Nodes are varying type, tree has fixed structure.
> >
> > Examples
> >
> > homogeneous: org chart. Every node is a person record, but the
> > structure of the organization may be of whatever form.
>
> Or a simpler example of a tree with an interger on each node.

Yes. But I'm not sure how realistic an example that is. Note
that I'm trying to classify *logical* trees, so the structure
of the tree needs to have some meaning apart from just being
a data structure. Usually when we talk about a tree of ints,
we've got some data structure going because we want logn lookup.
That I would call a set, even if the underlying physical structure
is a tree.

> > dynamic heterogeneous: parse tree. There are specific kinds of
> > nodes, but the structure of the tree is relatively unconstrained.
>
> There are people here much more knowledgeable about trees than I (but
> that has not stopped me before, eh?) but even with the lose definition,
> I wouldn't say "relatively unconstrained" but perhaps "either
> unconstrained or constrained by a grammar." The grammar constraint
> might not be fully encoded into the tree, due to the complexity of the
> grammar.

Yes, exactly. In other words, one has, say, 20 different kinds of
nodes; each node type may have a fixed or variable number of children;
each specific child may be constrained to be of a particular type:
a parse tree. I'm looking for a term that captures the fact that
a node has a specific structure. Each node has fixed *local* structure,
but the tree's *overall* structure is not fixed. (Hence "dynamic
heterogeneous.")

> > static heterogeneous: customer/account/invoice/line item. Each
> > level in the tree is fixed, with fixed relationships.
> >
> > Here "dynamic" and "static" refer to the structure of the tree.
> >
> > SQL handles the third kind of tree extremely well.
>
> I'm not so sure. If your tree extends from metadata down to data
> values, you might have to add in that less-than-satisfying term "scalar
> value" for your leaf nodes.

(I am unclear why you use the term "metadata" here. To me that means
the system catalog; one doesn't usually join from the catalog to
user-define tables, yes?)

I agree "scalar" is an unsatisfying term. I have in mind a better one,
but it requires a type system that's a lot different from SQL's.
(Which I assert is an "opportunity." :-)

> Otherwise, if you have such a structure
> and you have a value that is either multipart or multivalued within
> this static structure, you still don't have SQL-92 support and most
> SQL-DBMS's don't make it easy to have such structures, even if they are
> supported in some way.

I'm not saying that this is necessarily the best way to go, but you
can certainly handle this case in a 1NF way.

I think this kind of structure is what one often ends up with in SQL.
And I do think it works really well. Consider the customer/account/
invoice/invoice line item sort of case. The structure is rigid, but
any of the queries you want to ask against this sort of structure
are quite easy.

> Interesting topic, but I not quite sure about this partitioning as yet.
> Are you trying to carve out how the relational model fits within a
> tree model by giving a term to such trees?

Yes!

SQL handles the static heterogeneous case with ease, but really
chokes on the other two cases. The big difference seems to be
the fixed depth vs. variable depth issue. It's impossible to
handle variable depth without something more powerful than the
basic RM; you need something at least as powerful as transitive
closure; recursive queries or (ugh) iteration will also work.

So the area I'm exploring is, what kinds of operations do we want
to do on the dynamic tree types, and what smallest bit of power
to we have to add to the RM to handle that well?

> If so and if I understand
> your terms, then there are at least NF2 (non-first normal form) models
> that would fit your "static" heterogeneous term.

The issue for me is not the model so much, as it is what operators
do we need to work on them. Modeling structure is relatively easy;
the hard part is querying, updating, and constraining.

Marshall

dawn

unread,

Jul 22, 2005, 10:41:14 AM7/22/05

hmmm. I'll process, but I figured that putting a person record on each
node when trying to classify logical trees was muddying the example. I
don't think I'm fully tapped into what you are thinking yet, but this
gives me more clues.

>
> > > dynamic heterogeneous: parse tree. There are specific kinds of
> > > nodes, but the structure of the tree is relatively unconstrained.
> >
> > There are people here much more knowledgeable about trees than I (but
> > that has not stopped me before, eh?) but even with the lose definition,
> > I wouldn't say "relatively unconstrained" but perhaps "either
> > unconstrained or constrained by a grammar." The grammar constraint
> > might not be fully encoded into the tree, due to the complexity of the
> > grammar.
>
> Yes, exactly. In other words, one has, say, 20 different kinds of
> nodes; each node type may have a fixed or variable number of children;
> each specific child may be constrained to be of a particular type:
> a parse tree. I'm looking for a term that captures the fact that
> a node has a specific structure. Each node has fixed *local* structure,
> but the tree's *overall* structure is not fixed. (Hence "dynamic
> heterogeneous.")

OK, so the tree structure is not the only variable here -- you are also
looking at the structure within a node. Got it.

>
> > > static heterogeneous: customer/account/invoice/line item. Each
> > > level in the tree is fixed, with fixed relationships.
> > >
> > > Here "dynamic" and "static" refer to the structure of the tree.
> > >
> > > SQL handles the third kind of tree extremely well.
> >
> > I'm not so sure. If your tree extends from metadata down to data
> > values, you might have to add in that less-than-satisfying term "scalar
> > value" for your leaf nodes.
>
> (I am unclear why you use the term "metadata" here.

If you look at a DOM tree for an XML document (or just zero in on an
XHTML document if easier), you see that you move from tag to tag in a
path until you get to a value that has no children. So, you step down
the tree from <html> to <body> to <div> to <p> to a value that is the
text in the paragraph. The leaf nodes have values and others have
metadata.

> To me that means
> the system catalog; one doesn't usually join from the catalog to
> user-define tables, yes?)

With the topic of trees, I flipped out of "relational". So, I would
say that it is the case when I work with a tree or graph data structure
(which for me are typically old data structures rather than new) I
might step from metadata, such as the name space (by whatever name --
this might be a data source or schema), down the tree to a file (for
example), to a record based on a key value, to a to a field, to a data
value. If that data value is the value of a key in another file, then
I might step to a record in another file, then to a field, and then
finally to the target data value. Everything above the leaf value is
really just data about the data I'm after -- it is metadata, although
only the names might be considered such -- the name of the name space,
of the two files and of the two fields.

But the point is that in my mind the data trees do have metadata and
paths through the tree lead to data values.

> I agree "scalar" is an unsatisfying term. I have in mind a better one,
> but it requires a type system that's a lot different from SQL's.
> (Which I assert is an "opportunity." :-)

I'm planning to start, uh, blogging, this Fall and I decided to test
out various tools this past spring, so I have a first entry (which
likely looks like I abandoned the cause) at my not-exactly-perfect web
site (they say that women apologize more than men) about the types I
have at the top of my types tree (under the "type" type) in my ideal
LOGICAL system.

http://tincatgroup.com/mewsings

I'm guessing yours are different?

>
> > Otherwise, if you have such a structure
> > and you have a value that is either multipart or multivalued within
> > this static structure, you still don't have SQL-92 support and most
> > SQL-DBMS's don't make it easy to have such structures, even if they are
> > supported in some way.
>
> I'm not saying that this is necessarily the best way to go, but you
> can certainly handle this case in a 1NF way.

Sure, but that doesn't mean that if you have a tree that matches your
definition then it IS in a SQL-DBMS structure. If so, then, yes,
obviously SQL handles that type of tree well, but if you permit other
instances of such a type then, no, SQL doesn't handle all trees of this
type well, so this partitioning of types of trees did not isolate
information about SQL and trees. Did that make sense?

> I think this kind of structure is what one often ends up with in SQL.
> And I do think it works really well. Consider the customer/account/
> invoice/invoice line item sort of case. The structure is rigid, but
> any of the queries you want to ask against this sort of structure
> are quite easy.

If you are saying that you can put data structures that could be
implemented in a SQL-DBMS into such a structure, OK, but if you are
saying that SQL handles such tree structures well, then NO -- it only
handles the SQL-like flavors well.

>
> > Interesting topic, but I not quite sure about this partitioning as yet.
> > Are you trying to carve out how the relational model fits within a
> > tree model by giving a term to such trees?
>
> Yes!

You have possibly made it to a category that is a superset. It might
be necessary to use such a structure (I'm not quite fully tapped into
it, so I'll hedge on that), but it is not sufficient unless you add a
translator to take a tree with multipart values or multivalue and
explode it (that really is formally the term we use in my neck of the
woods) into a tree that SQL does like.

> SQL handles the static heterogeneous case with ease,

am I correct that this is only tree for a subset of this type of tree
by your definitions?

> but really
> chokes on the other two cases. The big difference seems to be
> the fixed depth vs. variable depth issue.

Yes, that is an issue, but only one.

> It's impossible to
> handle variable depth without something more powerful than the
> basic RM; you need something at least as powerful as transitive
> closure; recursive queries or (ugh) iteration will also work.

yes & yes and iteration is precisely how I drive from my house to my
parents' house, among other things. How do you think relational
operators are implemented? ;-)

> So the area I'm exploring is, what kinds of operations do we want
> to do on the dynamic tree types, and what smallest bit of power
> to we have to add to the RM to handle that well?

That is precisely the wrong question ;-) if you are talking about the
logical data model IMO. If a human is interacting with the data model,
then that person can apply either the metaphor of a graph (tree or
otherwise) or of sets and benefit from the use of both metaphors,
rather than being restricted by one. If we can model the idea of
"travel" by showing a picture of a person on a bicycle or by showing an
airplane, then how can we extend the picture of the bicycle minimally
so that it gets at the idea of being able to go over water? That
question is as relevant in my mind as the one you asked (OK, that is
almost the case -- I used a metaphor, so it is necessarily flawed and
also limited).

>
> > If so and if I understand
> > your terms, then there are at least NF2 (non-first normal form) models
> > that would fit your "static" heterogeneous term.
>
> The issue for me is not the model so much, as it is what operators
> do we need to work on them.

Give freedom to the functions -- they can be relational operators or
functions to move down a path or whatever is useful.

> Modeling structure is relatively easy;
> the hard part is querying, updating, and constraining.

Yup. I check in on XQuery on occasion just for the entertainment.
smiles. --dawn

>
> Marshall

Marshall Spight

unread,

Jul 24, 2005, 5:06:24 PM7/24/05

dawn wrote:
> Marshall Spight wrote:
>
> hmmm. I'll process, but I figured that putting a person record on each
> node when trying to classify logical trees was muddying the example. I
> don't think I'm fully tapped into what you are thinking yet, but this
> gives me more clues.

I just picked org chart because it shows an example of a complex
structure in which the structure of each node is the same.

> > > > dynamic heterogeneous: parse tree. There are specific kinds of
> > > > nodes, but the structure of the tree is relatively unconstrained.
> > >
> > > There are people here much more knowledgeable about trees than I (but
> > > that has not stopped me before, eh?) but even with the lose definition,
> > > I wouldn't say "relatively unconstrained" but perhaps "either
> > > unconstrained or constrained by a grammar." The grammar constraint
> > > might not be fully encoded into the tree, due to the complexity of the
> > > grammar.
> >
> > Yes, exactly. In other words, one has, say, 20 different kinds of
> > nodes; each node type may have a fixed or variable number of children;
> > each specific child may be constrained to be of a particular type:
> > a parse tree. I'm looking for a term that captures the fact that
> > a node has a specific structure. Each node has fixed *local* structure,
> > but the tree's *overall* structure is not fixed. (Hence "dynamic
> > heterogeneous.")
>
> OK, so the tree structure is not the only variable here -- you are also
> looking at the structure within a node. Got it.

Well, yes, but only insofar as the node structure influences the
overall structure. In other words, a node for + would have two
children, a node for ! would have only one.

> > (I am unclear why you use the term "metadata" here.
>
> If you look at a DOM tree for an XML document (or just zero in on an
> XHTML document if easier), you see that you move from tag to tag in a
> path until you get to a value that has no children. So, you step down
> the tree from <html> to <body> to <div> to <p> to a value that is the
> text in the paragraph. The leaf nodes have values and others have
> metadata.

That strikes me as a nonstardard definition of the use of metadata,
but no matter.

> > I agree "scalar" is an unsatisfying term. I have in mind a better one,
> > but it requires a type system that's a lot different from SQL's.
> > (Which I assert is an "opportunity." :-)
>
> I'm planning to start, uh, blogging, this Fall and I decided to test
> out various tools this past spring, so I have a first entry (which
> likely looks like I abandoned the cause) at my not-exactly-perfect web
> site (they say that women apologize more than men) about the types I
> have at the top of my types tree (under the "type" type) in my ideal
> LOGICAL system.
>
> http://tincatgroup.com/mewsings

The pun in the URL is awful. I like it!

> I'm guessing yours are different?

Well, that column seemed to be mostly about document management
and not data management. I'm really starting to see them as two
entirely different disciplines, and I believe that most of what's
going on in XML is about document management and not data
management.

I admit that text documents are an important data type, but
it's *just one* datatype among millions. Limiting ourselves
to looking at only a single datatype doesn't put us in a good
position to think about datatypes in general.

> > > Otherwise, if you have such a structure
> > > and you have a value that is either multipart or multivalued within
> > > this static structure, you still don't have SQL-92 support and most
> > > SQL-DBMS's don't make it easy to have such structures, even if they are
> > > supported in some way.
> >
> > I'm not saying that this is necessarily the best way to go, but you
> > can certainly handle this case in a 1NF way.
>
> Sure, but that doesn't mean that if you have a tree that matches your
> definition then it IS in a SQL-DBMS structure. If so, then, yes,
> obviously SQL handles that type of tree well, but if you permit other
> instances of such a type then, no, SQL doesn't handle all trees of this
> type well, so this partitioning of types of trees did not isolate
> information about SQL and trees. Did that make sense?

Sorry, but it didn't. At least not enough to tell you whether I agree
or not.

I'll try rephrasing: If you have data that conceptually matches
what I'm calling "static heterogeneous", (specifically, the data
hierarchy is fixed, as in Customer/Account/Invoice/InvoiceLineItem)
then you will be able to model, query, update, and constrain this
data very well in SQL. In constrast, if you have an org chart,
in which the structure is not fixed, then you will have a harder
time, particularly with querying and constraining, and maybe also
updating.

> If you are saying that you can put data structures that could be
> implemented in a SQL-DBMS into such a structure, OK, but if you are
> saying that SQL handles such tree structures well, then NO -- it only
> handles the SQL-like flavors well.

I don't really get it. *Any* data structure can be implemented in SQL.
I'm trying to get at the question of what specific ones are a good
match for sql, and I'm saying static heterogeneous is, and the others
aren't.

> > > Interesting topic, but I not quite sure about this partitioning as yet.
> > > Are you trying to carve out how the relational model fits within a
> > > tree model by giving a term to such trees?
> >
> > Yes!
>
> You have possibly made it to a category that is a superset. It might
> be necessary to use such a structure (I'm not quite fully tapped into
> it, so I'll hedge on that), but it is not sufficient unless you add a
> translator to take a tree with multipart values or multivalue and
> explode it (that really is formally the term we use in my neck of the
> woods) into a tree that SQL does like.
>
> > SQL handles the static heterogeneous case with ease,
>
> am I correct that this is only tree for a subset of this type of tree
> by your definitions?

I'm going to assume you mean "true" when you first said "tree", because
otherwise I can't make the sentence parse. If so, then *no*, I'm
saying it's true for all tree structures that match my definition
of static homogeneous. (And, explicitly, *not* true for the other
two kinds. They're hard to work with in SQL.)

> > but really
> > chokes on the other two cases. The big difference seems to be
> > the fixed depth vs. variable depth issue.
>
> Yes, that is an issue, but only one.

Okay. What are the others, as you see it?

> > It's impossible to
> > handle variable depth without something more powerful than the
> > basic RM; you need something at least as powerful as transitive
> > closure; recursive queries or (ugh) iteration will also work.
>
> yes & yes and iteration is precisely how I drive from my house to my
> parents' house, among other things.

I do not agree. It is just as accurate to say that in going from
point A to B to C to D, you invoke a pure function to transform your
position A to B, which then invokes a function to transform your
position from B to C, which then invokes a function to transform
your position from C to D. If you want to prove to me that the
universe is inherrently iterative, you're going to have to point
out to me a loop counter somewhere in the real world. Under a rock,
say.

The fact that so many progammers favor iteration over recursion
is something I consider odd, given that iteration is strictly
less powerful than recursion. It is possible to transform any
iterative algorithm into a recursive one; the reverse is not true.

> How do you think relational operators are implemented? ;-)

Alas, this is irrelevant. We build software in layers, and each
layer is implemented in whatever paradigm it chooses, but this
does not constrain the choice of paradigm of higher layers.

> > So the area I'm exploring is, what kinds of operations do we want
> > to do on the dynamic tree types, and what smallest bit of power
> > to we have to add to the RM to handle that well?
>
> That is precisely the wrong question ;-) if you are talking about the
> logical data model IMO. If a human is interacting with the data model,
> then that person can apply either the metaphor of a graph (tree or
> otherwise) or of sets and benefit from the use of both metaphors,
> rather than being restricted by one.

I would be interested to hear what you consider to be the right
question.

You regularly mention graphs. Do you have some demonstration of
why you consider them a good choice? Maybe a description of some
minimal set of operators and what you can do with them? I am quite
pleased with the fact that the RM has its minimal set of operations
and its correspondence with first order logic; can you describe
anything comparable?

> If we can model the idea of
> "travel" by showing a picture of a person on a bicycle or by showing an
> airplane, then how can we extend the picture of the bicycle minimally
> so that it gets at the idea of being able to go over water? That
> question is as relevant in my mind as the one you asked (OK, that is
> almost the case -- I used a metaphor, so it is necessarily flawed and
> also limited).

Alas, I did not follow this metaphor at all. :-( Could you try to
rephrase your point, perhaps without using a metaphor?

> > > If so and if I understand
> > > your terms, then there are at least NF2 (non-first normal form) models
> > > that would fit your "static" heterogeneous term.
> >
> > The issue for me is not the model so much, as it is what operators
> > do we need to work on them.
>
> Give freedom to the functions -- they can be relational operators or
> functions to move down a path or whatever is useful.
>
> > Modeling structure is relatively easy;
> > the hard part is querying, updating, and constraining.
>
> Yup. I check in on XQuery on occasion just for the entertainment.
> smiles. --dawn

I hear XMLSchema is a laugh riot.

Marshall

dawn

unread,

Jul 24, 2005, 7:01:47 PM7/24/05

Marshall Spight wrote:
> dawn wrote:
> > Marshall Spight wrote:

> > > (I am unclear why you use the term "metadata" here.
> >
> > If you look at a DOM tree for an XML document (or just zero in on an
> > XHTML document if easier), you see that you move from tag to tag in a
> > path until you get to a value that has no children. So, you step down
> > the tree from <html> to <body> to <div> to <p> to a value that is the
> > text in the paragraph. The leaf nodes have values and others have
> > metadata.
>
> That strikes me as a nonstardard definition of the use of metadata,
> but no matter.

I used html tags, but these could have been metadata from any
structure, so a path could be <Person> then <lastName> then a value. I
haven't done a lot of work with XML compared to others, but unless I am
misunderstanding something, this would be a common way of shaping
metadata & data from an RDBMS into an xml dom tree.

>
> > > I agree "scalar" is an unsatisfying term. I have in mind a better one,
> > > but it requires a type system that's a lot different from SQL's.
> > > (Which I assert is an "opportunity." :-)
> >
> > I'm planning to start, uh, blogging, this Fall and I decided to test
> > out various tools this past spring, so I have a first entry (which
> > likely looks like I abandoned the cause) at my not-exactly-perfect web
> > site (they say that women apologize more than men) about the types I
> > have at the top of my types tree (under the "type" type) in my ideal
> > LOGICAL system.
> >
> > http://tincatgroup.com/mewsings
>
> The pun in the URL is awful. I like it!

Yes it is and thanks.

>
> > I'm guessing yours are different?
>
> Well, that column seemed to be mostly about document management
> and not data management.

but if you knew me, you would know that I don't talk about document
management, I talk about data management. But at the logical level, we
only care about a value such as 4 if we have some context, semantics.
We model propositions and retrieve the same. I used to say that I just
need "String" and "Binary" as my data types, with other types
inheriting from those. When I type in a 13 as a number it is a string
of two integers, the same as if I type it in as a string. The computer
can do what it wants to store it in a smaller number of 1's & 0's, I
just want to enter strings and get them back. I used the term
"Document" because I am not interfacing with the devices via voice, so
it seems like I wan to enter data from a document and get documents
back. I would be OK with the term "Word" or "Sentence Fragment" or
even "String" or even "Page" as this type, but I thought Document fit
well with how a person thinks. Am I working with a computer related to
music files, video files, picture files, document files?

> I'm really starting to see them as two
> entirely different disciplines,

just when they are coming so much closer together. Maybe it is easier
for you to see a comment attribute as a component in a document than
another tag and related value, but at least you can think of data
processed reports as documents, right? Can you then make the leap to
see the input forms as documents too? And what is the interface -- the
input to and the output from, right?

> and I believe that most of what's
> going on in XML is about document management and not data
> management.

I tell people who either a) fear xml or b) worship xml that it is
simply a minor advancement over comma-quote format. If you buy that,
then you should be able to see that xml is all about data. So I
suspect your concerns are with the "management" word. I will grant
that managing data using xml and related documents such as dtd or xsd
is not the same as managing data with an RDBMS. However, from a model
perspective, managing data as trees (and using xlink as other graph
structures) is an alternative to managing data as relations. Then some
of the same data management features as a typical DBMS can be added to
this model.

> I admit that text documents are an important data type, but
> it's *just one* datatype among millions.

It is the type of data that I am entering when I stream in numbers from
a temperature device or type in my name and address again. If you want
to think of it as String instead of Document, feel free. The average
person might not feel as in touch with Strings as with Documents,
however.

> Limiting ourselves
> to looking at only a single datatype doesn't put us in a good
> position to think about datatypes in general.

It is not a limit, but the next level down from "Type" or "Object" as
the type from which all types flow. So, Document is-a Type and Mime
is-a Type. Date is a Document, HTML is-a Document, Integer is-a
Document (replace Document with String and it will sound better to
you). It is only when you look at the interface between the software
and the machine that you would want to think of a number as not being a
type of string. If you look at the input to and the output from the
database, you will see it is all strings of data. Mime types are also
strings, but they differ in that they do not build up to documents, but
to songs or videos or object code or ...

So, all non-binary data that I use when working with a computer are in
terms of strings and I could call all such forms for getting these
strings in and out of the computer "documents".

>
> > > > Otherwise, if you have such a structure
> > > > and you have a value that is either multipart or multivalued within
> > > > this static structure, you still don't have SQL-92 support and most
> > > > SQL-DBMS's don't make it easy to have such structures, even if they are
> > > > supported in some way.
> > >
> > > I'm not saying that this is necessarily the best way to go, but you
> > > can certainly handle this case in a 1NF way.
> >
> > Sure, but that doesn't mean that if you have a tree that matches your
> > definition then it IS in a SQL-DBMS structure. If so, then, yes,
> > obviously SQL handles that type of tree well, but if you permit other
> > instances of such a type then, no, SQL doesn't handle all trees of this
> > type well, so this partitioning of types of trees did not isolate
> > information about SQL and trees. Did that make sense?
>
> Sorry, but it didn't. At least not enough to tell you whether I agree
> or not.

I disagree with your statement that SQL handles a particular type of
tree well because there are trees that match your def of that type that
SQL does not handle well. They could be reshaped into trees that SQL
does handle, but I did not think that was your point.

> I'll try rephrasing: If you have data that conceptually matches
> what I'm calling "static heterogeneous", (specifically, the data
> hierarchy is fixed, as in Customer/Account/Invoice/InvoiceLineItem)
> then you will be able to model, query, update, and constrain this
> data very well in SQL.

I suspect that I disagree, but I might not be understanding what your
tree looks like.

For example, if InvoiceLineItem has an attribute "discount" which has a
value that is a list, such as

10.0
20.0

would your tree look the same as if this attribute were constrained to
be a single value? If the two trees would look the same, then SQL can
handle one and not the other (without first changing it to a different
tree). So, I think you have to get into restricting data types before
you can say that SQL will handle a tree in this shape.

I will grant that I might not be seeing the same tree that you are
seeing. Mine would have a path like this:

namespace:datasource/schema identifier(s) such as a URI or
host:port:datasourcename
|
Customer customerID=12345
|
Account accountNbr=12345-01
|
Invoice invoiceNbr=3828193
|
InvoiceLineItem lineNbr=3
/
InvoiceLineItem discount=10.0
\
and a second child node of the InvoiceLineItem lineNbr=3 would be
discount=20.0

a) Does this meet your criteria for what type of tree it is?
b) Does your tree look similar?
c) Do you agree that SQL doesn't query this tree very well as it gets
hung up on having two values for the discount attribute of the
InvoiceLineItem at lineNbr=3?

> In constrast, if you have an org chart,
> in which the structure is not fixed, then you will have a harder
> time, particularly with querying and constraining, and maybe also
> updating.
>
>
> > If you are saying that you can put data structures that could be
> > implemented in a SQL-DBMS into such a structure, OK, but if you are
> > saying that SQL handles such tree structures well, then NO -- it only
> > handles the SQL-like flavors well.
>
> I don't really get it. *Any* data structure can be implemented in SQL.
> I'm trying to get at the question of what specific ones are a good
> match for sql, and I'm saying static heterogeneous is, and the others
> aren't.

It might simply be that I'm not catching on to your terms here and if
so, I apologize. If I am understanding, then I can agree that you can
say that the trees that SQL handles well are static heterogeneous, but
you cannot say that SQL handles all such trees well.

>
> > > > Interesting topic, but I not quite sure about this partitioning as yet.
> > > > Are you trying to carve out how the relational model fits within a
> > > > tree model by giving a term to such trees?
> > >
> > > Yes!
> >
> > You have possibly made it to a category that is a superset. It might
> > be necessary to use such a structure (I'm not quite fully tapped into
> > it, so I'll hedge on that), but it is not sufficient unless you add a
> > translator to take a tree with multipart values or multivalue and
> > explode it (that really is formally the term we use in my neck of the
> > woods) into a tree that SQL does like.
> >
> > > SQL handles the static heterogeneous case with ease,
> >
> > am I correct that this is only tree for a subset of this type of tree
> > by your definitions?
>
> I'm going to assume you mean "true" when you first said "tree", because
> otherwise I can't make the sentence parse.

tree (I'm using it again as a synonym for "true" -- do you have a
problem with that? I mean, oops, sorry).

> If so, then *no*, I'm
> saying it's true for all tree structures that match my definition
> of static homogeneous. (And, explicitly, *not* true for the other
> two kinds. They're hard to work with in SQL.)

Do you understand the tree that I am handing you that I think is a
counter-example? Is it one?

>
> > > but really
> > > chokes on the other two cases. The big difference seems to be
> > > the fixed depth vs. variable depth issue.
> >
> > Yes, that is an issue, but only one.
>
> Okay. What are the others, as you see it?

My counter-example of a complex data type (a list in this case), would
be one.

>
> > > It's impossible to
> > > handle variable depth without something more powerful than the
> > > basic RM; you need something at least as powerful as transitive
> > > closure; recursive queries or (ugh) iteration will also work.
> >
> > yes & yes and iteration is precisely how I drive from my house to my
> > parents' house, among other things.
>
> I do not agree. It is just as accurate to say that in going from
> point A to B to C to D, you invoke a pure function to transform your
> position A to B, which then invokes a function to transform your
> position from B to C, which then invokes a function to transform
> your position from C to D. If you want to prove to me that the
> universe is inherrently iterative, you're going to have to point
> out to me a loop counter somewhere in the real world. Under a rock,
> say.

I have no reason to push that one further. I'd prefer a function that
beams me up to one that requires me to get 1 tank of gas and then
another (2) tank of case and so on until I reach the destination.

> The fact that so many progammers favor iteration over recursion
> is something I consider odd, given that iteration is strictly
> less powerful than recursion.

Yes, but if you can iterate instead of using recursion, then you don't
set yourself up for a stack overflow, for example. With recursion, you
push the stack with each iteration and don't free up that memory until
you are done with the entire process. If an iterative algorithm is
possible, then I would only use recursion if that algorithm gives
improved performance or increased maintainability (at least that is all
I can think of now).

> It is possible to transform any
> iterative algorithm into a recursive one; the reverse is not true.

Yes indeed.

> > How do you think relational operators are implemented? ;-)
>
> Alas, this is irrelevant. We build software in layers, and each
> layer is implemented in whatever paradigm it chooses, but this
> does not constrain the choice of paradigm of higher layers.

OK. So, some functions can be defined using iteration, but you want to
be done with the iteration in the underlying layers and not in any
software you or I would write. As long as it remains an option, should
I choose to use it instead of playing any games to get around using it,
then I'm OK with that approach.

>
> > > So the area I'm exploring is, what kinds of operations do we want
> > > to do on the dynamic tree types, and what smallest bit of power
> > > to we have to add to the RM to handle that well?
> >
> > That is precisely the wrong question ;-) if you are talking about the
> > logical data model IMO. If a human is interacting with the data model,
> > then that person can apply either the metaphor of a graph (tree or
> > otherwise) or of sets and benefit from the use of both metaphors,
> > rather than being restricted by one.
>
> I would be interested to hear what you consider to be the right
> question.

How about this one --
How can we have an API to data that permits us to view it in the
variety of ways that would be helpful? The answer might include
operators for joining relations as well as for navigating from one row
of data in one relation to another relation, using a foreign key.

> You regularly mention graphs. Do you have some demonstration of
> why you consider them a good choice?

I would love to have some emperical data, but I don't. I do have
anecdotal data and first-hand experience of software development teams
being much more productive and/or requiring many fewer developers when
using a DBMS that employs a tree/graph model along with a set approach,
instead of a SQL-DBMS. I have no data that proves that this is the
case in general, nor that shows that the data model is key to the
productivity gains. I have asked before in this forum what kind of an
experiment could be set up that would demonstrate that. I cannot think
of anything I could set up as an experiment for a small cost where the
outcome could sway anyone to anything related to the choice of a data
model.

I have only my intuition for that, which runs at about .72 (and no, I
have no intuition for whether to play red or black in vegas, so don't
think this implies that I'm right 72% of the time, just 72% of the time
that I am running with an intuition).

> Maybe a description of some
> minimal set of operators and what you can do with them? I am quite
> pleased with the fact that the RM has its minimal set of operations
> and its correspondence with first order logic; can you describe
> anything comparable?

There are some papers along this line that Jan has pointed me to before
(so they are on my stolen computer). Perhaps googling for
functional-data-model or xml-data-model (I know, I know) would yield
results. I'll check at some point, but let me know if you find
something.

>
> > If we can model the idea of
> > "travel" by showing a picture of a person on a bicycle or by showing an
> > airplane, then how can we extend the picture of the bicycle minimally
> > so that it gets at the idea of being able to go over water? That
> > question is as relevant in my mind as the one you asked (OK, that is
> > almost the case -- I used a metaphor, so it is necessarily flawed and
> > also limited).
>
> Alas, I did not follow this metaphor at all. :-( Could you try to
> rephrase your point, perhaps without using a metaphor?

My point is that each metaphor is some partial version of the whole. I
just listened to a CD of a theologian talking about how Jesus used
parables and he said something like "a parable is a metaphor and a
metaphor is a lie". If you know the poem about the blind men and the
elephant, that makes a similar point to mine. We gain something by
working with propositions viewed as relations AND as "webs" (do you
prefer that to "graphs"?) rather than restricting ourselves to one of
these.

>
> > > > If so and if I understand
> > > > your terms, then there are at least NF2 (non-first normal form) models
> > > > that would fit your "static" heterogeneous term.
> > >
> > > The issue for me is not the model so much, as it is what operators
> > > do we need to work on them.
> >
> > Give freedom to the functions -- they can be relational operators or
> > functions to move down a path or whatever is useful.
> >
> > > Modeling structure is relatively easy;
> > > the hard part is querying, updating, and constraining.
> >
> > Yup. I check in on XQuery on occasion just for the entertainment.
> > smiles. --dawn
>
> I hear XMLSchema is a laugh riot.

I'll check it out. smiles. --dawn

Marshall Spight

unread,

Jul 24, 2005, 11:06:10 PM7/24/05

dawn wrote:
> Marshall Spight wrote:
> >
> > That strikes me as a nonstardard definition of the use of metadata,
> > but no matter.
>
> I used html tags, but these could have been metadata from any
> structure, so a path could be <Person> then <lastName> then a value. I
> haven't done a lot of work with XML compared to others, but unless I am
> misunderstanding something, this would be a common way of shaping
> metadata & data from an RDBMS into an xml dom tree.

Again, this does not appear to be standard terminology to me.
As I am used to understanding the term, "metadata" means higher-order
data, not simply higher level data. It would not properly be used
to describe data in an enclosing scope. In an xhtml example, for some
text inside <b> tags, metadata would not be anything about the
enclosing <p>, but rather the metadata would be the xhtml dtd.

Metadata is schema information, or type information.

> > > I'm guessing yours are different?
> >
> > Well, that column seemed to be mostly about document management
> > and not data management.
>
> but if you knew me, you would know that I don't talk about document
> management, I talk about data management.

I'm not sure I agree.

> But at the logical level, we
> only care about a value such as 4 if we have some context, semantics.
> We model propositions and retrieve the same. I used to say that I just
> need "String" and "Binary" as my data types, with other types
> inheriting from those. When I type in a 13 as a number it is a string
> of two integers, the same as if I type it in as a string.

Okay, you're headed down a dead end road here. Watch out!

When you type, you are typing characters. There is no way to type
in 13 as a number, not even with ^M. You can only type characters.
When you type '1' and then '3', you have typed two characters.
You don't have a number yet.

> The computer
> can do what it wants to store it in a smaller number of 1's & 0's, I
> just want to enter strings and get them back.

I would be surprised if all you really care about is strings.
If all you have is strings, you can't add, for example. The
fact that some programming languages will implicitly convert
a string to a number in some contexts and add the resulting
numbers is simply a distraction; it does not mean that strings
and numbers are the same thing.

> I used the term
> "Document" because I am not interfacing with the devices via voice, so
> it seems like I wan to enter data from a document and get documents
> back. I would be OK with the term "Word" or "Sentence Fragment" or
> even "String" or even "Page" as this type, but I thought Document fit
> well with how a person thinks. Am I working with a computer related to
> music files, video files, picture files, document files?
>
> > I'm really starting to see them as two
> > entirely different disciplines,
>
> just when they are coming so much closer together.

They aren't, though. It's just that some people with a document
management background are claiming their tools do everything
the data management tools do. But their claims are false.

> Maybe it is easier
> for you to see a comment attribute as a component in a document than
> another tag and related value, but at least you can think of data
> processed reports as documents, right?

Not so much. Mostly I think of reports as result sets. You might
format them as an html table, pretty them up, and embed it in
a page, though. But that's mere presentation.

> Can you then make the leap to
> see the input forms as documents too? And what is the interface -- the
> input to and the output from, right?

The machine interface is, we type characters on the keyboard, and
we see pixels on the screen. (We also specify x,y coordinates and
have a few more buttons, non-character this time, on the mouse.)

Honestly, I consider the html way of talking about the world,
with input forms, hypertext, etc. as a decidedly inferior model
of human-computer interaction than the one available 20 years
ago. I am fond of saying that Tim Berners-Lee got one important
thing right and every other important thing wrong, and in so doing
set software back 20 years. The two bloodiest casualties have
been UI and data management.

> > and I believe that most of what's
> > going on in XML is about document management and not data
> > management.
>
> I tell people who either a) fear xml or b) worship xml that it is
> simply a minor advancement over comma-quote format. If you buy that,
> then you should be able to see that xml is all about data.

XML is all about strings.

> So I
> suspect your concerns are with the "management" word. I will grant
> that managing data using xml and related documents such as dtd or xsd
> is not the same as managing data with an RDBMS. However, from a model
> perspective, managing data as trees (and using xlink as other graph
> structures) is an alternative to managing data as relations. Then some
> of the same data management features as a typical DBMS can be added to
> this model.

Having the tree as the only possible structure is worse than having
the relation as the only possible structure, and I agree with you
that the latter is too limiting. I would also propose that having
the object as the only possible structure is a lose. Further, I
think the random mishmash of stuff that appears in most programming
languages is also not the way to go.

I would propose that the solution would support at least relations,
lists, tuples, and enum types. In fact, that might be a complete set.

Also: having strings as the only datatype is inferior to having
a variety of different types. The perl/tcl/html approach, where
everything is a string and you have a bajillion kinds of implicit
conversions might be fine if your goal is fast-and-loose, but
if you want any kind of discipline, which you certainly do if
you want to manage data, then it's out the door.

> > I admit that text documents are an important data type, but
> > it's *just one* datatype among millions.
>
> It is the type of data that I am entering when I stream in numbers from
> a temperature device or type in my name and address again. If you want
> to think of it as String instead of Document, feel free. The average
> person might not feel as in touch with Strings as with Documents,
> however.

By limiting yourself to how Joe Sixpack thinks about these things
today, you may gain some usability benefit in applications aimed
at the lowest common denominator, but you're not going to discover
anything profound that way. And, If Everyone Did It, (as Mom would
say) the forward progress of mankind would halt.

So I'm not much interested in how the average person might think
about their data.

> > Limiting ourselves
> > to looking at only a single datatype doesn't put us in a good
> > position to think about datatypes in general.
>
> It is not a limit, but the next level down from "Type" or "Object" as
> the type from which all types flow. So, Document is-a Type and Mime
> is-a Type. Date is a Document, HTML is-a Document, Integer is-a
> Document (replace Document with String and it will sound better to
> you).

Integer is most certainly not a document.

Question: if you add two documents together, what is the result?
Is addition of documents commutative?

> It is only when you look at the interface between the software
> and the machine that you would want to think of a number as not being a
> type of string.

As a mathematician who has never touched a computer (well, they used
to exist) if a number is a kind of Engish text.

> If you look at the input to and the output from the
> database, you will see it is all strings of data.

You are confusing things again. I could just as well say it's
all pixels, because we use graphic displays these days. Would
you agree that the job of a dbms is to manage pixels? When you
click the left mouse button, which character is that?

> Mime types are also
> strings, but they differ in that they do not build up to documents, but
> to songs or videos or object code or ...

I don't see how you can consider a video and a text ocument to be the
same thing.

> I disagree with your statement that SQL handles a particular type of
> tree well because there are trees that match your def of that type that
> SQL does not handle well. They could be reshaped into trees that SQL
> does handle, but I did not think that was your point.
>
> > I'll try rephrasing: If you have data that conceptually matches
> > what I'm calling "static heterogeneous", (specifically, the data
> > hierarchy is fixed, as in Customer/Account/Invoice/InvoiceLineItem)
> > then you will be able to model, query, update, and constrain this
> > data very well in SQL.
>
> I suspect that I disagree, but I might not be understanding what your
> tree looks like.
>
> For example, if InvoiceLineItem has an attribute "discount" which has a
> value that is a list, such as

Ah, now I think I see what you're getting at.

First of all, I completely agree that SQL handles lists poorly.
Ordered data, when the order is not derived from some ordering
function on the data but is implicit, is a weak point for SQL.

However, I was attempting to isolate the question of tree structure;
list data, whether embedded in a tree of whatever kind or just on
its own, is a real issue, but one that I see as being orthogonal
to the tree-structure taxonomy I'm working on.

And I'm sure we can both agree that SQL doesn't do nested relations
or nested anything, really. I am fully signed up to the value of
nested structure, and to the value of lists. So you don't have
to try to convince me on either point. :-)

> a) Does this meet your criteria for what type of tree it is?
> b) Does your tree look similar?
> c) Do you agree that SQL doesn't query this tree very well as it gets
> hung up on having two values for the discount attribute of the
> InvoiceLineItem at lineNbr=3?

Yes, yes, and yes.

The part that SQL handles well, though, is queries across the node
types (or levels), which is the part that is important (at least to
me) in classifying these kinds of trees. SQL has a hard time with
list data or multivalue data, whether it's in a tree or not, so
again I consider it an orthogonal issue.

> It might simply be that I'm not catching on to your terms here and if
> so, I apologize. If I am understanding, then I can agree that you can
> say that the trees that SQL handles well are static heterogeneous, but
> you cannot say that SQL handles all such trees well.

I think we're in agreement at this point.

> > The fact that so many progammers favor iteration over recursion
> > is something I consider odd, given that iteration is strictly
> > less powerful than recursion.
>
> Yes, but if you can iterate instead of using recursion, then you don't
> set yourself up for a stack overflow, for example.

Tail call optimization can make most or all of this issue go away.
(This raises a question for me, which is, is it the case that TCO
can convert any iterative construct into a recursive one that uses
only constant stack space? I think the answer might be yes, because
I think I can see how to write a tail-recursive 'while', and I think
all iterative constructs are just syntax on while.)

> With recursion, you
> push the stack with each iteration and don't free up that memory until
> you are done with the entire process. If an iterative algorithm is
> possible, then I would only use recursion if that algorithm gives
> improved performance or increased maintainability (at least that is all
> I can think of now).

If programmers put as much effort into recursion as they put into
iteration, I assert it would provide increased maintainability.

> > It is possible to transform any
> > iterative algorithm into a recursive one; the reverse is not true.
>
> Yes indeed.
>
> > > How do you think relational operators are implemented? ;-)
> >
> > Alas, this is irrelevant. We build software in layers, and each
> > layer is implemented in whatever paradigm it chooses, but this
> > does not constrain the choice of paradigm of higher layers.
>
> OK. So, some functions can be defined using iteration, but you want to
> be done with the iteration in the underlying layers and not in any
> software you or I would write. As long as it remains an option, should
> I choose to use it instead of playing any games to get around using it,
> then I'm OK with that approach.

Ha ha! I think I agree.

> > > > So the area I'm exploring is, what kinds of operations do we want
> > > > to do on the dynamic tree types, and what smallest bit of power
> > > > to we have to add to the RM to handle that well?
> > >

> > > That is precisely the wrong question ;-) [...]

> >
> > I would be interested to hear what you consider to be the right
> > question.
>
> How about this one --
> How can we have an API to data that permits us to view it in the
> variety of ways that would be helpful? The answer might include
> operators for joining relations as well as for navigating from one row
> of data in one relation to another relation, using a foreign key.

Okay, sure. But trying to ask this question for all possible data
structures is too big for me to go after all at once. So right now
I'm focusing just on trees: trying to classify them, and trying
to figure out what kinds of operations I might want to do with them.

The question in the large is: what are all the different logical
data structures, how might we query them, and how might we update
them and constrain them? I think mankind will be working on this
for some time.

> > You regularly mention graphs. Do you have some demonstration of
> > why you consider them a good choice?
>

> I would love to have some emperical data, but I don't. [...]

Whoops! You answered a question that wasn't quite what I meant
to ask. What I meant to ask was, can you show me a simple graph(s)
and some simple operations on them. Do you have a framework for
saying that some particular set of operations on these graphs is
complete, or even just some framework for saying what the possible
graph operators are?

I care about empirical data, too, but when I'm *here* I'm thinking
about theory.

> > Maybe a description of some
> > minimal set of operators and what you can do with them? I am quite
> > pleased with the fact that the RM has its minimal set of operations
> > and its correspondence with first order logic; can you describe
> > anything comparable?
>
> There are some papers along this line that Jan has pointed me to before
> (so they are on my stolen computer). Perhaps googling for
> functional-data-model or xml-data-model (I know, I know) would yield
> results. I'll check at some point, but let me know if you find
> something.

I would suggest that knowing what the graph operators are is something
you ought to have a good grip on if you're going to be advocating for
graphs.

> My point is that each metaphor is some partial version of the whole. I
> just listened to a CD of a theologian talking about how Jesus used
> parables and he said something like "a parable is a metaphor and a
> metaphor is a lie". If you know the poem about the blind men and the
> elephant, that makes a similar point to mine. We gain something by
> working with propositions viewed as relations AND as "webs"
> (do you prefer that to "graphs"?)

I prefer "graphs."

> rather than restricting ourselves to one of these.

But one can also *gain* from restrictions, as well. This point
is often missed. It's why constraints are valuable, and it's
why a minimal formalism is valuable.

Marshall

dawn

unread,

Jul 25, 2005, 1:49:09 AM7/25/05

Marshall Spight wrote:
> dawn wrote:
> > Marshall Spight wrote:
> > >
> > > That strikes me as a nonstardard definition of the use of metadata,
> > > but no matter.
> >
> > I used html tags, but these could have been metadata from any
> > structure, so a path could be <Person> then <lastName> then a value. I
> > haven't done a lot of work with XML compared to others, but unless I am
> > misunderstanding something, this would be a common way of shaping
> > metadata & data from an RDBMS into an xml dom tree.
>
> Again, this does not appear to be standard terminology to me.
> As I am used to understanding the term, "metadata" means higher-order
> data, not simply higher level data.

If lastName is an attribute (in RM terminology) of Person, then the
names "lastName" and "Person" are both metadata for the Person
relation, right?

> It would not properly be used
> to describe data in an enclosing scope.

I think this is just a representation issue. Since you are talking
about representing data in a tree model, I refered to an xml
representation of the metadata and data rather than a table
representation.

> In an xhtml example, for some
> text inside <b> tags, metadata would not be anything about the
> enclosing <p>, but rather the metadata would be the xhtml dtd.
>
> Metadata is schema information, or type information.

Yes, indeed. Surely it includes the names of relations and header
information such as column names, right?

> > > > I'm guessing yours are different?
> > >
> > > Well, that column seemed to be mostly about document management
> > > and not data management.
> >
> > but if you knew me, you would know that I don't talk about document
> > management, I talk about data management.
>
> I'm not sure I agree.

I guess I can only state it as my intent. I don't mind chatting about
.mp3 data or .rm or .mov or .class data, but I'm particularly
interested in data that comes from language modeled for software
development purposes. This would be Words, or Sentence Fragments, or
Strings, etc, and "Documents" might be too large a concept, but it at
least gives the hint that there would be operators that can parse
individual data values of this type and not only set operators.

>
> > But at the logical level, we
> > only care about a value such as 4 if we have some context, semantics.
> > We model propositions and retrieve the same. I used to say that I just
> > need "String" and "Binary" as my data types, with other types
> > inheriting from those. When I type in a 13 as a number it is a string
> > of two integers, the same as if I type it in as a string.
>
> Okay, you're headed down a dead end road here. Watch out!

I don't think so, but I'll bring my mace just in case.

> When you type, you are typing characters.

Precisely. And without loss of generalization we can consider my
"Document" type to be that which can be typed in on a keyboard and read
out loud in a language. So, under Type in my type hierarchy, I have
Strings/Documents and Binary/MIME types. All other types are in the
type tree below these.

> There is no way to type
> in 13 as a number, not even with ^M. You can only type characters.

Yes, yes -- you can only type
characters/strings/words/sentences/language = Documents.

> When you type '1' and then '3', you have typed two characters.
> You don't have a number yet.

I do have a number, but I am only ttreating as a string (supertype). A
number is a string with some additional functions. So, once I realize
this string is a number, I can apply numeric functions. But for any
data of type String or Document, I can treat it as a String.

>
> > The computer
> > can do what it wants to store it in a smaller number of 1's & 0's, I
> > just want to enter strings and get them back.
>
> I would be surprised if all you really care about is strings.
> If all you have is strings, you can't add, for example.

13 is in the Integer subclass of String/Document, in my type hierarchy.

> The
> fact that some programming languages will implicitly convert
> a string to a number in some contexts and add the resulting
> numbers is simply a distraction; it does not mean that strings
> and numbers are the same thing.

The representation of a number, like the representation of a word, are
Strings and that is what we are working with in our software
applications. An Integer is-a String.

>
> > I used the term
> > "Document" because I am not interfacing with the devices via voice, so
> > it seems like I wan to enter data from a document and get documents
> > back. I would be OK with the term "Word" or "Sentence Fragment" or
> > even "String" or even "Page" as this type, but I thought Document fit
> > well with how a person thinks. Am I working with a computer related to
> > music files, video files, picture files, document files?
> >
> > > I'm really starting to see them as two
> > > entirely different disciplines,
> >
> > just when they are coming so much closer together.
>
> They aren't, though. It's just that some people with a document
> management background are claiming their tools do everything
> the data management tools do. But their claims are false.

I come from the data side of the house, but respect the fact that if
you start marking up propositions in a consistent way, you can have a
representation of structured data that moves the document in the
direction of a database.

>
> > Maybe it is easier
> > for you to see a comment attribute as a component in a document than
> > another tag and related value, but at least you can think of data
> > processed reports as documents, right?
>
> Not so much. Mostly I think of reports as result sets.

If I ask you to e-mail me one of those result sets, nevermind, I know
you can see it as a document if you want to.

> You might
> format them as an html table, pretty them up, and embed it in
> a page, though. But that's mere presentation.

Exactly. It is not the data, but it is how we can perceive the data --
one possible representation and the one that we use as humans both for
entering data directly and for getting it back from the computer.

>
> > Can you then make the leap to
> > see the input forms as documents too? And what is the interface -- the
> > input to and the output from, right?
>
> The machine interface is, we type characters on the keyboard, and
> we see pixels on the screen. (We also specify x,y coordinates and
> have a few more buttons, non-character this time, on the mouse.)
>
> Honestly, I consider the html way of talking about the world,
> with input forms, hypertext, etc. as a decidedly inferior model
> of human-computer interaction than the one available 20 years
> ago.

Along with way-cool, hip languages and tools, I also work with database
products that date back 40 years (I have a paper from 1965 I want to
scan in). Every piece of data is a string, but you can also define the
string as a number and apply numeric functions if you want to. I
mention xml instead only because more people know that lingo. The
model is very similar between the 40-year-old system (definitely about
data) as it evolved and the document model from the xml folks. It is
as if flared hip-hugger jeans were back in style. And if the document
folks want to take the credit for this new hip idea, it's ok by me.

> I am fond of saying that Tim Berners-Lee got one important
> thing right and every other important thing wrong, and in so doing
> set software back 20 years. The two bloodiest casualties have
> been UI and data management.

Codd got some things right, Berners-Lee got some right, and I'm with
those who are suggesting we put the chocolate and the peanut butter
together.

>
> > > and I believe that most of what's
> > > going on in XML is about document management and not data
> > > management.
> >
> > I tell people who either a) fear xml or b) worship xml that it is
> > simply a minor advancement over comma-quote format. If you buy that,
> > then you should be able to see that xml is all about data.
>
> XML is all about strings.

now we are getting somewhere. Some of those strings are numbers, some
are dates, right?

>
<snip>

> I would propose that the solution would support at least relations,
> lists, tuples, and enum types. In fact, that might be a complete set.
>
> Also: having strings as the only

not only!! It is the ANCESTOR of all non-binary, language data.

> datatype is inferior to having
> a variety of different types. The perl/tcl/html approach, where
> everything is a string and you have a bajillion kinds of implicit
> conversions might be fine if your goal is fast-and-loose,

I've never been called that! But maybe we can get bigger bang for the
buck s/w development with what you termed fast-and-loose.

> but
> if you want any kind of discipline,

and I definitely do

> which you certainly do if
> you want to manage data, then it's out the door.

My type hierarchy has booleans, ints, etc, but these have an ancestor
of String.

>
> > > I admit that text documents are an important data type, but
> > > it's *just one* datatype among millions.
> >
> > It is the type of data that I am entering when I stream in numbers from
> > a temperature device or type in my name and address again. If you want
> > to think of it as String instead of Document, feel free. The average
> > person might not feel as in touch with Strings as with Documents,
> > however.
>
> By limiting yourself to how Joe Sixpack thinks about these things
> today, you may gain some usability benefit in applications aimed
> at the lowest common denominator,

I'm aiming for a data model that works for humans and related software
that does likewise.

> but you're not going to discover
> anything profound that way.

Profound is not my goal - I'm looking for ease and low cost in software
development and maintenance.

> And, If Everyone Did It, (as Mom would
> say) the forward progress of mankind would halt.

A logical data model is about the interface between humans and the
machine, even if those humans are s/w developers. It would be progress
to have the computer meet the human closer to where the human lives.

> So I'm not much interested in how the average person might think
> about their data.

I'm not (currently) aiming for a non-professional to prepare a data
model or write software here. But the average programmer is an average
person. I want to let the computer do the work of shaping data the way
a computer needs to see it and make the API between person and
representation of a data model more like the human intuitively
perceives it. This will (does, from my experience) improve the ability
of the s/w developer to write and maintain big-bang-for-the-buck
software.

> > > Limiting ourselves
> > > to looking at only a single datatype doesn't put us in a good
> > > position to think about datatypes in general.
> >
> > It is not a limit, but the next level down from "Type" or "Object" as
> > the type from which all types flow. So, Document is-a Type and Mime
> > is-a Type. Date is a Document, HTML is-a Document, Integer is-a
> > Document (replace Document with String and it will sound better to
> > you).
>
> Integer is most certainly not a document.

in the interface between me and the computer, I only pass integers as
strings, in document formats, typically with some metadata about the
integer visible somewhere. But if you prefer, Integer is-a String.

> Question: if you add two documents together, what is the result?
> Is addition of documents commutative?

Replace "docuemnt" with "string" and "add" with "concatenate". Addition
is not a function of a String. It shows up down the type hierarchy for
types that are numeric.

> > It is only when you look at the interface between the software
> > and the machine that you would want to think of a number as not being a
> > type of string.
>
> As a mathematician who has never touched a computer (well, they used
> to exist) if a number is a kind of Engish text.

For that mathematician, we could say that the number, such as 1, is
something they can hear as a word.

>
> > If you look at the input to and the output from the
> > database, you will see it is all strings of data.
>
> You are confusing things again. I could just as well say it's
> all pixels, because we use graphic displays these days. Would
> you agree that the job of a dbms is to manage pixels?

Nope. That is the representation of my document, words, sentences,
strings or language in the computer. I am not using the term document
as a single representation.

> When you
> click the left mouse button, which character is that?

We definitely aren't hitting each other right on this one.

>
> > Mime types are also
> > strings, but they differ in that they do not build up to documents, but
> > to songs or videos or object code or ...
>
> I don't see how you can consider a video and a text ocument to be the
> same thing.

I don't, I don't!! That is why in the type hierarchy, I have two types
-- binary which in my type hierarchy I'd like those to all be Mime
types (otherwise I could call them all Binary) and then String language
types, which I'm calling Documents.

Unless I am misunderstanding, a counter-example to that statement would
be a tree where instead of a list type, we have a name for a
two-attribute value in our tree. So, in our relation, we have
attribute A and attribute B and in our tree, we have names for A, B and
also the name C for A and B together (a COBOL FD for a VSAM file just
popped into my brain as clear as day, yikes). This tree with C having
children of A and B doesn't look like a SQL-happy tree.

> SQL has a hard time with
> list data or multivalue data, whether it's in a tree or not, so
> again I consider it an orthogonal issue.
>
>
> > It might simply be that I'm not catching on to your terms here and if
> > so, I apologize. If I am understanding, then I can agree that you can
> > say that the trees that SQL handles well are static heterogeneous, but
> > you cannot say that SQL handles all such trees well.
>
> I think we're in agreement at this point.
>
>
> > > The fact that so many progammers favor iteration over recursion
> > > is something I consider odd, given that iteration is strictly
> > > less powerful than recursion.
> >
> > Yes, but if you can iterate instead of using recursion, then you don't
> > set yourself up for a stack overflow, for example.
>
> Tail call optimization can make most or all of this issue go away.
> (This raises a question for me, which is, is it the case that TCO
> can convert any iterative construct into a recursive one that uses
> only constant stack space? I think the answer might be yes, because
> I think I can see how to write a tail-recursive 'while', and I think
> all iterative constructs are just syntax on while.)

Over my head on that one -- TCO to me is only total-cost-of-ownership.
Are you saying that if I write a recursive method in Java then the
compiler has or might have a feature that mitigates this or that the
run-time environment would have this feature? Have I unnecessarily
dragged along a concern that I could have dropped long ago? Did I miss
a memo that everyone else got that said not to worry about recursion
eating memory?

>
> > With recursion, you
> > push the stack with each iteration and don't free up that memory until
> > you are done with the entire process. If an iterative algorithm is
> > possible, then I would only use recursion if that algorithm gives
> > improved performance or increased maintainability (at least that is all
> > I can think of now).
>
> If programmers put as much effort into recursion as they put into
> iteration, I assert it would provide increased maintainability.

I'll keep that in mind. If you have any "instead of doing this common
iteration, try it as this recursion" examples, pass them along, even if
OT.

I think it is done. They named it XQuery.

But seriously, while it isn't perfect by any stretch (I believe I
called it "dog ugly" in this forum), XQuery (with the update
capabilities too) does give me some hope. Unlike SQL, I think I could
build the api I would want to have on top of it.

<snip>

> > > Maybe a description of some
> > > minimal set of operators and what you can do with them? I am quite
> > > pleased with the fact that the RM has its minimal set of operations
> > > and its correspondence with first order logic; can you describe
> > > anything comparable?
> >
> > There are some papers along this line that Jan has pointed me to before
> > (so they are on my stolen computer). Perhaps googling for
> > functional-data-model or xml-data-model (I know, I know) would yield
> > results. I'll check at some point, but let me know if you find
> > something.
>
> I would suggest that knowing what the graph operators are is something
> you ought to have a good grip on if you're going to be advocating for
> graphs.

I'm just a practitioner dabbling in theory (and, worse yet, maybe just
a s/w dev manager dabbling in practice) but if I were told I had to
enumerate the graph functions I use, I would look up the xpath
functions on w3.org and start with those.

>
> > My point is that each metaphor is some partial version of the whole. I
> > just listened to a CD of a theologian talking about how Jesus used
> > parables and he said something like "a parable is a metaphor and a
> > metaphor is a lie". If you know the poem about the blind men and the
> > elephant, that makes a similar point to mine. We gain something by
> > working with propositions viewed as relations AND as "webs"
> > (do you prefer that to "graphs"?)
>
> I prefer "graphs."

of course I knew that ;-)

> > rather than restricting ourselves to one of these.
>
> But one can also *gain* from restrictions, as well. This point
> is often missed. It's why constraints are valuable, and it's
> why a minimal formalism is valuable.

I agree that such are valuable. I do have a big problem with the way
we handle constraints, however, as I've mentioned in the past. The
minimal formalism is good for theory and for a maintainable
implementation under the covers, but not necessarily the best api.

cheers! --dawn

Marshall Spight

unread,

Jul 25, 2005, 11:01:11 AM7/25/05

dawn wrote:
> Marshall Spight wrote:
> > It would not properly be used
> > to describe data in an enclosing scope.
>
> I think this is just a representation issue. Since you are talking
> about representing data in a tree model, I refered to an xml
> representation of the metadata and data rather than a table
> representation.

It seems to me that what's happening here is that because xml
spews attribute names all through its file format, you're
considering this as normal. But that pretty much only happens
with xml.

> > Metadata is schema information, or type information.
>
> Yes, indeed. Surely it includes the names of relations and header
> information such as column names, right?

Yes.

> > > But at the logical level, we
> > > only care about a value such as 4 if we have some context, semantics.
> > > We model propositions and retrieve the same. I used to say that I just
> > > need "String" and "Binary" as my data types, with other types
> > > inheriting from those. When I type in a 13 as a number it is a string
> > > of two integers, the same as if I type it in as a string.
> >
> > Okay, you're headed down a dead end road here. Watch out!
>
> I don't think so, but I'll bring my mace just in case.

What, are you hawkgirl now? :-)

> > When you type '1' and then '3', you have typed two characters.
> > You don't have a number yet.
>
> I do have a number, but I am only ttreating as a string (supertype).

You're conflating lexical and semantic issues. This is that dead
end I warned you about, and I fear you've driven into it at 60 mph.

> A number is a string with some additional functions. So, once I realize
> this string is a number, I can apply numeric functions. But for any
> data of type String or Document, I can treat it as a String.

It's certainly possible to build a system like this, but I wouldn't
want to use it. For one thing, it throws static typing out the window.

> > The
> > fact that some programming languages will implicitly convert
> > a string to a number in some contexts and add the resulting
> > numbers is simply a distraction; it does not mean that strings
> > and numbers are the same thing.
>
> The representation of a number, like the representation of a word, are
> Strings and that is what we are working with in our software
> applications. An Integer is-a String.

Again the pushing together of lexical and semantic issues.
The software that I write, in statically typed languages,
does not consider integer to be a subtype of string. Your
source code isn't your program, any more than the word "water"
is wet.

> I come from the data side of the house, but respect the fact that if
> you start marking up propositions in a consistent way, you can have a
> representation of structured data that moves the document in the
> direction of a database.

These all-string representations are intended for reading, not for
processing. Human reading is an important application of code and
data, but it's certainly not the only one.

> > I am fond of saying that Tim Berners-Lee got one important
> > thing right and every other important thing wrong, and in so doing
> > set software back 20 years. The two bloodiest casualties have
> > been UI and data management.
>
> Codd got some things right, Berners-Lee got some right, and I'm with
> those who are suggesting we put the chocolate and the peanut butter
> together.

Except Berners-Lee got so many things wrong, and incompatibly wrong
with so many right things. In essence, he did one tiny valuable
thing, which is put a GUI on a slightly updated FTP.

> > XML is all about strings.
>
> now we are getting somewhere. Some of those strings are numbers, some
> are dates, right?

No. Lexically, all my source code is a string; semantically, it
is many different types. A Java source file is one big string,
but there are int and classes and so forth in there that, when
you execute the program, are not strings at all. You could make
a system where they were strings at runtime as well, but such
a system would lose many valuable properties that Java has,
such as the ability to do static analysis, both by the human
and by the computer, for both correctness and performance reasons.
This price is too high for me; static analysis is one of the
most powerful tools available.

> > datatype is inferior to having
> > a variety of different types. The perl/tcl/html approach, where
> > everything is a string and you have a bajillion kinds of implicit
> > conversions might be fine if your goal is fast-and-loose,
>
> I've never been called that! But maybe we can get bigger bang for the
> buck s/w development with what you termed fast-and-loose.

Sure; for prototyping and other small-scale applications. But not
for data management applications in which the cost of corruption
is high.

> > but if you want any kind of discipline,
>
> and I definitely do
>
> > which you certainly do if
> > you want to manage data, then it's out the door.
>
> My type hierarchy has booleans, ints, etc, but these have an ancestor
> of String.

So, how do two strings sort: lexicographically or numerically? I guess
it depends on whether they are also numbers, right? When you sort a
list
of strings, some of which are numbers, which compare function do you
use? Or does it vary depending on which strings you're looking at?

Since int <: string, (<: means "is a subtype of") then I presume that
int has all the string methods available? So I can have a variable
of type int, with an int value in it (which is also a string) and
invoke a method to prepend a ~ character to the string, right? Now
my variable declared to be int does not contain an int value anymore.

> > By limiting yourself to how Joe Sixpack thinks about these things
> > today, you may gain some usability benefit in applications aimed
> > at the lowest common denominator,
>
> I'm aiming for a data model that works for humans and related software
> that does likewise.
>
> > but you're not going to discover
> > anything profound that way.
>
> Profound is not my goal - I'm looking for ease and low cost in software
> development and maintenance.

Yes, but you're trying to do it by dumbing things down. I don't think
that's going to work. I think what's needed is to smarten things up.
Not complicate them, mind you; make them smart and simple.

> > And, If Everyone Did It, (as Mom would
> > say) the forward progress of mankind would halt.
>
> A logical data model is about the interface between humans and the
> machine, even if those humans are s/w developers. It would be progress
> to have the computer meet the human closer to where the human lives.

> [...] I want to let the computer do the work of shaping data the way

> a computer needs to see it and make the API between person and
> representation of a data model more like the human intuitively
> perceives it.

I'm imagining you and some mathematician about a millenium ago, sitting
in an ivory tower. The guy comes to you and says, "I'm thinking about
this idea for a new number, which I call 'zee-row.' It represents the
absence of a number. You could use it as the result of some operations
that are currently considered illegal today, like subtracting X from
X."
And you'd say, "But that's not how the average person intuitively
perceives subtraction. Let's not pursue that approach; let's do
something more user-friendly."

> > Integer is most certainly not a document.
>
> in the interface between me and the computer, I only pass integers as
> strings, in document formats, typically with some metadata about the
> integer visible somewhere. But if you prefer, Integer is-a String.

And once you type the integer in, you can just forget about it?
No; the human is also in charge of the integer as it moves around
the computer, across function calls, across the network, into the
database, etc. And to manage this process effectively, he needs
a strong suite of tools, chief among them a type system, static
analysis, and a way to structure and constrain data.

It's not the case that the only thing that matters to the human
is the point of interface between the computer and the human.

> > > It is only when you look at the interface between the software
> > > and the machine that you would want to think of a number as not being a
> > > type of string.

Looking at this paragraph again, this is exactly what I disagree with.

> > > If you look at the input to and the output from the
> > > database, you will see it is all strings of data.
> >
> > You are confusing things again. I could just as well say it's
> > all pixels, because we use graphic displays these days. Would
> > you agree that the job of a dbms is to manage pixels?
>
> Nope. That is the representation of my document, words, sentences,
> strings or language in the computer. I am not using the term document
> as a single representation.

Correct. Pixels are a mere representation of strings. And in *exactly
the same way* strings are a mere representation of your integers.

Okay, I thought I said it pretty well the first time, but I'll try
again: in thinking about trees, I'm trying *just* to think about
trees. Anything that's also a problem when you have exactly one
level will of course also be a problem when you have a multi-level
tree. Solve the one-level case and you've solved the multi-level
case, assuming you handle multi-level data. So I don't consider
SQL's null problems to be a tree issue, even though those problems
*also* show up when you're thinking about trees.

I also said:

> > SQL has a hard time with
> > list data or multivalue data, whether it's in a tree or not, so
> > again I consider it an orthogonal issue.

which puts in fairly well, I think.

> > > > The fact that so many progammers favor iteration over recursion
> > > > is something I consider odd, given that iteration is strictly
> > > > less powerful than recursion.
> > >
> > > Yes, but if you can iterate instead of using recursion, then you don't
> > > set yourself up for a stack overflow, for example.
> >
> > Tail call optimization can make most or all of this issue go away.
> > (This raises a question for me, which is, is it the case that TCO
> > can convert any iterative construct into a recursive one that uses
> > only constant stack space? I think the answer might be yes, because
> > I think I can see how to write a tail-recursive 'while', and I think
> > all iterative constructs are just syntax on while.)
>
> Over my head on that one -- TCO to me is only total-cost-of-ownership.

My fault; I went from the term ("tail call optimization") to the
abbreviation ("TCO") too abruptly.

> Are you saying that if I write a recursive method in Java then the
> compiler has or might have a feature that mitigates this or that the
> run-time environment would have this feature?

The Java compiler probably won't, but it might. The Scheme compiler
is required to. Since Java is chock-full of iterative constructs,
it's not much of an issue; no one uses recursion much.

> Have I unnecessarily
> dragged along a concern that I could have dropped long ago?

Uh, yes.

> Did I miss
> a memo that everyone else got that said not to worry about recursion
> eating memory?

Well, you still have to worry if your language's implementation doesn't
have TCO. But it's not a *fundamental* problem. You also have to worry
if your recursive method isn't tail-recursive, but I'm proposing that
a recursive translation of an iterative algorithm can be necessarily
tail recursive. I'll still have to check on that.

> > If programmers put as much effort into recursion as they put into
> > iteration, I assert it would provide increased maintainability.
>
> I'll keep that in mind. If you have any "instead of doing this common
> iteration, try it as this recursion" examples, pass them along, even if
> OT.

Of course, my background is about 98% iteration. I work for a living
which means I've had to use C++ or Java for most of the time. (Before
that it was C and Fortran. :-)

As for cool examples, check out quicksort in Haskell:
http://www.haskell.org/aboutHaskell.html

Blew my mind the first time I saw it.

> > The question in the large is: what are all the different logical
> > data structures, how might we query them, and how might we update
> > them and constrain them? I think mankind will be working on this
> > for some time.
>
> I think it is done. They named it XQuery.

Uh, does it have natural join?

> I'm just a practitioner dabbling in theory (and, worse yet, maybe just
> a s/w dev manager dabbling in practice) but if I were told I had to
> enumerate the graph functions I use, I would look up the xpath
> functions on w3.org and start with those.

I have no reason to do this. I need some tiny glimmer of a reason
to suspect there's something interesting there before I look. Nothing
so far.

> > But one can also *gain* from restrictions, as well. This point
> > is often missed. It's why constraints are valuable, and it's
> > why a minimal formalism is valuable.
>
> I agree that such are valuable. I do have a big problem with the way
> we handle constraints, however, as I've mentioned in the past. The
> minimal formalism is good for theory and for a maintainable
> implementation under the covers, but not necessarily the best api.

Sure sure; we've had that converastion to death. I believe we entirely
agree on the analysis of the problem (for once,:-) and I think we
mostly
agree on the characteristics of the solution.

Marshall

PS. Good grief we are both long-winded, eh?

dawn

unread,

Jul 25, 2005, 4:22:15 PM7/25/05

Marshall Spight wrote:
> dawn wrote:
> > Marshall Spight wrote:

<snip>

> > > > But at the logical level, we
> > > > only care about a value such as 4 if we have some context, semantics.
> > > > We model propositions and retrieve the same. I used to say that I just
> > > > need "String" and "Binary" as my data types, with other types
> > > > inheriting from those. When I type in a 13 as a number it is a string
> > > > of two integers, the same as if I type it in as a string.
> > >
> > > Okay, you're headed down a dead end road here. Watch out!
> >
> > I don't think so, but I'll bring my mace just in case.
>
> What, are you hawkgirl now? :-)

We look alike and both carry mace, but otherwise no.

> > > When you type '1' and then '3', you have typed two characters.
> > > You don't have a number yet.
> >
> > I do have a number, but I am only ttreating as a string (supertype).
>
> You're conflating lexical and semantic issues. This is that dead
> end I warned you about, and I fear you've driven into it at 60 mph.

I'll step on the gas then.

Let's take petQty=1 and hairColor=brown
The "1" is no more a number than "brown" is a color. They are
morphemes. They are character/string/keyboard representations related
to oneness and brownness.

So, my Type hierarchy is different from others in that I recognize up
top that what I'm working with are representations -- that's the type
of stuff I've got for the computer to work with.

Next level, adding in semantics for more precision
I can further refine the types of 1 and brown by designating the 1 as a
string that represents an Integer and the brown as string that
represents a Color type, both of which can be descendants of the
String/Document/Words/Sentences/Language type. They are not strings
that represent songs or videos in the computer, afterall, they are the
content of documents. Adding in the semantics does not take the string
"brown" and turn it into brown, it simply recognizes that beyond the
fact that I have a string, it is a string that represents a Color. So,
not only can I extract a character from it, I can also find shoes to
match. Then if I further identify that this is not just any color, it
is a HairColor, I can apply more functions, such as determining what
would need to be added to the hair color to turn it strawberry blond.

I'm guessing you think I'm switching levels here between the character
string "1" and what it represents, but I am always talking about the
character string and functions on that string. I am using semantics
for the design and interpretation of the software but the computer
never has to comprehend the meaning.

>
> > A number is a string with some additional functions. So, once I realize
> > this string is a number, I can apply numeric functions. But for any
> > data of type String or Document, I can treat it as a String.
>
> It's certainly possible to build a system like this, but I wouldn't
> want to use it. For one thing, it throws static typing out the window.

It changes it, but definitely does not toss it out the window. It
becomes more flexible.

>
> > > The
> > > fact that some programming languages will implicitly convert
> > > a string to a number in some contexts and add the resulting
> > > numbers is simply a distraction; it does not mean that strings
> > > and numbers are the same thing.
> >
> > The representation of a number, like the representation of a word, are
> > Strings and that is what we are working with in our software
> > applications. An Integer is-a String.
>
> Again the pushing together of lexical and semantic issues.
> The software that I write, in statically typed languages,
> does not consider integer to be a subtype of string. Your
> source code isn't your program, any more than the word "water"
> is wet.

I think I am more consistent in not pretending that the word "brown"
really is a Color nor that "1" really is a number.

>
> > I come from the data side of the house, but respect the fact that if
> > you start marking up propositions in a consistent way, you can have a
> > representation of structured data that moves the document in the
> > direction of a database.
>
> These all-string representations are intended for reading, not for
> processing. Human reading is an important application of code and
> data, but it's certainly not the only one.

I'm thinking of the API between human and computer related to the data
model to be the software API for the data as well. Software works with
data models all the time and I want it to be easier and more
consistent, independent of whether data are to be stored on disk or
not. When it comes to computations and processing, the functions for
the more specific types can be applied as appropriate. If my "1" is-a
Integer, I can add 2 to it to get another representation -- "3"

<snip>

> Except Berners-Lee got so many things wrong, and incompatibly wrong
> with so many right things. In essence, he did one tiny valuable
> thing, which is put a GUI on a slightly updated FTP.

and it spread like wild fire.

>
> > > XML is all about strings.
> >
> > now we are getting somewhere. Some of those strings are numbers, some
> > are dates, right?
>
> No. Lexically, all my source code is a string; semantically, it
> is many different types. A Java source file is one big string,
> but there are int and classes and so forth in there that, when
> you execute the program, are not strings at all. You could make
> a system where they were strings at runtime as well, but such
> a system would lose many valuable properties that Java has,
> such as the ability to do static analysis, both by the human
> and by the computer, for both correctness and performance reasons.
> This price is too high for me; static analysis is one of the
> most powerful tools available.

You don't lose that as completely as you are suggesting.

> > > datatype is inferior to having
> > > a variety of different types. The perl/tcl/html approach, where
> > > everything is a string and you have a bajillion kinds of implicit
> > > conversions might be fine if your goal is fast-and-loose,
> >
> > I've never been called that! But maybe we can get bigger bang for the
> > buck s/w development with what you termed fast-and-loose.
>
> Sure; for prototyping and other small-scale applications. But not
> for data management applications in which the cost of corruption
> is high.

I'm definitely aiming for highly scalable apps and quality data. Make
it painful (even if not technically hard) to alter a data name or type
when requirements change and you will get bad data and work-arounds.
I'll cut the rest of this PA, 'cause this thread is getting long.

> So, how do two strings sort: lexicographically or numerically?

if you are treating values as strings, then lex... and if they are both
of the subtype number, then the sorting function there overrides the
string sort.

> I guess
> it depends on whether they are also numbers, right? When you sort a
> list
> of strings, some of which are numbers, which compare function do you
> use? Or does it vary depending on which strings you're looking at?

You cannot sort a set of Colors and Integers together unless you bump
up in the type hierarchy until you are seeing them both as Strings or
something with the same ordering.

> Since int <: string, (<: means "is a subtype of") then I presume that
> int has all the string methods available?

yes

> So I can have a variable
> of type int, with an int value in it (which is also a string) and
> invoke a method to prepend a ~ character to the string, right?

Yes and that would not violate that this was a string. I do realize
this introduces back in some of the problems that the DBMS was built to
eliminate. Different tools are then needed to do something similar to
what the dbms does now. The biggest problems I see with this are in
cases where there is a DBMS that is maintained directly through the
dbms's api with appliations from different top level owners where there
is no ability to have tools that inspect source code. Since I want
that source code all persisted with any databases it updates, each
database would have all the data and functions it needs to address
inconsistencies.

This is not unlike what mountain man was interested in doing, but he
was taking everything out of other s/w apps and putting it in the dbms
as code, while I'm taking it out of the dbms as code and giving it back
to the dbms as data (some of which is code). And, granted, I have a
concept but the devil is in the details. Until perfection is reached,
I would have a different set of risks and flexibility than with a
current sql-dbms.

<snip

> but you're trying to do it by dumbing things down. I don't think
> that's going to work. I think what's needed is to smarten things up.

I smartened them up. The software now knows that I really can put a
tilde in front of a string even though it used to be a number and that
I just stopped it from being viewed as a number. It was smart enough
to accomodate this change to data values without me having to do
anything other than code the application differently and address any
warnings my tools give me.

> Not complicate them, mind you; make them smart and simple.

precisely.

<snip>

> I'm imagining you and some mathematician about a millenium ago, sitting
> in an ivory tower. The guy comes to you and says, "I'm thinking about
> this idea for a new number, which I call 'zee-row.' It represents the
> absence of a number. You could use it as the result of some operations
> that are currently considered illegal today, like subtracting X from
> X."
> And you'd say, "But that's not how the average person intuitively
> perceives subtraction. Let's not pursue that approach; let's do
> something more user-friendly."

humorous, but not accurate nor to the point IMO. I wrote a paragraph
on the flaws in this analogy, but it was boring even if true, so I'll
spare you.

>
> > > Integer is most certainly not a document.
> >
> > in the interface between me and the computer, I only pass integers as
> > strings, in document formats, typically with some metadata about the
> > integer visible somewhere. But if you prefer, Integer is-a String.
>
> And once you type the integer in, you can just forget about it?
> No; the human is also in charge of the integer as it moves around
> the computer, across function calls, across the network, into the
> database, etc. And to manage this process effectively, he needs
> a strong suite of tools,

Yes, she does.

> chief among them a type system, static
> analysis, and a way to structure and constrain data.

I don't disagree, just have a more flexible way of doing that IMO.

<snip>

>
> > > > It is only when you look at the interface between the software
> > > > and the machine that you would want to think of a number as not being a
> > > > type of string.
>
> Looking at this paragraph again, this is exactly what I disagree with.

In order for me to agree with it, I have to add to the start "Within
the software, ...". The computer (behind the scenes software) might
want to persist integers differently in memory or on disk than if they
were strings of numeric characters. It can do that behind the scenes.
Otherwise it simply needs to know what functions to apply and how to
apply them for all subtypes of strings.

Then outside of the computer, the s/w developer needs to know semantics
in order to develop the software properly, defining and applying
functions appropriate to the types of variables, for example.

Basically, I'm taking the schema and constraints out of the dbms tool
and putting it with all of the rest of the code, so it is handled just
like other data models used in the code, such as the UI data model.
The software applications should be able to execute a function on a
model of some data that gets the output to a browser and another that
gets the output to a database for storage on disk. It should be able to
execute a function on a model of data that pulls in values from a
browser or from a web service, xml document, or database.

<snip>

> > Unless I am misunderstanding, a counter-example to that statement would
> > be a tree where instead of a list type, we have a name for a
> > two-attribute value in our tree. So, in our relation, we have
> > attribute A and attribute B and in our tree, we have names for A, B and
> > also the name C for A and B together (a COBOL FD for a VSAM file just
> > popped into my brain as clear as day, yikes). This tree with C having
> > children of A and B doesn't look like a SQL-happy tree.
>
> Okay, I thought I said it pretty well the first time, but I'll try
> again: in thinking about trees, I'm trying *just* to think about
> trees.

Trees with nodes that could have random values, completely independent
of anything else? Then how do you get SQL into this picture. You are
right, I'm confused.

> Anything that's also a problem when you have exactly one
> level will of course also be a problem when you have a multi-level
> tree. Solve the one-level case and you've solved the multi-level
> case, assuming you handle multi-level data. So I don't consider
> SQL's null problems to be a tree issue, even though those problems
> *also* show up when you're thinking about trees.
>

You don't have to try to get me to understand, but I really am confused
about the trees you are looking at and if you give my brain (I swear it
used to be a whole lot better) another chance, I will try again. You
are saying that all trees that have a certain form are easy for SQL.
But without something on those nodes, and only a general shape for the
tree, I'm just not getting it.

>
> I also said:
>
> > > SQL has a hard time with
> > > list data or multivalue data, whether it's in a tree or not, so
> > > again I consider it an orthogonal issue.
>
> which puts in fairly well, I think.

Then what does SQL have to do with your tree? What does your tree look
like and how can SQL work with it?

>
> > > > > The fact that so many progammers favor iteration over recursion
> > > > > is something I consider odd, given that iteration is strictly
> > > > > less powerful than recursion.
> > > >
> > > > Yes, but if you can iterate instead of using recursion, then you don't
> > > > set yourself up for a stack overflow, for example.
> > >
> > > Tail call optimization can make most or all of this issue go away.
> > > (This raises a question for me, which is, is it the case that TCO
> > > can convert any iterative construct into a recursive one that uses
> > > only constant stack space? I think the answer might be yes, because
> > > I think I can see how to write a tail-recursive 'while', and I think
> > > all iterative constructs are just syntax on while.)
> >
> > Over my head on that one -- TCO to me is only total-cost-of-ownership.
>
> My fault; I went from the term ("tail call optimization") to the
> abbreviation ("TCO") too abruptly.

No, I caught that you were using TCO for tail call optimization -- I
just hadn't heard it before.

> > Are you saying that if I write a recursive method in Java then the
> > compiler has or might have a feature that mitigates this or that the
> > run-time environment would have this feature?
>
> The Java compiler probably won't, but it might. The Scheme compiler
> is required to. Since Java is chock-full of iterative constructs,
> it's not much of an issue; no one uses recursion much.
>
> > Have I unnecessarily
> > dragged along a concern that I could have dropped long ago?
>
> Uh, yes.

OK, I can still be taught new tricks.

> > Did I miss
> > a memo that everyone else got that said not to worry about recursion
> > eating memory?
>
> Well, you still have to worry if your language's implementation doesn't
> have TCO.

OK, so I won't do a major shift right now then.

> But it's not a *fundamental* problem. You also have to worry
> if your recursive method isn't tail-recursive, but I'm proposing that
> a recursive translation of an iterative algorithm can be necessarily
> tail recursive.

this is outside of anything I know about

> I'll still have to check on that.

While optimizations are taking place, perhaps it could rewrite the code
to show recursion instead of iteration so I don't have to change? :-)

> > > If programmers put as much effort into recursion as they put into
> > > iteration, I assert it would provide increased maintainability.
> >
> > I'll keep that in mind. If you have any "instead of doing this common
> > iteration, try it as this recursion" examples, pass them along, even if
> > OT.
>
> Of course, my background is about 98% iteration. I work for a living
> which means I've had to use C++ or Java for most of the time. (Before
> that it was C and Fortran. :-)
>
> As for cool examples, check out quicksort in Haskell:
> http://www.haskell.org/aboutHaskell.html
>
> Blew my mind the first time I saw it.

Will do.

>
> > > The question in the large is: what are all the different logical
> > > data structures, how might we query them, and how might we update
> > > them and constrain them? I think mankind will be working on this
> > > for some time.
> >
> > I think it is done. They named it XQuery.
>
> Uh, does it have natural join?

I'm guessing you know the answer, eh?

>
> > I'm just a practitioner dabbling in theory (and, worse yet, maybe just
> > a s/w dev manager dabbling in practice) but if I were told I had to
> > enumerate the graph functions I use, I would look up the xpath
> > functions on w3.org and start with those.
>
> I have no reason to do this. I need some tiny glimmer of a reason
> to suspect there's something interesting there before I look. Nothing
> so far.

fair enough.

> > > But one can also *gain* from restrictions, as well. This point
> > > is often missed. It's why constraints are valuable, and it's
> > > why a minimal formalism is valuable.
> >
> > I agree that such are valuable. I do have a big problem with the way
> > we handle constraints, however, as I've mentioned in the past. The
> > minimal formalism is good for theory and for a maintainable
> > implementation under the covers, but not necessarily the best api.
>
> Sure sure; we've had that converastion to death. I believe we entirely
> agree on the analysis of the problem (for once,:-) and I think we
> mostly
> agree on the characteristics of the solution.
>
> Marshall
>
> PS. Good grief we are both long-winded, eh?

Yup, let's just hope no one else is attempting to follow this one.
smiles
--dawn

Marshall Spight

unread,

Jul 26, 2005, 1:06:50 AM7/26/05

dawn wrote:
> Marshall Spight wrote:
> > > > Okay, you're headed down a dead end road here. Watch out!
> > >
> > > I don't think so, but I'll bring my mace just in case.
> >
> > What, are you hawkgirl now? :-)
>
> We look alike and both carry mace, but otherwise no.

Ha ha! Hawkgirl is my second favorite member of the JLA.

> > You're conflating lexical and semantic issues. This is that dead
> > end I warned you about, and I fear you've driven into it at 60 mph.
>
> I'll step on the gas then.

I admire your spirit!

> > > A number is a string with some additional functions. So, once I realize
> > > this string is a number, I can apply numeric functions. But for any
> > > data of type String or Document, I can treat it as a String.
> >
> > It's certainly possible to build a system like this, but I wouldn't
> > want to use it. For one thing, it throws static typing out the window.
>
> It changes it, but definitely does not toss it out the window. It
> becomes more flexible.

I don't believe it's possible to have a static type system where
you can update variables in such a way that they become of a more
general type. Once you do that, the type of the variable has to
change at runtime, and if that happens, you do not have a *static*
type system by definition.

I can see how to make your idea work with a dynamically typed
language, but not with a statically typed one.

> The representation of a number, like the representation of a word, are
> Strings and that is what we are working with in our software
> applications. An Integer is-a String.

I do not agree that what our software works on is simply the string
representation of our data. (It is in TCL, and some other system,
but not generally.)

> If my "1" is-a
> Integer, I can add 2 to it to get another representation -- "3"

You sure can, but only in a dynamically typed language with implicit
coercions. These are certainly workable, but I don't consider them
a good choice for data management.

> > No. Lexically, all my source code is a string; semantically, it
> > is many different types. A Java source file is one big string,
> > but there are int and classes and so forth in there that, when
> > you execute the program, are not strings at all. You could make
> > a system where they were strings at runtime as well, but such
> > a system would lose many valuable properties that Java has,
> > such as the ability to do static analysis, both by the human
> > and by the computer, for both correctness and performance reasons.
> > This price is too high for me; static analysis is one of the
> > most powerful tools available.
>
> You don't lose that as completely as you are suggesting.

So you say, but you don't explain how to get around the problem
with my below "prepend the tilde" example.

> > So, how do two strings sort: lexicographically or numerically?
>
> if you are treating values as strings, then lex... and if they are both
> of the subtype number, then the sorting function there overrides the
> string sort.

And does this determination happen at runtime or at compile time?
If the answer is "at runtime" then you've precluded static typing.
Which is not necessarily disastrous, but it's not a choice I'd make.

> You cannot sort a set of Colors and Integers together unless you bump
> up in the type hierarchy until you are seeing them both as Strings or
> something with the same ordering.

But this should be automatic, right? I mean, that's the definition of
subtyping: being able to use a more specific type in place of a more
general supertype. In this case, since strings can be sorted, and
you've stated that everything is a subtype of string, then everything
can be sorted. So you necessarily can sort colors and ints together.
The question is, what sort function is used?

Or are you doing away with substitution?

> > So I can have a variable
> > of type int, with an int value in it (which is also a string) and
> > invoke a method to prepend a ~ character to the string, right?
>
> Yes and that would not violate that this was a string.

But it *would* violate that this was an int, and so the *variable*
would either have to change its type, or not have one in the first
place (dynamic typing) or else your system will allow variables
to contain values of a different type than the variable is declared
as. Or you just don't allow update operators.

Or you just don't have variables. But then it's hard to manage
updatable data.

On a related note, I really don't think that if you sat down and,
without thinking about representation, wrote down all the operators
that apply to string and all the opertors that apply to int, I don't
think you'd see much overlap. You certainly don't in most popular
languages. Now, it's certainly possible to treat strings as if
they were integers via a partial mapping. It's also possible to
have a (partial or total) mapping from integers into strings.
This doesn't mean they are the same thing, though, any more
than a mapping from the even numbers to the odd numbers means
that even and odd numbers are the same. You can *represent*
every even number with an odd number, you know.

> Since I want
> that source code all persisted with any databases it updates, each
> database would have all the data and functions it needs to address
> inconsistencies.

This seems like it would really increase the coupling, which I don't
think you'd want to do. Wouldn't it be better to have the system
such that the applications were *less* coupled to the dbms rather
than *more* coupled?

> > I'm imagining you and some mathematician about a millenium ago, sitting
> > in an ivory tower. The guy comes to you and says, "I'm thinking about
> > this idea for a new number, which I call 'zee-row.' It represents the
> > absence of a number. You could use it as the result of some operations
> > that are currently considered illegal today, like subtracting X from
> > X."
> > And you'd say, "But that's not how the average person intuitively
> > perceives subtraction. Let's not pursue that approach; let's do
> > something more user-friendly."
>
> humorous, but not accurate nor to the point IMO. I wrote a paragraph
> on the flaws in this analogy, but it was boring even if true, so I'll
> spare you.

Well, I didn't particularly intend it to be humorous, although
perhaps I am hilarious just out of habit. Ha ha, I'm funny!

My point was that I don't think it's a good idea to use "how
people think about things today" as a hard design constraint,
because it precludes any possiblility of coming up with *a better
way to think about things*, which is where the *real* progress
is made.

> > And once you type the integer in, you can just forget about it?
> > No; the human is also in charge of the integer as it moves around
> > the computer, across function calls, across the network, into the
> > database, etc. And to manage this process effectively, he needs
> > a strong suite of tools,
>
> Yes, she does.

[Let me just state for the record that my singular indefinite "he"
is not gender specific. Rather it is a consequence of the lack of
a gender inspecific pronoun in the English language, coupled with
a wish to avoid the difficulties of speaking of indefinite people
in the plural. Void where prohibited. Driver carries no change.]

> > chief among them a type system, static
> > analysis, and a way to structure and constrain data.
>
> I don't disagree, just have a more flexible way of doing that IMO.

I don't think you can do any static analysis with your approach.
You can still do structure and constraints, though.

> In order for me to agree with it, I have to add to the start "Within
> the software, ...". The computer (behind the scenes software) might
> want to persist integers differently in memory or on disk than if they
> were strings of numeric characters. It can do that behind the scenes.

It cannot do these things without a static type system...

> Otherwise it simply needs to know what functions to apply and how to
> apply them for all subtypes of strings.

... But this part does not require static typing.

> Then outside of the computer, the s/w developer needs to know semantics
> in order to develop the software properly, defining and applying
> functions appropriate to the types of variables, for example.

This also does not require static typing. (Although I and others
claim this is easier for the developer to do when static typing
is available-- but that claim is merely anecdotal.)

> Basically, I'm taking the schema and constraints out of the dbms tool
> and putting it with all of the rest of the code, so it is handled just
> like other data models used in the code, such as the UI data model.
> The software applications should be able to execute a function on a
> model of some data that gets the output to a browser and another that
> gets the output to a database for storage on disk. It should be able to
> execute a function on a model of data that pulls in values from a
> browser or from a web service, xml document, or database.

Have you worked much with multi-application databases? Because this
seems hard to do in that situation. The UI for a particular
program is not shared among different applications; the schema
and constraints are.

> > Okay, I thought I said it pretty well the first time, but I'll try
> > again: in thinking about trees, I'm trying *just* to think about
> > trees.
>
> Trees with nodes that could have random values, completely independent
> of anything else? Then how do you get SQL into this picture. You are
> right, I'm confused.

It's not that they are random, nor that they are independent of
anything
else. It's simply that I'm trying to *isolate* those properties
specific to trees.

Java does not have a way to declare that a reference type must
not be null. (Other languages, including Nice, and SQL, do.)
Let's say I was looking at tree handling in Java. I can handle
dynamic trees nicely in Java. But HA! you point out:
Java doesn't have a way to specify that a reference type within
that tree must be non-null. So Java doesn't handle trees that
well. I say, yes, that *is* a flaw with Java, but it's not a
flaw with how Java handles *trees*, it's a flaw with how Java
handles reference types. It's not a tree issue at all; it's
a reference type issue. If you fix this issue, as has been
done in Nice, it doesn't mean you can handle trees any better;
it means you can handle reference types better, whether they
occur in trees, or in lists, or in objects, or by themselves.

Likewise, SQL's limitations regarding multivalue attributes,
or ordered data, are real, but you run into them all over
the place; they don't have anything to do with trees *per se.*
If you added a generic list type to SQL, it wouldn't make it
any better or any worse at handling static heterogeneous

> You don't have to try to get me to understand, but I really am confused
> about the trees you are looking at and if you give my brain (I swear it
> used to be a whole lot better) another chance, I will try again. You
> are saying that all trees that have a certain form are easy for SQL.
> But without something on those nodes, and only a general shape for the
> tree, I'm just not getting it.

I hope the above helps. Again, it's not that the nodes don't have
anything in them, it's just that, if I'm thinking specifically about
trees, I don't want to consider *at the same time* issues that SQL
has whether my data is tree-structured or not.

> > > Did I miss
> > > a memo that everyone else got that said not to worry about recursion
> > > eating memory?
> >
> > Well, you still have to worry if your language's implementation doesn't
> > have TCO.
>
> OK, so I won't do a major shift right now then.

If you're thinking about *language design*, you should probably
incorporate this information right away. I believe that it is
a better design choice for a language to include recursion along
with a "marketing technique" for conveniently expressing iterative
algorithms that is implemented in terms of recursion, for reeling
in the C++/Java people. I base this on the fact that recursion is
strictly more powerful than iteration. Any time I see two language
features, and I have to have both of them, and one is a superset
of the other, I figure it makes sense to put the larger one in,
and define the smaller one in terms of the larger. Roughly speaking.

> > As for cool examples, check out quicksort in Haskell:
> > http://www.haskell.org/aboutHaskell.html
> >
> > Blew my mind the first time I saw it.
> Will do.

I can't recommend looking at lots of different languages enough.
If all you've ever encountered is the Algol-family, like I had
when I started this whole thing, you've encountered only a very
narrow slice of what's possible.

> > > > The question in the large is: what are all the different logical
> > > > data structures, how might we query them, and how might we update
> > > > them and constrain them? I think mankind will be working on this
> > > > for some time.
> > >
> > > I think it is done. They named it XQuery.
> >
> > Uh, does it have natural join?
>
> I'm guessing you know the answer, eh?

I'm guessing it's "no." Having used join a lot and been wildly
impressed
by its expressive power, I now consider it a must-have language
feature.

I have to say, I don't think the process is done yet.

> > PS. Good grief we are both long-winded, eh?
>
> Yup, let's just hope no one else is attempting to follow this one.

On the one hand, one imagines that there are a thousand lurkers
for every poster. On the other hand, this thread feels like just
you + me + crickets. I think I hear them chirping now.

Marshall

dawn

unread,

Jul 26, 2005, 10:18:38 AM7/26/05

Marshall Spight wrote:
> dawn wrote:
> > Marshall Spight wrote:

I'll try to snip mercilessly and attempt short responses

> I don't believe it's possible to have a static type system where
> you can update variables in such a way that they become of a more
> general type.

...

> Once you do that, the type of the variable has to
> change at runtime,

A value can be seen as being of a more general type with a different
variable. I might not have my terms correct, but I think of it as
static typing if each variable itself doesn't change type. There is no
reason to require exactly one name/type per data attribute value. That
is what I propose, recognizing there are associated risks. A compiler
can only find inconsistencies within a compiled unit, but a database
that includes all code that uses the dbms api would have all such units
available.
...

>
> > > So I can have a variable
> > > of type int, with an int value in it (which is also a string) and
> > > invoke a method to prepend a ~ character to the string, right?
> >
> > Yes and that would not violate that this was a string.
>
> But it *would* violate that this was an int, and so the *variable*
> would either have to change its type,

The key to this is that you also have another variable referencing the
very same value, but of a different type. You asked if you could have
a variable of type int, with an int value and invoke a method ... and
you can, but you would not stick the tilde into that value using your
int variable. If there is a chance for your int variable to do
something with such a value after you do that, that would give a
compiler error.
...

> On a related note, I really don't think that if you sat down and,
> without thinking about representation, wrote down all the operators
> that apply to string and all the opertors that apply to int, I don't
> think you'd see much overlap.

No, but look at your data and see how often a value intended to be an
int needs to be treated as a string (e.g. UI I-O)

> > Since I want
> > that source code all persisted with any databases it updates, each
> > database would have all the data and functions it needs to address
> > inconsistencies.
>
> This seems like it would really increase the coupling,

Where are schema stored today? Do you have this same concern with
mountain man's approach of putting everything in the dbms?

> which I don't
> think you'd want to do. Wouldn't it be better to have the system
> such that the applications were *less* coupled to the dbms rather
> than *more* coupled?

Software apps could be seen as metadata, including validations,
contraints, etc. Lots more could be said on this one.
...

> > Basically, I'm taking the schema and constraints out of the dbms tool
> > and putting it with all of the rest of the code, so it is handled just
> > like other data models used in the code, such as the UI data model.
> > The software applications should be able to execute a function on a
> > model of some data that gets the output to a browser and another that
> > gets the output to a database for storage on disk. It should be able to
> > execute a function on a model of data that pulls in values from a
> > browser or from a web service, xml document, or database.
>
> Have you worked much with multi-application databases?

Yes, but not with databases where more than one company could use the
database api directly. I'll grant that in this case (do you know any
such cases, I'm sure there are some?), my thinking is flawed (and it is
likely flawed elsewhere too).

> Because this
> seems hard to do in that situation. The UI for a particular
> program is not shared among different applications; the schema
> and constraints are.

Don't forget that I'm requiring all code that uses the dbms api to be
available to the dbms.
...

> I hope the above helps. Again, it's not that the nodes don't have
> anything in them, it's just that, if I'm thinking specifically about
> trees, I don't want to consider *at the same time* issues that SQL
> has whether my data is tree-structured or not.

That didn't clarify it for me. I think at this point I would need your
definition of this type of tree restated, with an example of such a
tree and your claim about how SQL has any relationship to this type of
tree. However, if you are sure that what you are thinking is accurate,
you don't need to try to bring me along for the ride. Sometimes my
brain just doesn't work.
...

>
> If you're thinking about *language design*, you should probably
> incorporate this information right away.

I have no plans to design a language, but I'm happy to learn anyway
...

> Any time I see two language
> features, and I have to have both of them, and one is a superset
> of the other, I figure it makes sense to put the larger one in,
> and define the smaller one in terms of the larger. Roughly speaking.

Funny, but that is precisely why I advocate for a graph model over a
strictly relational model. There is no practical problem of which I am
aware in including set operations along with a graph model.
...

> I can't recommend looking at lots of different languages enough.
> If all you've ever encountered is the Algol-family, like I had
> when I started this whole thing, you've encountered only a very
> narrow slice of what's possible.

I'm at least passable in the following general purpose languages: Java,
COBOL/CICS, Fortran & BASIC (all of which I have taught to college
students at one time or another) and dabbled with several others
(Pascal, C, C++, RPG(!), ...). I did read a bit about Haskell and
other functional languages, but haven't played with them. Of course,
this is in addition to several non-general purpose languages such as
SQL, HTML (is it a language or a parm file?), maybe even JavaScript.

> > > > ... XQuery.

> > >
> > > Uh, does it have natural join?
> >
> > I'm guessing you know the answer, eh?
>
> I'm guessing it's "no." Having used join a lot and been wildly
> impressed
> by its expressive power, I now consider it a must-have language
> feature.

I don't actually know the answer, I just figured you did, given the
question. Perhaps an investigation is in order?

cheers! --dawn
==========OT below =============================

> > > And once you type the integer in, you can just forget about it?
> > > No; the human is also in charge of the integer as it moves around
> > > the computer, across function calls, across the network, into the
> > > database, etc. And to manage this process effectively, he needs
> > > a strong suite of tools,
> >
> > Yes, she does.
>
> [Let me just state for the record that my singular indefinite "he"
> is not gender specific. Rather it is a consequence of the lack of
> a gender inspecific pronoun in the English language, coupled with
> a wish to avoid the difficulties of speaking of indefinite people
> in the plural. Void where prohibited. Driver carries no change.]

I was hoping you would jump on that one. I just had a break-through in
this ongoing discussion with my husband. I proofread something he
wrote (that might sound ridiculous for anyone reading me here as I
don't proofread these and sometimes ramble on and on like this). I
told him that I really disliked the alternating he/she pronouns and
much preferred the plural pronoun to match a singular noun.

I used this specific document to show him that when the author is
intending to make the reader relate to a scenario but then tosses in a
pronoun that is not a match for the reader, the reader adapts by
thinking of some other person of the matching gender. They can still
understand the sentence and might not mind it, but they do not relate
to the statement in the same way. (See how easy it was to read "they"
for the reader -- it will get even easier to roll with it over time).
The scenario is then about someone else and not about the reader.

Given that we already use "you" for both singular and plural, what is
the harm in adapting our language to use "they" and "their" for both as
well? Let's just do it and be done with it. He agreed with the
argument this time (perhaps I've just worn him down) but changed the
wording to avoid the problem.

Marshall Spight

unread,

Jul 26, 2005, 10:49:04 AM7/26/05

dawn wrote:
> Marshall Spight wrote:
> > > > So I can have a variable
> > > > of type int, with an int value in it (which is also a string) and
> > > > invoke a method to prepend a ~ character to the string, right?
> > >
> > > Yes and that would not violate that this was a string.
> >
> > But it *would* violate that this was an int, and so the *variable*
> > would either have to change its type,
>
> The key to this is that you also have another variable referencing the
> very same value, but of a different type. You asked if you could have
> a variable of type int, with an int value and invoke a method ... and
> you can, but you would not stick the tilde into that value using your
> int variable. If there is a chance for your int variable to do
> something with such a value after you do that, that would give a
> compiler error.

Okay, but that means you can't in general substitute a subtype for
a supertype. That means your language doesn't have subtyping, only
inheritance. Of the two, subtyping is the more valuable, so I don't
think this is a good design choice.

> ...
> > On a related note, I really don't think that if you sat down and,
> > without thinking about representation, wrote down all the operators
> > that apply to string and all the opertors that apply to int, I don't
> > think you'd see much overlap.
>
> No, but look at your data and see how often a value intended to be an
> int needs to be treated as a string (e.g. UI I-O)

Yes, that's common. But sharing common functionality isn't how we
normally think of subtyping. Substitutibility is.

> > > Since I want
> > > that source code all persisted with any databases it updates, each
> > > database would have all the data and functions it needs to address
> > > inconsistencies.
> >
> > This seems like it would really increase the coupling,
>
> Where are schema stored today? Do you have this same concern with
> mountain man's approach of putting everything in the dbms?

I'm not sure I understand mountain man's ideas. But I certainly
think hard coupling applications to schema is going to reduce
maintainablitity. Look at how bad this issue is today. (Hey,
how do you do ad-doc queries if the database has to know
everything ahead of time?) I think the better approach is to
figure out how to have applications that are able to adjust to
schema dynamicaly.

> > which I don't
> > think you'd want to do. Wouldn't it be better to have the system
> > such that the applications were *less* coupled to the dbms rather
> > than *more* coupled?
>
> Software apps could be seen as metadata, including validations,
> contraints, etc. Lots more could be said on this one.

I don't see it. Type information, schema, and data values need
integrity management. What do apps need managed?

> > If you're thinking about *language design*, you should probably
> > incorporate this information right away.
>
> I have no plans to design a language, but I'm happy to learn anyway

I think a lot of the issues you are talking about above are
language design issues.

> ...
> > Any time I see two language
> > features, and I have to have both of them, and one is a superset
> > of the other, I figure it makes sense to put the larger one in,
> > and define the smaller one in terms of the larger. Roughly speaking.
>
> Funny, but that is precisely why I advocate for a graph model over a
> strictly relational model. There is no practical problem of which I am
> aware in including set operations along with a graph model.

But since the reverse is also true, and the relational model is
simpler, that tends in favor of the relational model as the
design choice.

> I'm at least passable in the following general purpose languages: Java,
> COBOL/CICS, Fortran & BASIC (all of which I have taught to college
> students at one time or another) and dabbled with several others

> (Pascal, C, C++, RPG(!), ...) [...] Of course,

> this is in addition to several non-general purpose languages such as
> SQL, HTML (is it a language or a parm file?), maybe even JavaScript.

Mostly, these languages are all of the same family. SQL and Javascript
are the exceptions. We all know about SQL, ha ha. With Javascript
you might not have run into the differences because Javascript often
is used in a very dynamically-typed-Java style.

> ==========OT below =============================
> > > > And once you type the integer in, you can just forget about it?
> > > > No; the human is also in charge of the integer as it moves around
> > > > the computer, across function calls, across the network, into the
> > > > database, etc. And to manage this process effectively, he needs
> > > > a strong suite of tools,
> > >
> > > Yes, she does.
> >

> > [Let me just state for the record that [...]
>
> [...]

> Given that we already use "you" for both singular and plural, what is
> the harm in adapting our language to use "they" and "their" for both as
> well? Let's just do it and be done with it. He agreed with the
> argument this time (perhaps I've just worn him down) but changed the
> wording to avoid the problem.

Yeah, but then everything else has to become plural, and that's often
awkward.

"And once they type the integer in, can they just forget about it?
No; the humans are also in charge of the integer as it moves around

the computer, across function calls, across the network, into the

database, etc. And to manage this process effectively, they need

a strong suite of tools,"

Marshall

dawn

unread,

Jul 26, 2005, 11:47:07 AM7/26/05

Marshall Spight wrote:
> dawn wrote:
> > Marshall Spight wrote:

only time for one of the topics

> > ...
> > > Any time I see two language
> > > features, and I have to have both of them, and one is a superset
> > > of the other, I figure it makes sense to put the larger one in,
> > > and define the smaller one in terms of the larger. Roughly speaking.
> >
> > Funny, but that is precisely why I advocate for a graph model over a
> > strictly relational model. There is no practical problem of which I am
> > aware in including set operations along with a graph model.
>
> But since the reverse is also true,

but didn't you start out trying to figure out how to minimally extend
the relational model to handle graphs or something like that?

> and the relational model is simpler,

Perhaps it is theoretically simpler, which translates to simplicity for
the lower level software, but I have never seen it be simpler in
practice for building applications. "Simpler" for whom?

> that tends in favor of the relational model as the
> design choice.

My experience related to my and my teams' productivity with (flawed)
implementations of each, tells me otherwise (by orders of magnitude in
dollars). Admittedly, there are trade-offs and there were more
differences than relational vs graph models, such as 3VL vs 2VL, strong
vs. weak typing, etc.

--dawn

Gene Wirchenko

unread,

Jul 26, 2005, 1:01:02 PM7/26/05

On 25 Jul 2005 22:06:50 -0700, "Marshall Spight"
<marshal...@gmail.com> wrote:

>dawn wrote:
>> Marshall Spight wrote:

[snip]

>On a related note, I really don't think that if you sat down and,
>without thinking about representation, wrote down all the operators

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
I think that this is the important part.

>that apply to string and all the opertors that apply to int, I don't
>think you'd see much overlap. You certainly don't in most popular
>languages. Now, it's certainly possible to treat strings as if

[snip]

>> > I'm imagining you and some mathematician about a millenium ago, sitting
>> > in an ivory tower. The guy comes to you and says, "I'm thinking about
>> > this idea for a new number, which I call 'zee-row.' It represents the
>> > absence of a number. You could use it as the result of some operations
>> > that are currently considered illegal today, like subtracting X from
>> > X."
>> > And you'd say, "But that's not how the average person intuitively
>> > perceives subtraction. Let's not pursue that approach; let's do
>> > something more user-friendly."
>>
>> humorous, but not accurate nor to the point IMO. I wrote a paragraph
>> on the flaws in this analogy, but it was boring even if true, so I'll
>> spare you.
>
>Well, I didn't particularly intend it to be humorous, although
>perhaps I am hilarious just out of habit. Ha ha, I'm funny!
>
>My point was that I don't think it's a good idea to use "how
>people think about things today" as a hard design constraint,
>because it precludes any possiblility of coming up with *a better
>way to think about things*, which is where the *real* progress
>is made.

And even so, it can take a long time to get it into use.

[snip]

>[Let me just state for the record that my singular indefinite "he"
>is not gender specific. Rather it is a consequence of the lack of
>a gender inspecific pronoun in the English language, coupled with
>a wish to avoid the difficulties of speaking of indefinite people
>in the plural. Void where prohibited. Driver carries no change.]

There are two. One is "he" which has a gender-neutral meaning
when gender is unknown. The other is "it".

[snip]

>> > As for cool examples, check out quicksort in Haskell:
>> > http://www.haskell.org/aboutHaskell.html
>> >
>> > Blew my mind the first time I saw it.
>> Will do.
>
>I can't recommend looking at lots of different languages enough.
>If all you've ever encountered is the Algol-family, like I had
>when I started this whole thing, you've encountered only a very
>narrow slice of what's possible.

That was interesting code. I am interested in how a language
works in general, since it is not enough just to code the part that
language is good for.

[snip]

>> Yup, let's just hope no one else is attempting to follow this one.

Bzzzt!

>On the one hand, one imagines that there are a thousand lurkers
>for every poster. On the other hand, this thread feels like just
>you + me + crickets. I think I hear them chirping now.

I do *not* chirp while reading.

Sincerely,

Gene Wirchenko

paul c

unread,

Jul 26, 2005, 2:11:07 PM7/26/05

Marshall Spight wrote:
> ...

>>>database, etc. And to manage this process effectively, he needs
>>>a strong suite of tools,
>>
>>Yes, she does.
>
>
> [Let me just state for the record that my singular indefinite "he"
> is not gender specific. Rather it is a consequence of the lack of
> a gender inspecific pronoun in the English language,

> ...

notwithstanding its idiosyncratic grammar, part of the economy of
written English is due to the fact that 'he' was regarded for centuries
as standing for anybody, ie. it was understood, at least in print, to be
gender non-specific. literal revisionists have created just as many
problems based on imaginary slurs that don't really need to be solved in
the social sciences as they have in the computer sciences. i'm
surprised that they haven't tried to replace the royal 'we' that so many
writers still use.

--
Apologies for my broken keyboard. I'm using the keyboard combination
'kw' to substitute for the broken key that stands for the letter that
falls between 'p' and 'r' in this alphabet.

dawn

unread,

Jul 26, 2005, 3:56:26 PM7/26/05

Gene Wirchenko wrote:
> On 25 Jul 2005 22:06:50 -0700, "Marshall Spight"
> <marshal...@gmail.com> wrote:
>
> >dawn wrote:
> >> Marshall Spight wrote:
>
> [snip]
>
> >On a related note, I really don't think that if you sat down and,
> >without thinking about representation, wrote down all the operators
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> I think that this is the important part.

definitely important for the conceptual and logical data models

> >[Let me just state for the record that my singular indefinite "he"
> >is not gender specific. Rather it is a consequence of the lack of
> >a gender inspecific pronoun in the English language, coupled with
> >a wish to avoid the difficulties of speaking of indefinite people
> >in the plural. Void where prohibited. Driver carries no change.]
>
> There are two. One is "he" which has a gender-neutral meaning
> when gender is unknown.

but does not (typically) render a gender-neutral picture in people's
minds (at least not in my mind). Men might view themselves as a
possible "he" while women will tend to picture someone apart from
themselves filling that role.

I have no idea to what extent such sloppiness in the language from the
start has lead to incorrect perceptions, but I'm quite sure the set
thereof is not null.

> The other is "it".

I'd prefer to be called an "it" than a "he". In the 80's most of my IT
mailings were addressed to Donald Wolthuis. So, some people even
ensure that the proper nouns sound male. Enough already!

> [snip]
>
> >> > As for cool examples, check out quicksort in Haskell:
> >> > http://www.haskell.org/aboutHaskell.html
> >> >
> >> > Blew my mind the first time I saw it.
> >> Will do.
> >
> >I can't recommend looking at lots of different languages enough.
> >If all you've ever encountered is the Algol-family, like I had
> >when I started this whole thing, you've encountered only a very
> >narrow slice of what's possible.
>
> That was interesting code. I am interested in how a language
> works in general, since it is not enough just to code the part that
> language is good for.
>
> [snip]
>
> >> Yup, let's just hope no one else is attempting to follow this one.
>
> Bzzzt!

Rats -- you found us. And, as with any of my postings, if I wrote
anything that wasn't exactly brilliant, please disregard and don't
count it against me. Are you left with a null? smiles. --dawn

Gene Wirchenko

unread,

Jul 26, 2005, 6:21:21 PM7/26/05

On 26 Jul 2005 12:56:26 -0700, "dawn" <dawnwo...@gmail.com> wrote:

>Gene Wirchenko wrote:
>> On 25 Jul 2005 22:06:50 -0700, "Marshall Spight"
>> <marshal...@gmail.com> wrote:

[snip]

>> >[Let me just state for the record that my singular indefinite "he"
>> >is not gender specific. Rather it is a consequence of the lack of
>> >a gender inspecific pronoun in the English language, coupled with
>> >a wish to avoid the difficulties of speaking of indefinite people
>> >in the plural. Void where prohibited. Driver carries no change.]
>>
>> There are two. One is "he" which has a gender-neutral meaning
>> when gender is unknown.
>
>but does not (typically) render a gender-neutral picture in people's
>minds (at least not in my mind). Men might view themselves as a
>possible "he" while women will tend to picture someone apart from
>themselves filling that role.
>
>I have no idea to what extent such sloppiness in the language from the
>start has lead to incorrect perceptions, but I'm quite sure the set
>thereof is not null.

Nor I.

>> The other is "it".
>
>I'd prefer to be called an "it" than a "he". In the 80's most of my IT
>mailings were addressed to Donald Wolthuis. So, some people even
>ensure that the proper nouns sound male. Enough already!

I, OTOH, have had my name corrected to "Jen" in one mailing where
it seems it was supposed that only women would be interested. I am
interested in how women there are in business.

[snip]

>> >> Yup, let's just hope no one else is attempting to follow this one.
>>
>> Bzzzt!
>
>Rats -- you found us. And, as with any of my postings, if I wrote
>anything that wasn't exactly brilliant, please disregard and don't
>count it against me. Are you left with a null? smiles. --dawn

Zero, ah, zee-row is not the same as null.

Sincerely,

Gene Wirchenko

dawn

unread,

Jul 27, 2005, 12:22:29 AM7/27/05

Gene Wirchenko wrote:
> On 26 Jul 2005 12:56:26 -0700, "dawn" <dawnwo...@gmail.com> wrote:
>> Are you left with a null?
>

> Zero, ah, zee-row is not the same as null.

I missed the word "set" before the ? I use a 2VL where a null is like
a null set. cheers --dawn

Marshall Spight

unread,

Jul 27, 2005, 11:44:08 AM7/27/05

dawn wrote:
>
> I missed the word "set" before the ? I use a 2VL where a null is like
> a null set.

What is a 'null set'? Is that like the empty set?

Marshall

dawn

unread,

Jul 27, 2005, 12:34:50 PM7/27/05

yes

http://www.swif.uniba.it/lei/foldop/foldoc.cgi?null+set
http://en.wikipedia.org/wiki/Null_set

--dawn

Gene Wirchenko

unread,

Jul 27, 2005, 12:45:24 PM7/27/05

What are the two values of your logic? I use true and false
myself.

Sincerely,

Gene Wirchenko

dawn

unread,

Jul 27, 2005, 2:12:04 PM7/27/05

Gene Wirchenko wrote:
> On 26 Jul 2005 21:22:29 -0700, "dawn" <dawnwo...@gmail.com> wrote:
>
> >Gene Wirchenko wrote:
> >> On 26 Jul 2005 12:56:26 -0700, "dawn" <dawnwo...@gmail.com> wrote:
> >>> Are you left with a null?
> >>
> >> Zero, ah, zee-row is not the same as null.
> >
> >I missed the word "set" before the ? I use a 2VL where a null is like
> >a null set. cheers --dawn
>
> What are the two values of your logic? I use true

for the physical switch being set to the up position, right?

> and false

and down position. Yes, Gene, those are the two I choose as well,
ignoring the shades of gray. --dawn

Marshall Spight

unread,

Jul 27, 2005, 2:47:09 PM7/27/05

Wow. I hadn't heard that term before. Given how much confusion
there is around the semantics of null in SQL, I think I'm going
to steer clear of it, especially in this field. I can easily
see it causing confusion where the more popular "empty set" term
wouldn't.

Marshall

dawn

unread,

Jul 27, 2005, 3:10:13 PM7/27/05

Marshall Spight wrote:
> dawn wrote:
> > Marshall Spight wrote:
> > > dawn wrote:
> > > >
> > > > I missed the word "set" before the ? I use a 2VL where a null is like
> > > > a null set.
> > >
> > > What is a 'null set'? Is that like the empty set?
> > >
> > yes
> >
> > http://www.swif.uniba.it/lei/foldop/foldoc.cgi?null+set
> > http://en.wikipedia.org/wiki/Null_set
>
> Wow. I hadn't heard that term before.

I'm really surprised by that. I wonder if/when that term lost favor.
I'll make a mental note.

> Given how much confusion
> there is around the semantics of null in SQL, I think I'm going
> to steer clear of it, especially in this field.

Hmmm. It seems an interesting sport to derive a term from one
discipline, alter the meaning, and then stop using it in the discipline
it was taken from so as not to confuse things. Language is so fluid.

> I can easily
> see it causing confusion where the more popular

I didn't realize it was. I'm sure I've spoken the words "null set"
more than "empty set" in my life. We both learned somethin' there.
Thanks. --dawn

Marshall Spight

unread,

Jul 27, 2005, 10:50:19 PM7/27/05

dawn wrote:
> Marshall Spight wrote:
> > > >
> > > > What is a 'null set'? Is that like the empty set?
> > > >
> > > yes
> > >

> > Wow. I hadn't heard that term before.
>
> I'm really surprised by that. I wonder if/when that term lost favor.

> [...] We both learned somethin' there.
> Thanks. --dawn

If my ignorance can help just one person, it will have been worth it.

Marshall

Jonathan Leffler

unread,

Jul 28, 2005, 1:47:24 AM7/28/05

dawn wrote:

> Gene Wirchenko wrote:
>> What are the two values of your logic? I use true
>
> for the physical switch being set to the up position, right?
>
>>and false
>
> and down position. Yes, Gene, those are the two I choose as well,
> ignoring the shades of gray. --dawn

These comments suggest an American (USA) perspective on the polarity of
switches. In the UK, a (light) switch that is up is normally off and
one that's down is normally on - at least, for most switches. It still
catches me out.

(I hope I have the attributions right - apologies if not.)

--
Jonathan Leffler #include <disclaimer.h>
Email: jlef...@earthlink.net, jlef...@us.ibm.com
Guardian of DBD::Informix v2005.01 -- http://dbi.perl.org/

dawn

unread,

Jul 28, 2005, 11:00:38 AM7/28/05

Jonathan Leffler wrote:
> dawn wrote:
> > Gene Wirchenko wrote:
> >> What are the two values of your logic? I use true
> >
> > for the physical switch being set to the up position, right?
> >
> >>and false
> >
> > and down position. Yes, Gene, those are the two I choose as well,
> > ignoring the shades of gray. --dawn
>
> These comments suggest an American (USA) perspective on the polarity of
> switches. In the UK, a (light) switch that is up is normally off and
> one that's down is normally on - at least, for most switches. It still
> catches me out.

I have never thought about whether hardware switches had this same
pattern. I wasn't referring to just any switches, but those found on
the front panel of a computer way back when. I hit the tail-end of the
hardware switch era, but I do recall setting physical switches in the
late 70's. This was then imitated with parameter settings. Well into
the 80's I recall variables named with -sw or -switch if they held
values of 1 or 0.

So, I wonder if computers in the UK had hardware switches reversed from
US hardware. Were there computers where up was 0 and down was 1?
--dawn

0 new messages