Not much, except that XQuery is more designed as an ad-hoc query
language than a general programming language. That means that queries
that are thought to be typical XML queries can be more succinctly
formulated in a more readable way, and the language is designed such
that typical database-style query optimization becomes easier than it
would be for LISP.
-- Jan Hidders
What would be interesting to me would be to understand more about the
various techniques for tree traversal. The one I find most
aesthetically
appealing is ML-style pattern matching, but this has the disadvantage
vs. XQuery of requiring the full tree structure to be specified (via
patterns)
even when you only want to query subtrees or particular kinds of nodes
or what have you.
I have no candidate in mind for a framework for comparing the
expressive
power of tree querying or tree rewriting languages. Sigh. The situation
is so much clearer and cleaner with relations.
I vaguely wonder whether it would be possible to use, or adapt,
relational
techniques for querying nested relations as an alternative to either
XQuery
or pattern matching.
Marshall
Does it? Or would the type polymorphism not allow you to leave a fair
bit unspecified, and, if you never access the unspecified parts, let
them fall away, irrelevant?
> I have no candidate in mind for a framework for comparing the
> expressive power of tree querying or tree rewriting
> languages. Sigh. The situation is so much clearer and cleaner with
> relations.
The trouble is that relations require taking something of a "set"
perspective, and if facts aren't being expressed that way naturally,
well, down that road lies some ghastlyness :-).
> I vaguely wonder whether it would be possible to use, or adapt,
> relational techniques for querying nested relations as an
> alternative to either XQuery or pattern matching.
Well, it sorta looks like sets versus, um, continuations...
--
(format nil "~S@~S" "cbbrowne" "gmail.com")
http://linuxdatabases.info/info/slony.html
"Lumping configuration data, security data, kernel tuning parameters,
etc. into one monstrous fragile binary data structure is really dumb."
- David F. Skoll
Given that the over the last century virtually all of mathematics
(mereology aside) has been recontextualised in terms of the "set"
perspective, I'd be mighty impressed if you could find a fact that
doesn't do so naturally too.
[Note that the above quote attributed to the estimable Jan Hidders was
actually said by me.]
This is a powerful and important point, and deserves much weight.
But at the same time, I don't think sets + the relational algebra are
enough to keep me happy. I want lists, too.
I might also note that over the last half centure, virtually all of
programming
has been done with ordered data, not unordered data. Lisp and Fortran,
C++ and Java, etc., all have arrays but not sets built in. Note that
I'm
not saying that I think we want lists instead of sets-- we want both.
Marshall
In elementary probability theory bags are fundamental units, not sets
I think the answer is, vaguely, yes. :-) You can think of the tree is a
binary relation, and if you want access to all descendants the system
could provide conceptually the transitive closure to you (so for the
binary table R you could for example use R* as the notation for its
transitive closure) and from there you could specify your queries as
usual. Most XPath queries would then translatie to simple
SELECT-FROM-WHERE queres.
-- Jan Hidders
Doesn't Tutorial-D include this capability?
> I think the answer is, vaguely, yes. :-) You can think of the tree is a
> binary relation, and if you want access to all descendants the system
> could provide conceptually the transitive closure to you (so for the
> binary table R you could for example use R* as the notation for its
> transitive closure) and from there you could specify your queries as
> usual. Most XPath queries would then translatie to simple
> SELECT-FROM-WHERE queres.
If they are not nested relations but nested ordered lists, as possible
with XML documents, then you would lose information this way, right?
I'll admit this is flying over my head so I might not be understanding
your response. Cheers! --dawn
Right. But you could add an extra binary relation that defines the order.
-- Jan Hidders
Right-o, and then the user (aka developer) must be sure to do the join
and the order-by as they will not be handled by the tools.
I suspect that multiple multivalues will still not be pretty if using
an extended SQL on nested relations. I keep meaning to look at XQuery
to see if it handles output as in the old pizza example with multiple
multivalues, such as the following output (sorry for dots in effort to
get good enough spacing)
PizzaName.....Crust.........Meats...............Veggies.........Cheeses
================================================
OurFamous.....DeepDish...Pepperoni.........Mushrooms....Parmesan
.......................................Sausage................................Moz
.......................................Ham
Everything.......Thick.........Pepperoni.........Olivies...........Moz
.......................................Ham.................Onions..........Provolone
...............................................................Peppers
Thanks. --dawn
Cool!
I note your use of the word "most"....
Marshall
> an extended SQL on nested relations. I keep meaning to look at XQuery
> to see if it handles output as in the old pizza example with multiple
> multivalues, such as the following output (sorry for dots in effort to
> get good enough spacing)
In certain other contexts, I have found the HTML tag "code" (matched by
"/code") to be useful.
As an experiment, I'm going to put these tage in your example, following,
and then look at the result via Google groups.
<code>
PizzaName.....Crust.........Meats...............Veggies.........Cheeses
================================================
OurFamous.....DeepDish...Pepperoni.........Mushrooms....Parmesan
.......................................Sausage..............................
..Moz
.......................................Ham
Everything.......Thick.........Pepperoni.........Olivies...........Moz
.......................................Ham.................Onions..........P
rovolone
...............................................................Peppers
</code>
That depends of course upon the tools, and note we were talking about a
hypothetical extension of relational tools.
> I keep meaning to look at XQuery
> to see if it handles output as in the old pizza example with multiple
> multivalues, such as the following output (sorry for dots in effort to
> get good enough spacing)
Please use a fixed-with font if you want dependable spacing.
> PizzaName.....Crust.........Meats...............Veggies.........Cheeses
> ================================================
> OurFamous.....DeepDish...Pepperoni.........Mushrooms....Parmesan
> .......................................Sausage................................Moz
> .......................................Ham
>
> Everything.......Thick.........Pepperoni.........Olivies...........Moz
> .......................................Ham.................Onions..........Provolone
> ...............................................................Peppers
Provided you reformulate it such that the output is in XML the answer is
"that is what it was designed to do".
-- Jan Hidders
Let me ask a more concrete question. Suppose I want to represent a tree
where all nodes are of the
same type, say Person. Link from person A to person B exists, say if A
is a parent of B. (I could've
just called it a family tree I guess).
It is reasonably obvious to see how to store this information
persistently using both XML and simple binary relation. Asking a
question, is person X an ancestor of person Y is rather straightforward
in
XQuery. How would I formulate an equivalent query in SQL?
I strongly suspect it is the query language that separates XML and
Relational schemes and not the
representation. Comments?
Why don't you google "trees sql"?
The problem is not a particular syntax. You can't design a query
language by randomly throwing in some ad-hock language elements, e.g.
take some ideas from regular expressions, pattern matching, spice it
with loop constructions from procedural languages, etc. This ad-hock
thingy might work, but it would never really take off the ground. You
have to have some theoretical insight. And the area of graphs, partial
order relation, and combinatorics is difficult from theoretical
perspective -- harder than first order logic.
>The trouble is that relations require taking something of a "set"
>perspective, and if facts aren't being expressed that way naturally,
>well, down that road lies some ghastlyness :-).
The thing that has bothered me is that those promoting the relational
model typically provide worthless examples, typically employee and
department, and seem to almost religiously avoid non-trivial working
examples, particularly those that might seem problematic for the
scheme.
For example, what is a type? What goes in the 'set', as you express
it, never mind any particular domain? Why is a book a 'type', when
there are various sorts of books? Why is a chapter a 'type', when the
chapters in the same book might be of a very different sort? Is the
appendix which is more an index a type to itself? And so on. Are the
paragraphs a 'type', and is the paragraph in chapter 4 different than
the paragraph entities in relation chapter 5? To normalize things does
it require literally hundreds and hundreds of 'relations'/tables to
represent the structure? Just think of the 'joins'. And wasn't the RM
intended to free people from the 'tyranny of structure'?
Its the same reason that babies get fed milk first, mashed food later
and finally solids! You start simple to maximise the likelihood that the
reader will "get it" and hopefully be able to employ the techniques to
the complex problems they face. Pretty simple IMO.
> For example, what is a type? What goes in the 'set', as you express
> it, never mind any particular domain? Why is a book a 'type', when
> there are various sorts of books? Why is a chapter a 'type', when the
> chapters in the same book might be of a very different sort? Is the
> appendix which is more an index a type to itself? And so on. Are the
> paragraphs a 'type', and is the paragraph in chapter 4 different than
> the paragraph entities in relation chapter 5? To normalize things does
> it require literally hundreds and hundreds of 'relations'/tables to
> represent the structure?
Have you seen "hundreds and hundreds" first hand? What were they
responsible for expressing?
> Just think of the 'joins'. And wasn't the RM
> intended to free people from the 'tyranny of structure'?
Nope - where you get an idea like that? The aim is to "improve" the
structure and still make sure that physical aspects don't (irrationally)
impinge on the logical interests.
Cheers, Frank.
>Mark Johnson wrote:
>> The thing that has bothered me is that those promoting the relational
>> model typically provide worthless examples, typically employee and
>> department, and seem to almost religiously avoid non-trivial working
>> examples, particularly those that might seem problematic for the
>> scheme.
>Its the same reason that babies get fed milk first, mashed food later
>and finally solids!
But those aren't babies reading those texts and articles. Babies don't
read all that well to begin with. You seem to have avoided the issue,
here. And there's that "particularly . . . problematic', bit. That
sort of explained it, for me. I don't know if you differ.
>> For example, what is a type? What goes in the 'set', as you express
>> it, never mind any particular domain? Why is a book a 'type', when
>> there are various sorts of books? Why is a chapter a 'type', when the
>> chapters in the same book might be of a very different sort? Is the
>> appendix which is more an index a type to itself? And so on. Are the
>> paragraphs a 'type', and is the paragraph in chapter 4 different than
>> the paragraph entities in relation chapter 5? To normalize things does
>> it require literally hundreds and hundreds of 'relations'/tables to
>> represent the structure?
>Have you seen "hundreds and hundreds" first hand? What were they
>responsible for expressing?
Structure. Pascal says much the same. More nomalized, more tables. And
what about all those joins?
>> Just think of the 'joins'. And wasn't the RM
>> intended to free people from the 'tyranny of structure'?
>Nope - where you get an idea like that?
Because that was the idea behind it. But the difference was that, if I
recall, the structure and programming went together in the 50s and
early 60s. They weren't separated out. Just generally, a lot of
development has been had because parts of processes have been
separated in a 'workflow' scheme. For a simple job, it seems
excessive. For anything else, it simplifies, compartmentalizes and
reduces error. And there have been a lot of decades between then and
now. And the RM has been implemented to certain degrees. Now if you
put such a hierarchy on an existing RM or pseudo-such, it seems as if
it would be a different matter - even while violating the basic tenets
of the RM to do so. But it's not necessarily tied to a structure, as
the RM is so tied by joins and the relations themselves. And it makes
structural modification difficult.
Remember what I said: For example, what is a type? What goes in the
'set', as you express it, never mind any particular domain? Why is a
book a 'type', when there are various sorts of books? Why is a chapter
a 'type', when the chapters in the same book might be of a very
different sort? Is the appendix which is more an index a type to
itself? And so on. Are the paragraphs a 'type', and is the paragraph
in chapter 4 different than the paragraph entities in relation chapter
5?
>The aim is to "improve" the
>structure and still make sure that physical aspects don't (irrationally)
>impinge on the logical interests.
I'm not sure that means anything. Improved compared to what,
precisely? And why irrationally in parentheses, etc? Could you
elaborate?
Errr - its allegory, at least it was intended to be such!
> And there's that "particularly . . . problematic', bit. That
> sort of explained it, for me. I don't know if you differ.
Neither do I. Can you point one of these "problematic" working examples
out to me?
>>>For example, what is a type? What goes in the 'set', as you express
>>>it, never mind any particular domain? Why is a book a 'type', when
>>>there are various sorts of books? Why is a chapter a 'type', when the
>>>chapters in the same book might be of a very different sort? Is the
>>>appendix which is more an index a type to itself? And so on. Are the
>>>paragraphs a 'type', and is the paragraph in chapter 4 different than
>>>the paragraph entities in relation chapter 5? To normalize things does
>>>it require literally hundreds and hundreds of 'relations'/tables to
>>>represent the structure?
>
>>Have you seen "hundreds and hundreds" first hand? What were they
>>responsible for expressing?
>
> Structure. Pascal says much the same. More nomalized, more tables. And
> what about all those joins?
I'm on record as having big doubts about de D proposal. Joins are not a
drama (to moi at least), but the TTM is another kettle of fish.
>>>Just think of the 'joins'. And wasn't the RM
>>>intended to free people from the 'tyranny of structure'?
>
>>Nope - where you get an idea like that?
>
> Because that was the idea behind it.
Gee, and all this time I thought it was about common sense. Structure
does _not_ imply or mandate tyranny even though they can co-exist as
several sad periods in human history confirm.
> But the difference was that, if I
> recall, the structure and programming went together in the 50s and
> early 60s. They weren't separated out. Just generally, a lot of
> development has been had because parts of processes have been
> separated in a 'workflow' scheme. For a simple job, it seems
> excessive. For anything else, it simplifies, compartmentalizes and
> reduces error. And there have been a lot of decades between then and
> now. And the RM has been implemented to certain degrees. Now if you
> put such a hierarchy on an existing RM or pseudo-such, it seems as if
> it would be a different matter - even while violating the basic tenets
> of the RM to do so. But it's not necessarily tied to a structure, as
> the RM is so tied by joins and the relations themselves. And it makes
> structural modification difficult.
Does it? I must be dreaming then - silly me!
> Remember what I said: For example, what is a type? What goes in the
> 'set', as you express it, never mind any particular domain? Why is a
> book a 'type', when there are various sorts of books? Why is a chapter
> a 'type', when the chapters in the same book might be of a very
> different sort? Is the appendix which is more an index a type to
> itself? And so on. Are the paragraphs a 'type', and is the paragraph
> in chapter 4 different than the paragraph entities in relation chapter
> 5?
Choices, simply choices, make them, or don't - what is the fuss all about?
>>The aim is to "improve" the
>>structure and still make sure that physical aspects don't (irrationally)
>>impinge on the logical interests.
>
> I'm not sure that means anything. Improved compared to what,
> precisely? And why irrationally in parentheses, etc? Could you
> elaborate?
Sure. Improved compared to the mess of enmeshed programmes and data that
arose during the 60's and 70's. Why did I use irrationally - because
there are physical aspects that will impinge - such as the available
storage on a system might impact quite reasonably on how you build a
huge solution, but whether a system is big or little endian is of and
should be of no interest to a business analyst wanting to record text
for your books or chapters. Capiche?
Cheers, Frank.
Employees and departments are tables that will exist in a
(probably relational) database in the HR department of
every corporation in the world. It is about as real-world
an example as you can get. It may be the case that
you don't write applications that do that sort of thing,
but lots of people do.
When you're trying to communicate, producing examples that are
illustrative while also as simple as possible is the best thing you
can do. Many relational writers are very good at this.
If one belongs to the crowd that says, "I want a J2EE book, and
the more it weighs the better it must be," one probably does
not appreciate this perspective.
> For example, what is a type? What goes in the 'set', as you express
> it, never mind any particular domain? Why is a book a 'type', when
> there are various sorts of books? Why is a chapter a 'type', when the
> chapters in the same book might be of a very different sort? Is the
> appendix which is more an index a type to itself? And so on. Are the
> paragraphs a 'type', and is the paragraph in chapter 4 different than
> the paragraph entities in relation chapter 5?
I don't understand your point here. It seems to be written to
critique a set of types to describe a book. What types and
what book? Or are you saying that type theory has nothing
useful to say about books?
> To normalize things does
> it require literally hundreds and hundreds of 'relations'/tables to
> represent the structure?
How many classes does a Java program have? How many
people does a company need?
> Just think of the 'joins'.
Joins are wonderful; I love thinking about them. I wish the
programming languages I use had anything as powerful.
> And wasn't the RM
> intended to free people from the 'tyranny of structure'?
No.
And anyway, you can't not have structure.
Marshall
>Mark Johnson wrote:
>> Christopher Browne <cbbr...@acm.org> wrote:
>> The thing that has bothered me is that those promoting the relational
>> model typically provide worthless examples, typically employee and
>> department, and seem to almost religiously avoid non-trivial working
>> examples, particularly those that might seem problematic for the
>> scheme.
>Employees and departments are tables that will exist in a
>(probably relational) database in the HR department of
>every corporation in the world. It is about as real-world
>an example as you can get. It may be the case that
>you don't write applications that do that sort of thing,
>but lots of people do.
No they don't. Such examples that are shown are trivial. You don't pay
someone to do something that easy, that simple. A real world
requirement isn't anything like that. The other guy, in fact, agreed
with me, but defended the practice. So you disagree with him, as well.
And I still think that you particularly don't see certain examples
because they might seem to be counter-examples of entire scheme.
>If one belongs to the crowd that says, "I want a J2EE book, and
>the more it weighs the better it must be," one probably does
>not appreciate this perspective.
And I'm sure it rains on Mondays in Sweden, depending. Which has what
to do with what?
> > For example, what is a type? What goes in the 'set', as you express
>> it, never mind any particular domain? Why is a book a 'type', when
>> there are various sorts of books? Why is a chapter a 'type', when the
>> chapters in the same book might be of a very different sort? Is the
>> appendix which is more an index a type to itself? And so on. Are the
>> paragraphs a 'type', and is the paragraph in chapter 4 different than
>> the paragraph entities in relation chapter 5?
>I don't understand your point here. It seems to be written to
>critique a set of types to describe a book. What types and
>what book? Or are you saying that type theory has nothing
>useful to say about books?
I'm just wondering what it says about them, given that was the
example.
>> To normalize things does
>> it require literally hundreds and hundreds of 'relations'/tables to
>> represent the structure?
>How many classes does a Java program have? How many
>people does a company need?
Do you disagree with that idea that a more normalized scheme tends to
increase the number of tables? I don't know the formula? But are you
saying such a formula could not even exist?
>> Just think of the 'joins'.
>Joins are wonderful;
Most anything to excess ceases to be wonderful, unless we're literally
talking the beatific vision (which isn't something you or I could live
through, by the way).
>> And wasn't the RM
>> intended to free people from the 'tyranny of structure'?
>No.
>And anyway, you can't not have structure.
No, of course not. Structure. It's transcendent. It transcends any
scheme. Any scheme has to represent that structure, that order.
Obviously. Or else it's representing something else.
But wasn't the idea to separate out any fixed structure by reducing a
very well-defined scheme to a series of connected relations? Wasn't
the idea that while, again obviously, the RM itself describes a fairly
rigid structure, it didn't face the problems of an early generation of
COBOL, for example?
If I misunderstand, then so be it. If that wasn't a typical problem,
then what did Codd set out to ameliorate?
>Mark Johnson wrote:
>> Frank Hamersley <terabit...@bigpond.com> wrote:
>>>Mark Johnson wrote:
>> And there's that "particularly . . . problematic', bit. That
>> sort of explained it, for me. I don't know if you differ.
>Neither do I. Can you point one of these "problematic" working examples
>out to me?
I mentioned, already. And you defended the use of such with your -
allegory. Remember?
>>>>Just think of the 'joins'. And wasn't the RM
>>>>intended to free people from the 'tyranny of structure'?
>>>Nope - where you get an idea like that?
>> Because that was the idea behind it.
>Gee, and all this time I thought it was about common sense. Structure
>does _not_ imply or mandate tyranny even though they can co-exist as
>several sad periods in human history confirm.
"Human history"? You lost me, here. What do you mean? The pyramids?
Perhaps I misunderstand what I thought Codd thought was an advantage
of his scheme, decades ago. That could be. What about his rule #9?
>> But the difference was that, if I
>> recall, the structure and programming went together in the 50s and
>> early 60s. They weren't separated out. Just generally, a lot of
>> development has been had because parts of processes have been
>> separated in a 'workflow' scheme. For a simple job, it seems
>> excessive. For anything else, it simplifies, compartmentalizes and
>> reduces error. And there have been a lot of decades between then and
>> now. And the RM has been implemented to certain degrees. Now if you
>> put such a hierarchy on an existing RM or pseudo-such, it seems as if
>> it would be a different matter - even while violating the basic tenets
>> of the RM to do so. But it's not necessarily tied to a structure, as
>> the RM is so tied by joins and the relations themselves. And it makes
>> structural modification difficult.
>Does it?
I don't know. Look at his rules, again.
>> Remember what I said: For example, what is a type? What goes in the
>> 'set', as you express it, never mind any particular domain? Why is a
>> book a 'type', when there are various sorts of books? Why is a chapter
>> a 'type', when the chapters in the same book might be of a very
>> different sort? Is the appendix which is more an index a type to
>> itself? And so on. Are the paragraphs a 'type', and is the paragraph
>> in chapter 4 different than the paragraph entities in relation chapter
>> 5?
>Choices, simply choices, make them, or don't - what is the fuss all about?
In other words, the idea of this 'type' is so vague that it simply
doesn't matter? See below.
>>>The aim is to "improve" the
>>>structure and still make sure that physical aspects don't (irrationally)
>>>impinge on the logical interests.
>> I'm not sure that means anything. Improved compared to what,
>> precisely? And why irrationally in parentheses, etc? Could you
>> elaborate?
>Sure. Improved compared to the mess of enmeshed programmes and data that
>arose during the 60's and 70's.
That's how _I_ understood it. That's why I mentioned this idea that
breaking out parts of processes has sort of defined developments over
the decades. Again, such might seem overkill for a small project. But
. . . etc., as I wrote, before.
>Why did I use irrationally - because
>there are physical aspects that will impinge - such as the available
>storage on a system might impact quite reasonably on how you build a
>huge solution, but whether a system is big or little endian is of and
>should be of no interest to a business analyst wanting to record text
>for your books or chapters. Capiche?
Work with what you have, in other words, and still produce a
theoretically sound solution? Actually, my question went more to what
perhaps you regard as a sort of pointless definitional confusion over
just how things are grouped, labelled, typed. I thought much had been
made about rule #2, and the lack of ambiguity, of like 'things' in
relations. But what is a like thing? That's all.
HR databases certainly do have tables for things like
employees and departments, so it certainly is the
case that lots of people do that sort of thing.
> Such examples that are shown are trivial. You don't pay
> someone to do something that easy, that simple.
> A real world requirement isn't anything like that.
You mean, you're reading books that aren't showing you
full blown HR systems? Such as would be hundreds of
thousands of lines of code long? Of *course* they don't.
They don't in any other field, either.
> And I still think that you particularly don't see certain examples
> because they might seem to be counter-examples of entire scheme.
Of course. When you're writing a book about a technique, you
show what it's good for, not what it's bad for. You don't see
books on Perl focusing on object-oriented programming;
you don't see books on SQL focusing on String processing, and
you don't see books on auto maintenance showing you how
to use a socket set to drive nails.
There are some things that SQL does extraordinarily well.
Handling data in static hierarchies such as employee/deptartment
or customer/invoice/line item is something SQL does better
than anything else. Handling dynamic hierarchies, like a
parse tree, it has a harder time with. Should we reject a
tool because it doesn't do everything well, or should we
instead just use it for what it's good at?
> >If one belongs to the crowd that says, "I want a J2EE book, and
> >the more it weighs the better it must be," one probably does
> >not appreciate this perspective.
>
> And I'm sure it rains on Mondays in Sweden, depending. Which
> has what to do with what?
A good demonstration of something is a minimal demonstration
of something. When I want an example of how to do something,
I want it small; that will make it easier to understand. This
is especially true when teaching new concepts. That's why
SQL books show the employee table as having 5 employees
in it; if it had a hundred, the concept would actually be *less*
clear because it would be obscured under all the data.
If one (metaphorically speaking) supersizes ones mental fries,
one won't appreciate this.
Furthermore,
If one belongs to the crowd that says, "I want a J2EE book, and
the more it weighs the better it must be," one probably does
not appreciate this perspective.
> >> To normalize things does
> >> it require literally hundreds and hundreds of 'relations'/tables to
> >> represent the structure?
>
> >How many classes does a Java program have? How many
> >people does a company need?
>
> Do you disagree with that idea that a more normalized scheme tends to
> increase the number of tables?
Not at all. For any schema, there is a schema with the equivalent
information that contains only a single relation. Such a schema
will maximize the potential for update anomalies.
> >> Just think of the 'joins'.
>
> >Joins are wonderful;
>
> Most anything to excess ceases to be wonderful,
This statement relies on the word "excess." The real
question is, what is the *right* number of tables? And
the answer is, the number of tables that produces
the fully normalized schema.
Tables and joins aren't an expense to be minimized.
Columns aren't something you want to be sure to have
the fewest of, no more so than rows.
One runs into this same idea in the Java forums sometimes.
Someone has a problem with a class; someone else shows
them how their problem may be eliminated by decomposing
the class into two classes; the OP rejects that solution
because, you know, "too many classes."
It's just a mental fixation, and not an actual problem.
"Just think of the joins" is the same non-problem as
"too many classes."
> >> And wasn't the RM
> >> intended to free people from the 'tyranny of structure'?
>
> >No.
> >And anyway, you can't not have structure.
>
> No, of course not. Structure. It's transcendent. It transcends any
> scheme. Any scheme has to represent that structure, that order.
> Obviously. Or else it's representing something else.
I had a hard time with this paragraph. I'm not at all sure
I understand what it means. As near as I can tell, what
it's trying to say is "nyaa nyaa nyaa, nyaa nyaa." But
I could be wrong. Could you perhaps rephrase this?
Or else perhaps if you disagree with the necessity of
structure, could you perhaps show me some example
of some data that you could meaningfully use that
didn't have any structure.
> But wasn't the idea to separate out any fixed structure by reducing a
> very well-defined scheme to a series of connected relations? Wasn't
> the idea that while, again obviously, the RM itself describes a fairly
> rigid structure, it didn't face the problems of an early generation of
> COBOL, for example?
>
> If I misunderstand, then so be it. If that wasn't a typical problem,
> then what did Codd set out to ameliorate?
I have no information about the problems of early generations
of COBOL, nor do I know anything about Codd's intentions.
Sorry.
Marshall
>Mark Johnson wrote:
>> "Marshall Spight" <marshal...@gmail.com> wrote:
>> >> The thing that has bothered me is that those promoting the relational
>> >> model typically provide worthless examples, typically employee and
>> >> department, and seem to almost religiously avoid non-trivial working
>> >> examples, particularly those that might seem problematic for the
>> >> scheme.
>> >Employees and departments are tables that will exist in a
>> >(probably relational) database in the HR department of
>> >every corporation in the world. It is about as real-world
>> >an example as you can get. It may be the case that
>> >you don't write applications that do that sort of thing,
>> >but lots of people do.
>> No they don't.
>HR
I don't care - what. You took me out of context. This is what I wrote,
instead:
>> Such examples that are shown are trivial. You don't pay
>> someone to do something that easy, that simple.
>> A real world requirement isn't anything like that.
>You mean, you're reading books that aren't showing you
>full blown HR systems?
And that's a straw man. That's not what I said. But at least we agree
that the examples typically shown are pointlessly trivial. Actually,
it's a common complaint against academics.
And:
>> And I still think that you particularly don't see certain examples
>> because they might seem to be counter-examples of entire scheme.
>Of course. When you're writing a book about a technique, you
>show what it's good for, not what it's bad for.
Sounds more like promotion or salesmanship. You only hear about what
was wrong with last year's model when it comes time to sell next
year's model.
>There are some things that SQL does extraordinarily well.
In fact, wasn't it one of Codd's rules that SQL be the means of
manipulation, rather than some other sort of code, in addition?
>I want it small; that will make it easier to understand.
But there's a problem with that. One of those is - scale. How does it
scale? Does it scale? The subject, here, is technical papers and
presentations. And in addition, do the trivial academic examples
conceal a fundamental problem with the approach? Does it work only
narrowly and in the most simply contrived situation, even regardless
of scale?
>> Do you disagree with that idea that a more normalized scheme tends to
>> increase the number of tables?
>Not at all. For any schema, there is a schema with the equivalent
>information that contains only a single relation. Such a schema
>will maximize the potential for update anomalies.
I was simply saying that relations tend to multiply, and so joins.
>> >> Just think of the 'joins'.
>> >Joins are wonderful;
>> Most anything to excess ceases to be wonderful,
>This statement relies on the word "excess." The real
>question is, what is the *right* number of tables?
That IS a good question. But it goes to what is a - type, I think.
You may think it's a sort of pointless question. I don't know. Just
throw in whatever 'sounds good', and call it an entity?
>Tables and joins aren't an expense to be minimized.
>Columns aren't something you want to be sure to have
>the fewest of, no more so than rows.
In other words, what if an attribute doesn't belong?
>One runs into this same idea in the Java forums sometimes.
>Someone has a problem with a class; someone else shows
>them how their problem may be eliminated by decomposing
>the class into two classes; the OP rejects that solution
>because, you know, "too many classes."
Alright. Classes/parameters/processes/etc.
An 'instantiated' class is created in one or more processes, before
anything else, let's say. Following the book question, what's the
scheme for that? A process might apply to the class, but also to
another class. But is the process properly part of the class
relations, or is a linked set of relations specifying class more
properly a thing that belongs with the process relations? Is there
some convention that's been used, for any particular reason? Maybe
this question has been trivially answered, and years ago. It's not
that one can't view class in terms of process, and vis-versa, from the
same data. And I understand it, that supposedly is a benefit of RM.
But why does what go where in the underlying relations? Does it
matter?
>"Just think of the joins" is the same non-problem as
>"too many classes."
They're not comparable. And are you suggesting that a SQL statement
consisting of many joins would not seem rather cumbersome, and even
prone to error for that? But with more discreet relations, isn't such
an excessive string a possibility?
>> >> And wasn't the RM
>> >> intended to free people from the 'tyranny of structure'?
>> >No.
>> >And anyway, you can't not have structure.
>> No, of course not. Structure. It's transcendent. It transcends any
>> scheme. Any scheme has to represent that structure, that order.
>> Obviously. Or else it's representing something else.
>I had a hard time with this paragraph. I'm not at all sure
>I understand what it means. As near as I can tell, what
>it's trying to say is "nyaa nyaa nyaa, nyaa nyaa."
Gloating or boasting - about what? How could it be, even as an outside
possibility? Maybe you're thinking of some other message or thread?
Rather, I was merely agreeing with you. Structure is transcendent of
any scheme. The scheme must represent that structure, or it represents
something else. I can't state it more clearly than that.
>> But wasn't the idea to separate out any fixed structure by reducing a
>> very well-defined scheme to a series of connected relations? Wasn't
>> the idea that while, again obviously, the RM itself describes a fairly
>> rigid structure, it didn't face the problems of an early generation of
>> COBOL, for example?
>> If I misunderstand, then so be it. If that wasn't a typical problem,
>> then what did Codd set out to ameliorate?
>I have no information about the problems of early generations
>of COBOL, nor do I know anything about Codd's intentions.
I was referring to his rules, and his purpose, and what he thought
might be better because of them.
In that case I have no idea what you are objecting to.
I understood you to be objecting to simple examples,
and I assert that simple examples are the best way
to illustrate concepts. Certainly they do not make
real systems; their purpose is didactic, not practical.
That does not mean the concepts so illustrated are
not practical, though.
Consider the original K&R "C Programming Language."
All the programs in it were quite small and limited.
A simple hash table, for example. Yet the book is
a classic.
> But at least we agree
> that the examples typically shown are pointlessly trivial.
Since I went on at length about what I thought was valuable
(which is to say, the opposite of pointless) about simple
examples, I don't see how you could reach that conclusion.
In any event, let me be explicit in saying that I do not
consider the typical employee/department examples used
in SQL books to be pointless; on the contrary they are quite
valuable.
> And:
>
> >> And I still think that you particularly don't see certain examples
> >> because they might seem to be counter-examples of entire scheme.
>
> >Of course. When you're writing a book about a technique, you
> >show what it's good for, not what it's bad for.
>
> Sounds more like promotion or salesmanship.
What's wrong with that? If you have a useful product, you
still need to promote it. No one's going to find out about
it by reading your mind. You promote something by showing
what it's good for; any other choice is ineffective.
> >There are some things that SQL does extraordinarily well.
>
> In fact, wasn't it one of Codd's rules that SQL be the means of
> manipulation, rather than some other sort of code, in addition?
Why did you excerpt my "There are some things" sentence
and follow it with what you did? It seems a complete non
sequitur.
And as I said, I have no particular desire (nor ability) to discuss
Codd.
> >I want it small; that will make it easier to understand.
>
> But there's a problem with that. One of those is - scale. How does it
> scale? Does it scale?
It turns out that SQL scales quite well, both in the complexity
dimension and in the data size dimension.
> The subject, here, is technical papers and presentations.
It is? When did it become so, just with this sentence?
> And in addition, do the trivial academic examples
> conceal a fundamental problem with the approach? Does it work only
> narrowly and in the most simply contrived situation, even regardless
> of scale?
I don't understand the purpose of these questions. Are they rhetorical?
It seems fairly clear that the answers to these questions is "no."
Do you have any experience with SQL?
> >> Do you disagree with that idea that a more normalized scheme tends to
> >> increase the number of tables?
>
> >Not at all. For any schema, there is a schema with the equivalent
> >information that contains only a single relation. Such a schema
> >will maximize the potential for update anomalies.
>
> I was simply saying that relations tend to multiply, and so joins.
Yes. Software tends to grow in complexity as we add new features
and functionality, and capture more data.
If we program in an OOPL, the number of classes goes up over time.
If we program in an FPL, the number of functions goes up over time.
This is a good thing.
> >> >> Just think of the 'joins'.
>
> >> >Joins are wonderful;
>
> >> Most anything to excess ceases to be wonderful,
>
> >This statement relies on the word "excess." The real
> >question is, what is the *right* number of tables?
>
> That IS a good question. But it goes to what is a - type, I think.
What is a minus type? I don't understand the purpose of the
dash in the above sentence. Are you just asking "What is a type?"
If so, please allow me to refer you to Benjamin Pierce,
"Types and Programming Languages."
> You may think it's a sort of pointless question. I don't know. Just
> throw in whatever 'sounds good', and call it an entity?
Could it be that when you ask "what is a type" what
you're really asking is, how do we define tables? (I say this
because you mentioned the word "entity".)
> >Tables and joins aren't an expense to be minimized.
> >Columns aren't something you want to be sure to have
> >the fewest of, no more so than rows.
>
> In other words, what if an attribute doesn't belong?
Then you put it where it does belong.
Am I to take it that you haven't done much data modelling?
> An 'instantiated' class is created in one or more processes, before
> anything else, let's say. Following the book question, what's the
> scheme for that? A process might apply to the class, but also to
> another class. But is the process properly part of the class
> relations, or is a linked set of relations specifying class more
> properly a thing that belongs with the process relations? Is there
> some convention that's been used, for any particular reason? Maybe
> this question has been trivially answered, and years ago. It's not
> that one can't view class in terms of process, and vis-versa, from the
> same data. And I understand it, that supposedly is a benefit of RM.
> But why does what go where in the underlying relations? Does it
> matter?
I really had a hard time understanding this paragraph. It sounds like
you're asking how table design is done, but I could be completely
wrong.
> >"Just think of the joins" is the same non-problem as
> >"too many classes."
>
> They're not comparable.
I can't think of a difference. Can you? Both are aesthetic
responses, not technical ones.
> And are you suggesting that a SQL statement
> consisting of many joins would not seem rather cumbersome, and even
> prone to error for that?
For me, calculus seems cumbersome and prone to error; is this
a flaw of calculus, or a limitation of mine?
Let us imagine a SQL select that joins ten tables. We must ask
why does it do that? If it is because it answers a complex,
difficult question that involves quite a lot of different sources
of information, then it is hard to see what the objection is.
In essence, any objection to the SELECT would really be
an objection to the question being asked. We have a choice:
we can answer the question, complexity and all, or we can
say, "that question is too hard" and then we don't have to
deal with the ten way join.
And of all the ways I know of to answer a question that might
be answered with a ten-way join, the ten-way join seems
the least error prone among them.
> But with more discreet relations, isn't such
> an excessive string a possibility?
Again, it's not "excessive" if that's what's necessary to get
the job done. Complicated problems often have complicated
solutions.
> >> No, of course not. Structure. It's transcendent. It transcends any
> >> scheme. Any scheme has to represent that structure, that order.
> >> Obviously. Or else it's representing something else.
>
> >I had a hard time with this paragraph. I'm not at all sure
> >I understand what it means. As near as I can tell, what
> >it's trying to say is "nyaa nyaa nyaa, nyaa nyaa."
>
> Gloating or boasting - about what? How could it be, even as an outside
> possibility? Maybe you're thinking of some other message or thread?
It's clear to me that we use quite different communication styles.
I sometimes have no idea what you're saying, even when you think
you're being very clear, and you sometimes don't follow what I'm
saying, even when I think I'm being very clear.
> Rather, I was merely agreeing with you. Structure is transcendent of
> any scheme. The scheme must represent that structure, or it represents
> something else. I can't state it more clearly than that.
Okay.
> >I have no information about the problems of early generations
> >of COBOL, nor do I know anything about Codd's intentions.
>
> I was referring to his rules, and his purpose, and what he thought
> might be better because of them.
Sure. I'm just saying I don't have much to contribute to that
discussion.
Marshall
> And that's a straw man. That's not what I said. But at least we agree
> that the examples typically shown are pointlessly trivial. Actually,
> it's a common complaint against academics.
In reality, it's a common complaint against introductory material in almost
any sphere of endeavor.
They introduce the alphabet and the numbers from one to ten on Sesame
Street. The examples they use are "pointlessly trivial", when viewed from
the perspective of a CPA who reads Shakespeare for pleasure. But they
aren't pointlessly trivial from the perspective of the kids who are watching
and learning.
I would expect a good schema to introduce relational concepts to consist of
less than 20 tables. I would expect it to be somewhat simpler that
something people get paid to do. I'm surprised that you expect otherwise.
You mean this ...
"For example, what is a type? What goes in the 'set', as you express
it, never mind any particular domain? Why is a book a 'type', when
there are various sorts of books? Why is a chapter a 'type', when the
chapters in the same book might be of a very different sort? Is the
appendix which is more an index a type to itself? And so on. Are the
paragraphs a 'type', and is the paragraph in chapter 4 different than
the paragraph entities in relation chapter 5?"
I really haven't time to iron the creases out of this spray. I'm not
even sure what is is supposed to illustrate about the RM.
> And you defended the use of such with your -
> allegory. Remember?
I did indeed! Do you agree or disagree with the point?
>>>>>Just think of the 'joins'. And wasn't the RM
>>>>>intended to free people from the 'tyranny of structure'?
>
>>>>Nope - where you get an idea like that?
>
>>>Because that was the idea behind it.
>
>>Gee, and all this time I thought it was about common sense. Structure
>>does _not_ imply or mandate tyranny even though they can co-exist as
>>several sad periods in human history confirm.
>
> "Human history"? You lost me, here. What do you mean? The pyramids?
Pyramids, structures, CDT, very strange - I guess you have led a
protected life then?
I assume you haven't the heard about the Nazi's using Hollerith punch
card technology from IBM to provide structure to the Holocaust? Whilst
the Jews bore the worst there were other groups targeted as well and the
punch cards made sure they were sent to the "right" camps. Then there
was the Killing Fields where the victims were often photographed and
catalogued. Enough.
> Perhaps I misunderstand what I thought Codd thought was an advantage
> of his scheme, decades ago. That could be. What about his rule #9?
What about it?
>>>But the difference was that, if I
>>>recall, the structure and programming went together in the 50s and
>>>early 60s. They weren't separated out. Just generally, a lot of
>>>development has been had because parts of processes have been
>>>separated in a 'workflow' scheme. For a simple job, it seems
>>>excessive. For anything else, it simplifies, compartmentalizes and
>>>reduces error. And there have been a lot of decades between then and
>>>now. And the RM has been implemented to certain degrees. Now if you
>>>put such a hierarchy on an existing RM or pseudo-such, it seems as if
>>>it would be a different matter - even while violating the basic tenets
>>>of the RM to do so. But it's not necessarily tied to a structure, as
>>>the RM is so tied by joins and the relations themselves. And it makes
>>>structural modification difficult.
>
>>Does it?
>
> I don't know. Look at his rules, again.
Why? The rules do not make the task of structural modification
difficult for the RM. It isn't even difficult to do anyway!
>>>Remember what I said: For example, what is a type? What goes in the
>>>'set', as you express it, never mind any particular domain? Why is a
>>>book a 'type', when there are various sorts of books? Why is a chapter
>>>a 'type', when the chapters in the same book might be of a very
>>>different sort? Is the appendix which is more an index a type to
>>>itself? And so on. Are the paragraphs a 'type', and is the paragraph
>>>in chapter 4 different than the paragraph entities in relation chapter
>>>5?
What's this fixation with the 'type' word? Is your normal head space OO?
>>Choices, simply choices, make them, or don't - what is the fuss all about?
>
> In other words, the idea of this 'type' is so vague that it simply
> doesn't matter? See below.
I'm looking, I'm looking...
>>>>The aim is to "improve" the
>>>>structure and still make sure that physical aspects don't (irrationally)
>>>>impinge on the logical interests.
>
>>>I'm not sure that means anything. Improved compared to what,
>>>precisely? And why irrationally in parentheses, etc? Could you
>>>elaborate?
>
>>Sure. Improved compared to the mess of enmeshed programmes and data that
>>arose during the 60's and 70's.
>
> That's how _I_ understood it. That's why I mentioned this idea that
> breaking out parts of processes has sort of defined developments over
> the decades. Again, such might seem overkill for a small project. But
> .. . . etc., as I wrote, before.
You mention "small projects" again - I thought you were interested in
the hard ones?
>>Why did I use irrationally - because
>>there are physical aspects that will impinge - such as the available
>>storage on a system might impact quite reasonably on how you build a
>>huge solution, but whether a system is big or little endian is of and
>>should be of no interest to a business analyst wanting to record text
>>for your books or chapters. Capiche?
>
> Work with what you have, in other words, and still produce a
> theoretically sound solution?
No - just work period and produce a practically sound solution.
> Actually, my question went more to what
> perhaps you regard as a sort of pointless definitional confusion over
> just how things are grouped, labelled, typed. I thought much had been
> made about rule #2, and the lack of ambiguity, of like 'things' in
> relations. But what is a like thing? That's all.
Codd was talking about scalar values in #2 - not arbitary objects.
...but I don't see!
Cheers, Frank.
Are you referring to me as the "other guy"?
If so I don't "agree" with you at all - I was pointing out that simple
examples are most useful for teaching basic knowledge.
[..]
Cheers, Frank.
>"Mark Johnson" <1023...@compuserve.com> wrote in
>They introduce the alphabet and the numbers from one to ten on Sesame
>Street.
But that's a very narrow discipline, that might yield some contract as
a 'product tester', or maybe for the few a participant, producer,
actor, etc. But it's something entirely different. Again, it's a
common complaint with academic papers that the examples are
pointlessly trivial, as if to disguise some problem with the approach,
or for other reasons.
>Mark Johnson wrote:
>> "Marshall Spight" <marshal...@gmail.com> wrote:
>>>Mark Johnson wrote:
>>>>Christopher Browne <cbbr...@acm.org> wrote:
>Are you referring to me as the "other guy"?
You were, at that point.
>If so I don't "agree" with you at all - I was pointing out that simple
>examples are most useful for teaching basic knowledge.
You were trying to defend such. I said that. You also admitted such
was the case. And that's all I was saying.
>Mark Johnson wrote:
>> FrankHamersley <FrankHam...@hotmail.com> wrote:
>>>Mark Johnson wrote:
>>>>Frank Hamersley <terabit...@bigpond.com> wrote:
>>>>>Mark Johnson wrote:
>"For example, what is a type? What goes in the 'set', as you express
>it, never mind any particular domain? Why is a book a 'type', when
>there are various sorts of books? Why is a chapter a 'type', when the
>chapters in the same book might be of a very different sort? Is the
>appendix which is more an index a type to itself? And so on. Are the
>paragraphs a 'type', and is the paragraph in chapter 4 different than
>the paragraph entities in relation chapter 5?"
>I really haven't time to iron the creases out of this spray.
You think that perhaps such things don't matter? Such questions seems
perhaps pointlessly trivial?
>> And you defended the use of such with your -
>> allegory. Remember?
>I did indeed! Do you agree or disagree with the point?
I've repeatedly said, to a couple of people now, that I disagree with
any attempt to justify that approach. I simply pointed out that it's a
common complaint with academic papers.
>> "Human history"? You lost me, here. What do you mean? The pyramids?
>I assume you haven't the heard about the Nazi's using Hollerith punch
>card technology from IBM to provide structure to the Holocaust? Whilst
>the Jews bore the worst there were other groups targeted as well and the
>punch cards made sure they were sent to the "right" camps.
It wasn't just about camps. Millions were simply gunned down on the
spot, or in fields being made to walk out to mass graves. The entire
Nazi regime was the quick and steady corruption of a German society
reeling from the straights into which the victors of the Great War
forced them, as sort of a punishment. It was a society so organized,
structured, advanced, in the center of European history before it was
even thought of as, Europe, that the Manhattan Project was essentially
justified as a race to get the bomb - a race with Hiesenberg and
Germany's effort (which was a dead duck given Hiesenberg's fortunate
miscalculations). Hitler's Final Solution to the Jewish Problem, as
they termed it, his secret police and assault upon suspected traitors
to his regime, his eyes and ears, his rumormongers, his round-up of
the feeble and infirmed, of separating the 'races' even by something
like phrenology, as insane as it was, the whole thing was very
well-documented and methodical, though they also were quite methodical
in destroying documents and physical evidence (thus the contradictions
found, and exploited, by 'Holocaust deniers'). It wasn't merely punch
cards. It was also that the trains ran on time. But that's not
comparable to this, and any mention of Codd.
>> Perhaps I misunderstand what I thought Codd thought was an advantage
>> of his scheme, decades ago. That could be. What about his rule #9?
>What about it?
Wasn't this supposed to be a key advance over what he believed had
been the case?
>The rules do not make the task of structural modification
>difficult for the RM. It isn't even difficult to do anyway!
Well, take the example:
>>>>Remember what I said: For example, what is a type? What goes in the
>>>>'set', as you express it, never mind any particular domain? Why is a
>>>>book a 'type', when there are various sorts of books? Why is a chapter
>>>>a 'type', when the chapters in the same book might be of a very
>>>>different sort? Is the appendix which is more an index a type to
>>>>itself? And so on. Are the paragraphs a 'type', and is the paragraph
>>>>in chapter 4 different than the paragraph entities in relation chapter
>>>>5?
>What's this fixation with the 'type' word?
Thing, then, if you prefer that. What's the structure, here? And then
imagine I find a second book. Does it also fit? You can put a binding
on many things, speaking of which. Can the structure be changed? How
difficult is it "to do anyway"? But you have to have an idea of the
first, before attempting to change it.
>>>Choices, simply choices, make them, or don't - what is the fuss all about?
>> In other words, the idea of this 'type' is so vague that it simply
>> doesn't matter? See below.
>I'm looking, I'm looking...
Perhaps it just seems a rather pointlessly trivial question? I don't
know. Just slap some stuff together, call it a 'relation'?
>> That's how _I_ understood it. That's why I mentioned this idea that
>> breaking out parts of processes has sort of defined developments over
>> the decades. Again, such might seem overkill for a small project. But
>> .. . . etc., as I wrote, before.
>You mention "small projects" again - I thought you were interested in
>the hard ones?
I mentioned both, and the difference between the two. It might seem
overkill - it thankfully might not. That's both.
>>>Why did I use irrationally - because
>>>there are physical aspects that will impinge - such as the available
>>>storage on a system might impact quite reasonably on how you build a
>>>huge solution, but whether a system is big or little endian is of and
>>>should be of no interest to a business analyst wanting to record text
>>>for your books or chapters. Capiche?
>> Work with what you have, in other words, and still produce a
>> theoretically sound solution?
>No - just work period and produce a practically sound solution.
I'm sure you still aim at some theoretical goal in producing whatever
temporized "solution". Where the theory presents some insight into or
just distillation of practice, a practioner can benefit from applying
theory, just as any theorist can often greatly benefit from being a
practioner.
>> Actually, my question went more to what
>> perhaps you regard as a sort of pointless definitional confusion over
>> just how things are grouped, labelled, typed. I thought much had been
>> made about rule #2, and the lack of ambiguity, of like 'things' in
>> relations. But what is a like thing? That's all.
>Codd was talking about scalar values in #2 - not arbitary objects.
Date's rewrite, perhaps.
"To access any data-item you specify which column within which table
it exists, there is no reading of characters 10 to 20 of a 255 byte
string."
Punch card - not punch card. But anything can go in 10-20. So again, I
just get the sense that you think it sort of pointless to consider
differences and what might appropriately fit one set of relations, and
not another?
>Mark Johnson wrote:
>> "Marshall Spight" <marshal...@gmail.com> wrote:
>> >> Such examples that are shown are trivial. You don't pay
>> >> someone to do something that easy, that simple.
>> >> A real world requirement isn't anything like that.
>I understood you to be objecting to simple examples,
>and I assert that simple examples are the best way
>to illustrate concepts. Certainly they do not make
>real systems; their purpose is didactic, not practical.
Which is a basic principal of any apprenticeship. Unreal examples to
prepare for productive application. I understand that. But it's also
the common complaint with academics. There could be many reasons why
the impractical never is made practical. But it is a complaint.
>That does not mean the concepts so illustrated are
>not practical, though.
I agree. Or it could even be the case that they know the utility, but
for proprietary reasons, refuse to share that knowledge in their
papers. It might be deduced by someone with background or some
gamester's clue, if you will. But the simplest, accurate explanation
must also allow for the possibility that such is presented because
they did not pursue the matter, and that they leave any potential
ulitity as an 'exercise for the reader'. Had they pursued it, perhaps
they wouldn't have published, in short. There are too many papers out
there, as it is.
>Consider the original K&R "C Programming Language."
>All the programs in it were quite small and limited.
>A simple hash table, for example. Yet the book is
>a classic.
I believe the Kernighan book was a text, in that era, as well, because
it was considered the definitive text on the subject.
>In any event, let me be explicit in saying that I do not
>consider the typical employee/department examples used
>in SQL books to be pointless; on the contrary they are quite
>valuable.
I'd have to disagree. They represent that common complaint with
academic papers. But you to your opinion. And me to mine.
>> >Of course. When you're writing a book about a technique, you
>> >show what it's good for, not what it's bad for.
>> Sounds more like promotion or salesmanship.
>What's wrong with that?
I said what was wrong with that in the very next sentence.
You only hear about what was wrong with last year's model when it
comes time to sell next year's model.
>it by reading your mind. You promote something by showing
>what it's good for; any other choice is ineffective.
But it might be more responsible, and your civic duty.
>> >There are some things that SQL does extraordinarily well.
>> In fact, wasn't it one of Codd's rules that SQL be the means of
>> manipulation, rather than some other sort of code, in addition?
>Why did you excerpt my "There are some things" sentence
>and follow it with what you did? It seems a complete non
>sequitur.
Because:
>And as I said, I have no particular desire (nor ability) to discuss
>Codd.
Okay.
>> >I want it small; that will make it easier to understand.
>> But there's a problem with that. One of those is - scale. How does it
>> scale? Does it scale?
>It turns out that SQL scales quite well
But what about such a scheme as I suggested, which was the point?
------------------
>I want it small; that will make it easier to understand.
But there's a problem with that. One of those is - scale. How does it
scale? Does it scale? The subject, here, is technical papers and
presentations. And in addition, do the trivial academic examples
conceal a fundamental problem with the approach? Does it work only
narrowly and in the most simply contrived situation, even regardless
of scale?
-----------------
>> The subject, here, is technical papers and presentations.
>It is? When did it become so, just with this sentence?
. . .
>> And in addition, do the trivial academic examples
>> conceal a fundamental problem with the approach? Does it work only
>> narrowly and in the most simply contrived situation, even regardless
>> of scale?
>I don't understand the purpose of these questions. Are they rhetorical?
>It seems fairly clear that the answers to these questions is "no."
Well perhaps the answer is - yes - they do conceal a basic problem,
they do not scale, they cannot handle another common set of data, etc.
>Do you have any experience with SQL?
Yes I do. I don't know how you imagined I was discussing SQL in
referring to "trivial academic examples". I was instead referring to
"trivial academic examples".
>> I was simply saying that relations tend to multiply, and so joins.
>Yes. Software tends to grow in complexity as we add new features
>and functionality, and capture more data.
>If we program in an OOPL, the number of classes goes up over time.
>If we program in an FPL, the number of functions goes up over time.
>This is a good thing.
Unless it simply piles confusion upon confusion, and does not allow
for a simple cover and vector redirection to an improved recipe that
incorporates a clearer understanding of the previous technology. It's
not really comparable to this idea of normalization, itself, however.
>> You may think it's a sort of pointless question. I don't know. Just
>> throw in whatever 'sounds good', and call it an entity?
>Could it be that when you ask "what is a type" what
>you're really asking is, how do we define tables? (I say this
>because you mentioned the word "entity".)
I glad I did mention the word, "entity", even if not parethetically.
Yes, I am referring to relations, and what goes in one, and not the
other.
>> >Tables and joins aren't an expense to be minimized.
>> >Columns aren't something you want to be sure to have
>> >the fewest of, no more so than rows.
>> In other words, what if an attribute doesn't belong?
>Then you put it where it does belong.
>Am I to take it that you haven't done much data modelling?
How does data modelling answer:
>> An 'instantiated' class is created in one or more processes, before
>> anything else, let's say. Following the book question, what's the
>> scheme for that? A process might apply to the class, but also to
>> another class. But is the process properly part of the class
>> relations, or is a linked set of relations specifying class more
>> properly a thing that belongs with the process relations? Is there
>> some convention that's been used, for any particular reason? Maybe
>> this question has been trivially answered, and years ago. It's not
>> that one can't view class in terms of process, and vis-versa, from the
>> same data. And I understand it, that supposedly is a benefit of RM.
>> But why does what go where in the underlying relations? Does it
>> matter?
>I really had a hard time understanding this paragraph. It sounds like
>you're asking how table design is done, but I could be completely
>wrong.
I'm simply asking what goes where, why, and for any particular cause.
Does it matter? Just slap together some 'things', and call it an,
entity? And then what happens when the next book comes along, or the
new technology that is being similarly represented as relations?
I just don't know. Maybe it seems a really pointless question?
>> >"Just think of the joins" is the same non-problem as
>> >"too many classes."
>> They're not comparable.
>I can't think of a difference. Can you?
I said they weren't comparable. That means I can't see any similarity.
I can definitely see a problem with having to run too many joins in a
SQL query.
>> And are you suggesting that a SQL statement
>> consisting of many joins would not seem rather cumbersome, and even
>> prone to error for that?
>For me, calculus seems cumbersome and prone to error; is this
>a flaw of calculus, or a limitation of mine?
You will notice in some that scale, a measuring scale, is important in
certain placeholders. Quantity of some type, some sort. But if you
mean just attempting to represent a SQL statement mathematically,
rather than mathematics in general, wouldn't that also be error prone
the more verbose the formula becomes? You would be attempting not to
block a few elements, and then a few more, as any coder, or
mathematician, would do. You would be faced with the 'reams of paper'
problem, instead.
>Again, it's not "excessive" if that's what's necessary to get
>the job done. Complicated problems often have complicated
>solutions.
Simple problems also often tend to consume far more time than
imagined. That's because people don't foresee the difficulty in what
they imagined was a simple problem. Sometimes, what seems complicated,
by the mechanisms used, by the scheme in place, requires changing a
single pointer, or something. I guess there's a difference between
anticipatory classification or qualification, and the same in
hindsight.
>> Rather, I was merely agreeing with you. Structure is transcendent of
>> any scheme. The scheme must represent that structure, or it represents
>> something else. I can't state it more clearly than that.
>Okay.
And then the question becomes, what IS that structure?
Yes and No. On the first point you need to convince me of the rationale
to actually perform an analysis. On the second we agree, it is not a
trivial problem, but it is quite soluble. What remains important is to
appreciate the requirement so the design is appropriate.
>>>And you defended the use of such with your -
>>>allegory. Remember?
>
>>I did indeed! Do you agree or disagree with the point?
>
> I've repeatedly said, to a couple of people now, that I disagree with
> any attempt to justify that approach. I simply pointed out that it's a
> common complaint with academic papers.
Text books using simple examples - yes, papers no - most of the ones I
have read presume substantial prior knowledge and deal with non trivial
questions.
>>>"Human history"? You lost me, here. What do you mean? The pyramids?
>
>>I assume you haven't the heard about the Nazi's using Hollerith punch
>>card technology from IBM to provide structure to the Holocaust? Whilst
>>the Jews bore the worst there were other groups targeted as well and the
>>punch cards made sure they were sent to the "right" camps.
>
> It wasn't just about camps. Millions were simply gunned down on the
> spot, or in fields being made to walk out to mass graves. The entire
> Nazi regime was the quick and steady corruption of a German society
> reeling from the straights into which the victors of the Great War
> forced them, as sort of a punishment. It was a society so organized,
> structured, advanced, in the center of European history before it was
> even thought of as, Europe, that the Manhattan Project was essentially
> justified as a race to get the bomb - a race with Hiesenberg and
> Germany's effort (which was a dead duck given Hiesenberg's fortunate
> miscalculations). Hitler's Final Solution to the Jewish Problem, as
> they termed it, his secret police and assault upon suspected traitors
> to his regime, his eyes and ears, his rumormongers, his round-up of
> the feeble and infirmed, of separating the 'races' even by something
> like phrenology, as insane as it was, the whole thing was very
> well-documented and methodical, though they also were quite methodical
> in destroying documents and physical evidence (thus the contradictions
> found, and exploited, by 'Holocaust deniers'). It wasn't merely punch
> cards. It was also that the trains ran on time. But that's not
> comparable to this, and any mention of Codd.
Simply illustrating my point that tyranny and structure can co-exist.
What is more interesting to appreciate is that structure does not
mandate tyranny especially as the term was used in respect of the RM in
a previous post.
>>>Perhaps I misunderstand what I thought Codd thought was an advantage
>>>of his scheme, decades ago. That could be. What about his rule #9?
>
>>What about it?
>
> Wasn't this supposed to be a key advance over what he believed had
> been the case?
All 12 rules were and remain a key advance.
>>The rules do not make the task of structural modification
>>difficult for the RM. It isn't even difficult to do anyway!
>
> Well, take the example:
>
>>>>>Remember what I said: For example, what is a type? What goes in the
>>>>>'set', as you express it, never mind any particular domain? Why is a
>>>>>book a 'type', when there are various sorts of books? Why is a chapter
>>>>>a 'type', when the chapters in the same book might be of a very
>>>>>different sort? Is the appendix which is more an index a type to
>>>>>itself? And so on. Are the paragraphs a 'type', and is the paragraph
>>>>>in chapter 4 different than the paragraph entities in relation chapter
>>>>>5?
Lets get to first base first and come up with the inaugural design
before we start modifying it - as you mention yourself below.
>>What's this fixation with the 'type' word?
>
> Thing, then, if you prefer that. What's the structure, here? And then
> imagine I find a second book. Does it also fit? You can put a binding
> on many things, speaking of which. Can the structure be changed? How
> difficult is it "to do anyway"? But you have to have an idea of the
> first, before attempting to change it.
OK - but remember Codd limited the scope to scalar types - arrangements
of which can describe the "things" you mention.
>>>>Choices, simply choices, make them, or don't - what is the fuss all about?
>
>>>In other words, the idea of this 'type' is so vague that it simply
>>>doesn't matter? See below.
>
>>I'm looking, I'm looking...
>
> Perhaps it just seems a rather pointlessly trivial question? I don't
> know. Just slap some stuff together, call it a 'relation'?
There is no slapping of any stuff together - it is a much more organised
and repeatable process.
>>>That's how _I_ understood it. That's why I mentioned this idea that
>>>breaking out parts of processes has sort of defined developments over
>>>the decades. Again, such might seem overkill for a small project. But
>>>.. . . etc., as I wrote, before.
>
>>You mention "small projects" again - I thought you were interested in
>>the hard ones?
>
> I mentioned both, and the difference between the two. It might seem
> overkill - it thankfully might not. That's both.
>
>>>>Why did I use irrationally - because
>>>>there are physical aspects that will impinge - such as the available
>>>>storage on a system might impact quite reasonably on how you build a
>>>>huge solution, but whether a system is big or little endian is of and
>>>>should be of no interest to a business analyst wanting to record text
>>>>for your books or chapters. Capiche?
>>>
>>>Work with what you have, in other words, and still produce a
>>>theoretically sound solution?
>>
>>No - just work period and produce a practically sound solution.
>
> I'm sure you still aim at some theoretical goal in producing whatever
> temporized "solution". Where the theory presents some insight into or
> just distillation of practice, a practioner can benefit from applying
> theory, just as any theorist can often greatly benefit from being a
> practioner.
That's true at a simple macro level - at the coal face the mixture of
practice and theory has to be carefully managed.
>>>Actually, my question went more to what
>>>perhaps you regard as a sort of pointless definitional confusion over
>>>just how things are grouped, labelled, typed. I thought much had been
>>>made about rule #2, and the lack of ambiguity, of like 'things' in
>>>relations. But what is a like thing? That's all.
>
>>Codd was talking about scalar values in #2 - not arbitary objects.
>
> Date's rewrite, perhaps.
Perhaps - I was depending on Wikipedia - is it not reliable?
> "To access any data-item you specify which column within which table
> it exists, there is no reading of characters 10 to 20 of a 255 byte
> string."
>
> Punch card - not punch card. But anything can go in 10-20. So again, I
> just get the sense that you think it sort of pointless to consider
> differences and what might appropriately fit one set of relations, and
> not another?
No - it depends what the requirements are. However is is unlikely I
would retain a design like your 10-20 one unless there was a compelling
reason.
Cheers, Frank.
I'm not that now?
>>If so I don't "agree" with you at all - I was pointing out that simple
>>examples are most useful for teaching basic knowledge.
>
> You were trying to defend such. I said that. You also admitted such
> was the case. And that's all I was saying.
Firstly I think I succeeded! Secondly there is no admission on my part -
it is a simple statement of practicalities.
None of which supports your underlying perspective which I would state
as "the RM is inadequate because nobody ever explains how it would model
really difficult problems".
Cheers, Frank.
>Mark Johnson wrote:
>> Frank Hamersley <terabit...@bigpond.com> wrote:
>>>Mark Johnson wrote:
>>>>"Marshall Spight" <marshal...@gmail.com> wrote:
>>>>>Mark Johnson wrote:
>>>>>>Christopher Browne <cbbr...@acm.org> wrote:
>>>Are you referring to me as the "other guy"?
>> You were, at that point.
>I'm not that now?
Were, though.
>>>If so I don't "agree" with you at all - I was pointing out that simple
>>>examples are most useful for teaching basic knowledge.
>> You were trying to defend such. I said that. You also admitted such
>> was the case. And that's all I was saying.
>Firstly I think I succeeded! Secondly there is no admission on my part -
>it is a simple statement of practicalities.
I think you succeeded in agreeing with my complaint, but did not at
all agree that it should be offered as a complaint.
>None of which supports your underlying perspective which I would state
>as "the RM is inadequate
I don't think it's at all inadequate.
>Mark Johnson wrote:
>> Frank Hamersley <terabit...@bigpond.com> wrote:
>>>Mark Johnson wrote:
>>>>FrankHamersley <FrankHam...@hotmail.com> wrote:
>>>>>Mark Johnson wrote:
>>>>>>Frank Hamersley <terabit...@bigpond.com> wrote:
>>>>>>>Mark Johnson wrote:
>> You think that perhaps such things don't matter? Such questions seems
>> perhaps pointlessly trivial?
>Yes and No. On the first point you need to convince me of the rationale
>to actually perform an analysis. On the second we agree, it is not a
>trivial problem, but it is quite soluble.
okay
>> It wasn't just about camps. Millions were simply gunned down on the
>> spot, or in fields being made to walk out to mass graves. The entire
>> Nazi regime was the quick and steady corruption of a German society
>> reeling from the straights into which the victors of the Great War
>> forced them, as sort of a punishment. It was a society so organized,
>> structured, advanced, in the center of European history before it was
>> even thought of as, Europe, that the Manhattan Project was essentially
>> justified as a race to get the bomb - a race with Hiesenberg and
>> Germany's effort (which was a dead duck given Hiesenberg's fortunate
>> miscalculations). Hitler's Final Solution to the Jewish Problem, as
>> they termed it, his secret police and assault upon suspected traitors
>> to his regime, his eyes and ears, his rumormongers, his round-up of
>> the feeble and infirmed, of separating the 'races' even by something
>> like phrenology, as insane as it was, the whole thing was very
>> well-documented and methodical, though they also were quite methodical
>> in destroying documents and physical evidence (thus the contradictions
>> found, and exploited, by 'Holocaust deniers'). It wasn't merely punch
>> cards. It was also that the trains ran on time. But that's not
>> comparable to this, and any mention of Codd.
>Simply illustrating my point that tyranny and structure can co-exist.
>What is more interesting to appreciate is that structure does not
>mandate tyranny especially as the term was used in respect of the RM in
>a previous post.
. . .
>> I'm sure you still aim at some theoretical goal in producing whatever
>> temporized "solution". Where the theory presents some insight into or
>> just distillation of practice, a practioner can benefit from applying
>> theory, just as any theorist can often greatly benefit from being a
>> practioner.
>That's true at a simple macro level - at the coal face the mixture of
>practice and theory has to be carefully managed.
And, practitioner. My typo, twice.
>Perhaps - I was depending on Wikipedia - is it not reliable?
Serious question? Wikipedia? You know how stuff gets into wikipedia?
> None of which supports your underlying perspective which I would state
> as "the RM is inadequate because nobody ever explains how it would model
> really difficult problems".
Well, I don't know anyone who fully understand how to use RM to solve really
difficult problems.
I think Codd knew, but ...
Would this make RM adequate ?
>
> >Perhaps - I was depending on Wikipedia - is it not reliable?
> Serious question? Wikipedia? You know how stuff gets into wikipedia?
You know who puts stuff into wikipedia ?
I think it's Jan Hidders.
Marshall
>"Mark Johnson" <1023...@compuserve.com> wrote in
>
>> And that's a straw man. That's not what I said. But at least we agree
>> that the examples typically shown are pointlessly trivial. Actually,
>> it's a common complaint against academics.
>
>In reality, it's a common complaint against introductory material in almost
>any sphere of endeavor.
True. Now, where is the next level? Often, it is absent.
A while back, I was reading a book on TCP/IP. A program had a
comment that error checking was omitted for clarity. Fine. I ran the
program, and it happened to work. I was aware though that errors can
occur, and my question was then what errors could occur and how to
handle them. The book did not have a version with error handling.
Dropped ball.
>They introduce the alphabet and the numbers from one to ten on Sesame
>Street. The examples they use are "pointlessly trivial", when viewed from
>the perspective of a CPA who reads Shakespeare for pleasure. But they
>aren't pointlessly trivial from the perspective of the kids who are watching
>and learning.
True, but what if combining letters into words was never covered?
>I would expect a good schema to introduce relational concepts to consist of
>less than 20 tables. I would expect it to be somewhat simpler that
>something people get paid to do. I'm surprised that you expect otherwise.
I would hope so. I would also like to see a more complex, more
real version for the intermediate level, say: "Here is a simple
invoicing system used by a client of mine, a small manufacturer's rep.
The introductory database is a simplified version. Let us examine the
differences between the two and why they are there. . . ."
Sincerely,
Gene Wirchenko
> >In reality, it's a common complaint against introductory material in
almost
> >any sphere of endeavor.
>
> True. Now, where is the next level? Often, it is absent.
>
> A while back, I was reading a book on TCP/IP. A program had a
> comment that error checking was omitted for clarity. Fine. I ran the
> program, and it happened to work. I was aware though that errors can
> occur, and my question was then what errors could occur and how to
> handle them. The book did not have a version with error handling.
> Dropped ball.
How much are you prepared to pay for such a book ?
>> handle them. The book did not have a version with error handling.
>> Dropped ball.
>How much are you prepared to pay for such a book ?
But isn't that an admission of what you seemed to want to deny, that
texts typically include rather trivial and meaningless examples, for
whatever reasons? You seem to suggest that useful insight and examples
must necessary cost much more to purchase? Or are you simply
complaining of the present costs now associated with short run
professional texts, which have suffered an inflationary run of perhaps
an order of magnitude in the last quarter century or more?
I always wondered if I was bipolar, now I know! :-)
>>>>If so I don't "agree" with you at all - I was pointing out that simple
>>>>examples are most useful for teaching basic knowledge.
>>>
>>>You were trying to defend such. I said that. You also admitted such
>>>was the case. And that's all I was saying.
>>
>>Firstly I think I succeeded! Secondly there is no admission on my part -
>>it is a simple statement of practicalities.
>
> I think you succeeded in agreeing with my complaint, but did not at
> all agree that it should be offered as a complaint.
Yep.
>>None of which supports your underlying perspective which I would state
>>as "the RM is inadequate
>
> I don't think it's at all inadequate.
OK - I had formed a view based on your previous posts that in your
opinion the RM is was too cumbersome for small jobs, and secondly
because nobody documents example designs for difficult problems that it
may also be less than useful for those as well.
Cheers, Frank.
Yep - sure do! Much more importantly is that specific entry bent out of
shape? If so can you point to an alternate source that has a truer form
of the 12 rules?
Cheers, Frank.
You should get out more :-).
Seriously, there prolly isn't that many who would convince me they knew
the subject, but there are some nonetheless.
Cheers, Frank.
> You should get out more :-).
Thanks for the tip. :-)
Can you recommend some places ?
> Seriously, there prolly isn't that many who would convince me they knew
> the subject, but there are some nonetheless.
They hide well. :-)
> >How much are you prepared to pay for such a book ?
> But isn't that an admission of what you seemed to want to deny, that
> texts typically include rather trivial and meaningless examples, for
> whatever reasons?
The examples are not trivial and meaningless.
The explanation is hard to understand just by reading the examples.
> You seem to suggest that useful insight and examples
> must necessary cost much more to purchase?
Not necessarily.
I am wondering what amount of money would persuade someone who really knows
what he is talking about to share its knowledge and if he would be able to
sell any book.
>Or are you simply
> complaining of the present costs now associated with short run
> professional texts, which have suffered an inflationary run of perhaps
> an order of magnitude in the last quarter century or more?
If you want to collect all those pieces how much it would cost you ?
Worth the moneys ?
Agreed.
This has reached the point of being completely off-topic for
this newsgroup. I don't know of a good newsgroup for
this sort of discussion, but if you wish to continue this
line of conversation, I would ask you to take it somewhere
else where it will be on-topic.
Marshall
>"Mark Johnson" <1023...@compuserve.com> wrote in message
>news:m3rlv19bu5u5jaqiv...@4ax.com...
>> "x" <x...@not-exists.org> wrote:
>>
>> >> handle them. The book did not have a version with error handling.
>> >> Dropped ball.
>
>> >How much are you prepared to pay for such a book ?
A fair amount. It has gotten to the point where error messages
omitted for "clarity" is a red flag.
>> But isn't that an admission of what you seemed to want to deny, that
>> texts typically include rather trivial and meaningless examples, for
>> whatever reasons?
>
>The examples are not trivial and meaningless.
>The explanation is hard to understand just by reading the examples.
Then, the examples are not that good and are trivial. In the
case I gave, it was impossible to determine which errors could occur
and how to handle them. The text was allegedly on TCP/IP.
>> You seem to suggest that useful insight and examples
>> must necessary cost much more to purchase?
>
>Not necessarily.
>I am wondering what amount of money would persuade someone who really knows
>what he is talking about to share its knowledge and if he would be able to
>sell any book.
>
>>Or are you simply
>> complaining of the present costs now associated with short run
>> professional texts, which have suffered an inflationary run of perhaps
>> an order of magnitude in the last quarter century or more?
>
>If you want to collect all those pieces how much it would cost you ?
>Worth the moneys ?
How much does it cost to not have the pieces? I am having
trouble at work trying to work out how to implement a multi-user
database application with predicate locking. I have never seen
anything on how to implement it.
Sincerely,
Gene Wirchenko
How theory is taught is relevant. Given the number of arguments
over terminology, that should be obvious.
Sincerely,
Gene Wirchenko
You might want to reconsider your design for at least two reasons:
1. In general. predicate locking conflict detection is an NP-hard
problem as has been known since 1976 (R System) so you may not want to
go this way.
2. A reasonable subset of predicate locking, known as "the next key
locking" has already been implemented in major databases such as DB2
and SQL Sever (see Mohan's ARIES protocol, 1981) so your database might
already have it. Even MySQL (InnoDB) has it.
In any case, there is a vast body of literature on predicate locking
that shoud be satisfactory at any level of familiarity with the
subject.
>
> Sincerely,
>
> Gene Wirchenko
No aspect of this conversation has anything specifically to do
with database theory. Hence it is offtopic here, obviousness
notwithstanding.
Marshall
So a statement that is broadly applicable (i.e. to more than just
database theory) is automatically off-topic?
Ha!
Sincerely,
Gene Wirchenko
No. But this one is.
I'm concerned about how rising gas prices are going to affect
database theorists and their ability to drive to work. Discuss.
:-)
> Ha!
Double "ha!" Or "ha ha" as they sometimes say.
Marshall
Has anybody simply reprinted whatever was Codd's last version, in his
own words?
>Mark Johnson wrote:
>> Frank Hamersley <terabit...@bigpond.com> wrote:
>>>Mark Johnson wrote:
>>>>Frank Hamersley <terabit...@bigpond.com> wrote:
>>>>>Mark Johnson wrote:
>>>>>>"Marshall Spight" <marshal...@gmail.com> wrote:
>>>>>>>Mark Johnson wrote:
>>>>>>>>Christopher Browne <cbbr...@acm.org> wrote:
>>>>>
>>>>>Are you referring to me as the "other guy"?
>>>>
>>>>You were, at that point.
>>>
>>>I'm not that now?
>>
>> Were, though.
>
>I always wondered if I was bipolar
Temporized, as the term would be understood in a temporal db.
>> I think you succeeded in agreeing with my complaint, but did not at
>> all agree that it should be offered as a complaint.
>Yep.
At least we agree on as much as that.
>>>None of which supports your underlying perspective which I would state
>>>as "the RM is inadequate
>> I don't think it's at all inadequate.
>OK - I had formed a view based on your previous posts that in your
>opinion the RM is was too cumbersome for small jobs
No, I referred, generally, to a general creation of workflow and
processes design meant to simplify and reduce error and confusion in
reasonably large projects, but which might seem overkill in simple
ones. That's all.
>I'm concerned about how rising gas prices are going to affect
>database theorists and their ability to drive to work. Discuss.
They pool or take public transit, in any case. But it might cause some
to fret exceedingly, to the point that their papers might not go into
sufficient detail, or provide sufficiently detailed examples, of their
work and proposals. It's really an old complaint with academic
publications, to be fair.
Yep - but they are _most_ likely a long way from where you are!
>>Seriously, there prolly isn't that many who would convince me they knew
>>the subject, but there are some nonetheless.
>
> They hide well. :-)
... and management is often unable to identify them even in their midst!
Cheers, Frank.
>>OK - I had formed a view based on your previous posts that in your
>>opinion the RM is was too cumbersome for small jobs
>
> No, I referred, generally, to a general creation of workflow and
> processes design meant to simplify and reduce error and confusion in
> reasonably large projects, but which might seem overkill in simple
> ones. That's all.
Fair enough - there is a base cost to using any of the big project
methodologies regardless of how small the project is. So you are correct
to infer there is a point at which they should not be deployed.
In small projects most of the coordination is face to face and informal
and remains effective because of the small number of players and limited
scope.
Cheers, Frank.
> > They hide well. :-)
>
> ... and management is often unable to identify them even in their midst!
Good managers hide well also :-)
Developers are often unable to identify them !
>In small projects most of the coordination is face to face and informal
>and remains effective because of the small number of players and limited
>scope.
And I was also suggesting that when the small project grows, such
informality can prove to be burdensome to further development? But
yes, the proper methods might seem greatly overkill at the beginning -
far too general and complicated. They exist, as Skerrit said in the
film, for your benefit and safety.
In that continued success is far less likely, yes. Picking the time to
transition is the art, but is often clouded with other factors like the
"it only took a small effort to develop the prototype, so why does the
enterprise solution cost so much?" mentat.
> But
> yes, the proper methods might seem greatly overkill at the beginning -
> far too general and complicated. They exist, as Skerrit said in the
> film, for your benefit and safety.
Yep - but as ever each succeeding generation will question the need.
That behaviour arises from a mixture of invincibility and reactionary
fervour!
Cheers, Frank.
Googling for 'Codd 12 Rules' finds lots of stuff - and you quickly get
close to what was in the original articles. The 'last version' might be
Codd's RM/V2 book - I don't know of anything later. Whether that would
be better than the original is open to debate. I do have a list of
URL's stashed in my browser's bookmarks if you really need them; they
were Googled several years ago, so they'd need revalidating.
--
Jonathan Leffler #include <disclaimer.h>
Email: jlef...@earthlink.net, jlef...@us.ibm.com
Guardian of DBD::Informix v2005.02 -- http://dbi.perl.org/
>Mark Johnson wrote:
>> Frank Hamersley <terabit...@bigpond.com> wrote:
>>> Yep - sure do! Much more importantly is that specific entry bent out of
>>> shape? If so can you point to an alternate source that has a truer form
>>> of the 12 rules?
>> Has anybody simply reprinted whatever was Codd's last version, in his
>> own words?
>Googling for 'Codd 12 Rules' finds lots of stuff - and you quickly get
>close to what was in the original articles. The 'last version' might be
>Codd's RM/V2 book - I don't know of anything later. Whether that would
>be better than the original is open to debate.
Well, let me amend that, then. His views apparently changed. It's not
that one can't get the complete texts. For some reason, I'm not sure
that they are entirely available online, but only partially. But
anyone could go to the local university library and get every last
article in an hour or so. One just has to do it. If one is affiliated,
you can probably get machine readable downloaded right to your
computer, or at worst facsimile images. His 12 Rules were initially
sent to a computer rag of the time, if I recall. But that shouldn't be
a problem.
But, like anyone, I'm sure certain of his views changed, or were
clarified by 'anomolies' and things he hadn't previously considered.
And it does seem that those trying to carry on for him even might
disagree at points, and have further modified his opinions by their
own. So when they say - Codd - maybe they mean more - themselves.
I thought 'next key locking' referred to a specific key-value locking
protocol that is described in Mohan's 1990 paper.
http://www.vldb.org/conf/1990/P392.PDF
I knew that the DBMSs you mention use a locking protocol from the ARIES
family, but I thought it was one of the more general ones, so did you
mean to say that they use this specific protocol?
-- Jan Hidders
What 'more general' protocol do you have in mind exactly?
The basice ARIES protocol, as he describes it on his VLDB 1999 sheets
on page 4:
http://www.almaden.ibm.com/u/mohan/vldb99_aries_slides.pdf
( For those who are interested, there is a very extensive description
in TODS: http://portal.acm.org/citation.cfm?id=128770 )
He mentions the more specific ones on page 5. I'm still curious which
of those are actually used in which DBMS.
-- Jan Hidders
Ah, ok, the ARIES family consists roughly of two parts: 1. WAL
(write-ahead transaction logging) and 2. key-range locking. (1) is
vital, (2) is nice to have . All the major commercial databases have
some form of (1). Oracle, for example, has (1) but does not use (2) at
the expense of not having the ' true' SERIALIZABLE isolation.
>
> -- Jan Hidders
Right, I always wondered why that was (and a bit ashamed for not
knowing). Thanks.
-- Jan Hidders
regards,
Peter
Peter Liedermann wrote:
> but bags can be represented by means of set theory, can't they?
There is an interesting mathematics behind both concepts. Let's shift
pespective from set-theoretic into algebraic and focus upon
multivariate polynomials. Consider a relation
{(x=1,y=1),(x=2,y=1)}
This set can be considered as a set of roots of some system of
polynomial equations. Can we write those equations explicitly? Sure:
(x-1)*(x-2)=0
(y-1)=0
The term for such an object in algebraic geometry is an "affine
variety". (Finite) relations essentially are 0-dimensional affine
varieties. (BTW, this is the ultimate
answer to ignorants who claim that relations are 2-dimensional:-)
Let's give some example of 1-dimensional affine variety:
x-y=0
This is a familiar x=y predicate. (It is 1-dimensional because we can
parameterize the set of roots of the equation x-y=0 with a single
parameter).
Next, consider a set which is not a variety:
(x-1)*(x-2)<=0
This is ax interval {x in [1,2]}, and I speculate that the challenge of
modelling temporal and spatial domains is anchored to the fact that we
have to give up some nice properties of algebraic varieties.
Next, there are set theoretic operations upon varieties. Intersection
of varieties is just combining their defining equations:
(x-1)*(x-2)=0
(y-1)=0
intersect
x-y=0
is
(x-1)*(x-2)=0
(y-1)=0
x-y=0
which reduces to
(x-1)=0
(y-1)=0
Intersection of varieties corresponds in relational language to join.
In our example we joined finite relation {(x=1,y=1),(x=2,y=1)} with
predicate {x=y} which in RA terms is the selection.
Next, a set of equations for a union of varieties A and B is build by
coupling each equation from A with that of B. For example, a union of
(x-1)*(x-2)=0
(y-1)=0
and
(y-2)=0
is
(x-1)*(x-2)*(y-2)=0
(y-1)*(y-2)=0
Even though union operands are 0-dimensional varieties, the result is
not a 0-dimensional variety, which translates into the corresponding
relation being no longer finite. The result is finite when both
varieties are defined in the same space (that is the same set of
variables). This is a familiar D&D union operator.
Since we are talking about roots of equations, it is naturally to ask
what about roots of multiplicity greater than one? For instance,
x^2 = 0
has root 0 of multiplicity 2. This would be a naive attempt to
introduce bags.
Unfortunately, the equation
x^2 = 0
defines the same variety as
x = 0
therefore, varieties are genuine relations.
The mathematical object that corresponds to a bag is a polynomial
ideal. Ideal is a set of polymomials closed over addition and
multiplication. The most celebrated mathematical result of 19th century
is Hilbert's basis theorem which says that every ideal has a finite
basis.
By finding the right mathematical counterpart of a bag we can hope
being able to give consistent definition of bag operations.
Ideals can be added, multiplied and intersected. The union of ideals
usually is not an ideal since it may not be closed under addition. From
the perspective of algebraic geometry, ideals and varieties are
intimately related: the addition of ideals corresponds to the
intersection of varieties, and the intersection of ideals corresponds
to the union of varieties. Also, the multiplication of ideals
corresponds to the union of varieties. Lets go throug the examples.
Multiplication of
<x^2>
by
<x^3>
produces
<x^5>. In bag language this corresponds to union of {0,0} with {0,0,0}
producing
{0,0,0,0,0}.
Intersection of
<x^2>
with
<x^3>
produces
<x^3>. In bag language this corresponds to set union of {0,0} with
{0,0,0} producing
{0,0,0}.
Addition of
<x^2>
with
<x^3>
produces
<x^2>. In bag language this corresponds to set intersection of {0,0}
with {0,0,0} producing {0,0}.
The analoyy shines in case of one variable. It breaks in case of many.
Consider the addition of
<x^2>
with
<y^3>
which is
<x^2,y^3>. In language of bags the corresponding operation is cartesian
product of {x=0,x=0} with {y=0,y=0,y=0}. The bag
{(x=0,y=0),(x=0,y=0),(x=0,y=0),(x=0,y=0), (x=0,y=0),(x=0,y=0)} is the
expected result, but does it really correspond to the ideal <x^2,y^3>?
There are many reasons why not.
In classic bag theory if we project
{(x=0,y=0),(x=0,y=0),(x=0,y=0),(x=0,y=0),(x=0,y=0),(x=0,y=0)} into x we
won't get the original relation {x=0,x=0} back (yet another snag that
challenges usefulness of classic bag theory). In ideal theory the
calculating elimination ideal is analogous to projection operation on
bags. Elimination ideal for <x^2,y^3> is <x^2> -- the original ideal.
This is as much info as internet posting can hold. A more detailed
paper is due.
That does not look right. Take different values for 'y'.
> The term for such an object in algebraic geometry is an "affine
> variety".
Well, a variety is a set of points, not a set of equations.
'{(x=1,y=1),(x=2,y=1)}'
is a variety allright.
See above. Perhaps you are confusing ideals that are generated by a
set of polynomials with varieties that are just a set points where
such ideals 'vanish' ?
>
> Intersection of varieties corresponds in relational language to join.
> In our example we joined finite relation {(x=1,y=1),(x=2,y=1)} with
> predicate {x=y} which in RA terms is the selection.
>
> Next, a set of equations for a union of varieties A and B is build by
> coupling each equation from A with that of B. For example, a union of
>
> (x-1)*(x-2)=0
> (y-1)=0
>
> and
>
> (y-2)=0
>
> is
>
> (x-1)*(x-2)*(y-2)=0
> (y-1)*(y-2)=0
Let's fix the terminology first before discussing the above.
>
> Even though union operands are 0-dimensional varieties, the result is
> not a 0-dimensional variety, which translates into the corresponding
> relation being no longer finite. The result is finite when both
> varieties are defined in the same space (that is the same set of
> variables). This is a familiar D&D union operator.
>
> Since we are talking about roots of equations, it is naturally to ask
> what about roots of multiplicity greater than one? For instance,
>
> x^2 = 0
>
> has root 0 of multiplicity 2. This would be a naive attempt to
> introduce bags.
>
> Unfortunately, the equation
>
> x^2 = 0
>
> defines the same variety as
>
> x = 0
>
> therefore, varieties are genuine relations.
>
> The mathematical object that corresponds to a bag is a polynomial
> ideal. Ideal is a set of polymomials closed over addition and
> multiplication.
Well, no. The structure you have in mind is called a ring. The ideal
is a special kind of subring closed with respect to 'external'
multiplication.
While there are some similarities, not surprizingly, between ideal
operations and RA, it's unclear what advantage if any the
variety/ideal lingo may have. For example, what field did you have
in mind when you talked about polynomials ?
OK, let's add a tuple {(x=2,y=2)} so that the relation becomes
{(x=1,y=1),(x=2,y=1),(x=2,y=2)}
The equations for the tuple are
x-2 = 0
y-2 = 0
We have to multiply RHS of every equation for the original relation
with the above set:
(x-1)*(x-2)*(x-2)=0
(x-1)*(x-2)*(y-2)=0
(y-1)*(x-2)=0
(y-1)*(y-2)=0
which reduces to
(x-1)*(x-2)=0
(x-2)*(y-1)=0
(y-1)*(y-2)=0
> > The term for such an object in algebraic geometry is an "affine
> > variety".
>
> Well, a variety is a set of points, not a set of equations.
> '{(x=1,y=1),(x=2,y=1)}'
> is a variety allright.
Variety is a set of zeros of a system of polynomial equations.
<snipped>
> See above. Perhaps you are confusing ideals that are generated by a
> set of polynomials with varieties that are just a set points where
> such ideals 'vanish' ?
...
> Let's fix the terminology first before discussing the above.
Well, if you want to skip this naive introduction of operations upon
varieties anf jump to ideals I don't object.
> > Since we are talking about roots of equations, it is naturally to ask
> > what about roots of multiplicity greater than one? For instance,
> >
> > x^2 = 0
> >
> > has root 0 of multiplicity 2. This would be a naive attempt to
> > introduce bags.
> >
> > Unfortunately, the equation
> >
> > x^2 = 0
> >
> > defines the same variety as
> >
> > x = 0
> >
> > therefore, varieties are genuine relations.
> >
> > The mathematical object that corresponds to a bag is a polynomial
> > ideal. Ideal is a set of polymomials closed over addition and
> > multiplication.
>
> Well, no. The structure you have in mind is called a ring. The ideal
> is a special kind of subring closed with respect to 'external'
> multiplication.
I don't understand this snippet. Can you be please more specific?
I hope to extend relational algebra with introduction of clean bags
semantics and aggregation.
> For example, what field did you have
> in mind when you talked about polynomials ?
C ? This matter doesn't seem important to me. In the relational
applications ideals are manufactured from 0-dimensional varieties, not
the other way around. Therefore, we don't need fancy algebraic
properties, like demanding the field to be algebraically closed.
No, let's just consider {(1,2), (3,4)}. You'd see that there would be
4 defining equations instead if just two as was in the case of {(1,1),
(2,1)}. My point is that you want to explain to those interested how
you recover polynoms from the variety and what computational complexity
it would entail, not just say "Can we write those equations explicitly?
Sure".
[...]
> > While there are some similarities, not surprizingly, between ideal
> > operations and RA, it's unclear what advantage if any the
> > variety/ideal lingo may have.
>
> I hope to extend relational algebra with introduction of clean bags
> semantics and aggregation.
>
> > For example, what field did you have
> > in mind when you talked about polynomials ?
>
> C ? This matter doesn't seem important to me.
Then, how do you intend to handle non-numeric domains like strings of
characters, or user defined finite domains ?
There are 4 equations indeed
(x-1)(x-3)=0
(x-1)(y-4)=0
(y-2)(x-3)=0
(y-2)(y-4)=0
However, take your favorite computer algebra system, compute Groebner
basis, and withness that the system reduces to just 2 equations:
(y-2)(y-4)=0
x-y+1=0
It is easy to see that we need only 2 equations. One defines domain y
in {2,4}. The other one is linear interpolation polynome y=x+1. There
would be 2 equations for any binary relation which has functional
dependency!
> My point is that you want to explain to those interested how
> you recover polynoms from the variety and what computational complexity
> it would entail, not just say "Can we write those equations explicitly?
> Sure".
It could be constructed by adding tuples one by one like in the example
in my previous message. It might be beneficial to apply Buchberger
algorithm at each step in order to collapse the system.
I haven't research complexity issues. Buchberger algorithm is known to
be theoretically non-efficient, while it has good behaviour on
practice.
> > > For example, what field did you have
> > > in mind when you talked about polynomials ?
> >
> > C ? This matter doesn't seem important to me.
>
> Then, how do you intend to handle non-numeric domains like strings of
> characters, or user defined finite domains ?
I expected this question:-) Map them to numbers somehow?
Rght, but imagine a real life relation with on average 10 attributes
and say cardinality of 50,000. Naively, one would need to end up with
10^50,000-1 equations, or perform 50,000 Buchberger reductions to ten
equations at each step.
>
> > My point is that you want to explain to those interested how
> > you recover polynoms from the variety and what computational complexity
> > it would entail, not just say "Can we write those equations explicitly?
> > Sure".
>
> It could be constructed by adding tuples one by one like in the example
> in my previous message. It might be beneficial to apply Buchberger
> algorithm at each step in order to collapse the system.
>
> I haven't research complexity issues. Buchberger algorithm is known to
> be theoretically non-efficient, while it has good behaviour on
> practice.
It woud be interesting to get some data on how realistic it might be
(see above).
>
> > > > For example, what field did you have
> > > > in mind when you talked about polynomials ?
> > >
> > > C ? This matter doesn't seem important to me.
> >
> > Then, how do you intend to handle non-numeric domains like strings of
> > characters, or user defined finite domains ?
>
> I expected this question:-) Map them to numbers somehow?
What kind of mapping do you have in mind ?
More importanly, I am not convinced that ideals is a good tool tool to
represent bags: killing a fly with a gun, unfamiliar to most
mumbo-jumbo, etc. Could you please list some advantages as compared
to the "old way" ?
Relations should always be reduced, as 10^50,000 is ridiculously large.
I'm not able to get any estimate on the polynomial system size however.
Here is a radical ideal for the relation with 10 tuples in 3 variables:
{x = 5, y = 5, z = 5}, {x = 1, y = 1, z = 1}, {y = 2, x = 3, z = 1},
{y = 4, x = 3, z = 1}, {z = 2, x = 1, y = 1},
{z = 2, x = 1, y = 2}, {y = 1, z = 4, x = 3},
{x = 5, y = 2, z = 4}, {y = 4, z = 4, x = 3},
{y = 2, z = 3, x = 3}
<pre>
S := [z - 15 z + 85 z - 225 z + 274 z - 120, 2 z y - 14 z y
4 3 2 2 2
+ 28 y z - 16 y - 3 z + 26 z - 77 z + 94 z - 40, z y
2 2 2 4 3
- 5 z y + 4 y - 3 z y + 15 y z - 12 y - 2 z + 20 z
2
- 68 z + 90 z - 40,
3 2 4 3 2
2 y - 14 y + 28 y - z + 10 z - 35 z + 50 z - 40, 9 x
2 2 2 4 3 2
+ z y + 5 y - 9 z y + 42 y z - 69 y + z - 4 z - z + 25
]
</pre>
-- 4 equations only, so that polynomial representation appears not to
grow significantly.
Here is couple of amusing manipulations:
1. select all the tuples with x=1:
> gbasis(convert(S ,set) union {x-1},plex(x,y,z)); # polynomial representation
solve(convert(S ,set) union {x-1},{x,y,z}); # relational
representation
2 2
[z - 3 z + 2, y z - z - 2 y + 2, y - 3 y + 2, x - 1]
{x = 1, y = 1, z = 1}, {z = 2, x = 1, y = 1}, {z = 2, x = 1, y = 2}
2. Project the previous result to {x,y}:
i) Eliminate variable x by computing Grobner basis in proper monomial
ordering. In our case we can just take a shortcut and notice that we
can just trow away "x-1" from the
2 2
[z - 3 z + 2, y z - z - 2 y + 2, y - 3 y + 2, x - 1]
ii) > solve({z^2-3*z+2, y*z-z-2*y+2, y^2-3*y+2},{y,z});
{z = 2, y = 2}, {y = 1, z = 1}, {z = 2, y = 1}
Life is not as simple for bags. If we fing grobner basis -- that would
be a radical ideal and duplicate information would be lost.
> More importanly, I am not convinced that ideals is a good tool tool to
> represent bags: killing a fly with a gun, unfamiliar to most
> mumbo-jumbo, etc. Could you please list some advantages as compared
> to the "old way" ?
I don't have satisfactory answer yet.