Love or hate, or? domains with cardinality two

Nicola

unread,

Nov 2, 2015, 12:29:56 PM11/2/15

to

Not long ago, I had a quick chat with a logician, where we talked about
boolean domains (domain with two elements) in connection to logic and
databases. All started from an instance of a relational schema:

Employee(id, dept, mgr?)

where mgr? is boolean. A tuple (x,y,true) means that employee x belongs
to department y and x is a manager of y. A tuple (x,y,false) means that
employee x belongs to department y but x is not a manager of y.

I was arguing about how schemas like that are commonly found in
practice, and about how I feel uncomfortable with them, because, let
apart the obvious difficulties with enforcing some integrity constraints
(e.g., a department should have one and only one manager at a time) in
current DBMSs, truth of a predicate is coded as a value of a domain.
Suprisingly (for me), she did not dismiss the approach at once, but
pointed out that you are trading a predicate (Manager(id)) for a couple
of new constant symbols and the two things should be essentially
interchangeable (from a logical point of view).

Not long after that, I stumbled across one of those lengthy arguments by
Chris Date, in which he critiqued a schema like:

Loves(x,y) ("x loves y")
Hates(x,y) ("x hates y")

on the ground of a principle by which each tuple should be insertable
only in one schema of a database (in the example above, ('Romeo',
'Juliet') may be inserted in both), and he proposes to use a single
schema R(x, z, y) where z ranges over {'loves','hates'}.

I am still uncomfortable with boolean domains, but I can't pinpoint the
exact reasons. In the Loves/Hates example I don't see a particular
advantage in using a single schema, because in either case you must
impose similar kinds of constraints (if x loves y then x doesn't hate
y...); the only difference is that in the first case those are
interrelational constraints, and you might say that they are not as
"easy" to enforce as intrarelational constraints (is it true?).

In the Employee example, boolean values may lead to redundancy, but of a
kind that is not captured by the usual normal forms. For example, if
each department may only have one manager, then in this instance:

id dept mgr?
--------------
1 Math true
2 Math ?
3 Math ?

all the values denoted by ? must necessarily be false. This schema is in
5NF, however (it is not in DKNF because it has an insertion anomaly,
e.g, inserting (4, 'Math', true) leads to an inconsistent instance). Of
course, this argument doesn't hold if a department may have many
managers.

But maybe my intuition is wrong. What do you think? Are there compelling
reasons to avoid boolean domains in logical database design?

Nicola

--- news://freenews.netfront.net/ - complaints: ne...@netfront.net ---

Nicola

unread,

Nov 2, 2015, 4:55:20 PM11/2/15

to

I think that part of my discomfort has to do with the Closed
World Assumption. This is better explained with a contrived example:
instead of a predicate Cat(x) that holds when x is a cat, let us
consider a binary predicate P(x,y) with y boolean, such that
P(x,true) means "x is a cat" and P(x,false) means "x is not a cat".
Now, what does it mean for a tuple (x,true) *not* to be in P? By the
CWA, it means that 'not P(x,true)' must hold. That is, 'it is not true
that x is a cat'. But then, (x,false) must be in P. So, for every
element of the (infinite) domain of x there must be a tuple in P.

Now, in the Employee example things are a bit different.
Employee(x,y,true) is a compound predicate: 'employee x belongs to
dept y and x is a manager of y'. The negation is a disjunction, which
is consistent with (x,y,false) being or not being in Employee (I leave
the details to you). If the semantics were 'x is a manager of dept y',
then there would be trouble.

The Loves/Hates example is another story. The starting schema has two
predicates Loves(x,y) and Hates(x,y). Of course, if a tuple (a,b) is
in neither, it means that 'a does not love b' and 'a does not hate b',
which must not be a contradiction - if it were, then the two relations
would need to be infinite if the domains are infinite (each pair
should occur in one of the two relations). So, love/hate is not really
a boolean attribute.

Tegiri Nenashi

unread,

Nov 3, 2015, 3:56:28 PM11/3/15

to

Are you comfortable with nested relations? If you do, then boolean domains should be allowed, because booleans are propositional values, while propositions are 0-ary predicates.

James K. Lowden

unread,

Nov 3, 2015, 5:13:21 PM11/3/15

to

On Mon, 2 Nov 2015 18:29:54 +0100
Nicola <nvitac...@gmail.com> wrote:

> Employee(id, dept, mgr?)
>
> where mgr? is boolean. A tuple (x,y,true) means that employee x
> belongs to department y and x is a manager of y. A tuple (x,y,false)
> means that employee x belongs to department y but x is not a manager
> of y.
>
> I was arguing about how schemas like that are commonly found in
> practice, and about how I feel uncomfortable with them, because, let
> apart the obvious difficulties with enforcing some integrity
> constraints (e.g., a department should have one and only one manager
> at a time) in current DBMSs, truth of a predicate is coded as a value
> of a domain. Suprisingly (for me), she did not dismiss the approach
> at once, but pointed out that you are trading a predicate (Manager
> (id)) for a couple of new constant symbols and the two things should
> be essentially interchangeable (from a logical point of view).

...

> But maybe my intuition is wrong. What do you think? Are there
> compelling reasons to avoid boolean domains in logical database
> design?

At a logical level, there's nothing to argue about, right? As your
logician friend said, whether "true" is represented symbolically or
derived from other facts doesn't much matter.

I think your intuition is nevertheless correct for two practical
reasons.

1. Because SQL has no operator for relational division, FORALL tests
are hard to express, and therefore to enforce. That leads people to
choose boolean columns over convoluted quantification queries. But
unless the test is somehow otherwise implemented, the database is
subject to becoming inconsistent. If you skip the boolean, you avoid
that potential inconsistency.

2. Very often, "2 is the wrong answer". What starts out as seemingly
simple -- is manager or not, or loves/hates (terrible contrived example)
-- turn out to have more attributes or states (perhaps the manager's
title or managing relationship, or how every New Yorker I know feels
about the city). Better to describe facts and infer truth, because
that way as the facts become more complex the derivation of "is
manager" can change with them.

My problem with the love/hates example (other than its artificiality) is
that it seems a poor model for any domain of discourse I can imagine.
In what realm do we want to know who are the lovers and haters? Are not
the actors people, and their connections what is interesting?

I would tell the designer that he's likely elevated data to metadata,
because to model new emotions requires a schema change. That's not
*wrong*, because as you say the two models are logically
interchangeable. It will prove inconvenient, though, if a dynamic
aspect of the domain of discourse has been modelled as DDL.

--jkl

-CELKO-

unread,

Nov 3, 2015, 5:17:42 PM11/3/15

to

I did a piece on this topic:

https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=celko%20bit%20of%20a%20problem

I am against it for reasons given there.

Nicola

unread,

Nov 5, 2015, 6:08:19 AM11/5/15

to

Good point, but then the question becomes: are there good arguments
against (or in favor of) 0-ary predicates in database design? In a
nested relational database design I'd tend to prefer factoring out
0-ary predicates into a separate 1NF schema. If you ask me why, well, I
posted exactly because I'm trying to find compelling arguments :)

I'd like to point out that I'm not against boolean domains per se,
e.g., in SQL it is perfectly fine to ask something like:

select exists (select 1 from T);

or

select location, temperature < 10 from Weather;

Note also that my problem is not related to the cardinality of the
domain per se - domains of cardinality two do not have anything special
(*) -, but to a specific interpretation of the two values of the domain
that is constrained so as mimic the true and false truth values.

So, it is the use of such boolean values in base relations that leaves
me a bit perplexed. In a relation with a boolean attribute you are
effectively using a trick to encode both positive and negative
information (true and false propositions), while typically in a
relation schema you just represent only one kind of information. This
leads to situations in which you must guarantee some coherence (at
least one of (x,true) or (x,false) must be present in a valid
instance), as I have noticed in a previous post. On the other hand, in
a case like Employee(id, dept, mgr?) such coherence is guaranteed "by
construction" (unless you allow null values in mgr?...).

Nicola

(*) It occurred to me that in Fagin's "A Normal Form for Relational
Databases That Is Based on Domains and Keys" he considers the
"combinatorial consequences of bounded domain sizes". It was something
about some logical equivalences not holding when domains of nonprime
attributes are "too small". But I don't recall all the details. Maybe,
rereading that paper will give me some insight.

Nicola

unread,

Nov 5, 2015, 6:14:51 AM11/5/15

to

On 2015-11-05 11:08:15 +0000, Nicola said:

> In a relation with a boolean attribute you are effectively using a
> trick to encode both positive and negative information (true and false

> propositions), while typically in a relation schema...

that should be "in a relation instance", of course.

Nicola

Message has been deleted

com...@hotmail.com

unread,

Nov 6, 2015, 5:40:54 AM11/6/15

to

On Thursday, November 5, 2015 at 3:14:51 AM UTC-8, Nicola wrote:
> On 2015-11-05 11:08:15 +0000, Nicola said:
>
> > In a relation with a boolean attribute you are effectively using a
> > trick to encode both positive and negative information (true and false
> > propositions), while typically in a relation schema...

There isn't anything different about boolean attributes than any other attributes. Your problem is you are not writing your predicates correctly.

For EM(x,b) for your first example the predicate is
x is an employee and b and x is a manager
or x is an employee and not b and x is not a manager
which is
x is an employee and it is b that x is a manager

The fundamental theorem of the relational model:
If every base holds the rows that make its predicate true
then every expression holds the rows that make its predicate true.
(Where the predicate of a JOIN is the AND of its arguments' predicates, of a UNION, OR, of a MINUS, AND NOT, of PROJECT all-but-X, EXISTS X, etc.)

So present tuples assert their propositions and absent tuples assert NOT their
propositions. (Because of the second conjuct we say we are making the CWA.)

So the proposition for <fred,true> is "fred is an employee and fred is a manager". When in the relation it asserts that and when not it asserts NOT that.

The more "relational" design is
E(x) x is an employee
M(x) x is a manager

Each base asserts its proposition per its present and absent tuples.

If you write these predicates in terms of the above predicates you get the predciate for the corresponding view relation expression. Which also maps values and constraints from the EM schema to the other.

In your Schrodinger P(x,b) example you suggested that the predicate was:
b and x is a cat
or not b and x is not a cat
But that does not hold the rows you think it would. <c1,true> is in P iff c1 is a cat. Etc for c2,... . But <c1,false> is in P iff c1 is a cat, ie iff <c1,true> is not in P. Etc for c2,... . Ie every c has exactly one row in P.

This is in response to your first post. See what you can make of your others now. See also comp.databases.theory posts signed by me re the relational model & predicates.

PS
Why do you even mention DKNY? It holds when a table can hold any value per its heading. That's the case or it isn't. It doesn't have anything to do with the sequence of normal forms implied by PJ/NF per se.

philip

com...@hotmail.com

unread,

Nov 6, 2015, 5:46:47 AM11/6/15

to

On Friday, November 6, 2015 at 2:40:54 AM UTC-8, com...@hotmail.com wrote:
> On Thursday, November 5, 2015 at 3:14:51 AM UTC-8, Nicola wrote:

Correction:

> In your Schrodinger P(x,b) example you suggested that the predicate was:
> b and x is a cat
> or not b and x is not a cat
> But that does not hold the rows you think it would. <c1,true> is in P iff c1 is a cat. Etc for c2,... . But <c1,false> is in P iff c1 is

not

> a cat, ie iff <c1,true> is not in P. Etc for c2,... . Ie every c has exactly one row in P.

philip

Nicola

unread,

Nov 6, 2015, 12:45:53 PM11/6/15

to

On 2015-11-03 22:13:18 +0000, James K. Lowden said:

> On Mon, 2 Nov 2015 18:29:54 +0100
> Nicola <nvitac...@gmail.com> wrote:
>
>> Employee(id, dept, mgr?)
>>
>> where mgr? is boolean. A tuple (x,y,true) means that employee x
>> belongs to department y and x is a manager of y. A tuple (x,y,false)
>> means that employee x belongs to department y but x is not a manager
>> of y.
>>
>> I was arguing about how schemas like that are commonly found in
>> practice, and about how I feel uncomfortable with them, because, let
>> apart the obvious difficulties with enforcing some integrity
>> constraints (e.g., a department should have one and only one manager
>> at a time) in current DBMSs, truth of a predicate is coded as a value
>> of a domain. Suprisingly (for me), she did not dismiss the approach
>> at once, but pointed out that you are trading a predicate (Manager
>> (id)) for a couple of new constant symbols and the two things should
>> be essentially interchangeable (from a logical point of view).
> ...
>> But maybe my intuition is wrong. What do you think? Are there
>> compelling reasons to avoid boolean domains in logical database
>> design?
>
> At a logical level, there's nothing to argue about, right?

Right. I hoped there was something (related to CWA vs OWA, 3VL, logical
negation, or something else), but apparently there is not.

> I think your intuition is nevertheless correct for two practical

> reasons. [...]

I agree with those. It seems that arguments in favor or against boolean
attributes are eventually just pragmatic.

Nicola

unread,

Nov 6, 2015, 3:20:12 PM11/6/15

to

The product release example is very similar to the Employee example.
Yes, enforcing the relevant constraints on the master_release_flg is a
pain in the neck: I wouldn't know how to do it declaratively, unless
there is support for subselects in constraint-check clauses (which is
not widespread, as far as I know), and even then I would not rule out
triggers because they might perform better. Partial unique indexes might
help, too, if available.

However, defining a master release table would be better, in my opinion,
than your proposed solution based on the minimum release number, because
the meaning would be clearer than by using an implicit rule, which
amounts to not enforcing the constraint (is the product with the minimum
release number really the master release, or has someone deleted the
master release by mistake?):

create table Master_release (
product_id char(15),
product_release_nbr integer,
primary key (product_id),
foreign key (product_id, product_release_nbr) references Products
on update cascade on delete restrict
);

In this case, the only non-trivial constraint is to ensure that each
product (with at least two releases) has a corresponding master release.
Unless I am missing something, I believe that this solution would be
easier to implement than the one based on a master_release_flg flag, and
not much more complicated than your proposed solution (with the
advantage that the view with the master release of the products becomes
trivial).

Your last example about the blood factor has made me realize that there
is a semantic nuance between a proposition and its negation, and two
(positive) propositions that are mutually exclusive. Even if the only
values for the blood factor are Rh+ and Rh-, certainly I would want to
assert both facts: I wouldn't read a tuple (1,'Rh-') as "Person 1 does
not have blood factor Rh+", nor would I design a database with schemas
Person(id) and Rh_Positive(id) and retrieve people with Rh- blood factor
using "Person minus Rh_Positive"...

Nicola

unread,

Nov 6, 2015, 3:56:20 PM11/6/15

to

On 2015-11-06 10:40:53 +0000, com...@hotmail.com said:

> On Thursday, November 5, 2015 at 3:14:51 AM UTC-8, Nicola wrote:
>> On 2015-11-05 11:08:15 +0000, Nicola said:
>>
>>> In a relation with a boolean attribute you are effectively using a
>>> trick to encode both positive and negative information (true and false
>>> propositions), while typically in a relation schema...
>
> There isn't anything different about boolean attributes than any other
> attributes.

Except for the fact that, by definition, the interpretation of the
predicates they
appear in is constrained, i.e., it cannot be the case that P(c,true)
and P(c,false)
are both true or both false.

> Your problem is you are not writing your predicates correctly.

Which predicates did I not specify correctly?

> For EM(x,b) for your first example the predicate is
> x is an employee and b and x is a manager
> or x is an employee and not b and x is not a manager
> which is
> x is an employee and it is b that x is a manager
>

> [...]

> So present tuples assert their propositions and absent tuples assert NOT their
> propositions. (Because of the second conjuct we say we are making the CWA.)
>
> So the proposition for <fred,true> is "fred is an employee and fred is
> a manager". When in the relation it asserts that and when not it
> asserts NOT that.

Ok, that's fine.

> Why do you even mention DKNY?

DKNY?

Nicola

unread,

Nov 6, 2015, 4:08:18 PM11/6/15

to

(Rule number 0: never write on usenet on Friday night.)

philip: forget my previous post, I've figured it out.

Gene Wirchenko

unread,

Nov 9, 2015, 6:33:55 PM11/9/15

to

On Mon, 2 Nov 2015 18:29:54 +0100, Nicola <nvitac...@gmail.com>
wrote:

>Not long ago, I had a quick chat with a logician, where we talked about
>boolean domains (domain with two elements) in connection to logic and
>databases. All started from an instance of a relational schema:
>
>Employee(id, dept, mgr?)
>
>where mgr? is boolean. A tuple (x,y,true) means that employee x belongs
>to department y and x is a manager of y. A tuple (x,y,false) means that
>employee x belongs to department y but x is not a manager of y.
>
>I was arguing about how schemas like that are commonly found in
>practice, and about how I feel uncomfortable with them, because, let
>apart the obvious difficulties with enforcing some integrity constraints
>(e.g., a department should have one and only one manager at a time) in

Invalid assumption. I do some inventory counting and have seen
references in stores to co-managers. The store has more than one
store manager (not just managers of different ranks).

And what obvious difficulties?

[snip]

Sincerely,

Gene Wirchenko

Nicola

unread,

Nov 10, 2015, 1:01:09 PM11/10/15

to

On 2015-11-09 23:33:52 +0000, Gene Wirchenko said:

I was arguing about how schemas like that are commonly found in

practice, and about how I feel uncomfortable with them, because, let

apart the obvious difficulties with enforcing some integrity constraints

(e.g., a department should have one and only one manager at a time) in

Invalid assumption. I do some inventory counting and have seen

references in stores to co-managers. The store has more than one

store manager (not just managers of different ranks).

And I have seen departments where the assumption (at least, the "only one"

part) is valid. Anyway, the point is not whether the assumption is realistic

or not, but what the implications are.

And what obvious difficulties?

If you think there aren't any, show me a straightforward solution (of

course, what is difficult and what is not is largely subjective!).

Nicola

Gene Wirchenko

unread,

Nov 10, 2015, 1:21:55 PM11/10/15

to

On Tue, 10 Nov 2015 19:01:06 +0100, Nicola <nvitac...@gmail.com>
wrote:

>On 2015-11-09 23:33:52 +0000, Gene Wirchenko said:
>
>>> I was arguing about how schemas like that are commonly found in
>>> practice, and about how I feel uncomfortable with them, because, let
>>> apart the obvious difficulties with enforcing some integrity constraints
>>> (e.g., a department should have one and only one manager at a time) in
>>
>> Invalid assumption. I do some inventory counting and have seen
>> references in stores to co-managers. The store has more than one
>> store manager (not just managers of different ranks).
>
>And I have seen departments where the assumption (at least, the "only one"
>part) is valid. Anyway, the point is not whether the assumption is realistic
>or not, but what the implications are.

It is important. If you use the wrong rules, you get a mess.

>> And what obvious difficulties?
>
>If you think there aren't any, show me a straightforward solution (of
>course, what is difficult and what is not is largely subjective!).

I have not claimed that there were none.

You are the one who made the claim that they were obvious. Prove
your claim.

Sincerely,

Gene Wirchenko
o make the claim

Tegiri Nenashi

unread,

Nov 10, 2015, 2:14:57 PM11/10/15

to

On Thursday, November 5, 2015 at 3:08:19 AM UTC-8, Nicola wrote:
> ... : are there good arguments

> against (or in favor of) 0-ary predicates in database design? In a
> nested relational database design I'd tend to prefer factoring out
> 0-ary predicates into a separate 1NF schema. If you ask me why, well, I
> posted exactly because I'm trying to find compelling arguments :)

In every single database schema I have seen the design with boolean datatype was inferior to "normal" domain:
- isManager in your example is clearly less informative than employees hierarchy
- The design with isCreditWorthy boolean attribute literally begs for credit score
https://vadimtropashko.wordpress.com/2010/09/16/on-boolean-datatype-in-sql-and-beyond/

It is up to proponents of Boolean datatype to exhibit schema where their schema design is superior.

> ... In a relation with a boolean attribute you are

> effectively using a trick to encode both positive and negative
> information (true and false propositions), while typically in a
> relation schema you just represent only one kind of information. This
> leads to situations in which you must guarantee some coherence (at
> least one of (x,true) or (x,false) must be present in a valid
> instance), as I have noticed in a previous post.

Is it really that different from constraints & normal forms? If you impose FD id->isManager then you guarantee no contradiction. If you demand PK(id), then you add more consistency.

> ... (*) It occurred to me that in Fagin's "A Normal Form for Relational

> Databases That Is Based on Domains and Keys" he considers the
> "combinatorial consequences of bounded domain sizes". It was something
> about some logical equivalences not holding when domains of nonprime
> attributes are "too small". But I don't recall all the details. Maybe,
> rereading that paper will give me some insight.

I noticed nothing special there: in conclusion to that paper the author stipulates that domain cardinality is greater or equal to 2.

Nicola

unread,

Nov 11, 2015, 1:36:37 PM11/11/15

to

On 2015-11-10 18:21:54 +0000, Gene Wirchenko said:
>
>>> And what obvious difficulties?
>>
>> If you think there aren't any, show me a straightforward solution (of
>> course, what is difficult and what is not is largely subjective!).
>
> I have not claimed that there were none.
>
> You are the one who made the claim that they were obvious. Prove
> your claim.

On the Employee(id, dept, mgr : boolean) schema, the constraint by which
at any time each department has one and only one manager may be formalized
as follows:

forall x (Department(x) -> (exists y (Employee(x, y, true)
and
forall w,z (Employee(x, w, z) and w <> y -> z = false))))

This is not the most straightforward constraint (especially if you have
to implement it in SQL), and other solutions may result in constraints
that are simpler to enforce in practice (e.g., using a different schema
a property equivalent to the second conjunct may be enforced through a
primary key constraint).

Of course, if you reject the constraint as invalid to begin with, this
whole point is void.

Nicola

unread,

Nov 11, 2015, 2:04:45 PM11/11/15

to

>> ... (*) It occurred to me that in Fagin's "A Normal Form for Relational
>> Databases That Is Based on Domains and Keys" he considers the
>> "combinatorial consequences of bounded domain sizes". It was something
>> about some logical equivalences not holding when domains of nonprime
>> attributes are "too small". But I don't recall all the details. Maybe,
>> rereading that paper will give me some insight.
>
> I noticed nothing special there: in conclusion to that paper the author
> stipulates that domain cardinality is greater or equal to 2.

Yes, a very weak condition. More interesting is the situation for prime
attributes and join dependencies: there, cardinality constraints are
less trivial:

- "too small" domains play a role on PJ/NF (Th. 6.10 of Fagin's paper);
- they cause an explosion in computational complexity at least for the
problem of deriving join dependencies [1].

Nicola

[1] Kanellakis, 1980 (cited by Fagin)

Tegiri Nenashi

unread,

Nov 12, 2015, 2:29:04 PM11/12/15

to

Auction example is a very good one. Auction is just an example of an operation performed by a system, with a result being something more than just a Boolean. Virtually every software system provides integer codes, or exceptions as information about operation success or failure. The asymmetry is very puzzling: if it is a success we don't want to know more details, whereas exception or error code is an information indispensable for troubleshooting.

Erwin

unread,

Nov 16, 2015, 3:13:04 PM11/16/15

to

Op dinsdag 3 november 2015 21:56:28 UTC+1 schreef Tegiri Nenashi:

> Are you comfortable with nested relations? If you do, then boolean domains should be allowed, because booleans are propositional values, while propositions are 0-ary predicates.

I don't understand the connection. Care to explain ?

Erwin

unread,

Nov 16, 2015, 3:25:14 PM11/16/15

to

Op dinsdag 10 november 2015 20:14:57 UTC+1 schreef Tegiri Nenashi:

> On Thursday, November 5, 2015 at 3:08:19 AM UTC-8, Nicola wrote:
> > ... : are there good arguments
> > against (or in favor of) 0-ary predicates in database design? In a
> > nested relational database design I'd tend to prefer factoring out
> > 0-ary predicates into a separate 1NF schema. If you ask me why, well, I
> > posted exactly because I'm trying to find compelling arguments :)
>
> In every single database schema I have seen the design with boolean datatype was inferior to "normal" domain:
> - isManager in your example is clearly less informative than employees hierarchy
> - The design with isCreditWorthy boolean attribute literally begs for credit score

Curiously, way before I heard of the existence of the relational model, I often tossed around "codes are poor, entities are rich". Codes, and especially the Y/N kind, will tell you only Y/N. Entities can tell you so much more. Also, entities, and the SQL tables that implement them, are fairly easily extensible using standard data[base] management facilities. Extending Y/N with a "reason" or some such, which then only applies to the 'Y' cases, does not fit that description (well, if relational fidelity and null avoidance is worth anything).

> https://vadimtropashko.wordpress.com/2010/09/16/on-boolean-datatype-in-sql-and-beyond/
>
> It is up to proponents of Boolean datatype to exhibit schema where their schema design is superior.

I am not a fan of boolean attributes in relations.

http://shark.armchair.mb.ca/~erwin/booleandomains_0106.html

Erwin

unread,

Nov 16, 2015, 3:34:29 PM11/16/15

to

Op woensdag 11 november 2015 19:36:37 UTC+1 schreef Nicola:

> On 2015-11-10 18:21:54 +0000, Gene Wirchenko said:
> >
> >>> And what obvious difficulties?
> >>
> >> If you think there aren't any, show me a straightforward solution (of
> >> course, what is difficult and what is not is largely subjective!).
> >
> > I have not claimed that there were none.
> >
> > You are the one who made the claim that they were obvious. Prove
> > your claim.
>
> On the Employee(id, dept, mgr : boolean) schema, the constraint by which
> at any time each department has one and only one manager may be formalized
> as follows:
>
> forall x (Department(x) -> (exists y (Employee(x, y, true)
> and
> forall w,z (Employee(x, w, z) and w <> y -> z = false))))
>
> This is not the most straightforward constraint

???

(DEPARTMENT NOTMATCHING EMPLOYEE) SUBSET-OF FI

plus a key declaration {dept} on the view EMPLOYEE WHERE mgr

Nicola

unread,

Nov 16, 2015, 4:09:14 PM11/16/15

to

On 2015-11-16 20:34:27 +0000, Erwin said:

> Op woensdag 11 november 2015 19:36:37 UTC+1 schreef Nicola:
>> On 2015-11-10 18:21:54 +0000, Gene Wirchenko said:
>>>
>>>>> And what obvious difficulties?
>>>>
>>>> If you think there aren't any, show me a straightforward solution (of
>>>> course, what is difficult and what is not is largely subjective!).
>>>
>>> I have not claimed that there were none.
>>>
>>> You are the one who made the claim that they were obvious. Prove
>>> your claim.
>>
>> On the Employee(id, dept, mgr : boolean) schema, the constraint by which
>> at any time each department has one and only one manager may be formalized
>> as follows:
>>
>> forall x (Department(x) -> (exists y (Employee(x, y, true)
>> and
>> forall w,z (Employee(x, w, z) and w <> y -> z = false))))
>>
>> This is not the most straightforward constraint
>
> ???
>
>
>
> (DEPARTMENT NOTMATCHING EMPLOYEE) SUBSET-OF FI
>
> plus a key declaration {dept} on the view EMPLOYEE WHERE mgr

Somehow I expected that :) What is FI?

Erwin

unread,

Nov 17, 2015, 1:36:39 PM11/17/15

to

Op maandag 16 november 2015 22:09:14 UTC+1 schreef Nicola:
> ...

> >
> > ???
> >
> >
> >
> > (DEPARTMENT NOTMATCHING EMPLOYEE) SUBSET-OF FI
> >
> > plus a key declaration {dept} on the view EMPLOYEE WHERE mgr
>
> Somehow I expected that :) What is FI?
>
> Nicola
>

Empty relation.

Erwin

unread,

Nov 17, 2015, 1:38:57 PM11/17/15

to

Op dinsdag 17 november 2015 19:36:39 UTC+1 schreef Erwin:

(Too lazy to look up the proper symbol(s) in the UCS.)

Erwin

unread,

Nov 17, 2015, 2:45:59 PM11/17/15

to

Op maandag 16 november 2015 22:09:14 UTC+1 schreef Nicola:
> ...

> > ???
> >
> >
> >
> > (DEPARTMENT NOTMATCHING EMPLOYEE) SUBSET-OF FI
> >
> > plus a key declaration {dept} on the view EMPLOYEE WHERE mgr
>
> Somehow I expected that :) What is FI?
>
> Nicola

What I actually should have replied :

Well to me it's clearly not "not straightforward" and if you expected it, why claim it is [not straightforward] ?

Nicola

unread,

Nov 17, 2015, 3:36:54 PM11/17/15

to

If I interpret correctly your constraints, you've got them wrong, because
they allow a department without a manager.

Erwin

unread,

Nov 17, 2015, 5:57:01 PM11/17/15

to

Op dinsdag 17 november 2015 21:36:54 UTC+1 schreef Nicola:

> On 2015-11-17 19:45:57 +0000, Erwin said:
>
> > Op maandag 16 november 2015 22:09:14 UTC+1 schreef Nicola:
> >> ...
> >>> ???
> >>>
> >>>
> >>>
> >>> (DEPARTMENT NOTMATCHING EMPLOYEE) SUBSET-OF FI
> >>>
> >>> plus a key declaration {dept} on the view EMPLOYEE WHERE mgr
> >>
> >> Somehow I expected that :) What is FI?
> >>
> >> Nicola
> >
> > What I actually should have replied :
> >
> > Well to me it's clearly not "not straightforward" and if you expected
> > it, why claim it is [not straightforward] ?
>
> If I interpret correctly your constraints, you've got them wrong, because
> they allow a department without a manager.
>
> Nicola

Okay so that was too hasty of me.

(DEPARTMENT NOTMATCHING (EMPLOYEE WHERE mgr)) SUBSET-OF FI

addresses it, methinks. Still nowhere near "not straightforward" in my book ...

Nicola

unread,

Nov 18, 2015, 4:10:04 AM11/18/15

to

Point taken.

This highlights the importance of the succintness of the language, not only the
expressiveness. Constraints expressed algebraically are in many cases
more succint
than their counterpart in first-order logic (think of division).

SQL is nowhere near that, and when I think of something not being
straightforward,
my opinion is biased by the thought that it needs to be eventually
implemented in
SQL.

That said, my argument is still valid: using a different schema design
would allow
you to simplify the above constraint even further, so it is correct
that the above
is not the *most* straightforward constraint ;)

Erwin

unread,

Nov 19, 2015, 1:46:33 PM11/19/15

to

Op woensdag 18 november 2015 10:10:04 UTC+1 schreef Nicola:
> ...

> >
> > Okay so that was too hasty of me.
> >
> > (DEPARTMENT NOTMATCHING (EMPLOYEE WHERE mgr)) SUBSET-OF FI
> >
> > addresses it, methinks. Still nowhere near "not straightforward" in my
> > book ...
>
> Point taken.
>
> This highlights the importance of the succintness of the language, not only the
> expressiveness. Constraints expressed algebraically are in many cases
> more succint
> than their counterpart in first-order logic (think of division).

I have been "hatching" quite a bit on a [polite] way to express that your "non-straigthforwardness" was actually not in the constraint, but in the language used by formal logic to express it.

It is easy to deduce from "all departments must have exactly one manager" that "departments at fault are those that have no manager at all, or more than one". And while you _NEED_ logic to underpin that deduction, the language of logic itself is actually actively alienating/obstructive to making it. At least to an engineer whose profile perhaps doesn't fit the targeted audience of this group. This group being the theory group.

In my personal opinion, which I am exceptionally going to label explicitly as "humble", that might very well be the reason why Datalog is not [and never will be] the solution to the "problem" of relational databases. (I know full well what amount of flak I'll get over such pronouncements in academic circles.)

> SQL is nowhere near that,

I would hope that anyone around here knows that. (Hmmmmmmmm one particularly vociferous exception already comes to mind.)

> and when I think of something not being
> straightforward,
> my opinion is biased by the thought that it needs to be eventually
> implemented in SQL.

I have put myself without an income for 6/7 years to show that it is not the case that "eventually it must be implemented in SQL".

> That said, my argument is still valid: using a different schema design
> would allow
> you to simplify the above constraint even further, so it is correct
> that the above
> is not the *most* straightforward constraint ;)

My mind is insufficiently clear on this subject (it was crystal clear as fresh water prior to those 6/7 years). Tailoring designs so constraints become expressible declaratively, does not seem like a very good idea, unless you are constrained by the limitations of SQL implementations that do not have CREATE ASSERTION.

But if you have CREATE ASSERTION (and knowing that CREATE ASSERTION is all you need to keep all necesssary "control" over any form of "redundancy"), what motivation remains to stick with normalization theory as being relevant ?

If you have CREATE ASSERTION, and knowing that it adds an extra twist to the notion of "simplicity" as advocated by the normal forms (because it then becomes possible to have a 1NF db structure that still gives all the integrity guarantees provided by the equivalent BCNF structure), what is the notion of "simplest design" to pursue ?

Erwin

unread,

Nov 19, 2015, 1:51:49 PM11/19/15

to

On Thursday, 19 November 2015 19:46:33 UTC+1, Erwin wrote:
> ...

> I have been "hatching" quite a bit on a [polite] way to express that your "non-straigthforwardness" was actually not in the constraint, but in the language used by formal logic to express it.

To which I should have added "but you nailed it for me".

Tegiri Nenashi

unread,

Nov 20, 2015, 1:44:34 PM11/20/15

to

Boolean logic operates with variables and constants which are propositions. Predicate calculus operates with predicates which are more general objects (because propositions are predicates of arity zero).

Note that in Predicate Calculus domain variables are not Boolean. If domain values were allowed to be propositions, then why not admit predicates themselves as well?

Jan Hidders

unread,

Dec 2, 2015, 12:00:03 PM12/2/15

to

Op donderdag 19 november 2015 19:46:33 UTC+1 schreef Erwin:

> Op woensdag 18 november 2015 10:10:04 UTC+1 schreef Nicola:
> > ...
> > >
> > > Okay so that was too hasty of me.
> > >
> > > (DEPARTMENT NOTMATCHING (EMPLOYEE WHERE mgr)) SUBSET-OF FI
> > >
> > > addresses it, methinks. Still nowhere near "not straightforward" in my
> > > book ...
> >
> > Point taken.
> >
> > This highlights the importance of the succintness of the language, not only the
> > expressiveness. Constraints expressed algebraically are in many cases
> > more succint
> > than their counterpart in first-order logic (think of division).
>
> I have been "hatching" quite a bit on a [polite] way to express that your "non-straigthforwardness" was actually not in the constraint, but in the language used by formal logic to express it.
>
> It is easy to deduce from "all departments must have exactly one manager" that "departments at fault are those that have no manager at all, or more than one". And while you _NEED_ logic to underpin that deduction, the language of logic itself is actually actively alienating/obstructive to making it. At least to an engineer whose profile perhaps doesn't fit the targeted audience of this group. This group being the theory group.
>
> In my personal opinion, which I am exceptionally going to label explicitly as "humble", that might very well be the reason why Datalog is not [and never will be] the solution to the "problem" of relational databases. (I know full well what amount of flak I'll get over such pronouncements in academic circles.)

I sincerely doubt you will need your flak jacket for that. :-) I personally would argue that datalog (and its variants, and extensions) is in some respects one of the most elegant ways of expressing a certain class of queries. But that does not mean I think it is "the solution" for expressing all relational queries ever. Never mind constraints, it was not designed for that.

I'm not even sure I would agree with the assumption you seem to be making that there is such a thing as "the solution". Why not different languages for different purposes and users? Just as long as it has a clear semantics and a mapping to a well understood logic.

Jan Hidders

Erwin

unread,

Jul 25, 2016, 3:21:06 PM7/25/16

to

Op woensdag 2 december 2015 18:00:03 UTC+1 schreef Jan Hidders:

There appear to be academic circles who regard "relational databases" as a "solved problem" with "the solution" being Datalog.

Efforts such as Rel tend to piss off [the people in] those circles quite seriously, I'm told, because "it distracts attention from where the real problem is and therefore holds back true progress".

Not my words and some exaggeration might have slipped in during paraphrasal, but the gist of the message was preserved.

Your remark re. "not designed for constraints" puzzles me because all in all, a constraint [declaration] is nothing more than a predicate. Hard to imagine it being claimed that "Datalog is not designed for handling predicates" !!! (Have you seen Martinenghi's work ?)