Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Coul constraint *enforcement be done at compile time vs runtime ?

1 view
Skip to first unread message

Cimode

unread,
Jun 16, 2009, 3:52:33 AM6/16/09
to
As the subject says...

The traditional direct image system approach consists of having
relations constraints implemented at the time when the relation is
updated meaning during UPDATE/DELETE/INSERT operations. As a part of
a research effort to understand better what may be a relational system
implementation, I have discovered that it is actually possible to
perform constraint enforcement at constraint declaration time.

Since I have not read anything on the subject, I was wandering and
hoping for additional references on that specific aspect. I also woul
dlike to know if previous attempts at building such systems have been
done in the past.

Brian Selzer

unread,
Jun 16, 2009, 8:47:58 AM6/16/09
to

"Cimode" <cim...@hotmail.com> wrote in message
news:d156099b-b43c-47ae...@e21g2000yqb.googlegroups.com...

> As the subject says...
>
> The traditional direct image system approach consists of having
> relations constraints implemented at the time when the relation is
> updated meaning during UPDATE/DELETE/INSERT operations. As a part of
> a research effort to understand better what may be a relational system
> implementation, I have discovered that it is actually possible to
> perform constraint enforcement at constraint declaration time.

Can you be more specific? Are you saying that /all/ constraints can be
enforced at compile time? When a user submits an illogical update, for
example, a delete that targets a tuple that is referenced in an inclusion
dependency, what do you call the process of determining that it is an
illogical update, if not constraint enforcement? Moreover, while domain
definitions partition the set of all particulars in the Universe into a set
of domains and while state constraints shape the Universe into a set of
possible worlds, transition constraints specify which possible worlds are
directly accessible from the distinguished actual world, which is
represented by the current database value (the redundancy in the terms,
'the' and 'current'.and in the terms 'database' and 'value' being
intentional for emphasis). As a consequence, since the enforcement of
transition constraints depend on the current database value, how can one
determine at constraint declaration time whether a particular update
violates a transition constraint?

Walter Mitty

unread,
Jun 16, 2009, 10:20:10 AM6/16/09
to

"Cimode" <cim...@hotmail.com> wrote in message
news:d156099b-b43c-47ae...@e21g2000yqb.googlegroups.com...

I don't know whether this applies or not, but you can decide.

In addition to constraints that are checked for compliance (meaning
"enforced") at UPDATE/INSERT/DELETE time there is yet another time that a
SQL DBMS might check constraints. It's at COMMIT time. It is possible to
come up with collections of constraints such that it is impossible to get
from one legitimate state to another legitimate state without temporarily
putting the database in a non compliant state. By deferring some
constraints to COMMIT time, it becomes possible to get through the non
compliant state without being blocked by a constraint enforcement. I can
give an example if that helps. A good DBMS will make it possible to defer
constraint checking when appropriate.

I am not familiar with Tutorial D. But some of the previous discussion in
this newsgroup leads me to believe that the above commentary would not apply
to Tutorial D. In particular, the transformations associated with a single
transaction in SQL can require a series of actions, where each UPDATE,
INSERT, or DELETE is a separate action. In Tutorial D, if I understand it
correctly, the transformations that need to be considered atomic can all be
expressed as one action, in such a way that the programmer need never be
aware of actions and transactions being different units of work. I am
hoping that someone who does know D will clarify this point.

I do not know whether your proposed implementation resmbles SQL or D in this
regard. Hope this helps.

Brian Selzer

unread,
Jun 16, 2009, 10:46:00 AM6/16/09
to

"Walter Mitty" <wam...@verizon.net> wrote in message
news:uSNZl.1309$P5....@nwrddc02.gnilink.net...

D requires that constraints be checked at the end of each statement, not at
the end of each transaction, but D provides support for multiple assignment,
in which a series (not a set) of assignments are submitted en mass as a
single statement.

Cimode

unread,
Jun 17, 2009, 7:54:52 AM6/17/09
to
I realize I have not been explicit enough by what I mean by
*enforcement*. I apologize but it is difficult for me to summarize a
decade effort of design into a single thread.

To answer your question, let me use a significant example that will
underline the difference in approach between the way a direct image
system handles an INSERT constraint check and the way a system
implementing independence between the logical and physical layer would
work. I won't get into too much detail since that would require very
lengthy explanations onto how the relation representation scheme works
but I will attempt to provide some explanations to clarify my point.

In a typical direct image system such as ORACLE, DB2 or SQL Server,
the typical INSERT constraint checking is handled the following
fashion (there may be some variations but the general algorhythmics
remains the same):

Considering the 3 following relations SCHOOL, STUDENT where one SCHOOL
may have 0 or more STUDENTS. Let's suppose I want to INSERT 3 new
STUDENTS while checking that they do belong to a SCHOOL that belongs
to SCHOOL body. The traditional direct image system works in the
following fashion

--> A transaction context is created for the INSERT when the
instruction is submitted to the engine.
--> The *to be inserted* new STUDENT tuples in are matched against
SCHOOL to enforce constraint checking. The operation involves an *in
between* (before COMMIT) issue/execution of SELECT statements that
check the physical existence of matching SCHOOLS in SCHOOL relation.
--> If SOME or ALL tuples are matched, a COMMIT OK flag is issued to
allow the transaction to continue. Oppositely, when SOME or ALL of
the tuples are NOT matched in SCHOOL, a COMMIT KO is issued to
indicate to the system that there would be a constraint violation if
the current operation would actually be completed. The COMMIT KO, an
internal data sublanguage instruction, then usually ends up with a
ROLLBACK external language instruction.

By the above mechanism, a traditional direct image system, the system
analyzes the *intent* at updating the body of relation and expresses
internally whether or not that intent would actually result in a
constraint violation. Among the many drawbacks that come from such
approach:
--> The necessary *analysis of intent* requires multiple SELECT
statements to be issued. These SELECT statement will occur everytime
a set of tuples are to update on a specific relation and are in fact
orthogonal overhead to the INSERT transaction context. For instance,
that even if one issues an INSERT statement for a similar value to be
inserted, SELECT statements on that value will occur as many times as
the INSERT statement is issued.
--> By binding the analysis of intent solely to the transactional
context of a single INSERT transaction, the system will actually lock
any other transactions from updating the content of both the STUDENT
and the SCHOOL transaction. STUDENT will be locked for allowing the
INSERT and SCHOOL will be locked to allow the SELECT statement (there
are wayss around that but they require additional ressources to be
consumed; I am thinking about ORACLE Read consistency and SQL Server
SNAPSHOT mode). All this comes at a great cost as far as concurrency
is concerned and resoource consumption.

There's in fact a fundamental logical problem hidding behind the above
physical problem. The non distinguishability between the two
contexts, the context of the operation and the context of analysis of
intent is a direct consequence onto how the internal relation is to be
operated. In enforcing constraints and performing a UNION, direct
image systems *internally* assume the following elementary relational
premises:

--> Premice 1: An user INSERT/UNION user operation between STUDENTS
and NEW_STUDENTS is in fact 2 relational internal operations PROJECTION
(SELECT for analysis of intent)-UNION(COMMIT INSERT). That is a
direct consequence of the fact that when operated the two relations
are not of the same type so the analysis of intent imposes a necessary
*conversion* process to allow the external UNION to be performed.
--> Premice2: The UNION operation occurs according to the following
algorhthmics:

STUDENTS UNION NEW_STUDENTS = STUDENTS
<--> NEW_STUDENTS = EMPTY no matter what!!!

Premice 1 is a consequence of the fact that NEW_STUDENTS internal
representation has no mechanism that would represent it as STUDENT to
allow the analysis of intent to be implicit. Premice 2 OTOH is a
direct consequence of flawed relational theory.
In fact, a more reasonnable approach is to state that

STUDENTS UNION NEW_STUDENTS = STUDENTS2 where STUDENTS, NEW_STUDENTS,,
and STUDENTS2 are relations sharing the same header. As a
consequence, STUDENTS2 in fact equates to STUDENTS *if and only if*
NEW_STUDENTS is empty OR if NEW_STUDENTS would provoque a constraint
violation. This is in fact a fundamental difference from traditional
relational approach with important consequences:

--> The argument that indicate that a relational operation output such
as a UNION *always* ends up with one of the relations involved in the
operation is a fallacy. That fact is a direct consequence of
underestimating the relation *naming* process to the profit of headers
and bodies.
--> A truly relational system can *only* perform operations between
relations of the same type. A system that does not guarantee
representations can not be perform relational operations neither it
can do analysis of intent efficiently.
--> A UNION operation between two relation always produces a third
relation.

From the above premices, I can present how a non direct image system
actually performs the UNION constraint checking :


--> To operate relations, a relational operation engine handler must
have an *isotyping* mechanism (same type) that would necessarily
represent the relation variables as being of the same type. The
isotyping mechanism is directly bound to the operation.
--> The isotyping mechanism would allow to internally solve

STUDENTS UNION NEW_STUDENTS = STUDENTS2 and consider a constraint
violation *if and only if* STUDENTS2 = STUDENTS. I can not present
how my engine does it in detail, because it would impose an entire
book to write, but what I can say is that the relation representation
scheme used allows a relation level projection *not* a tuple level.
The representation scheme allows for instance to equate two un-ary
relations without resorting to elementary tuple based projections.
--> Since the isotyping mechanism is bound to the operation, and
inherently allows the analysis of intent, each time a relational
operation is compiled, the analysis of intent is performed, leaving
only the UNION operation to be performed at run time

What I can say is that this engine is currently working (still fixing
issues though especially with virtual relation types) but it performs
basic relational operations according to the above principles. The
advantages are numerous:

--> A system that operates relations not values
--> A more economic system since no overhead is done in constraint
checking. Only the demanded operation is actually executed.
--> The system implements the same mechanism between any pair of N-ary
relations. Since these relations can share the same header it greatly
simplifies for instance dupplicate checking.
--> I am still working on the system language compiler and fixing some
issues and hope to have a prototype in the next 2 years...

I hope this clarified...

Cimode

unread,
Jun 17, 2009, 8:00:33 AM6/17/09
to
Thank you for this input.

> I am not familiar with Tutorial D.  But some of the previous discussion in
> this newsgroup leads me to believe that the above commentary would not apply
> to Tutorial D.  In particular, the transformations associated with a single
> transaction in SQL can require a series of actions, where each UPDATE,
> INSERT, or DELETE is a separate action.  In Tutorial D, if I understand it
> correctly,  the transformations that need to be considered atomic can all be
> expressed as one action, in such a way that the programmer need never be
> aware of actions and transactions being different units of work.  I am
> hoping that someone who does know D will clarify this point.

> I do not know whether your proposed implementation resmbles SQL or D in this
> regard.  Hope this helps.

Thank you...I have posted a few examples of syntax on this ng and one
may see that the language in fact ressembles neither. The language I
am developping has some common traits but is inherently different.
The two above languages are based on flawed computing models for
representing relations and operate these relations. But again we are
not in the layer of RM but rather in the layer of computing models to
represent RM the way it is supposed to be represented.

Regards...

Brian Selzer

unread,
Jun 17, 2009, 8:56:27 PM6/17/09
to
I just want you to know that I am not ignoring your post. I am unclear
about a few things, and find that I have to read and re-read what you wrote.
I don't think it's anyone's fault--just a natural consequence of native
language differences.

Brian Selzer

unread,
Jun 20, 2009, 3:30:12 PM6/20/09
to

"Cimode" <cim...@hotmail.com> wrote in message
news:b9f4d22f-1bfb-4e51...@c9g2000yqm.googlegroups.com...

OK, but my questions apply to all implementations, not just direct image
implementations.

> In a typical direct image system such as ORACLE, DB2 or SQL Server,
> the typical INSERT constraint checking is handled the following
> fashion (there may be some variations but the general algorhythmics
> remains the same):
>
> Considering the 3 following relations SCHOOL, STUDENT where one SCHOOL
> may have 0 or more STUDENTS. Let's suppose I want to INSERT 3 new
> STUDENTS while checking that they do belong to a SCHOOL that belongs
> to SCHOOL body. The traditional direct image system works in the
> following fashion
>
> --> A transaction context is created for the INSERT when the
> instruction is submitted to the engine.

yes.

> --> The *to be inserted* new STUDENT tuples in are matched against
> SCHOOL to enforce constraint checking.

True, but wouldn't the constraint check have to be done regardless of the
implementation.

> The operation involves an *in
> between* (before COMMIT) issue/execution of SELECT statements that
> check the physical existence of matching SCHOOLS in SCHOOL relation.
> --> If SOME or ALL tuples are matched, a COMMIT OK flag is issued to
> allow the transaction to continue. Oppositely, when SOME or ALL of
> the tuples are NOT matched in SCHOOL, a COMMIT KO is issued to
> indicate to the system that there would be a constraint violation if
> the current operation would actually be completed. The COMMIT KO, an
> internal data sublanguage instruction, then usually ends up with a
> ROLLBACK external language instruction.

And this is exactly what an inclusion dependency requires. Is the SCHOOL
that is referenced in the insert already defined? This is a logical
requirement: I don't see how the type of implementation can alter the need
for this question to be answered whenever an insert of a student of a school
is submitted.

> By the above mechanism, a traditional direct image system, the system
> analyzes the *intent* at updating the body of relation and expresses
> internally whether or not that intent would actually result in a
> constraint violation. Among the many drawbacks that come from such
> approach:
> --> The necessary *analysis of intent* requires multiple SELECT
> statements to be issued. These SELECT statement will occur everytime
> a set of tuples are to update on a specific relation and are in fact
> orthogonal overhead to the INSERT transaction context. For instance,
> that even if one issues an INSERT statement for a similar value to be
> inserted, SELECT statements on that value will occur as many times as
> the INSERT statement is issued.
> --> By binding the analysis of intent solely to the transactional
> context of a single INSERT transaction, the system will actually lock
> any other transactions from updating the content of both the STUDENT
> and the SCHOOL transaction. STUDENT will be locked for allowing the
> INSERT and SCHOOL will be locked to allow the SELECT statement (there
> are wayss around that but they require additional ressources to be
> consumed; I am thinking about ORACLE Read consistency and SQL Server
> SNAPSHOT mode). All this comes at a great cost as far as concurrency
> is concerned and resoource consumption.

Yes, locking is a necessary evil whenever an update is submitted. Even with
SNAPSHOT isolation, locks are applied when performing updates. There are
only three alternatives to locking that I can think of:

(1) poll each connection to see if it has an update to submit;
(2) pass a token amongst the connections and agree that only the connection
with the token can submit an update;
(3) allow each connection to just stuff information in, but devise a
mechanism for detecting collisions and reverting the database to the state
it was in at the start of the first of the colliding updates.

All of these alternatives have their ups and downs. Even by implementing
some kind of priority queuing mechanism, alternatives (1) and (2) would
incur overhead for connections that don't need anthing done. Alternative
(3) may appear useful, but it introduces other problems that are beyond the
scope of this post.

SNAPSHOT isolation limits the need for locking to just those transactions
that submit updates, and that is in my opinion the least intrusive of the
concurrency control mechanisms. I don't see how any system, whether direct
image or not, would benefit more by implementing any of the alternatives.

> There's in fact a fundamental logical problem hidding behind the above
> physical problem. The non distinguishability between the two
> contexts, the context of the operation and the context of analysis of
> intent is a direct consequence onto how the internal relation is to be
> operated. In enforcing constraints and performing a UNION, direct
> image systems *internally* assume the following elementary relational
> premises:
>
> --> Premice 1: An user INSERT/UNION user operation between STUDENTS
> and NEW_STUDENTS is in fact 2 relational internal operations PROJECTION
> (SELECT for analysis of intent)-UNION(COMMIT INSERT). That is a
> direct consequence of the fact that when operated the two relations
> are not of the same type so the analysis of intent imposes a necessary
> *conversion* process to allow the external UNION to be performed.
> --> Premice2: The UNION operation occurs according to the following
> algorhthmics:
>
> STUDENTS UNION NEW_STUDENTS = STUDENTS
> <--> NEW_STUDENTS = EMPTY no matter what!!!
>
> Premice 1 is a consequence of the fact that NEW_STUDENTS internal
> representation has no mechanism that would represent it as STUDENT to
> allow the analysis of intent to be implicit. Premice 2 OTOH is a
> direct consequence of flawed relational theory.
> In fact, a more reasonnable approach is to state that

I don't understand what you're driving at here. The projection and join
(which for 3 rows would probably translate into a three index look-ups, one
for each SCHOOL value) required to enforce the constraint, do not alter
operands of the join, just the result, which should be empty. STUDENTS and
NEW_STUDENTS are of the same type, and therefore should remain UNION
compatible regardless of the evaluation of relational expressions required
to enforce constraints.

> STUDENTS UNION NEW_STUDENTS = STUDENTS2 where STUDENTS, NEW_STUDENTS,,
> and STUDENTS2 are relations sharing the same header. As a
> consequence, STUDENTS2 in fact equates to STUDENTS *if and only if*
> NEW_STUDENTS is empty OR if NEW_STUDENTS would provoque a constraint
> violation. This is in fact a fundamental difference from traditional
> relational approach with important consequences:
>
> --> The argument that indicate that a relational operation output such
> as a UNION *always* ends up with one of the relations involved in the
> operation is a fallacy. That fact is a direct consequence of
> underestimating the relation *naming* process to the profit of headers
> and bodies.

I agree that a UNION does not necessarily end up with one of the relations
involved in the UNION, but can you be a little more clear here? What is
this *naming* process and how can underestimating it profit headers and
bodies?

> --> A truly relational system can *only* perform operations between
> relations of the same type. A system that does not guarantee
> representations can not be perform relational operations neither it
> can do analysis of intent efficiently.

I don't understand what you mean here by operations. Can't a join can occur
between relations with even completely different headings?

> --> A UNION operation between two relation always produces a third
> relation.
>

Absolutely.

> From the above premices, I can present how a non direct image system
> actually performs the UNION constraint checking :
>

I can't really comment further until I have some of the above questions
answered.

0 new messages