Joe Celko Where are you?

19 views
Skip to first unread message

Andrew J. Kelly

unread,
Apr 5, 2001, 1:47:32 PM4/5/01
to
Joe,

We have been having some discussions in another thread about scoping and how
the ANSI spec refers to this scenario:


CREATE TABLE [dbo].[TableA] (
[PK_ID] [int] NOT NULL ,
[Email] [varchar] (80) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[Other_1] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[Other_2] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
[Blah_Blah_Blah] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT
NULL
) ON [PRIMARY]
GO

CREATE TABLE [dbo].[TableB] (
[Sub_ID] [int] NOT NULL
) ON [PRIMARY]
GO


SELECT Email FROM TableA WHERE Email IN (SELECT Email FROM TableB)


I would expect an Error since there is no column called Email in TableB and
execution would stop. But it sounds like the Engine determines that since
the column is not in TableB but is in TableA it will just use that and
return all the rows instead. I disagree with that logic since I feel that
the FROM clause should have made it clear that I wanted a column from
TableB. But what I am really looking for is how does the ANSI spec deal with
this and where can I find the explaination so even though I may not like
that logic I will know how it works.

--
Andrew J. Kelly
Targitmail.com


Joe Celko

unread,
Apr 5, 2001, 3:20:25 PM4/5/01
to
>> I would expect an Error since there is no column called Email in TableB and execution would stop. But it sounds like the Engine determines that since the column is not in TableB but is in TableA it will just use that and return all the rows instead. I disagree with that logic since I feel that the FROM clause should have made it clear that I wanted a column from TableB. But what I am really looking for is how does the ANSI spec deal with this and where can I find the explaination so even though I may not like that logic I will know how it works.

SELECT email
FROM TableA
WHERE email
IN (SELECT email
FROM TableB); <<

The rules are pretty much like any other block structured language. The WHERE clause references the nearest containing FROM clause, but cannot yet see any contained FROM clause. The SELECT clause gets its data from the result of the FROM.. WHERE results, does any other calculations on that data and sends it to the host language.

Would the reference to a containing query bother you in these cases?

SELECT email
FROM TableA
WHERE email
IN (SELECT 'joe....@trilogy.com'
FROM TableB
WHERE TableA.email = TableB.email); <== error!

SELECT email
FROM TableA
WHERE email
IN (SELECT 'joe....@trilogy.com'
FROM TableB
WHERE TableA.email = TableA.email); <== okay

SELECT email
FROM TableA
WHERE email
IN (SELECT 'joe....@trilogy.com'
FROM TableB
WHERE TableA.email = TableB.sub_id); <== okay

Would it bother you if you were in an Algol, Pascal, C, etc. language and had this code?

B1: BEGIN email VARCHAR(80),
other_1 VARCHAR(50),
other_2 VARCHAR(50),
blah_blah_blah VARCHAR(50);
...
B2: BEGIN sub_id INTEGER;
PRINT (email);
...
END;
...
END;

Block B2 is not going to blow up over the variable "email", is it?

The scoping rules get a bit fancy when you start using the AS <table name> (<column list>) options with the ANSI JOIN syntax.

(SELECT r, s, t FROM Foo
INNER JOIN
SELECT x, y, z FROM Bar
ON Foo.keycol = Bar.keycol)

You can reference Foo and Bar in the containing queries and use their column names.

(SELECT r, s, t FROM Foo
INNER JOIN
SELECT x, y, z FROM Bar
ON Foo.keycol = Bar.keycol) AS X

You cannot reference Foo and Bar in the containing queries; you must use X, but you get the original column names.

(SELECT r, s, t FROM Foo
INNER JOIN
SELECT x, y, z FROM Bar
ON Foo.keycol = Bar.keycol) AS X(a,b,c)

You cannot reference Foo and Bar in the containing queries; you must use X and the new columns names. I think that you can see how the most local correlation name has to be used in these cases:

(SELECT * FROM Foo AS F1
INNER JOIN
SELECT * FROM Bar
ON F1.keycol = Bar.keycol)

(SELECT * FROM Foo AS F1
INNER JOIN
SELECT * FROM Bar AS B1
ON F1.keycol = B1.keycol)

(SELECT * FROM Foo AS F1
INNER JOIN
SELECT * FROM Bar AS B1
ON F1.keycol = B1.keycol) AS X

The idea is that the nesting has a name that is presetned to the containing levels and the containing levels cannot break into their contained blocks.

--CELKO--

SQL guru at Trilogy
===========================
Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, datatypes, etc. in your schema are.

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Steve Dassin

unread,
Apr 5, 2001, 3:31:25 PM4/5/01
to
Andrew,

Just how much do you want to get divorced from reality?

:)

stevie


a29379 LastName

unread,
Apr 5, 2001, 4:20:34 PM4/5/01
to

Well Joe, can't you please be specifc.

In this case:

SELECT *
FROM TTEST1
WHERE POSTNR NOT IN
(SELECT POSTNR FROM TTEST2);

If column POSTNR is not a column in table TTEST2
should the rdbms return
1. an error
or
2. a result based om POSTNR as a column in TTEST1

????

/a

Andrew J. Kelly

unread,
Apr 5, 2001, 4:56:42 PM4/5/01
to
Steve,

Not sure which direction you are going with that but usually I try and stay
in the general vacinity<g>. I was just searching for how it works in the
long run. If its the way that it actually seems to work I may not - No I
definitely don't agree with the logic but I can live with it. I just have to
readjust how I have always thought of a subquery to be. I always approached
the Subquery (NOT Correlated Subquery) as it was a totally seperate piece
from the rest of the statement and it was only the results of that subquery
that actually mattered to the outer query. So as long as I didn't reference
another table inside ( ), to me it was self contained. When you call a
method in a DLL you don't expect the DLL to know anything about the object
that called it, usually it just returns a result set that the calling object
acts upon. So if I run the statement SELECT Email FROM TableB by itself
in QA and Email isn't a column of that table it will error. I expected it to
behave the same way as a SubQuery (again non correlated) and return an error
vs going off and searching for other tables to find a matching column in.

I suppose this has gone on long enough<bg>


--
Andrew J. Kelly
Targitmail.com


"Steve Dassin" <try...@aol.com> wrote in message
news:#RETtdgvAHA.1400@tkmsftngp05...

Andrew J. Kelly

unread,
Apr 5, 2001, 5:05:28 PM4/5/01
to
>>>>>>
Would the reference to a containing query bother you in these cases?

SELECT email
FROM TableA
WHERE email
IN (SELECT 'joe....@trilogy.com'
FROM TableB
WHERE TableA.email = TableB.email); <== error!

SELECT email
FROM TableA
WHERE email
IN (SELECT 'joe....@trilogy.com'
FROM TableB
WHERE TableA.email = TableA.email); <== okay

SELECT email
FROM TableA
WHERE email
IN (SELECT 'joe....@trilogy.com'
FROM TableB
WHERE TableA.email = TableB.sub_id); <== okay
<<<<<

In each of these examples you reference another table in the WHERE clause
which to me makes it a Correlated Subquery and I would expect it to work
differently than if the only table specified in the Subquery was the one in
the FROM clause. To me, the Scope does change when you add a reference to
another table but should stay Local otherwise. Local meaning strictly within
the ( ) or subquery itself.


>>>>>>>>>
Would it bother you if you were in an Algol, Pascal, C, etc. language and
had this code?

B1: BEGIN email VARCHAR(80),
other_1 VARCHAR(50),
other_2 VARCHAR(50),
blah_blah_blah VARCHAR(50);
...
B2: BEGIN sub_id INTEGER;
PRINT (email);
...
END;

<<<<<<<<<<

To me this is not a valid analogy. This is just a 2 blocks of code within
the same method or event and would obviously have the same scope. I look at
the Subquery as a method call that completely changes the scope between the
outer and inner queries since there was no specific reference to the outer
table(s) from the inner query.

Steve Dassin

unread,
Apr 5, 2001, 5:40:09 PM4/5/01
to

"Andrew J. Kelly" <ake...@targitmail.com> wrote in message
news:#cxILLhvAHA.856@tkmsftngp03...

> Steve,
>
> Not sure which direction you are going with that but usually I try and
stay
> in the general vacinity<g>.
Touche!.You gotta have a sense of humor about some of this stuff:).
And maybe someone at MS development team has a big sense of
humor too!.After all its 'software' and software is always full of
surprises.
The are many 'surprises',I'm sure,in store for all users:).Don't think
Oracle,
IBM and Sybase don't have surprises for their users too.But you do know
how to avoid the problem of 'seek and ye shall find'.And now you may
want to go back and possible modify your zillion queries and procedures:).
So everything is a correlated query except for a constant.Life will go on:).
But I'll give MS this,their consistent as the same behavior exists in
Access!.
Somewhere in server license agreement I think they say there's no
free lunches:)

Your contributions to the newsgroup are much appreciated!.

Regards,
stevie


BP Margolin

unread,
Apr 5, 2001, 5:51:29 PM4/5/01
to
Andrew,

> So as long as I didn't reference
> another table inside ( ), to me it was self contained.

But you ARE referencing another table, because the column does NOT exist in
the table mentioned in the subquery.
Remember that aliases are necessary ONLY to resolve ambiguity. If you do not
use aliases, then you are telling SQL: "go find this column in **any** table
referenced in the query ... and that is exactly what SQL is doing.

------------------------------------
BP Margolin
Please reply only to the newsgroups.
When posting, inclusion of SQL (CREATE TABLE ..., INSERT ..., etc.) which
can be cut and pasted into Query Analyzer is appreciated.

"Andrew J. Kelly" <ake...@targitmail.com> wrote in message
news:#cxILLhvAHA.856@tkmsftngp03...

Joe Celko

unread,
Apr 5, 2001, 6:11:18 PM4/5/01
to
>> To me this is not a valid analogy. This is just a 2 blocks of code within the same method or event and would obviously have the same scope. I look at the Subquery as a method call that completely changes the scope between the outer and inner queries since there was no specific reference to the outer table(s) from the inner query. <<

To me, two table expressions (subqueries) within the same containing query obviously have the same scope. Remember the "S" in "SQL" used to mean "Structured" in the sense of nested blocks.

If you really believe that the subquery is like a method call that completely changes the scope between the outer and inner queries, then correlated subqueries would not work for you. They would behave like stand-alone queries and thus always fail with an error message about an invalid column name. Ever work with Fortran or early versions of BASIC? That is pretty much their SUBROUTINE model.

There is a specific reference to the outer table(s) from the inner query in this case -- it is in the SELECT list. I gather you want a rule that says they have to be in the WHERE clause or maybe in the WHERE and ON clauses before they are within scope?

Certain people hate it when I do this, but here are the rules, right out of the Standard:

3.3.4.2 Syntactic containment

In a Format, a syntactic element <A> is said to immediately contain a syntactic element <B> if <B> appears on the right-hand side of the BNF production rule for <A>. A syntactic element <A> is said to contain or specify a syntactic element <C> if <A> immediately contains <C> or if <A> immediately contains a syntactic element <B> that contains <C>.

In SQL language, an instance A1 of <A> is said to imediately contain an instance B1 of <B> if <A> immediately contains <B> and the text of B1 is part of the text of A1. An instance A1 of <A> is said to contain or specify an instance C1 of <C> if A1 immediately contains C1 or if A1 immediately contains an instance B1 of <B> that contains C1.

An instance A1 of <A> is said to contain an instance B1 of <B> with an intervening <C> if A1 contains B1 and A1 contains an instance C1 of <C> that contains B1. An instance A1 of <A> is said to contain an instance B1 of <B> without an intervening <C> if A1 contains B1 and A1 does not contain an instance C1 of <C> that contains B1.

An instance A1 of <A> simply contains an instance B1 of <B> if A1 contains B1 without an intervening instance A2 of <A> or an intervening instance B2 of <B>.

If <A> contains <B>, then <B> is said to be contained in <A> and <A> is said to be a containing production symbol for <B>. If <A> simply contains <B>, then <B> is said to be simply contained in <A> and <A> is said to be a simply containing production symbol for <B>.

Let A1 be an instance of <A> and let B1 be an instance of <B>. If <A> contains <B>, then A1 is said to contain B1 and B1 is said to be contained in A1. If <A> simply contains <B>, then A1 is said to simply contain B1 and B1 is said to be simply contained in A1.

An instance A1 of <A> is the innermost <A> satisfying a condition C if A1 satisfies C and A1 does not contain an instance A2 of <A> that satisfies C. An instance A1 of <A> is the outermost <A> satisfying a condition C if A1 satisfies C and A1 is not contained in an instance A2 of <A> that satisfies C.

If <A> contains a <table name> that identifies a view that is defined by a <view definition> V, then <A> is said to generally contain the <query expression> contained in V. If <A> contains <B>, then <A> generally contains <B>. If <A> generally contains <B> and <B> generally contains <C>, then <A> generally contains <C>.

An instance A1 of <A> directly contains an instance B1 of <B> if A1 contains B1 without an intervening <set function specification> or <subquery>.


--CELKO--

SQL guru at Trilogy
===========================
Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, datatypes, etc. in your schema are.

*** Sent via Developersdex http://www.developersdex.com ***

Joe Celko

unread,
Apr 5, 2001, 6:23:15 PM4/5/01
to
>> To me this is not a valid analogy. This is just a 2 blocks of code within the same method or event and would obviously have the same scope. I look at the Subquery as a method call that completely changes the scope between the
outer and inner queries since there was no specific reference to the outer table(s) from the inner query. <<

But there is a specific outer reference -- it is in the SELECT clause list. The SELECT clause is just as much a part of the subquery as the other clauses.

There were some attempt in language design in the 1970's to require that a block also declare its outer references as well as its local variables, but they never really left the laboratory.

My practice, to keep from shooting myself in the foot, is to use correlation names that begin with a letter (maybe two letters) that abbreviates the base table name and which end with an integer that tells me which copy of the base table this one is. The idea is I can follow my nesting levels and that I will always qualify column names because these table names are so short.

Some people hate it when I do this, but here is the rules right out of the Standard

3.3.4.2 Syntactic containment

In a Format, a syntactic element <A> is said to immediately contain a syntactic element <B> if <B> appears on the right-hand side of the BNF production rule for <A>. A syntactic element <A> is said to contain or specify a syntactic element <C> if <A> immediately contains <C> or if <A> immediately contains a syntactic element <B> that contains <C>.

In SQL language, an instance A1 of <A> is said to immediately contain an instance B1 of <B> if <A> immediately contains <B> and the text of B1 is part of the text of A1. An instance A1 of <A> is said to contain or specify an instance C1 of <C> if A1 immediately contains C1 or if A1 immediately contains an instance B1 of <B> that contains C1.

Joe Celko

unread,
Apr 5, 2001, 6:26:16 PM4/5/01
to
>> Well Joe, can't you please be specifc.

In this case:

SELECT *
FROM TTEST1
WHERE POSTNR NOT IN
(SELECT POSTNR FROM TTEST2);

If column POSTNR is not a column in table TTEST2
should the rdbms return
1. an error
or
2. a result based om POSTNR as a column in TTEST1 <<

2, of course. This will make it more specific:

DROP TABLE Foobar, Foo;

CREATE TABLE Foobar
(alpha CHAR(1) NOT NULL PRIMARY KEY,
thing1 INTEGER NOT NULL);

INSERT INTO Foobar VALUES ('a', 1);
INSERT INTO Foobar VALUES ('b', 2);
INSERT INTO Foobar VALUES ('c', 3);
INSERT INTO Foobar VALUES ('d', 4);
SELECT * FROM Foobar;

CREATE TABLE Foo
(thing2 INTEGER NOT NULL);

INSERT INTO Foo(thing2) VALUES(1);
INSERT INTO Foo(thing2) VALUES(2);

SELECT * FROM Foobar;
SELECT * FROM Foo;

SELECT *
FROM Foobar
WHERE thing1
IN (SELECT Foobar.thing1
FROM Foo);

alpha thing1
----- -----------
a 1
b 2
c 3
d 4

(4 row(s) affected)

--CELKO--

SQL guru at Trilogy
===========================
Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, datatypes, etc. in your schema are.

*** Sent via Developersdex http://www.developersdex.com ***

Steve Dassin

unread,
Apr 5, 2001, 6:37:06 PM4/5/01
to

"BP Margolin" <bpm...@attglobal.net> wrote in message
news:Ozke8nhvAHA.1928@tkmsftngp05...
> Andrew,

Hey where you come from?.
Join the party:).
How ya doing.

stevie


Andrew J. Kelly

unread,
Apr 6, 2001, 7:54:01 AM4/6/01
to
BP,

> But you ARE referencing another table, because the column does NOT exist
in
> the table mentioned in the subquery.<<

Yea I see it that way now. I had always thought that the table had to be
actually specified in the same FROM clause in order for SQL to pay attention
to it. I just never realized that sql would start wondering off looking for
lost columns<bg>.


--
Andrew J. Kelly
Targitmail.com

"BP Margolin" <bpm...@attglobal.net> wrote in message
news:Ozke8nhvAHA.1928@tkmsftngp05...

Tibor Karaszi

unread,
Apr 6, 2001, 7:58:36 AM4/6/01
to
> I just never realized that sql would start wondering off looking for
> lost columns<bg>.

Now, that is a nice friendly way of phrasing it:-) I wonder what we would find if we
took a snapshot of the memory in the machine at that point in time.

"Hmm... nope - not here. I wonder where it can be? Perhaps I left it at..."

--
Tibor Karaszi, SQL Server MVP
FAQ from Neil at: http://www.sqlserverfaq.com
Please reply to the newsgroup only, not by email.


"Andrew J. Kelly" <ake...@targitmail.com> wrote in message

news:#TBilApvAHA.2012@tkmsftngp05...

Andrew J. Kelly

unread,
Apr 6, 2001, 8:01:04 AM4/6/01
to
>>If you really believe that the subquery is like a method call that
completely changes the scope between the outer and inner queries, then
correlated >>subqueries would not work for you. <<
Thats why I have been trying to clarify this mess as only when NOT doing a
correlated subquery. In a correlated subquery I would specifically reference
a table from the outer query and this would not be an issue.


>> They would behave like stand-alone queries and thus always fail with an
error message about an invalid column >>name. Ever work with Fortran or
early versions of BASIC? That is pretty much their SUBROUTINE model. <<

Exactly my point.

>>Certain people hate it when I do this, but here are the rules, right out
of the Standard:<<

No, actually that was what I was looking for all along. I never doubted that
you (or anyone else that gave feedback) was wrong, personally I just don't
agree with the logic and was curious how the spec worded it.

Thanks

--
Andrew J. Kelly
Targitmail.com


"Joe Celko" <joe....@trilogy.com> wrote in message
news:#Kt0X1hvAHA.1880@tkmsftngp03...

Paul Thornett

unread,
Apr 6, 2001, 7:36:08 PM4/6/01
to
"Tibor Karaszi"
<tibor.please_reply_to...@cornerstone.se> wrote in
message news:OSQdgDpvAHA.1400@tkmsftngp05...

> > I just never realized that sql would start wondering off looking
for
> > lost columns<bg>.
>
> Now, that is a nice friendly way of phrasing it:-) I wonder what we
would find if we
> took a snapshot of the memory in the machine at that point in time.
>
> "Hmm... nope - not here. I wonder where it can be? Perhaps I left it
at..."

Well, it seems to me this is how to solve cross-database queries.
There's clearly no need to specify the names of the other databases,
as Sql Server is obviously smart enough to figure it out by itself.
Mind you, it might take a while.........<g>!

--
Paul Thornett

Help fight Alzheimer's, Cystic Fibrosis and Mad Cow Disease
http://www.stanford.edu/group/pandegroup/Cosm/


Reply all
Reply to author
Forward
0 new messages