<helmi.a...@marketmax.com> wrote in message
news:cd9afc$e88$1...@license1.unx.sas.com...
> This T_SQL (Update.. from) works .. it should not .. any explanation ??
> create table a (a int, b int, c varchar(10))
> create table b (a int, b int, c varchar(10))
>
> insert into a values(1,1,'a')
> insert into a values(1,2,'x')
>
>
> insert into b values(1,1,'g')
> insert into b values(1,1,'h')
> insert into b values(1,1,'u')
> insert into b values(1,2,'w')
> insert into b values(1,2,'z')
>
> begin tran
> select * from a
> select * from b
> update a set c = b.c from b where a.a = b.a and a.b = b.b
> select * from a
> rollback tran
>
>
Yep, that's what it does and yes it is *very* bad. For this reason I usually
recommend avoiding Microsoft's proprietary UPDATE FROM syntax altogether. On
the other hand, some people would counter that it's up to the developer to
ensure his joins are valid - the UPDATE FROM syntax does work properly and
safely provided the join results in no more than one row on the right side
of the assignment.
--
David Portas
SQL Server MVP
--
CREATE TABLE Countries (countryname VARCHAR(20) PRIMARY KEY, capitalcity
VARCHAR(20) NULL)
CREATE TABLE Cities (cityname VARCHAR(20) NOT NULL, countryname VARCHAR(20)
NOT NULL REFERENCES Countries (countryname), CONSTRAINT PK_Cities PRIMARY
KEY (cityname, countryname))
INSERT INTO Countries (countryname) VALUES ('USA')
INSERT INTO Countries (countryname) VALUES ('UK')
INSERT INTO Cities VALUES ('Washington','USA')
INSERT INTO Cities VALUES ('London','UK')
INSERT INTO Cities VALUES ('Manchester','UK')
The MS-syntax makes it all too easy for the developer to slip-up by writing
ambiguous UPDATE...FROM statements where the JOIN criteria is not unique on
the right side of the join. Try these two identical UPDATE statements with a
small change to the primary key in between.
UPDATE Countries
SET capitalcity = cityname
FROM Countries JOIN Cities /* evil UPDATE... FROM syntax */
ON Countries.countryname = Cities.countryname
SELECT * FROM Countries
ALTER TABLE Cities DROP CONSTRAINT PK_Cities
ALTER TABLE Cities ADD CONSTRAINT PK_Cities PRIMARY KEY (countryname,
cityname)
UPDATE Countries
SET capitalcity = cityname
FROM Countries JOIN Cities /* don't do this! */
ON Countries.countryname = Cities.countryname
SELECT * FROM Countries
I get this from the first SELECT statement:
countryname capitalcity
-------------------- --------------------
UK London
USA Washington
and this from the second:
countryname capitalcity
-------------------- --------------------
UK Manchester
USA Washington
(though these results aren't guaranteed - that's part of the problem).
Why did the result change? The physical implementation has affected the
meaning of the code, with serious, potentially disastrous consequences. How
can you even test your code if its results are subject to change due to the
vagaries of storage, indexing and cacheing?
With the ANSI syntax there is no ambiguity. The UPDATE statement compels the
programmer to design an unambiguous assignment subquery that returns no more
than a single value.
UPDATE Countries
SET capitalcity =
(SELECT MIN(cityname)
FROM Cities
WHERE Countries.countryname = Cities.countryname)
At the very least this forces the developer to reconsider whether the UPDATE
statement makes logical sense.
Ironically, you've come very close to demonstrating how useFUL the
UPDATE .. FROM can be. Your final (ANSI) query, while fine for the data
you give, is what I've seen suggested all too often as the "right" way
to rewrite an UPDATE .. FROM query. The problem, of course, is that
while neither the T-SQL nor ANSI query has an explicit WHERE clause, the
T-SQL UPDATE .. FROM query limits the update to target table rows that
appear in the implicit join. If that's not all of the rows, as is often
the case, the ANSI rewrite updates the correct rows and also NULLs out
information in every other row.
I recommend the ANSI syntax myself, but at the same time, I've
forgotten to put in a crucial WHERE clause on more than a few occasions
also. And there's another thing about the T-SQL syntax I don't like:
mentioning the target table in the FROM clause is optional. In other words,
update T set
columnA = U.columnA
from U
where U.key = T.key
has the same meaning as
update T set
columnA = U.columnA
from T, U
where U.key = T.key
Despite the convenience of making the update query look like a join, in
which you can change UPDATE to SELECT to see what will be updated, the
second one bugs me because of the automagical identification that takes
place between the two references to T. When you have a self-join, or
use table aliases in the FROM clause, it gets even mysteriouser.
Steve Kass
Drew University
It's true that in that sense the ANSI syntax is less concise. You have to
specify the "join" criteria twice to achieve the same functionality as
UPDATE FROM ... INNER JOIN. So the programmer can still screw up but at
least she will do so in manner that is deterministic!
> update T set
> columnA = U.columnA
> from T, U
> where U.key = T.key
The FROM syntax is truly a bad Friday afternoon piece of work. SQL Server,
Access and Visual Foxpro all have proprietary methods for multi-table UPDATE
statements but most frustrating of all, they are all different! (although
the VFP9 Beta at last supports ANSI UPDATEs :).
For me the worst is that UPDATEs with ill-formed JOIN criteria will work
with no indication that anything is wrong. You just know that not everyone
is savvy enough to pick the problem up in testing. At the very least it
should raise a warning condition - something that could surely be fixed
without breaking legacy code or incurring an unacceptable performance
overhead.
David Portas wrote:
>>while neither the T-SQL nor ANSI query has an explicit WHERE clause, the
>>T-SQL UPDATE .. FROM query limits the update to target table rows that
>>appear in the implicit join. If that's not all of the rows, as is often
>>the case, the ANSI rewrite updates the correct rows and also NULLs out
>>information in every other row.
>>
>>
>
>It's true that in that sense the ANSI syntax is less concise. You have to
>specify the "join" criteria twice to achieve the same functionality as
>UPDATE FROM ... INNER JOIN. So the programmer can still screw up but at
>least she will do so in manner that is deterministic!
>
>
>
>>update T set
>> columnA = U.columnA
>>from T, U
>>where U.key = T.key
>>
>>
>
>The FROM syntax is truly a bad Friday afternoon piece of work. SQL Server,
>Access and Visual Foxpro all have proprietary methods for multi-table UPDATE
>statements but most frustrating of all, they are all different! (although
>the VFP9 Beta at last supports ANSI UPDATEs :).
>
>
At least SQL Server doesn't let you update more than one table at a time.
>For me the worst is that UPDATEs with ill-formed JOIN criteria will work
>with no indication that anything is wrong. You just know that not everyone
>is savvy enough to pick the problem up in testing. At the very least it
>should raise a warning condition - something that could surely be fixed
>without breaking legacy code or incurring an unacceptable performance
>overhead.
>
>
>
I bet it would break some legacy code that works for its purpose, either
where the database
is not normalized and the multiple rows that satisfy the join criteria
all supply the same value
to the update, or where this syntax is used to have the effect of an
I_DONT_CARE_WHICH()
aggregate, as in a mess like this:
use Northwind
go
create table SloppyCustomer (
customerID char(5) primary key,
anyOrderID int null references Orders(OrderID)
)
go
-- assume no customer table
insert into SloppyCustomer
select distinct -- ick
CustomerID,
NULL
from Orders
go
update SloppyCustomer set
anyOrderID = OrderID
from Orders
where Orders.CustomerID = SloppyCustomer.CustomerID
go
select * from SloppyCustomer
go
drop table SloppyCustomer
By the way, I don't think any bits in the table are ever truly updated
multiple
times, even when the "logic" makes it appear otherwise. At least when I've
tried to exploit this funny syntax under that assumption, it's never worked.
SK