We're trying to resolve some performance issues and would be grateful for
your thoughts on the benefits of each of these methods to get the same
result.
Which is more performant?
SELECT Fields
FROM Table
WHERE FKeyField IN(
SELECT KeyField FROM OtherTable WHERE OtherValue = 'FilterValue'
)
or...
SELECT Fields
FROM Table
WHERE EXISTS(
SELECT Anything FROM OtherTable WHERE KeyField = Table.FKeyField AND
OtherValue = 'FilterValue'
)
Many thanks
Mike
SELECT * FROM tbl WHERE col NOT IN (SELECT col FROM anothertbl ...)
"Mike Wazowski" <bookham_meas...@yahoo.com> wrote in message
news:O$6jTDHK...@TK2MSFTNGP04.phx.gbl...
The meaning of IN/NOT IN is different to
EXISTS/NOT EXISTS ...
You will use IN when the SubQuery returns more than
one result. In fact, SQL Server is just joining the tables.
In you case you can also use "WHERE FKeyField = (SELECT ...)"
Regards
Frank
Note that SQL is a declarative rather than a procedural language. If the
queries are semantically the same, the SQL Server cost-based optimizer ought
to be able to come up with the same optimal execution plan in both cases.
Proper index use is the key to performance. I suggest you view the query
execution plan (in SQL Server Management Studio, select the query text and
press ctrl-L) to ensure that indexes are used as expected. I would expect a
composite one on KeyField and OtherValue would be optimal for the subquery
performance and perhaps one on KeyField too.
--
Hope this helps.
Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/
"Mike Wazowski" <bookham_meas...@yahoo.com> wrote in message
news:O$6jTDHK...@TK2MSFTNGP04.phx.gbl...
True but EXISTS and IN can often be expressed as a JOIN or correlated
subquery. I get identical execution plans with all of the methods below:
CREATE TABLE dbo.MyTable(
MyTableKey int NOT NULL
CONSTRAINT PK_MyTable PRIMARY KEY,
Fields int NOT NULL,
FKeyField int NOT NULL
);
CREATE TABLE dbo.OtherTable(
KeyField int NOT NULL
CONSTRAINT PK_OtherTable PRIMARY KEY,
OtherValue varchar(20) NOT NULL
);
CREATE INDEX index1 ON dbo.OtherTable(
OtherValue, KeyField);
ALTER TABLE dbo.MyTable
ADD CONSTRAINT FK_MyTable_OtherTable
FOREIGN KEY (FKeyField)
REFERENCES dbo.OtherTable(KeyField);
SELECT Fields
FROM dbo.MyTable
WHERE FKeyField IN(
SELECT OtherTable.KeyField
FROM dbo.OtherTable
WHERE OtherValue = 'FilterValue'
);
SELECT Fields
FROM dbo.MyTable
WHERE EXISTS(
SELECT *
FROM dbo.OtherTable
WHERE OtherTable.KeyField = MyTable.FKeyField AND
OtherTable.OtherValue = 'FilterValue'
);
SELECT Fields
FROM dbo.MyTable
JOIN dbo.OtherTable ON
OtherTable.KeyField = MyTable.FKeyField
AND OtherTable.OtherValue = 'FilterValue';
SELECT Fields
FROM dbo.MyTable
WHERE FKeyField =
(
SELECT KeyField
FROM dbo.OtherTable
WHERE
OtherTable.OtherValue = 'FilterValue'
AND OtherTable.KeyField = MyTable.FKeyField
);
--
Hope this helps.
Dan Guzman
SQL Server MVP
http://weblogs.sqlteam.com/dang/
"Frank Uray" <Fran...@discussions.microsoft.com> wrote in message
news:0C72B4A6-20B3-4E6E...@microsoft.com...
Do you mean compared to EXISTS, or compared to = ? You can also use EXISTS
regardless of the number of results returned by the subquery.
For the OP, I prefer EXISTS for a couple of reasons. One is the reason that
Uri pointed out; if your subquery can return NULLs, then IN/NOT IN will not
necessarily produce correct results. The other is that EXISTS has the
possibility of performing better; it can stop "looking" once it has found
its first result, and you can also set it to return a constant (e.g. SELECT
1) instead of a column... This makes it easier to spot as a
non-data-returning element in your query.
A
I have been advised that the first method, using IN(), results in one
subquery where the results are then used to filter the parent result set.
Allegedly using the second EXISTS method, a subquery is effectively
performed for each record in the parent result set, before any other filter
criteria is applied.
Just going to do some more research in Query Analyser. Thanks again.
Mike
"Mike Wazowski" <bookham_meas...@yahoo.com> wrote in message
news:O$6jTDHK...@TK2MSFTNGP04.phx.gbl...
It was my thoughts too that EXISTS does not need to retrieve or return
actual data, so would perform better.
"Aaron Bertrand [SQL Server MVP]" <ten...@dnartreb.noraa> wrote in message
news:C510D82A.13E59%ten...@dnartreb.noraa...
Thanks again for the comments of those following this thread.
I produced execution plans for both versions and both plans are the same! I
guess now I need to dig in to the statistics mentioned by Aaron to see if I
can get a definitive 'best method' for my DB.
Mike
"Mike Wazowski" <bookham_meas...@yahoo.com> wrote in message
news:O$6jTDHK...@TK2MSFTNGP04.phx.gbl...
>I produced execution plans for both versions and both plans are the same! I
>guess now I need to dig in to the statistics mentioned by Aaron to see if I
>can get a definitive 'best method' for my DB.
One more thought on the IN vs EXISTS question. IN only works with
single-column keys. EXISTS works with single or multiple column keys.
My tendency is to want to use a consistent style, and EXISTS lets me
use the same approach in both cases.
Roy Harvey
Beacon Falls, CT
--
Tibor Karaszi, SQL Server MVP
http://www.karaszi.com/sqlserver/default.asp
http://sqlblog.com/blogs/tibor_karaszi
"Aaron Bertrand [SQL Server MVP]" <ten...@dnartreb.noraa> wrote in message
news:C510DB1E.13E63%ten...@dnartreb.noraa...