SQL query problem: get all products with same application fields

Mikaël Gyselinck

unread,

Jun 9, 2002, 5:56:44 AM6/9/02

to

Hi all,

I'm trying to build a SQL query that return all products with the same
application fields

I have 3 tables:

1) tblProducts
2) tblApplicationFields
3) tblMatrix

tblProduct
------------------------------
IDproduct fldProductName
------------------------------
1 Prod01
2 Prod02
3 Prod03

tblApplicationField
------------------------------
IdAppliacation fldApplication
------------------------------
1 Application01
2 Application02
3 Application03

tblMatrix
------------------------------
FKproduct FKapplication
------------------------------
1 1
1 2
2 1
2 3
3 1
3 2
3 3

"tblMatrix" is the binding column between "tblApplicationField" and
tblProducts. This means that a product can have multiple applicationfields.

Example:
search for all products that belong to " Application01" AND to
application02 .

Result = Prod01 en Prod03

The following queries do not give the result i needed:

1)

SELECT DISTINCT fldProductName
FROM tblProduct, tblMatrix
WHERE FKproduct = IDprodcut
AND (FKapplication = 1 OR FKapplication = 2)

Result

Prod01
Prod02
Prod03

2)

SELECT DISTINCT fldProductName
FROM tblProduct, tblMatrix
WHERE FKproduct = IDprodcut
AND (FKapplication = 1 AND FKapplication = 2)

Result: none

It does work when i searh for products the have one application field in
commen:

SELECT DISTINCT fldProductName
FROM tblProduct, tblMatrix
WHERE FKproduct = IDprodcut
AND FKapplication = 2

Result

Prod2
Prod3

Help is greatly appreciated.

regards
Mikael

Niels Berglund

unread,

Jun 9, 2002, 6:38:25 AM6/9/02

to

Mikaël Gyselinck <mikael.g...@pandora.be> wrote in
news:B928F2FB.F1B%mikael.g...@pandora.be:

Various ways of doing this, here's one way:

<<<<<<<<<<<<<<<<
select distinct p.fldProductName
from tblProduct p
join tblMatrix pa
on p.idProduct = pa.FkProduct
join tblApplicationField a
on pa.FKApplication = a.idApplication
where a.idApplication = 1
or a.idApplication = 2
>>>>>>>>>>>>>>>>>>>>>>>>

here's another way
<<<<<<<<<<<<<<<<<<
select distinct p.fldProductName
from tblProduct p
join tblMatrix pa
on p.idProduct = pa.FkProduct
join tblApplicationField a
on pa.FKApplication = a.idApplication
where a.idApplication in (1, 2)
>>>>>>>>>>>>>>>>>>>>

Hope this helps

Niels

--
***************************************
* Niels Berglund
* Developmentor
* nie...@develop.com
* http://staff.develop.com/nielsb/
***************************************

Steve Kass

unread,

Jun 9, 2002, 7:06:46 AM6/9/02

to

Mikaël,

You can do this:

SELECT fldProductName

FROM tblProduct, tblMatrix
WHERE FKproduct = IDprodcut

AND FKapplication IN (1,2)
GROUP BY fldProductName
HAVING COUNT(FKapplication) = 2

To be sure this works, you should have a UNIQUE
constraint or primary key on tblMatrix(FKProduct,FKApplication),
so it won't think both applications are present when one of them is
listed twice.

Steve Kass
Drew University

David Brabant

unread,

Jun 9, 2002, 7:29:21 AM6/9/02

to

Mikaël Gyselinck wrote:

> [...]

Yet another solution:

-- All products belonging to application 1
select fldProductName from tblProducts P, tblMatrix M
where M.fkApplication = 1 and P.idProduct = M.fkProduct

-- All products belonging to application 2
select fldProductName from tblProducts P, tblMatrix M
where M.fkApplication = 2 and P.idProduct = M.fkProduct

-- One way to get the intersection:
select fldProductName from tblProducts P, tblMatrix M1
where M1.fkApplication = 1 and P.idProduct = M1.fkProduct and
exists(select 1 from tblMatrix M2 where M2.fkApplication = 2 and M1.fkProduct = M2.fkProduct)

--
David

Itzik Ben-Gan

unread,

Jun 9, 2002, 10:06:02 AM6/9/02

to

If the FKProduct, FKApplication combination does not logically have to be
unique, a distinct count solves the problem:

HAVING COUNT(DISTINCT FKapplication) = 2

--
BG

SQL Server MVP
Hi-Tech College, Israel
http://sql.hi-tech.co.il
http://www.tsqlsolutions.com

"Steve Kass" <sk...@drew.edu> wrote in message
news:3D0336C6...@drew.edu...

Itzik Ben-Gan

unread,

Jun 9, 2002, 10:11:10 AM6/9/02

to

Here's another option...

SELECT fldProductName
FROM tblProduct JOIN tblMatrix
ON FKproduct = IDprodcut
GROUP BY fldProductName
HAVING MAX(CASE WHEN FKapplication = 1 THEN 1 ELSE NULL END) = 1
AND MAX(CASE WHEN FKapplication = 2 THEN 1 ELSE NULL END) = 1

--
BG

SQL Server MVP
Hi-Tech College, Israel
http://sql.hi-tech.co.il
http://www.tsqlsolutions.com

"Mikaël Gyselinck" <mikael.g...@pandora.be> wrote in message
news:B928F2FB.F1B%mikael.g...@pandora.be...

Joe Celko

unread,

Jun 9, 2002, 1:01:29 PM6/9/02

to

Please post DDL, so that people do not have to guess what the keys,
constraints, Declarative Referential Integrity, datatypes, etc. in your
schema are.

Next, please read a book on data modeling. None of your data element
names make any sense. You are using prefixes that tell us HOW the data
is PHYSICALLY stored (tables are the ONLY data structure in SQL anyway)
and used by these particular applications. A data element name tells us
what the data element is lOGICALLY.

As one glaring example, just consider "FKproduct = IDproduct" -- Why is
the code used for a product (UPC, EAN, GTIN, or whatever industry
standards you follow) TOTALLY, LOGICALLY DIFFERENT when it is a foreign
key and when it is an identifier?? A product code is a product code is
a product code ... unless the schema is a total disaster.

The total disaster comes from having multiple names for the same data
element, so that you also get multiple declarations and cannot build a
data dictionary (what does the data dictionary look like for this
project? Do you even have one?)

A Matrix is a kind of data structure; it is not the name of a
relationship. The prefix "fld-" is not just redundant, it implies that
you do not know the difference between a field and a column. What we
wanted you to post might lookl like this:

CREATE TABLE Products
(upc DECIMAL(10,0) NOT NULL PRIMARY KEY,
prod_desc VARCHAR(50) NOT NULL);

CREATE TABLE Applications
(application_id INTEGER NOT NULL PRIMARY KEY,
application_desc VARCHAR(50) NOT NULL);

CREATE TABLE ProductApplications
(upc DECIMAL(10,0) NOT NULL
REFERENCES Products (upc)
ON UPDATE CASCADE
ON DELETE CASCADE,
application_id INTEGER NOT NULL
REFERENCES Applications (application_id)
ON UPDATE CASCADE
ON DELETE CASCADE,
PRIMARY KEY (upc, application_id));

Look at the work the DDL is doing for you automatically. Notice the use
of plural names -- tables are sets of things.

Now, after you fix the basic design problems here, what you are trying
to do is called a Relational Division. Relational division is one of
the eight basic operations in Codd's relational algebra. The idea is
that a divisor table is used to partition a dividend table and produce a
quotient or results table. The quotient table is made up of those
values of one column for which a second column had all of the values in
the divisor.

This is easier to explain with an example. We have a table of pilots
and the planes they can fly (dividend); we have a table of planes in the
hangar (divisor); we want the names of the pilots who can fly every
plane (quotient) in the hangar. To get this result, we divide the
PilotSkills table by the planes in the hangar.

CREATE TABLE PilotSkills
(pilot CHAR(15) NOT NULL,
plane CHAR(15) NOT NULL,
PRIMARY KEY (pilot, plane));

PilotSkills
pilot plane
=========================
'Celko' 'Piper Cub'
'Higgins' 'B-52 Bomber'
'Higgins' 'F-14 Fighter'
'Higgins' 'Piper Cub'
'Jones' 'B-52 Bomber'
'Jones' 'F-14 Fighter'
'Smith' 'B-1 Bomber'
'Smith' 'B-52 Bomber'
'Smith' 'F-14 Fighter'
'Wilson' 'B-1 Bomber'
'Wilson' 'B-52 Bomber'
'Wilson' 'F-14 Fighter'
'Wilson' 'F-17 Fighter'

CREATE TABLE Hangar
(plane CHAR(15) NOT NULL PRIMARY KEY);

Hangar
plane
=============
'B-1 Bomber'
'B-52 Bomber'
'F-14 Fighter'

PilotSkills DIVIDED BY Hangar
pilot
=============================
'Smith'
'Wilson'

In this example, Smith and Wilson are the two pilots who can fly
everything in the hangar. Notice that Higgins and Celko know how to fly
a Piper Cub, but we don't have one right now. In Codd's original
definition of relational division, having more rows than are called for
is not a problem.

The important characteristic of a relational division is that the CROSS
JOIN (Cartesian product) of the divisor and the quotient produces a
valid subset of rows from the dividend. This is where the name comes
from, since the CROSS JOIN acts like a multiplication operator.

Division with a Remainder

There are two kinds of relational division. Division with a remainder
allows the dividend table to have more values than the divisor, which
was Codd's original definition. For example, if a pilot can fly more
planes than just those we have in the hangar, this is fine with us. The
query can be written in SQL-89 as

SELECT DISTINCT pilot
FROM PilotSkills AS PS1
WHERE NOT EXISTS
(SELECT *
FROM Hangar
WHERE NOT EXISTS
(SELECT *
FROM PilotSkills AS PS2
WHERE (PS1.pilot = PS2.pilot)
AND (PS2.plane = Hangar.plane)));

The quickest way to explain what is happening in this query is to
imagine an old World War II movie where a cocky pilot has just walked
into the hangar, looked over the fleet, and announced, "There ain't no
plane in this hangar that I can't fly!" We are finding the pilots for
whom there does not exist a plane in the hangar for which they have no
skills. The use of the NOT EXISTS() predicates is for speed. Most SQL
systems will look up a value in an index rather than scan the whole
table. The SELECT * clause lets the query optimizer choose the column
to use when looking for the index.

This query for relational division was made popular by Chris Date in his
textbooks, but it is not the only method nor always the fastest.
Another version of the division can be written so as to avoid three
levels of nesting. While it is not original with me, I have made it
popular in my books.

SELECT PS1.pilot
FROM PilotSkills AS PS1, Hangar AS H1
WHERE PS1.plane = H1.plane
GROUP BY PS1.pilot
HAVING COUNT(PS1.plane) = (SELECT COUNT(plane) FROM Hangar);

There is a serious difference in the two methods. Burn down the hangar,
so that the divisor is empty. Because of the NOT EXISTS() predicates in
Date's query, all pilots are returned from a division by an empty set.
Because of the COUNT() functions in my query, no pilots are returned
from a division by an empty set.

In the sixth edition of his book, INTRODUCTION TO DATABASE SYSTEMS
(Addison-Wesley; 1995 ;ISBN 0-201-82458-2), Chris Date defined another
operator (DIVIDEBY ... PER) which produces the same results as my query,
but with more complexity.

Exact Division

The second kind of relational division is exact relational
division. The dividend table must match exactly to the values of
the divisor without any extra values.

SELECT PS1.pilot
FROM PilotSkills AS PS1
LEFT OUTER JOIN
Hangar AS H1
ON PS1.plane = H1.plane
GROUP BY PS1.pilot
HAVING COUNT(PS1.plane) = (SELECT COUNT(plane) FROM Hangar)
AND COUNT(H1.plane) = (SELECT COUNT(plane) FROM Hangar);

This says that a pilot must have the same number of certificates as
there planes in the hangar and these certificates all match to a plane
in the hangar, not something else. The "something else" is shown by a
created NULL from the LEFT OUTER JOIN.

Please do not make the mistake of trying to reduce the HAVING clause
with a little algebra to:

HAVING COUNT(PS1.plane) = COUNT(H1.plane)

because it does not work; it will tell you that the hangar has (n)
planes in it and the pilot is certified for (n) planes, but not that
those two sets of planes are equal to each other.

Note on Performance

The nested EXISTS() predicates version of relational division was made
popular by Chris Date's textbooks, while the author is associated with
popularizing the COUNT(*) version of relational division. The Winter
1996 edition of DB2 ON-LINE MAGAZINE (http://www.db2mag.com/96011ar:htm)
had an article entitled "Powerful SQL:Beyond the Basics" by Sheryl
Larsen which gave the results of testing both methods. Her conclusion
for DB2 was that the nested EXISTS() version is better when the quotient
has less than 25% of the dividend table's rows and the COUNT(*) version
is better when the quotient is more than 25% of the dividend table.

--CELKO--

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!

Niels Berglund

unread,

Jun 9, 2002, 5:10:35 PM6/9/02

to

Joe Celko <71062...@compuserve.com> wrote in news:OOwLNd9DCHA.1696
@tkmsftngp05:

> Please post DDL, so that people do not have to guess what the keys,
> constraints, Declarative Referential Integrity, datatypes, etc. in your
> schema are.
>

[snip]
>
> --CELKO--
>

If you are in lecturing mode, can you please show some news group
etiquette and at least quote vital parts of the message you are replying
to.

Thanks