Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Help with Query

23 views
Skip to first unread message

Jeff B

unread,
Nov 25, 2007, 9:52:03 AM11/25/07
to
Hi everyone,

I have a book table and in that table it has the book tile, publisher, and
type of book it is. example mystery, scifi, etc...

I am trying to write a query that brings back a list of every pair of books
that have the same publisher and same book type. I have been able to get
the following code to work:

select publisher_code, type
from book
group by publisher_code, type
having count(*) > 1;

which returns the following results:

PU TYP
-- ---
JP MYS
LB FIC
PE FIC
PL FIC
ST SFI
VB FIC

I can not figure out how to get the book title and book code for the books
that this result list represents, everything i have tried throws out an
error.

Can someone help?

Thanks,

Jeff

Charles Hooper

unread,
Nov 25, 2007, 10:28:47 AM11/25/07
to

I see two possible methods:
1. Slide the SQL statement that you have written into an inline view,
join the inline view to your book table, and then use the
publisher_code, type columns to drive back into your book table. The
join syntax may look like one of the following: (publisher_code, type)
IN (SELECT...) or b.publisher_code=ib.publisher_code and
b.type=ib.type
2. Use analytical functions (COUNT() OVER...) to determine the number
of matches for the same publisher_code, type columns. Then slide this
SQL statement into an inline view to retrieve only those records with
the aliased COUNT() OVER greater than 1. This has the benefit of
retrieving the matching rows in a single pass.

You will likely find examples of the above approaches in this and the
comp.databases.oracle.server group.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

Jeff B

unread,
Nov 25, 2007, 10:59:09 AM11/25/07
to
"Charles Hooper" <hoope...@yahoo.com> wrote in message
news:4ed69188-f428-46dd...@n20g2000hsh.googlegroups.com...

Hi Charles,

Thanks for the response. I can not say that I understand everything that
you were saying. When you say inline view do you mean like this?

select title
from book
where publisher_code and type in
(select publisher_code, type


from book
group by publisher_code, type

having count(*) > 1);

this did not work it threw back the following error:

SQL> select title
2 from book
3 where type in
4 (select publisher_code, type
5 from book
6 group by publisher_code, type
7 having count(*) > 1);
(select publisher_code, type
*
ERROR at line 4:
ORA-00913: too many values


SQL> select title
2 from book
3 where publisher_code, type in
4 (select publisher_code, type
5 from book
6 group by publisher_code, type
7 having count(*) > 1);
where publisher_code, type in
*
ERROR at line 3:
ORA-00920: invalid relational operator


SQL> select title
2 from book
3 where publisher_code and type in
4 (select publisher_code, type
5 from book
6 group by publisher_code, type
7 having count(*) > 1);
where publisher_code and type in
*
ERROR at line 3:
ORA-00920: invalid relational operator

is this what you were referring too? still not being able to get it. I am
thinking that i have to be half way to the solution with the first part that
did bring back the six sets, just cannot figure out how to get the book code
and titles for each of those six sets?

Again thanks for the help

Jeff

Maxim Demenko

unread,
Nov 25, 2007, 11:13:51 AM11/25/07
to Jeff B
Jeff B schrieb:

You want get all pairs of books from your table - to get all
permutations usually cross join is used. If you consider (1,2) to be the
same as (2,1), then all permutations should be reduced about a half - to
achieve it, a simple condition like (j<k) - where j,k are your set items
- is enough. To get your selection restricted only on the same type
and publisher - simply put this restriction into the join condition.
Finally, you can get something like:

SQL> with books as (
2 select 'JP' publisher_code,'MYS' type,'The Jupiter Legacy' title
from dual union all
3 select 'ST','SFI','The Daleth Effect' from dual union all
4 select 'ST','SFI','A Stainless Steel Rat is Born' from dual
union all
5 select 'JP','MYS','Galactic Dreams' from dual union all
6 select 'VB','MYS','The Puppet Masters' from dual union all
7 select 'LB','FIC','The Stainless Steel Rat' from dual union all
8 select 'PE','FIC','One Step from Earth' from dual union all
9 select 'PL','FIC','Planet of No Return' from dual union all
10 select 'VB','FIC','The Technicolor Time Machine' from dual union all
11 select 'LB','FIC','A Rebel in Time' from dual union all
12 select 'PE','FIC','Skyfall' from dual union all
13 select 'PL','FIC','War With the Robots' from dual union all
14 select 'VB','FIC','Stainless Steel Visions' from dual union all
15 select 'ST','FIC','The Menace from Earth' from dual
16 )
17 -- End test data
18 select a.publisher_code,a.type,a.title,b.title
19 from books a,books b
20 where a.publisher_code=b.publisher_code
21 and a.type=b.type
22 and a.title<b.title
23 /

PU TYP TITLE TITLE
-- --- ----------------------------- -----------------------------
JP MYS Galactic Dreams The Jupiter Legacy
ST SFI A Stainless Steel Rat is Born The Daleth Effect
LB FIC A Rebel in Time The Stainless Steel Rat
VB FIC Stainless Steel Visions The Technicolor Time Machine
PE FIC One Step from Earth Skyfall
PL FIC Planet of No Return War With the Robots

6 rows selected.

If you like to have all your titles in one column, then it'll look like

select a.publisher_code,a.type,a.title
from books a,books b
where a.publisher_code=b.publisher_code
and a.type=b.type
-- and a.title<b.title

Best regards

Maxim

Jeff B

unread,
Nov 25, 2007, 12:08:04 PM11/25/07
to

"Maxim Demenko" <mdem...@gmail.com> wrote in message
news:47499F3F...@gmail.com...

Hi Maxim,

Thank you very much this is very close to what I am looking for. I was
thinking that I needed to do a self join on the table somehow just was not
sure how to do it. I like the way that your table was listed:

PU TYP TITLE TITLE
> -- --- ----------------------------- -----------------------------
> JP MYS Galactic Dreams The Jupiter Legacy
> ST SFI A Stainless Steel Rat is Born The Daleth Effect
> LB FIC A Rebel in Time The Stainless Steel Rat
> VB FIC Stainless Steel Visions The Technicolor Time Machine
> PE FIC One Step from Earth Skyfall
> PL FIC Planet of No Return War With the Robots

Mine did not come out listed that away? here is how mine came out what do I
need to do to get it to look like your table?

SQL> select a.publisher_code,a.type,a.title,b.title
2 from book a, book b
3 where a.publisher_code=b.publisher_code
4 and a.type=b.type
5 and a.title<b.title
6 order by a.type, a.publisher_code;

PU TYP TITLE
-- --- ----------------------------------------
TITLE
----------------------------------------
LB FIC Franny and Zooey
Nine Stories

LB FIC Franny and Zooey
The Catcher in the Rye

LB FIC Nine Stories
The Catcher in the Rye


PU TYP TITLE
-- --- ----------------------------------------
TITLE
----------------------------------------
PE FIC East of Eden
Of Mice and Men

PE FIC East of Eden
The Grapes of Wrath

PE FIC Of Mice and Men
The Grapes of Wrath


PU TYP TITLE
-- --- ----------------------------------------
TITLE
----------------------------------------
PL FIC Beloved
Jazz

PL FIC Jazz
Song of Solomon

PL FIC Beloved
Song of Solomon


PU TYP TITLE
-- --- ----------------------------------------
TITLE
----------------------------------------
VB FIC The Fall
The Stranger

JP MYS Slay Ride
The Edge

ST SFI Harry Potter and the Goblet of Fire
Harry Potter and the Prisoner of Azkaban


12 rows selected.

Thanks again for all the help from everyone.

Jeff

Maxim Demenko

unread,
Nov 25, 2007, 12:14:06 PM11/25/07
to Jeff B
Jeff B schrieb:

Note, please, the approach which Charles suggested, should work for you
too, you had only one small syntax mistake. Correct query would look like

select title
from book
where (publisher_code,type) in


(select publisher_code, type
from book
group by publisher_code, type
having count(*) > 1);

You have to choose, which sql is more suitable for your needs and
probably make some performance benchmarking.

Best regards

Maxim

Charles Hooper

unread,
Nov 25, 2007, 12:26:56 PM11/25/07
to
On Nov 25, 10:59 am, "Jeff B" <jef...@KnoSpam.tds.net> wrote:
> "Charles Hooper" <hooperc2...@yahoo.com> wrote in message

The above only tries to find a matching TYPE, yet the subquery is
returning PUBLISHER_CODE and TYPE - that is the result of this error
message. Close to what you need.

>
> SQL> select title
> 2 from book
> 3 where publisher_code, type in
> 4 (select publisher_code, type
> 5 from book
> 6 group by publisher_code, type
> 7 having count(*) > 1);
> where publisher_code, type in
> *
> ERROR at line 3:
> ORA-00920: invalid relational operator

Very close to what you need. However, Oracle expects the column names
to be wrapped in () ... like this: where (publisher_code, type) in

The above uses a subquery, which may perform slow on some Oracle
releases compared to the use of an inline view. Assume that I have a
table named PART, which has columns ID, DESCRIPITION, PRODUCT_CODE,
and COMMODITY_CODE, with ID as the primary key. I want to find ID,
DESCRIPTION, and COMMODITY_CODE for all parts with the same
DESCRIPTION and PRODUCT_CODE, where there are at least 3 matching
parts in the group:

The starting point, which looks similar to your initial query:
SELECT
DESCRIPTION,
PRODUCT_CODE,
COUNT(*) NUM_MATCHES
FROM
PART
GROUP BY
DESCRIPTION,
PRODUCT_CODE
HAVING
COUNT(*)>=3;

When the original query is slid into an inline view and joined to the
original table, it looks like this:
SELECT
P.ID,
P.DESCRIPTION,
P.COMMODITY_CODE
FROM
(SELECT
DESCRIPTION,
PRODUCT_CODE,
COUNT(*) NUM_MATCHES
FROM
PART
GROUP BY
DESCRIPTION,
PRODUCT_CODE
HAVING
COUNT(*)>=3) IP,
PART P
WHERE
IP.DESCRIPTION=P.DESCRIPTION
AND IP.PRODUCT_CODE=P.PRODUCT_CODE;

Here is the DBMS_XPLAN:
-------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time | Buffers | OMem | 1Mem | Used-Mem |
-------------------------------------------------------------------------------------------------------------------
|* 1 | HASH JOIN | | 1 | 1768 | 11525 |
00:00:00.21 | 2748 | 1048K| 1048K| 1293K (0)|
| 2 | VIEW | | 1 | 1768 | 1156 |
00:00:00.11 | 1319 | | | |
|* 3 | FILTER | | 1 | | 1156 |
00:00:00.11 | 1319 | | | |
| 4 | HASH GROUP BY | | 1 | 1768 | 23276 |
00:00:00.08 | 1319 | | | |
| 5 | TABLE ACCESS FULL| PART | 1 | 35344 | 35344 |
00:00:00.04 | 1319 | | | |
| 6 | TABLE ACCESS FULL | PART | 1 | 35344 | 35344 |
00:00:00.04 | 1429 | | | |
-------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("IP"."DESCRIPTION"="P"."DESCRIPTION" AND
"IP"."PRODUCT_CODE"="P"."PRODUCT_CODE")
3 - filter(COUNT(*)>=3)

The query format using the subquery looks like this:
SELECT
P.ID,
P.DESCRIPTION,
P.COMMODITY_CODE
FROM
PART P
WHERE
(DESCRIPTION,PRODUCT_CODE) IN
(SELECT
DESCRIPTION,
PRODUCT_CODE
FROM
PART
GROUP BY
DESCRIPTION,
PRODUCT_CODE
HAVING
COUNT(*)>=3);

The DBMS_XPLAN, note that Oracle 10.2.0.2 transformed the query above:
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows
| A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------
|* 1 | HASH JOIN RIGHT SEMI | | 1 | 1 | 11525 |
00:00:00.21 | 2748 | 1048K| 1048K| 1214K (0)|
| 2 | VIEW | VW_NSO_1 | 1 | 1768 | 1156 |
00:00:00.12 | 1319 | | | |
|* 3 | FILTER | | 1 | | 1156 |
00:00:00.12 | 1319 | | | |
| 4 | HASH GROUP BY | | 1 | 1768 | 23276 |
00:00:00.09 | 1319 | | | |
| 5 | TABLE ACCESS FULL| PART | 1 | 35344 | 35344 |
00:00:00.04 | 1319 | | | |
| 6 | TABLE ACCESS FULL | PART | 1 | 35344 | 35344 |
00:00:00.01 | 1429 | | | |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("DESCRIPTION"="$nso_col_1" AND
"PRODUCT_CODE"="$nso_col_2")
3 - filter(COUNT(*)>=3)

Without allowing the automatic transformations in Oracle 10.2.0.2, the
query takes _much_ longer than 0.21 seconds to complete.

The method using analytical functions starts like this:
SELECT
P.ID,
P.DESCRIPTION,
P.COMMODITY_CODE,
COUNT(*) OVER (PARTITION BY DESCRIPTION, PRODUCT_CODE) NUM_MATCHES
FROM
PART P;

Then, sliding the above into an inline view:
SELECT
ID,
DESCRIPTION,
COMMODITY_CODE
FROM
(SELECT
P.ID,
P.DESCRIPTION,
P.COMMODITY_CODE,
COUNT(*) OVER (PARTITION BY DESCRIPTION, PRODUCT_CODE) NUM_MATCHES
FROM
PART P)
WHERE
NUM_MATCHES>=3;

The DBMS_XPLAN for the above looks like this:
-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|* 1 | VIEW | | 1 | 35344 | 11525 |
00:00:00.31 | 1319 | | | |
| 2 | WINDOW SORT | | 1 | 35344 | 35344 |
00:00:00.27 | 1319 | 2533K| 726K| 2251K (0)|
| 3 | TABLE ACCESS FULL| PART | 1 | 35344 | 35344 |
00:00:00.04 | 1319 | | | |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("NUM_MATCHES">=3)

Note that there is only one TABLE ACCESS FULL of the PART table in the
above. The execution time required 0.31 seconds to complete, which is
greater than the first two approaches, but that is because the
database server is concurrently still trying to resolve the query
method using the subquery with no permitted transformations (5+
minutes later).

> SQL> select title
> 2 from book
> 3 where publisher_code and type in
> 4 (select publisher_code, type
> 5 from book
> 6 group by publisher_code, type
> 7 having count(*) > 1);
> where publisher_code and type in
> *
> ERROR at line 3:
> ORA-00920: invalid relational operator

> is this what you were referring too? still not being able to get it. I am
> thinking that i have to be half way to the solution with the first part that
> did bring back the six sets, just cannot figure out how to get the book code
> and titles for each of those six sets?
>
> Again thanks for the help
>
> Jeff

Your first attempt was very close. Keep working at it. Maxim
Demenko, who has previously posted very clever solutions to other
problems, provides another method to find the solution.

Jeff B

unread,
Nov 25, 2007, 12:45:58 PM11/25/07
to

"Maxim Demenko" <mdem...@gmail.com> wrote in message
news:4749AD5E...@gmail.com...

LOL writing this while I slap myself upside the head. It is always the
small things that will get you. I could not find a good example of the IN
clause everyone i found only had one column listed before the IN statement
so I did not realize that I need the parenthsis around the 2 columns before
the IN clause. After making that correction here is how it now looks:

SQL> select book_code, title, publisher_code, type
2 from book
3 where (publisher_code,type) in


4 (select publisher_code, type
5 from book
6 group by publisher_code, type
7 having count(*) > 1);

BOOK TITLE PU TYP
---- ---------------------------------------- -- ---
9882 Slay Ride JP MYS
0808 The Edge JP MYS
9883 The Catcher in the Rye LB FIC
6908 Franny and Zooey LB FIC
3743 Nine Stories LB FIC
9701 The Grapes of Wrath PE FIC
7405 East of Eden PE FIC
2766 Of Mice and Men PE FIC
9627 Song of Solomon PL FIC
6128 Jazz PL FIC
138X Beloved PL FIC

BOOK TITLE PU TYP
---- ---------------------------------------- -- ---
7443 Harry Potter and the Goblet of Fire ST SFI
2226 Harry Potter and the Prisoner of Azkaban ST SFI
7559 The Fall VB FIC
0200 The Stranger VB FIC

15 rows selected.

so this brings back the same results as the code you also had :) although I
do think I like the way yours formatted with the PU TYPE the each title all
listed in one row .

PU TYP TITLE TITLE
-- --- ----------------------------- -----------------------------
JP MYS Galactic Dreams The Jupiter Legacy
ST SFI A Stainless Steel Rat is Born The Daleth Effect
LB FIC A Rebel in Time The Stainless Steel Rat
VB FIC Stainless Steel Visions The Technicolor Time Machine
PE FIC One Step from Earth Skyfall
PL FIC Planet of No Return War With the Robots

Again Thanks too both for the input I having been working on this for a few
hours now. plus also glad too see that atleast I was on the right track
myself just missing the correct syntax to make it work.

Jeff

Charles Hooper

unread,
Nov 25, 2007, 1:12:19 PM11/25/07
to
On Nov 25, 12:26 pm, Charles Hooper <hooperc2...@yahoo.com> wrote:
(snip)
> --------------------------------------------------------------------------------------------------------------------

> | Id | Operation | Name | Starts | E-Rows | A-Rows | A-
> Time | Buffers | OMem | 1Mem | Used-Mem |
> --------------------------------------------------------------------------------------------------------------------

> |* 1 | HASH JOIN | | 1 | 1768 | 11525 |
> 00:00:00.21 | 2748 | 1048K| 1048K| 1293K (0)|
> | 2 | VIEW | | 1 | 1768 | 1156 |
> 00:00:00.11 | 1319 | | | |
> |* 3 | FILTER | | 1 | | 1156 |
> 00:00:00.11 | 1319 | | | |
> | 4 | HASH GROUP BY | | 1 | 1768 | 23276 |
> 00:00:00.08 | 1319 | | | |
> | 5 | TABLE ACCESS FULL| PART | 1 | 35344 | 35344 |
> 00:00:00.04 | 1319 | | | |
> | 6 | TABLE ACCESS FULL | PART | 1 | 35344 | 35344 |
> 00:00:00.04 | 1429 | | | |
> --------------------------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------------------------------

> | Id | Operation | Name | Starts | E-Rows | A-Rows
> | A-Time | Buffers | OMem | 1Mem | Used-Mem |
> ------------------------------------------------------------------------------------------------------------------------

> |* 1 | HASH JOIN RIGHT SEMI | | 1 | 1 | 11525 |
> 00:00:00.21 | 2748 | 1048K| 1048K| 1214K (0)|
> | 2 | VIEW | VW_NSO_1 | 1 | 1768 | 1156 |
> 00:00:00.12 | 1319 | | | |
> |* 3 | FILTER | | 1 | | 1156 |
> 00:00:00.12 | 1319 | | | |
> | 4 | HASH GROUP BY | | 1 | 1768 | 23276 |
> 00:00:00.09 | 1319 | | | |
> | 5 | TABLE ACCESS FULL| PART | 1 | 35344 | 35344 |
> 00:00:00.04 | 1319 | | | |
> | 6 | TABLE ACCESS FULL | PART | 1 | 35344 | 35344 |
> 00:00:00.01 | 1429 | | | |
> ------------------------------------------------------------------------------------------------------------------------
> ------------------------------------------------------------------------------------------------------------------
> | Id | Operation | Name | Starts | E-Rows | A-Rows | A-
> Time | Buffers | OMem | 1Mem | Used-Mem |
> ------------------------------------------------------------------------------------------------------------------
> |* 1 | VIEW | | 1 | 35344 | 11525 |
> 00:00:00.31 | 1319 | | | |
> | 2 | WINDOW SORT | | 1 | 35344 | 35344 |
> 00:00:00.27 | 1319 | 2533K| 726K| 2251K (0)|
> | 3 | TABLE ACCESS FULL| PART | 1 | 35344 | 35344 |
> 00:00:00.04 | 1319 | | | |
> ------------------------------------------------------------------------------------------------------------------

>
> Predicate Information (identified by operation id):
> ---------------------------------------------------
> 1 - filter("NUM_MATCHES">=3)
>
> Note that there is only one TABLE ACCESS FULL of the PART table in the
> above. The execution time required 0.31 seconds to complete, which is
> greater than the first two approaches, but that is because the
> database server is concurrently still trying to resolve the query
> method using the subquery with no permitted transformations (5+
> minutes later).
>

Just an update, the subquery version of the query without allowing
transformations just completed, requiring 46 minutes and 21 seconds,
and performed 29475 full table scans of the PART table. Here is the
DBMS_XPLAN:


---------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time | Buffers |

---------------------------------------------------------------------------------------
|* 1 | FILTER | | 1 | | 11525 |
00:46:21.46 | 38M|
| 2 | TABLE ACCESS FULL | PART | 1 | 35344 | 35344 |
00:00:00.25 | 1429 |
|* 3 | FILTER | | 29474 | | 6143 |
00:46:06.52 | 38M|
| 4 | HASH GROUP BY | | 29474 | 1 | 613M|
00:33:24.30 | 38M|
| 5 | TABLE ACCESS FULL| PART | 29474 | 35344 | 1041M|
00:00:02.54 | 38M|
---------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter( IS NOT NULL)
3 - filter(("DESCRIPTION"=:B1 AND "PRODUCT_CODE"=:B2 AND
COUNT(*)>=3))

Compare that execution time (46 minutes, 21.46 seconds) with the same
for the analytical version (0.19 seconds):


-----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-
Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------
|* 1 | VIEW | | 1 | 35344 | 11525 |

00:00:00.19 | 1319 | | | |


| 2 | WINDOW SORT | | 1 | 35344 | 35344 |

00:00:00.17 | 1319 | 2533K| 726K| 2251K (0)|


| 3 | TABLE ACCESS FULL| PART | 1 | 35344 | 35344 |

00:00:00.01 | 1319 | | | |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("NUM_MATCHES">=3)

As Maxim Demenko suggests, test the performance of the solution on the
expected Oracle database version before committing to one solution or
another. Make certain that the data size is reasonably large when
performing the performance test, otherwise the test results may not
predict actual performance once the solution is deployed.

Jeff B

unread,
Nov 25, 2007, 1:39:32 PM11/25/07
to

"Charles Hooper" <hoope...@yahoo.com> wrote in message
news:e0a2ef39-df04-406b...@e23g2000prf.googlegroups.com...

Thanks Charles, that is some very insightful information!! I am certainly
not that advanced (I understand what you are showing me, but no clue how to
get those results) but am hoping to get there, I am learning :)

Again thanks to you and Maxim for the help and information.

Jeff

ssan...@gmail.com

unread,
Dec 4, 2007, 6:01:51 PM12/4/07
to
On Nov 25, 9:52 am, "Jeff B" <jef...@KnoSpam.tds.net> wrote:

cheater

0 new messages