Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Can anyone suggest a better way to write this query?

0 views
Skip to first unread message

C M

unread,
Nov 19, 2009, 3:10:01 PM11/19/09
to
Table dir_size stores the mbytes of storage used in a given directory.
Table directories stores various directory names which may or may not
exist in table dir_size.
For every directory in table directories, report the cumulative
storage in that directory and all its subdirectories.
This solution uses a cartesian join. I imagine it will not scale
well.

Thanks in advance!

create table dir_size (
dir_name varchar2(40),
mbytes number
);

create table directories (
dir_name varchar2(40)
);

insert into dir_size values ('c:\aaa\bbb\ccc\ddd', 100);
insert into dir_size values ('c:\aaa\bbb\ccc', 100);
insert into dir_size values ('c:\aaa\bbb', 100);
insert into dir_size values ('c:\aaa', 100);
insert into dir_size values ('c:\', 100);

insert into directories values ('c:\aaa\bbb\ccc\ddd');
insert into directories values ('c:\aaa\bbb\ccc');
insert into directories values ('c:\aaa\bbb');
insert into directories values ('c:\aaa');
insert into directories values ('c:\');
insert into directories values ('c:\xxx\yyy\zzz');

commit;

select dir_name, sum(mbytes) from (
select directories.dir_name,
instr(dir_size.dir_name, directories.dir_name) INSTR,
mbytes
from directories, dir_size
)
where INSTR = 1
group by dir_name
order by 1;

DIR_NAME SUM(MBYTES)
---------------------------------------- -----------
c:\ 500
c:\aaa 400
c:\aaa\bbb 300
c:\aaa\bbb\ccc 200
c:\aaa\bbb\ccc\ddd 100

Charles Hooper

unread,
Nov 19, 2009, 4:33:38 PM11/19/09
to

This appears to be a hard problem. To avoid headaches, make certain
that each of the DIR_NAMES ends with \

Let's start here:
SELECT
'c:\aaa\bbb\ccc\ddd\' DIR_NAME,
100 MBYTES
FROM
DUAL;

DIR_NAME MBYTES
-------------------- ----------


c:\aaa\bbb\ccc\ddd\ 100

In your example, you would like to put 100MB into the following
directories based on the above:
c:\
c:\aaa\
c:\aaa\bbb\
c:\aaa\bbb\ccc\
c:\aaa\bbb\ccc\ddd\

You somehow need to be able to break that one row into 5 rows. The
following might help
SELECT
LEVEL L
FROM
DUAL
CONNECT BY
LEVEL<=20;

L
---
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

If we join those two row sources together we might be able to create 5
rows from the one row:
SELECT
SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) DIR_NAME2,
MBYTES
FROM
(SELECT
'c:\aaa\bbb\ccc\ddd\' DIR_NAME,
100 MBYTES
FROM
DUAL) DIR_SIZE,
(SELECT
LEVEL L
FROM
DUAL
CONNECT BY
LEVEL<=20) C
WHERE
SUBSTR(DIR_NAME,1,INSTR(DIR_NAME,'\',1,L)) IS NOT NULL;

DIR_NAME2 MBYTES
-------------------- ----------
c:\ 100
c:\aaa\ 100
c:\aaa\bbb\ 100
c:\aaa\bbb\ccc\ 100


c:\aaa\bbb\ccc\ddd\ 100

Now, if we performed the same process for all of the rows in the
DIR_SIZE table, grouping on DIR_NAME2, we might be able to find the
SUM of the MBYTES column.

I will leave the rest for your experimentation. You probably will not
have 20 \ characters in the DIR_NAME column, so you could optimize the
SQL statement a bit.

Charles Hooper
IT Manager/Oracle DBA
K&M Machine-Fabricating, Inc.

C M

unread,
Nov 19, 2009, 5:05:41 PM11/19/09
to
> K&M Machine-Fabricating, Inc.- Hide quoted text -
>
> - Show quoted text -

Thanks for the suggestion. I suspect the best way will involve some
kind of recursive processing. The tricky bit is the matching of the
rows in the directories table to the rows in the dir_size table. We
need to do a "like" (which we can't, of course) which is why I thought
of the instr.

CM

Charles Hooper

unread,
Nov 19, 2009, 6:26:44 PM11/19/09
to

> Thanks for the suggestion.  I suspect the best way will involve some


> kind of recursive processing.  The tricky bit is the matching of the
> rows in the directories table to the rows in the dir_size table.  We
> need to do a "like" (which we can't, of course) which is why I thought
> of the instr.
>
> CM

The LIKE keyword is not necessary.

Notice how closely the output of the following SQL statement:


SELECT
'c:\aaa\bbb\ccc\ddd\' DIR_NAME,
100 MBYTES
FROM
DUAL;

Matches the row created by one of your insert statements:


insert into dir_size values ('c:\aaa\bbb\ccc\ddd', 100);

You might try replacing in the above examples:


SELECT
'c:\aaa\bbb\ccc\ddd\' DIR_NAME,
100 MBYTES
FROM
DUAL;

With a SQL statement that selects all of the rows from your DIR_SIZE
table - the results might surprise you IF each of the DIR_NAME values
end with a \

You really need more variety in the insert statements to see what is
happening, for example:
insert into dir_size values ('c:\ddd\', 800);
insert into dir_size values ('c:\ddd\kkk\', 300);

The first of the above SQL statements will increase the calculated SUM
in the c:\ directory by 800, and the second insert statement will
increase the SUM in both of the c:\ and c:\ddd\ directories by 300 if
you modify my original example to use the DIR_SIZE table rather than
the DUAL table.

0 new messages