Linking records together

PJ

unread,

Sep 23, 2009, 9:22:49 AM9/23/09

to

Hi all,

We need to track work items across several existing applications. Items can
start as a document on disk, can be renamed or moved, converted to a task
in an application, until they reach a certain state. The applications know
nothing about preceding or following applications and there is no common
identifier for the items, which has always been a good thing.

My idea is to create a table for all events. Each application will record
their
own events. A minimum is one event per application with a starting and
finishing situation for that application. Example: "file \\folder1\x.doc was
moved to \\folder2\y.doc". Another application may follow up with
"\\folder2\y.doc was archived with id 12345". Each event will have a date
and time.

My table will probably have the following columns:

event_id number(10)
start_item varchar2(255)
end_item varchar2(255)
event_date date

Sample data:

1 A B 2009-10-11
2 G H 2009-10-12
3 B C 2009-10-13
4 H I 2009-10-17

Records can be linked from end_item to start_item (the next application in
line will continue where the previous application stops). So in this sample
data I have two sets of records (1 with 3, 2 with 4) and recognize two
starting points A and G with two end points C and I. From A to C takes
two days, from G to I takes 5 days.

Data entry is no problem. Reporting is. My resultset (a view, another table)
should be this:

A C 2
G I 5

How can I achieve this? I have looked at the connect by, but I get to many
records. So basically I have no idea where to start.

Thanks in advance,
PJ

ddf

unread,

Sep 23, 2009, 10:27:16 AM9/23/09

to

You're making this too difficult:

SQL> create table event_track(
2 event_id number(10),
3 start_item varchar2(255),
4 end_item varchar2(255),
5 event_date date
6 );

Table created.

SQL>
SQL>
SQL> insert all
2 into event_track
3 values(1,'A','B',to_date('2009-10-11', 'RRRR-MM-DD'))
4 into event_track
5 values(2,'G','H',to_date('2009-10-12', 'RRRR-MM-DD'))
6 into event_track
7 values(3,'B','C',to_date('2009-10-13', 'RRRR-MM-DD'))
8 into event_track
9 values(4,'H','I',to_date('2009-10-17', 'RRRR-MM-DD'))
10 select * from dual;

4 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> column start_item format a10
SQL> column end_item format a10
SQL>
SQL> select e1.event_id, e1.start_item, e2.end_item, round
(e2.event_date - e1.event_date, 0) event_days
2 from event_track e1, event_track e2
3 where e2.start_item = e1.end_item;

EVENT_ID START_ITEM END_ITEM EVENT_DAYS
---------- ---------- ---------- ----------
1 A C 2
2 G I 5

SQL>

David Fitzjarrell

PJ

unread,

Sep 23, 2009, 1:36:37 PM9/23/09

to

I should have known, the sample data is too easy. It won't work with more
than two records per item. Real world will have a varying number of records
per item, probably with an average of 6.

PJ

"ddf" <ora...@msn.com> wrote in message
news:7c8b9226-0a29-47ef...@g6g2000vbr.googlegroups.com...

ddf

unread,

Sep 23, 2009, 2:35:54 PM9/23/09

to

On Sep 23, 12:36 pm, "PJ" <pjv_dev.nos...@geenspam.hotmail.com> wrote:
> I should have known, the sample data is too easy. It won't work with more
> than two records per item. Real world will have a varying number of records
> per item, probably with an average of 6.
>
> PJ
>

> "ddf" <orat...@msn.com> wrote in message

> David Fitzjarrell- Hide quoted text -
>
> - Show quoted text -

Provide more accurate sample data, then. We work with what we're
given; if that sample is not representative of the actual data how can
anyone expect an accurate solution? Remember, please, "Garbage In,
Garbage Out".

David Fitzjarrell

ddf

unread,

Sep 23, 2009, 2:58:59 PM9/23/09

to

On Sep 23, 12:36 pm, "PJ" <pjv_dev.nos...@geenspam.hotmail.com> wrote:

> I should have known, the sample data is too easy. It won't work with more
> than two records per item. Real world will have a varying number of records
> per item, probably with an average of 6.
>
> PJ
>

> "ddf" <orat...@msn.com> wrote in message

> David Fitzjarrell- Hide quoted text -
>
> - Show quoted text -

I gave this a try and it appears to be working:

SQL> --
SQL> -- Create table
SQL> --

SQL> create table event_track(
2 event_id number(10),
3 start_item varchar2(255),
4 end_item varchar2(255),
5 event_date date
6 );

Table created.

SQL>
SQL> --
SQL> -- Load data
SQL> --

SQL> insert all
2 into event_track
3 values(1,'A','B',to_date('2009-10-11', 'RRRR-MM-DD'))
4 into event_track
5 values(2,'G','H',to_date('2009-10-12', 'RRRR-MM-DD'))
6 into event_track
7 values(3,'B','C',to_date('2009-10-13', 'RRRR-MM-DD'))
8 into event_track
9 values(4,'H','I',to_date('2009-10-17', 'RRRR-MM-DD'))

10 into event_track
11 values(5,'C','D',to_date('2009-10-19', 'RRRR-MM-DD'))
12 into event_track
13 values(6,'I','J',to_date('2009-10-22', 'RRRR-MM-DD'))
14 into event_track
15 values(7,'J','K',to_date('2009-10-23', 'RRRR-MM-DD'))
16 into event_track
17 values(8,'D','E',to_date('2009-10-27', 'RRRR-MM-DD'))
18 select * from dual;

8 rows created.

SQL>
SQL> commit;

Commit complete.

SQL>
SQL> column start_item format a10
SQL> column end_item format a10
SQL>

SQL> --
SQL> -- Display source data
SQL> --
SQL> select event_id, start_item, end_item ,event_date
2 from event_track;

EVENT_ID START_ITEM END_ITEM EVENT_DATE
---------- ---------- ---------- --------------------
1 A B 11-OCT-2009 00:00:00
2 G H 12-OCT-2009 00:00:00
3 B C 13-OCT-2009 00:00:00
4 H I 17-OCT-2009 00:00:00
5 C D 19-OCT-2009 00:00:00
6 I J 22-OCT-2009 00:00:00
7 J K 23-OCT-2009 00:00:00
8 D E 27-OCT-2009 00:00:00

8 rows selected.

SQL>
SQL> --
SQL> -- Generate desired report
SQL> --
SQL> with etrack as(
2 select e1.event_id, e1.start_item, e2.end_item,
e2.event_date edate2, e1.event_date edate1
3 from event_track e1, event_track e2
4 where e2.start_item = e1.end_item
5 )
6 select e11.event_id, e11.start_item, e21.end_item, round
(e21.edate2 - e11.edate1,0) evt_days
7 from etrack e11, etrack e21
8 where e21.start_item = e11.end_item
9 order by 1;

EVENT_ID START_ITEM END_ITEM EVT_DAYS
---------- ---------- ---------- ----------
1 A E 16
2 G K 11

SQL>

Test it on your data to see if the query returns correct results.

David Fitzjarrell

Shakespeare

unread,

Sep 23, 2009, 3:55:02 PM9/23/09

to

ddf schreef:

This does not work for a sequence of more than 4 events. All solutions
of this type can only work for a fixed max. number of events.
Proof:
insert into event_track values (8,'E','F',to_date('2009-11-27', 'RRRR-MM-DD
'));
commit;

and try again.

I think pl/sql is your friend here. Create a function that calculates
the end-point for a given start point (that one could use a connect by
or a cursor loop) and use it in your select statement:
select start_item, f_calc_end_item(startitem) end_item from event_track.
Only problem remaining is: what are the start items? (I leave that to
you..., hint: use a not exists clause.

I hope performance will not be an issue ;-)

Shakespeare

ddf

unread,

Sep 23, 2009, 4:50:28 PM9/23/09

to

> Shakespeare- Hide quoted text -

>
> - Show quoted text -

I never said it was an absolute answer. :D

David Fitzjarrell

Shakespeare

unread,

Sep 23, 2009, 5:09:30 PM9/23/09

to

I just responded to your challenge:

"> Test it on your data to see if the query returns correct results."

;-)

Shakespeare

unread,

Sep 23, 2009, 5:46:34 PM9/23/09

to

Shakespeare schreef:

I just could not leave it alone:

CREATE OR REPLACE FUNCTION f_endpoint(p_startpoint IN NUMBER) RETURN
NUMBER IS
CURSOR c_nextpoint(b_startpoint NUMBER) IS
SELECT t2.event_id
FROM event_track t1, event_track t2
WHERE t1.event_id = b_startpoint
AND t1.end_item = t2.start_item;
r_nextpoint c_nextpoint%ROWTYPE;
BEGIN
OPEN c_nextpoint(p_startpoint);
FETCH c_nextpoint
INTO r_nextpoint;
IF c_nextpoint%NOTFOUND THEN
CLOSE c_nextpoint;
RETURN p_startpoint;
ELSE
CLOSE c_nextpoint;
RETURN f_endpoint(r_nextpoint.event_id);
END IF;
-- this could be done by a connect by and taking the last record
-- as well, but I love recursion ....
END;

SELECT t3.event_id, t3.start_item, t4.end_item
FROM (SELECT t.event_id, t.start_item, f_endpoint(t.event_id) end_event
FROM event_track t
WHERE NOT EXISTS
(SELECT 1 FROM event_track t2 WHERE t2.end_item =
t.start_item)) t3,
event_track t4
WHERE t4.event_id = t3.end_event
ORDER BY 1;

EVENT_ID START_ITEM END_ITEM
1 A E
2 G K

Not the most optimal version I'm sure, but it works. Tested with more
than 4 steps as well.

Shakespeare

Malcolm Dew-Jones

unread,

Sep 23, 2009, 6:05:21 PM9/23/09

to

PJ (pjv_dev...@geenspam.hotmail.com) wrote:
: Hi all,

: We need to track work items across several existing applications. Items can
: start as a document on disk, can be renamed or moved, converted to a task
: in an application, until they reach a certain state. The applications know
: nothing about preceding or following applications and there is no common
: identifier for the items, which has always been a good thing.

: My idea is to create a table for all events. Each application will record
: their
: own events. A minimum is one event per application with a starting and
: finishing situation for that application. Example: "file \\folder1\x.doc was
: moved to \\folder2\y.doc". Another application may follow up with
: "\\folder2\y.doc was archived with id 12345". Each event will have a date
: and time.

Use connect by to join the entries and sys_connect_by_path to show each
complete path.

You will get extra entries such as
e.g.

A->B
A->B->D
A->B->D->X

Filter that list so you only see the results where the end (e.g. X) does
not have a next connection.

CONNECT_BY_ISLEAF might help do that.

CONNECT_BY_ROOT might be useful when displaying the end points.

sys_connect_by_path might not even be required here.

PJ

unread,

Sep 24, 2009, 2:46:53 AM9/24/09

to

Thanks for the suggestions, I will definitely look into it.

"Malcolm Dew-Jones" <yf...@vtn1.victoria.tc.ca> wrote in message
news:4aba9ba1$1...@news.victoria.tc.ca...

PJ

unread,

Sep 24, 2009, 2:50:07 AM9/24/09

to

Thanks for your suggestions, I will look into it. Considering the amount of
records I will have in the event table, I will probably use a second table
with only the resulting report, and recalculate and update the appropriate
record when an event is inserted.

PJ

"Shakespeare" <wha...@xs4all.nl> wrote in message
news:4aba973d$0$83238$e4fe...@news.xs4all.nl...

Michel Cadot

unread,

Sep 24, 2009, 12:17:34 PM9/24/09

to

"PJ" <pjv_dev...@geenspam.hotmail.com> a �crit dans le message de news: 4aba2123$0$23459$703f...@news.kpn.nl...

create table event_track (
event_id number(10),
start_item varchar2(10),
end_item varchar2(10),
event_date date
)
/
insert all
into event_track values(1,'A','B',to_date('2009-10-11', 'RRRR-MM-DD'))
into event_track values(2,'G','H',to_date('2009-10-12', 'RRRR-MM-DD'))
into event_track values(3,'B','C',to_date('2009-10-13', 'RRRR-MM-DD'))
into event_track values(4,'H','I',to_date('2009-10-17', 'RRRR-MM-DD'))
into event_track values(5,'C','D',to_date('2009-10-19', 'RRRR-MM-DD'))
into event_track values(6,'I','J',to_date('2009-10-22', 'RRRR-MM-DD'))
into event_track values(7,'J','K',to_date('2009-10-23', 'RRRR-MM-DD'))
into event_track values(8,'D','E',to_date('2009-10-27', 'RRRR-MM-DD'))
into event_track values(9,'E','F',to_date('2009-11-27', 'RRRR-MM-DD'))
select * from dual;
alter session set nls_date_format='YYYY-MM-DD';

SQL> select * from event_track order by 1;
EVENT_ID START_ITEM END_ITEM EVENT_DATE
---------- ---------- ---------- ----------

1 A B 2009-10-11
2 G H 2009-10-12
3 B C 2009-10-13
4 H I 2009-10-17

5 C D 2009-10-19
6 I J 2009-10-22
7 J K 2009-10-23
8 D E 2009-10-27
9 E F 2009-11-27

9 rows selected.

SQL> select connect_by_root start_item start_item,
2 end_item,
3 event_date - connect_by_root event_date evt_days
4 from event_track
5 where connect_by_isleaf = 1
6 connect by prior end_item = start_item
7 start with start_item not in (select end_item from event_track)
8 order by 1
9 /

START_ITEM END_ITEM EVT_DAYS
---------- ---------- ----------

A F 47
G K 11

2 rows selected.

Regards
Michel

Shakespeare

unread,

Sep 24, 2009, 4:49:04 PM9/24/09

to

PJ schreef:

That sounds like a good idea to me.

Shakespeare

unread,

Sep 24, 2009, 4:53:09 PM9/24/09

to

Michel Cadot schreef:

In my first post, I originally mentioned you as one of the solutions,
but decided not to. Somehow I knew you would come up with a good and
simple answer.....

Compliments!

Shakespeare

PJ

unread,

Sep 25, 2009, 2:56:37 AM9/25/09

to

Very nice! Very clean coding and just what I need. Thanks!

"Michel Cadot" <micadot{at}altern{dot}org> wrote in message
news:4abb9b9e$0$15180$426a...@news.free.fr...