Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Looking for "gaps" (from a time perspective) in data

0 views
Skip to first unread message

jeremy

unread,
Oct 10, 2008, 5:53:14 AM10/10/08
to
This is on 10gR2 standard edition.

We have a table defined like this:

txn_id
date_start
date_end
url_called

which contains a record for every time our application processes a
request from a browser. We would like to be able to spot any "gaps" in
the data - which would indicate a period when the service was
unavailable for whatever reason.

To do this we need to do a comparison between the date_start of each
row and the one that preceded (or followed) it and then see if the
period of time is greater than a to-be-determined threshold.

Can a single SQL statement be constructed which could achieve this? if
so could you point me in the right direction? For those who haven't
followed what I was asking about, if you are in the majority then
please accept my apologies for not being more concise :)

regards,

--
jeremy

sybrandb

unread,
Oct 10, 2008, 7:04:16 AM10/10/08
to

Please refer to the LAG function in the Analytical Functions chapter
in the Datawarehousing documentation for the version you don't care to
mention.
Please apologize for being *too* concise. At a minimum you should
always post your 4 digit version number.
If you can't be bother to do it, please consider not posting at all.
Crystal balls have been worn out by similar lazy dudes like you.

--
Sybrand Bakker
Senior Oracle DBA

Robert Klemme

unread,
Oct 10, 2008, 7:19:16 AM10/10/08
to

Yes, this should be possible with analytic SQL. You define a sliding
window of one row preceding ordered by date_start and take the
difference of date_start - min(date_start).

An alternative might be to define the window based on a range of dates
preceding (i.e. use you threshold) and emit all rows where the count is
1 (i.e. there is no other record within the threshold time).

HTH

Kind regards

robert

ca11...@gmail.com

unread,
Oct 10, 2008, 8:04:40 AM10/10/08
to
Hi Jeremy,

Instead of looking at individual records it may be easier to analyze
the issue from statistical perspective.
Calculate number of records, average duration, and standard deviation
with averaging
over second/minute/hour - whatever is appropriate.

For example, for second

select
to_char(date_start,'YYYY-MM-DD HH24:MI:SS') d_start,
count(*) cnt,
avg((date_end - date_start)*24*3600) avg_duration
from tableA
group by
to_char(date_start,'YYYY-MM-DD HH24:MI:SS');

jeremy

unread,
Oct 10, 2008, 8:52:24 AM10/10/08
to
You mean where I stated 10gR2 up there ^^^^?

LAG and LEAD were exactly the functions I was looking for.

> Please apologize for being *too* concise. At a minimum you should
> always post your 4 digit version number.

Blimey I didn't realise the rules now included that level of detail.

> If you can't be bother to do it, please consider not posting at all.
> Crystal balls have been worn out by similar lazy dudes like you.

Heh - I know you've been flamed before for being rude, but I'll not
rise to the bait.

Thanks for your help, the pointer to the relevant functions was all I
needed.

--
jeremy

jeremy

unread,
Oct 10, 2008, 8:56:32 AM10/10/08
to


Ended up with a query like this:

select *
from (
select txn_id, web_site_id, username, date_start,
lag(date_start) over (order by txn_id) prev_date_start,
lead(date_start) over (order by txn_id) next_date_start,
url
from system_activity_log
where date_start >= sysdate-5
and ip_address is not null
order by 1) tabview
where date_start - prev_date_start > 1/1440
and to_number(to_char(date_start,'hh24')) between 6 and 22;

Thanks for your comments.

--
jeremy


jeremy

unread,
Oct 10, 2008, 9:43:09 AM10/10/08
to
On Oct 10, 12:19 pm, Robert Klemme <shortcut...@googlemail.com> wrote:

jeremy

unread,
Oct 10, 2008, 10:27:37 AM10/10/08
to

Hi this makes sense, but for us to identify periods of no activity, we
would want to get a row for every minute if the day (obviosuly for
that purpose we would start by altering the date format masks to not
include the :SS) - can't think at the moment how we could do that -
any thoughts?

--
jeremy

ErikYkema

unread,
Oct 10, 2008, 11:03:20 AM10/10/08
to
> jeremy- Hide quoted text -
>
> - Show quoted text -

You may enjoy http://download.oracle.com/docs/cd/B19306_01/server.102/b14223/analysis.htm#DWHSG02013
about "Data Densification" and
http://download.oracle.com/docs/cd/B19306_01/server.102/b14223/analysis.htm#sthref1845
on "Time Series Calculations on Densified Data".
You can create a densely filled matrix along e.g. the time axis and
then do analysis on that.

joel garry

unread,
Oct 10, 2008, 6:16:15 PM10/10/08
to

Also put those in the search box at asktom.oracle.com for many more
examples.

The book mentioned here sounds useful (I haven't checked it out yet,
I'm afraid I might have to recode a huge old project :-) ):
http://oraclesponge.wordpress.com/2008/06/12/the-overlaps-predicate/

>
> > Please apologize for being *too* concise. At a minimum you should
> > always post your 4 digit version number.
>
> Blimey I didn't realise the rules now included that level of detail.

See http://www.dbaoracle.net/readme-cdos.htm#subj12

>
> > If you can't be bother to do it, please consider not posting at all.
> > Crystal balls have been worn out by similar lazy dudes like you.
>
> Heh - I know you've been flamed before for being rude, but I'll not
> rise to the bait.
>
> Thanks for your help, the pointer to the relevant functions was all I
> needed.
>

Thanks for posting the query you wound up with, too, bound to be a
FAQ.

jg
--
@home.com is bogus.
So, how do you run a trace on what exadata is doing?

0 new messages