Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

help with regex

Skip to first unread message

Obama

unread,
Nov 20, 2009, 7:07:13 PM11/20/09
to
I do have a log file that I would like to parse for the key as
follows:
1- start date, 2- server name, 3- end date that match start server
name, 4- size that matches server which started and ended

I got this code:


while (<FILE>) {
next if (/^\s*$/); # skip empty lines
next if (/^slk/); # skip empty lines
next if (/^=/); # skip = lines
if ( /(\S+) (\S+\s\S+\s\S+) (.*?) (\w+) \S+ (\S+) (\S+)( \(([^)]+
\)))?/ )
{
if ($6 eq 'Start')
{
push @server_start_name, $5;
push @server_start_date, $2;
push @time_start, $3;
}

if ($6 eq 'End')
{
push @server_end_name, $5;
push @time_end, $3;
push @size_end, $7;
}
}

my $num_end = scalar (@time_end);
for (my $i=0; $i < $num_start; $i++)
{
$t1= $time_start[$i];
$t2= $time_end[$i];
$tx= $size_end[$i];
$s_server= $server_start_name[$i];
$e_server= $server_end_name[$i];
print "Server $s_server start: $s_server\nServer $e_server End:
$e_server size: $tx\n\n";
}

the output i get is not right the server-start does not match the end-
server, here is the out put:

output:
====================
Start: Hercules:sm_fv_servicedata
End : Hercules:sm_fv_students

Start: Hercules:sm_fv_phd
End : Hercules:sm_fv_phd


Log file:
====================
dst Wed Nov 18 16:00:04 EST galaxy.fuqua.duke.edu:fv_students
Hercules:sm_fv_students Start
dst Wed Nov 18 16:00:05 EST galaxy.fuqua.duke.edu:fv_servicedata
Hercules:sm_fv_servicedata Start
dst Wed Nov 18 16:00:05 EST galaxy.fuqua.duke.edu:fv_phd
Hercules:sm_fv_phd Start
dst Wed Nov 18 16:00:22 EST galaxy.fuqua.duke.edu:fv_students
Hercules:sm_fv_students End (14532 KB)
dst Wed Nov 18 16:00:33 EST galaxy.fuqua.duke.edu:fv_servicedata
Hercules:sm_fv_servicedata End (15212 KB)
dst Wed Nov 18 16:01:00 EST andromeda.fuqua.duke.edu:esx_fc_nfs2
Hercules:sm_esx_fc_nfs2 Request (Retry)
dst Wed Nov 18 16:01:00 EST andromeda.fuqua.duke.edu:esx_fc_nfs2
Hercules:sm_esx_fc_nfs2 Defer (volume is not online; cannot execute
operation)
dst Wed Nov 18 16:01:10 EST galaxy.fuqua.duke.edu:fv_phd
Hercules:sm_fv_phd End (339576 KB)
dst Wed Nov 18 16:02:00 EST andromeda.fuqua.duke.edu:esx_fc_nfs2
Hercules:sm_esx_fc_nfs2 Request (Retry)
dst Wed Nov 18 16:02:00 EST andromeda.fuqua.duke.edu:esx_fc_nfs2
Hercules:sm_esx_fc_nfs2 Defer (volume is not online; cannot execute
operation)
dst Wed Nov 18 16:03:00 EST andromeda.fuqua.duke.edu:esx_fc_nfs2
Hercules:sm_esx_fc_nfs2 Request (Retry)
dst Wed Nov 18 16:03:00 EST andromeda.fuqua.duke.edu:esx_fc_nfs2
Hercules:sm_esx_fc_nfs2 Defer (volume is not online; cannot execute
operation)
~

Any help will be appropriate in advance..

Jim Gibson

unread,
Nov 20, 2009, 7:43:55 PM11/20/09
to
In article
<44e92eab-08e8-4393...@g1g2000pra.googlegroups.com>,
Obama <cyrus...@gmail.com> wrote:

This is because the End statements do not appear in the log file in the
same order as the Start statements. This is to be expected. Your data
structures assume that they do.

What you need to do is use a data structure that is impervious to the
order of the statements in the log file. Use a hash-of-hashes: the
first-level hash key is the name of the server and the value is a hash
reference. The second-level hash keys are 'Start', 'End', 'Date', and
'Size', and the hash values are the corresponding data extracted from
the file (untested):

my %times;
while(<FILE>) {
if( big-long-regex ) {
$times{$5}->{$6} = $3;


if( $6 eq 'Start' ) {

$times{$5}->{Date} = $2;
}elsif( $6 eq 'End' ) {
$times{$5}->{Size} = $7;
}
}
}

You could also use unpack or substr to extract the data instead of a
big regex. If you do stick with a regex, use the x modifier and split
the regex over several lines for readability.

--
Jim Gibson

Obama

unread,
Nov 20, 2009, 8:05:39 PM11/20/09
to
Thanks Jim,
then how do you access it each one?

Obama

unread,
Nov 21, 2009, 12:01:01 AM11/21/09
to
On Nov 20, 5:05 pm, Obama <cyrusgre...@gmail.com> wrote:
> Thanks Jim,
> then how do you access it each one?

Will someone please help me on this, thanks in advance..
Jim, it does not work


my %times;
while(<FILE>) {

if( /(\S+) (\S+\s\S+\s\S+) (.*?) (\w+) \S+ (\S+) (\S+)( \(([^)]+
\)))?/) {


$times{$5}->{$6} = $3;
if( $6 eq 'Start' ) {

$times{$5}->{Date} = $2; <----should this be 'Date' not Date


}elsif( $6 eq 'End' ) {

$times{$5}->{Size} = $7; <----- same for this one
}
}

}

Also how to you access teach key such as server, date/time, for both
start and end and size?

Martien Verbruggen

unread,
Nov 21, 2009, 12:46:59 AM11/21/09
to
On Fri, 20 Nov 2009 21:01:01 -0800 (PST),
Obama <cyrus...@gmail.com> wrote:
> On Nov 20, 5:05 pm, Obama <cyrusgre...@gmail.com> wrote:
>> Thanks Jim,
>> then how do you access it each one?
>
> Will someone please help me on this, thanks in advance..
> Jim, it does not work

Please, read this:

http://www.rehabitation.com/clpmisc.shtml

You're not making it easy for anyone to help you. Make it easier. Read
that link. Do what it suggests.

When I tried to look at your problem, I had to make chanes to the both
the code, and the data. Even after making the snippet that you provided
syntactically correct, it still wouldn't run with strict, and only with
a large number of warnings without.

So I gave up.

If you had been a bit more pro-activem there would have been a decent
chance you would have had a solution by now.

Martien
--
|
Martien Verbruggen | In the fight between you and the world,
first...@heliotrope.com.au | back the world - Franz Kafka
|

Obama

unread,
Nov 21, 2009, 2:00:11 AM11/21/09
to
>
> So I gave up.
>
Please don't, I'm new to this and please be patient I try my best,
here what I got so far

open (FILE, $file) || die "Can't open datafile '$file' $!";

while (<FILE>)
{
next if (/^\s*$/); # skip empty lines
next if (/^slk/); # skip empty lines
next if (/^=/); # skip = lines

if ( /(\S+) (\S+\s\S+\s\S+) (.*?) (\w+) \S+ (\S+) (\S+)( \(([^)]+
\)))?/ )
{


if ($6 eq 'Start')
{

$end{$5}= {
s_name => $5,
s_date => $2,
s_time => $3,


};
}
if ($6 eq 'End')
{

$end{$5}= {
e_name => $5,
e_date => $2,
e_time => $3,
e_size => $7
};
}
}
}
close (FILE);

print Dumper(%end);

I got the following

$VAR1 = 'Hercules:sm_fv_mmvideo';
$VAR2 = {
'e_name' => 'Hercules:sm_fv_mmvideo',
'e_size' => ' (692 KB)',
'e_time' => '15:30:17',
'e_date' => 'Wed Nov 18'
};
$VAR3 = 'Hercules:sm_esx_sata_nfs1';
$VAR4 = {
'e_name' => 'Hercules:sm_esx_sata_nfs1',
'e_size' => ' (1161316 KB)',
'e_time' => '15:42:12',
'e_date' => 'Wed Nov 18'
};
what is missing the:
s_name => $5,
s_date => $2,
s_time => $3,


Basically I want to make sure the server has right info (start and end
time) since there are other server as listed above from log....


foreach $key (keys %end){ # loop thru the hash
my $s_server = $end{$key}{s_name}; # retrieve the name
my $e_server = $end{$key}{e_name}; # retrieve the name

my $s_date = $end{$key}{s_date}; # retrieve date
my $e_date = $end{$key}{e_date}; # retrieve date

my $s_time = $end{$key}{s_time}; # retrieve time
my $e_time = $end{$key}{e_time}; # retrieve time


my $e_size = $end{$key}->{e_size}; # size

print "$key:\n------------------\nStart: $s_server\nDate: $s_date
\nTime: $s_time\n";;
print "$key:\n------------------\nEnd: $e_server\nDate: $e_date
\nTime: $e_time\nSize: $e_size\n\n";;
}

Thanks again and hope I can get something from you good guys...

the log file has

Tad McClellan

unread,
Nov 21, 2009, 8:22:29 AM11/21/09
to
Obama <cyrus...@gmail.com> wrote:
>>
>> So I gave up.
>>
> Please don't,


Did you read all of the paragraphs before that part?

Did you understand what they were saying?

Are you trying to correct the deficiencies pointed out there?

Did you read the Posting Guidelines?


> I'm new to this


New to Perl is no problem.

New to programming is no problem.

Most folks will not ignore you for that.

New to asking a coherent question on Usenet newsgroups can become
a problem, so concentrate on getting that right before it is too late.


> and please be patient I try my best,


I am sorry, but I do not believe you.

I do not think you are trying very hard at all.

The Posting Guidelines suggest you post a *complete* program that
we can run. (my earlier followup to you included a complete program
for example.)

You did not do that.

The Posting Guidelines suggest that you include file data in
a __DATA__ section (also as has been shown to you in my earlier
followup).

You did not do that either.


> next if (/^slk/); # skip empty lines


That code does not "skip empty lines"...


> print Dumper(%end);


You should pass a *reference* to Dumper:

print Dumper(\%end);


> hope I can get something from you good guys...


You are still making it hard for us to do that.

If you make it easy for us to help you, then it is more likely
that we will end up actually helping you...


--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.noitatibaher\100cmdat/"

J�rgen Exner

unread,
Nov 21, 2009, 9:20:27 AM11/21/09
to
Obama <cyrus...@gmail.com> wrote:
>On Nov 20, 5:05�pm, Obama <cyrusgre...@gmail.com> wrote:
>> Thanks Jim,
>> then how do you access it each one?
>
>Will someone please help me on this, thanks in advance..
>Jim, it does not work

Who is Jim, what did he do, and what doesn't work?

>my %times;
>while(<FILE>) {
> if( /(\S+) (\S+\s\S+\s\S+) (.*?) (\w+) \S+ (\S+) (\S+)( \(([^)]+
>\)))?/) {
> $times{$5}->{$6} = $3;
> if( $6 eq 'Start' ) {
> $times{$5}->{Date} = $2; <----should this be 'Date' not Date

No, not necessarily. Using barewords as keys for hashes is perfectly
fine as long as they don't contain whitespace.

> }elsif( $6 eq 'End' ) {
> $times{$5}->{Size} = $7; <----- same for this one
> }
> }
>
>}
>
>Also how to you access teach key such as server, date/time, for both
>start and end and size?

There are no such keys in the code sample you showed above. The only
keys used are 'End', 'Size', and whatever the content of $5 and $6 are.

Either way, you got a Hash-of-Hashes, so the answer would depend upon if
you are talking about keys for the top-level hash or the sub-hashes.

jue

Obama

unread,
Nov 21, 2009, 11:19:05 AM11/21/09
to
Here is complete code that I need your help, please do let me know if
you do have any other question.

#!/usr/bin/perl -w

use Data::Dumper;
my %HoH=();

while (<DATA>)
{
if ( /(\S+) (\S+\s\S+\s\S+) (.*?) (\w+) \S+ (\S+) (\S+)( \(([^)]+
\)))?/ )
{


if ($6 eq 'Start')
{

%HoH = (
'start_server' => {'start_name' =>$5, 'start_date' => $2,
'start_time' => $3},
);
}

if ($6 eq 'End')
{

%HoH = (
'ende_name' => {'end_name' =>$5, 'end_date' => $2,
'end_time' => $3, 'size' => $7}
);
}
}
}

print Dumper(\%HoH);

=item
What I want is to match right server for right date, time and size,
the out put I'm looking for is:

Server: Hercules:sm_fv_servicedata
Start date: Sun Nov 15
Start time: 00:00:03
End date: Sun Nov 15
End time: 00:00:55
End size: (53664 KB)

Same for other servers
=cut

__DATA__
dst Sun Nov 15 00:00:03 EST galaxy.fuqua.duke.edu:fv_servicedata
Hercules:sm_fv_servicedata Start
dst Sun Nov 15 00:00:03 EST galaxy.fuqua.duke.edu:fv_phd
Hercules:sm_fv_phd Start
dst Sun Nov 15 00:00:03 EST galaxy.fuqua.duke.edu:fv_students
Hercules:sm_fv_students Start
dst Sun Nov 15 00:00:25 EST galaxy.fuqua.duke.edu:fv_students
Hercules:sm_fv_students End (3368 KB)
dst Sun Nov 15 00:00:26 EST galaxy.fuqua.duke.edu:fv_phd
Hercules:sm_fv_phd End (4528 KB)
dst Sun Nov 15 00:00:55 EST galaxy.fuqua.duke.edu:fv_servicedata
Hercules:sm_fv_servicedata End (53664 KB)
dst Sun Nov 15 00:01:00 EST andromeda.fuqua.duke.edu:esx_fc_nfs2
Hercules:sm_esx_fc_nfs2 Request (Retry)
dst Sun Nov 15 00:15:04 EST galaxy.fuqua.duke.edu:fv_faculty
Hercules:sm_fv_faculty Start
dst Sun Nov 15 00:15:04 EST galaxy.fuqua.duke.edu:fv_researchdata
Hercules:sm_fv_researchdata Start
dst Sun Nov 15 00:15:04 EST galaxy.fuqua.duke.edu:fv_emba
Hercules:sm_fv_emba Start
dst Sun Nov 15 00:15:04 EST galaxy.fuqua.duke.edu:root
Hercules:sm_galaxy_root Start
dst Sun Nov 15 00:15:13 EST galaxy.fuqua.duke.edu:fv_emba
Hercules:sm_fv_emba End (1900 KB)
dst Sun Nov 15 00:15:18 EST galaxy.fuqua.duke.edu:fv_researchdata
Hercules:sm_fv_researchdata End (1820 KB)
dst Sun Nov 15 00:15:32 EST galaxy.fuqua.duke.edu:root
Hercules:sm_galaxy_root End (39128 KB)
dst Sun Nov 15 00:16:00 EST andromeda.fuqua.duke.edu:esx_fc_nfs2
Hercules:sm_esx_fc_nfs2 Request (Retry)

Martien Verbruggen

unread,
Nov 21, 2009, 3:22:33 PM11/21/09
to
On Fri, 20 Nov 2009 23:00:11 -0800 (PST),
Obama <cyrus...@gmail.com> wrote:
>>
>> So I gave up.
>>
> Please don't, I'm new to this and please be patient I try my best,
> here what I got so far

And this is what I see when I copy and paste (which is ALL I should have
to do), and add

#!/usr/bin/perl
use warnings;
use strict;

Which is what YOU should have already done.

Global symbol "$file" requires explicit package name at /tmp/foo.pl line 5.
Global symbol "$file" requires explicit package name at /tmp/foo.pl line 5.
Global symbol "%end" requires explicit package name at /tmp/foo.pl line 18.
Global symbol "%end" requires explicit package name at /tmp/foo.pl line 26.
Global symbol "%end" requires explicit package name at /tmp/foo.pl line 37.
Execution of /tmp/foo.pl aborted due to compilation errors.

Again, I gave up.

Did you follow that link that I suggested? And did you read all it said
and all it referred to? If so, I suggest you read it again.

You're simply making it too hard for anyone to help you.

John W. Krahn

unread,
Nov 21, 2009, 3:34:04 PM11/21/09
to
J�rgen Exner wrote:
> Obama <cyrus...@gmail.com> wrote:
>> On Nov 20, 5:05 pm, Obama <cyrusgre...@gmail.com> wrote:
>>> Thanks Jim,
>>> then how do you access it each one?
>> Will someone please help me on this, thanks in advance..
>> Jim, it does not work
>
> Who is Jim, what did he do, and what doesn't work?
>
>> my %times;
>> while(<FILE>) {
>> if( /(\S+) (\S+\s\S+\s\S+) (.*?) (\w+) \S+ (\S+) (\S+)( \(([^)]+
>> \)))?/) {
>> $times{$5}->{$6} = $3;
>> if( $6 eq 'Start' ) {
>> $times{$5}->{Date} = $2; <----should this be 'Date' not Date
>
> No, not necessarily. Using barewords as keys for hashes is perfectly
> fine as long as they don't contain whitespace.

as long as they don't contain any \W characters.

John
--
The programmer is fighting against the two most
destructive forces in the universe: entropy and
human stupidity. -- Damian Conway

Martien Verbruggen

unread,
Nov 21, 2009, 4:07:52 PM11/21/09
to
On Sat, 21 Nov 2009 08:19:05 -0800 (PST),
Obama <cyrus...@gmail.com> wrote:
> Here is complete code that I need your help, please do let me know if
> you do have any other question.

Thanks for submitting code that compiled. Using that as a starting
point, I came up with a possible solution to your problem. I had to fix
your regex to correctly capture the size information in $7, but other
than that, it seems to work.

You should try to convince your posting software to not wrap long lines.
I had to reassemble the log lines. Not a huge deal.

In the below, I'm using the %data hash to store information about each
'server'. Under each server key is a hash reference with keys 'start'
and 'end'. Each start hash has two keys, and each end one three. Other
data structures can be used, particularly for the last level, but I
figured this was closest to what you were trying to do, and therefore
probably most comfortable. I noticed you know about Data::Dumper, so use

print Dumper(\%data);

somewhere near the end of the program to see what the structure looks
like if you want.

Depending on requirements I wasn't aware of when I was writing this, you
may need to change one or two things. This should give you a decent
start though.

If you have any questions, don't hesistate to ask.

Martien

PS. Here's the program:


#!/usr/bin/perl
use strict;
use warnings;
my %data;

while (<DATA>)
{
if (/(\S+) (\S+\s\S+\s\S+) (.*?) (\w+) \S+ (\S+) (\S+)(?: \(([^)]+)\))?/)

{
if ($6 eq 'Start')
{

$data{$5}{start} = {date => $2, time => $3};


}
elsif ($6 eq 'End')
{

$data{$5}{end} = {date => $2, time => $3, size => $7};
}
}
}

sub print_start
{
my ($start) = @_;
if ($start)
{
printf " Start date: %s\n Start time: %s\n",
$start->{date}, $start->{time};
}
else
{
print " No start information\n";
}
}

sub print_end
{
my ($end) = @_;
if ($end)
{
printf " End date: %s\n End time: %s\n End size: %s\n",
$end->{date}, $end->{time}, $end->{size};
}
else
{
print " No end information\n";
}
}

while (my ($name, $data) = each %data)
{
print "Server: $name\n";
print_start($data->{start});
print_end($data->{end});
}

__DATA__
dst Sun Nov 15 00:00:03 EST galaxy.fuqua.duke.edu:fv_servicedata Hercules:sm_fv_servicedata Start
dst Sun Nov 15 00:00:03 EST galaxy.fuqua.duke.edu:fv_phd Hercules:sm_fv_phd Start
dst Sun Nov 15 00:00:03 EST galaxy.fuqua.duke.edu:fv_students Hercules:sm_fv_students Start
dst Sun Nov 15 00:00:25 EST galaxy.fuqua.duke.edu:fv_students Hercules:sm_fv_students End (3368 KB)
dst Sun Nov 15 00:00:26 EST galaxy.fuqua.duke.edu:fv_phd Hercules:sm_fv_phd End (4528 KB)
dst Sun Nov 15 00:00:55 EST galaxy.fuqua.duke.edu:fv_servicedata Hercules:sm_fv_servicedata End (53664 KB)
dst Sun Nov 15 00:01:00 EST andromeda.fuqua.duke.edu:esx_fc_nfs2 Hercules:sm_esx_fc_nfs2 Request (Retry)
dst Sun Nov 15 00:15:04 EST galaxy.fuqua.duke.edu:fv_faculty Hercules:sm_fv_faculty Start
dst Sun Nov 15 00:15:04 EST galaxy.fuqua.duke.edu:fv_researchdata Hercules:sm_fv_researchdata Start
dst Sun Nov 15 00:15:04 EST galaxy.fuqua.duke.edu:fv_emba Hercules:sm_fv_emba Start
dst Sun Nov 15 00:15:04 EST galaxy.fuqua.duke.edu:root Hercules:sm_galaxy_root Start
dst Sun Nov 15 00:15:13 EST galaxy.fuqua.duke.edu:fv_emba Hercules:sm_fv_emba End (1900 KB)
dst Sun Nov 15 00:15:18 EST galaxy.fuqua.duke.edu:fv_researchdata Hercules:sm_fv_researchdata End (1820 KB)
dst Sun Nov 15 00:15:32 EST galaxy.fuqua.duke.edu:root Hercules:sm_galaxy_root End (39128 KB)
dst Sun Nov 15 00:16:00 EST andromeda.fuqua.duke.edu:esx_fc_nfs2 Hercules:sm_esx_fc_nfs2 Request (Retry)

--
| Everything that can be invented has been
Martien Verbruggen | invented. -- Charles H. Duell,
first...@heliotrope.com.au | Commissioner, U.S. Office of Patents,
| 1899

Martien Verbruggen

unread,
Nov 21, 2009, 4:13:38 PM11/21/09
to
On Sat, 21 Nov 2009 12:34:04 -0800,
John W. Krahn <som...@example.com> wrote:

> Jürgen Exner wrote:
>
>> No, not necessarily. Using barewords as keys for hashes is perfectly
>> fine as long as they don't contain whitespace.
>
> as long as they don't contain any \W characters.

\begin{pedantry}[level='slight']

It wouldn't, by definition. A 'bareword' is a 'word' (see perldata), and
a word consists entirely of \w characters, as that is how \w is defined.

\end{pedantry}

Martien
--
|
Martien Verbruggen | There are only 10 types of people in the
first...@heliotrope.com.au | world; those who understand binary and
| those who don't.

Obama

unread,
Nov 21, 2009, 4:42:48 PM11/21/09
to
On Nov 21, 1:07 pm, Martien Verbruggen

<martien.verbrug...@invalid.see.sig> wrote:
> On Sat, 21 Nov 2009 08:19:05 -0800 (PST),
> If you have any questions, don't hesistate to ask.
> Martien

Martien,
thanks for your help, I do have a log file which contains more than
3000 records, after running the program it records only 23, the reason
I guess that code overwrites if one server has more than one 'Start|
End'. One server could 'Start' say at Nov 15 00:03:45 and 'End' Nov 15
00:3:55 and have another session later on the log, say at Nov 18
00:08:41 and 'End' Nov 18 00:08:59. The last one overwrite the earlier
ones! Again thanks for your help if you can help me out on this!


John W. Krahn

unread,
Nov 21, 2009, 11:32:26 PM11/21/09
to
Martien Verbruggen wrote:
> On Sat, 21 Nov 2009 12:34:04 -0800,
> John W. Krahn <som...@example.com> wrote:
>> Jürgen Exner wrote:
>>
>>> No, not necessarily. Using barewords as keys for hashes is perfectly
>>> fine as long as they don't contain whitespace.
>> as long as they don't contain any \W characters.
>
> \begin{pedantry}[level='slight']
>
> It wouldn't, by definition. A 'bareword' is a 'word' (see perldata), and
> a word consists entirely of \w characters, as that is how \w is defined.
>
> \end{pedantry}

Yes, and a 'bareword' with whitespace isn't really a 'bareword' either.
Perhaps you meant to direct this to J�rgen instead of me?

J�rgen Exner

unread,
Nov 22, 2009, 1:24:35 PM11/22/09
to
Obama <cyrus...@gmail.com> wrote:
>On Nov 21, 1:07�pm, Martien Verbruggen
><martien.verbrug...@invalid.see.sig> wrote:
>> On Sat, 21 Nov 2009 08:19:05 -0800 (PST),
>> If you have any questions, don't hesistate to ask.
>> Martien
>
>Martien,
>thanks for your help, I do have a log file which contains more than
>3000 records, after running the program it records only 23, the reason

I am confused! Is this a different problem than the one you posted under
"Help with HoH and accessing right keys"? Because in that thread you
claimed 23 out of 300 records.

>I guess that code overwrites if one server has more than one 'Start|
>End'. One server could 'Start' say at Nov 15 00:03:45 and 'End' Nov 15
>00:3:55 and have another session later on the log, say at Nov 18
>00:08:41 and 'End' Nov 18 00:08:59. The last one overwrite the earlier
>ones! Again thanks for your help if you can help me out on this!

Yes, as people have explained in that other thread, that's what hashes
do.

jue

s...@netherlands.com

unread,
Nov 24, 2009, 3:40:15 PM11/24/09
to

Since you refuse to treat this like a database, here is a
hybrid, putting the record in fixed strings, sorting, then
extracting. All in a fixed way, since you refuse everything else.

-sln
---------
Output:

Hercules:sm_fv_emba
Tue Nov 12 00:15:04 - Wed Nov 13 00:00:13 (6098287 KB)
Sun Nov 15 00:15:04 - Sun Nov 15 00:15:13 (1900 KB)

Hercules:sm_fv_faculty


Sun Nov 15 00:15:04

Hercules:sm_fv_phd
Sun Nov 15 00:00:03 - Sun Nov 15 00:00:26 (4528 KB)

Hercules:sm_fv_researchdata
Sun Nov 15 00:15:04 - Sun Nov 15 00:15:18 (1820 KB)

Hercules:sm_fv_servicedata
- Tue Nov 15 00:00:58 (53664 KB)
- Sun Nov 14 00:00:55 (53664 KB)
- Sun Nov 14 00:02:01 (53664 KB)
- Sun Nov 15 00:00:01 (53664 KB)
Sun Nov 15 00:00:03 - Sun Nov 15 00:01:00 (4445554 KB)

Hercules:sm_fv_students
Sun Nov 15 00:00:03 - Sun Nov 15 00:00:25 (3368 KB)

Hercules:sm_galaxy_root
Sun Nov 15 00:15:04 - Sun Nov 15 00:15:32 (39128 KB)

network-1:Test1
Fri May 25 00:13:20 - Fri May 25 00:13:49 (2048 KB)

network-2:Test2
Sat May 26 00:15:20 - Sat May 26 00:15:49 (212048 KB) - Sat May 26 00:15:50 (212048 KB)
Sat May 26 00:16:20
Sat May 26 00:16:22 - Sat May 26 00:16:49 (212048 KB)
---------

## misc_parse13.pl, sln
##
use strict;
use warnings;

my %servers;

my %day2num = (
mon=>'1', tue=>'2', wed=>'3', thu=>'4',
fri=>'5', sat=>'6', sun=>'7');
my %month2num = (
jan=>'01', feb=>'02', mar=>'03', apr=>'04',
may=>'05', jun=>'06', jul=>'07', aug=>'08',
sep=>'09', oct=>'10', nov=>'11', dec=>'12');
my %num2day = reverse ( %day2num );
my %num2month = reverse ( %month2num );

while (<DATA>)
{
my @all = split /[()\s]+/;
next if (@all<9 or $all[8] !~ /^(?:start|end)/i);
$all[9] = '' if !defined ($all[9]);
$all[10] = '' if !defined ($all[10]);

my $rec =
$day2num{lc $all[1]}. # day of week/year
$month2num{lc $all[2]}. # month
$all[3]. # day
$all[4]. # time
'-'. # -
$all[8]. # start/end
' ('.$all[9].' '.$all[10].')'; # ( usage size )

push @{$servers{$all[7]}}, $rec;
}

## sort by server

for my $srv (sort keys %servers)
{
print "\n\n$srv";
my $nostart = "\n ";

## sort by year/month/day/time

for (sort @{$servers{$srv}})
{
## print results

(/start/i .. /start/i) and (print "\n", $nostart = '');

if ( /(\d)(\d\d)(\d\d)(.+)-start/i )
{
print ' '.
ucfirst($num2day{$1}).' '.ucfirst($num2month{$2}).
' '.$3.' '.$4.' ';
}
elsif ( /(\d)(\d\d)(\d\d)(.+)-end (.*)/i )
{
print $nostart.
'- '.ucfirst($num2day{$1}).' '.ucfirst($num2month{$2}).
' '.$3.' '.$4.' '.$5.' ';
}
}
}

__DATA__
src Fri May 25 00:13:49 EDT myserver1:Test1 network-1:Test1 End (2048 KB)
src Fri May 25 00:13:20 EDT myserver1:Test1 network-1:Test1 Start
src Sat May 26 00:15:20 EDT myserver2:Test2 network-2:Test2 Start
src Sat May 26 00:15:49 EDT myserver2:Test2 network-2:Test2 End (212048 KB)
src Sat May 26 00:15:50 EDT myserver2:Test2 network-2:Test2 End (212048 KB)
src Sat May 26 00:16:20 EDT myserver2:Test2 network-2:Test2 Start
src Sat May 26 00:16:22 EDT myserver2:Test2 network-2:Test2 Start
src Sat May 26 00:16:49 EDT myserver2:Test2 network-2:Test2 End (212048 KB)


dst Sun Nov 15 00:00:03 EST galaxy.fuqua.duke.edu:fv_servicedata Hercules:sm_fv_servicedata Start
dst Sun Nov 15 00:00:03 EST galaxy.fuqua.duke.edu:fv_phd Hercules:sm_fv_phd Start
dst Sun Nov 15 00:00:03 EST galaxy.fuqua.duke.edu:fv_students Hercules:sm_fv_students Start
dst Sun Nov 15 00:00:25 EST galaxy.fuqua.duke.edu:fv_students Hercules:sm_fv_students End (3368 KB)
dst Sun Nov 15 00:00:26 EST galaxy.fuqua.duke.edu:fv_phd Hercules:sm_fv_phd End (4528 KB)

dst Tue Nov 15 00:00:58 EST galaxy.fuqua.duke.edu:fv_servicedata Hercules:sm_fv_servicedata End (53664 KB)
dst Sun Nov 14 00:00:55 EST galaxy.fuqua.duke.edu:fv_servicedata Hercules:sm_fv_servicedata End (53664 KB)
dst Sun Nov 14 00:02:01 EST galaxy.fuqua.duke.edu:fv_servicedata Hercules:sm_fv_servicedata End (53664 KB)
dst Sun Nov 15 00:00:01 EST galaxy.fuqua.duke.edu:fv_servicedata Hercules:sm_fv_servicedata End (53664 KB)
dst Sun Nov 15 00:01:00 EST galaxy.fuqua.duke.edu:fv_servicedata Hercules:sm_fv_servicedata End (4445554 KB)
dst Wed Nov 13 00:00:13 EST galaxy.fuqua.duke.edu:fv_emba Hercules:sm_fv_emba End (6098287 KB)


dst Sun Nov 15 00:01:00 EST andromeda.fuqua.duke.edu:esx_fc_nfs2 Hercules:sm_esx_fc_nfs2 Request (Retry)
dst Sun Nov 15 00:15:04 EST galaxy.fuqua.duke.edu:fv_faculty Hercules:sm_fv_faculty Start
dst Sun Nov 15 00:15:04 EST galaxy.fuqua.duke.edu:fv_researchdata Hercules:sm_fv_researchdata Start
dst Sun Nov 15 00:15:04 EST galaxy.fuqua.duke.edu:fv_emba Hercules:sm_fv_emba Start
dst Sun Nov 15 00:15:04 EST galaxy.fuqua.duke.edu:root Hercules:sm_galaxy_root Start
dst Sun Nov 15 00:15:13 EST galaxy.fuqua.duke.edu:fv_emba Hercules:sm_fv_emba End (1900 KB)
dst Sun Nov 15 00:15:18 EST galaxy.fuqua.duke.edu:fv_researchdata Hercules:sm_fv_researchdata End (1820 KB)
dst Sun Nov 15 00:15:32 EST galaxy.fuqua.duke.edu:root Hercules:sm_galaxy_root End (39128 KB)
dst Sun Nov 15 00:16:00 EST andromeda.fuqua.duke.edu:esx_fc_nfs2 Hercules:sm_esx_fc_nfs2 Request (Retry)

dst Tue Nov 12 00:15:04 EST galaxy.fuqua.duke.edu:fv_emba Hercules:sm_fv_emba Start

0 new messages