2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K
(502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02
secs]
I would like to parse this to output for easy plotting using gnuplot
and would like the following output:
2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09,
0.04, 0.02
I have tried with a command like this:
awk '{if($1~/[0-9]+/ && $2=="[GC" && $3=="[PSYoungGen:")printf("%s %s
%s %s %s %s\n", $1,$2,$3,$4,$5,$6)}' gc_20091104_024256_psghlc301.log
| sed "s/[0-9][0-9]:.*GC \[PSYoungGen: /, /" | sed "s/K.*->/, /" | sed
"s/K.*(/, /" | sed "s/K)//"
but it jumps over several fields and gives me the following output:
2.7, 70850, 6800, 502464, 0.0165440
How can I set sed to not look at the last match ( "K(" ), but trigger
on the first match?
Thanks
Assuming the input is all on one line:
$ cat file
2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K (502464K),
0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 secs]
$ awk '{OFS=", "; gsub(/[^[:digit:].]/," "); $1=$1}1' file
2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, 0.04, 0.02
Ed.
Kaz's txr utility to the rescue.
txr -c '@(collect)
@num: [GC [PSYoungGen: @{size1}K->@{size2}K(@{size3}K)] @{size4}K->@{size5}K (@{size6}K), @secs secs] [Times: user=@utime sys=@systime, real=@realtime secs]
@(end)
@(output)
@(repeat)
@num, @size1, @size2, @size3, @size4, @size5, @size6, @secs, @utime, @systime, @realtime
@(end)
@(end)
' logfile
That's a beautiful solution! Now, there's a change in the log file
output. The first field is now a date and time stamp
2009-11-05T15:00:16.965-0600: 0.405: [GC 2112K->750K(7680K), 0.0204170
secs]
2009-11-05T15:00:17.087-0600: 0.527: [GC 2862K->1010K(7680K),
0.0043760 secs]
and applying this command
cat ${gclogfile}|sed 's/^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9][A-
Z]:*//'|sed 's/\.[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9]:*//'|awk -F:
'{print (NR==1||(!$1&&$1!=p)?++c:c),$0;p=$1}'
generates the following output:
15, 00, 16, 0.405, 2112, 750, 7680, 0.0204170
15, 00, 17, 0.527, 2862, 1010, 7680, 0.0043760
where the time stamp (15:00:16) shows up as 15, 00, 16. Is there a
way to have the output look like this:
15:00:16, 0.405, 2112, 750, 7680, 0.0204170
15:00:17, 0.527, 2862, 1010, 7680, 0.0043760
Thanks!
"beautiful solution" discarded apparently!
> generates the following output:
>
> 15, 00, 16, 0.405, 2112, 750, 7680, 0.0204170
> 15, 00, 17, 0.527, 2862, 1010, 7680, 0.0043760
>
> where the time stamp (15:00:16) shows up as 15, 00, 16. Is there a
> way to have the output look like this:
>
> 15:00:16, 0.405, 2112, 750, 7680, 0.0204170
> 15:00:17, 0.527, 2862, 1010, 7680, 0.0043760
>
> Thanks!- Hide quoted text -
>
> - Show quoted text -
Why do you keep going back to pipelines of cat, sed, and awk? If
you're going to use awk anyway, you don't need sed or cat.
Try this:
awk '{OFS=", "; t=substr($0,12,8); $0=substr($0,30);
gsub(/[[:digit:].]/," "); $1=$1; print t,$0}' file
Ed.
Thanks, Ed!
Well, I'm by no means a shell expert.
Running your command on the file, I get the following output:
15:00:16, :, [GC, K->, K(, K),, secs]
Are you sure you copy/pasted my script instead of retyping it?
Are you sure your input file is the same as you posted?
Look:
$ cat file
2009-11-05T15:00:16.965-0600: 0.405: [GC 2112K->750K(7680K),
0.0204170 secs]
2009-11-05T15:00:17.087-0600: 0.527: [GC 2862K->1010K(7680K),
0.0043760 secs]
$ awk '{OFS=", "; t=substr($1,12,8); $0=substr($0,30); gsub(/[^
[:digit:].]/," "); $1=$1; print t,$0}' file
15:00:16, 0.405, 2112, 750, 7680, 0.0204170
15:00:17, 0.527, 2862, 1010, 7680, 0.0043760
Please post exactly the same commands and their output so we can see
where something's going wrong.
Ed.
Hint: check if you mistyped the gsub() as
gsub(/[[:digit:].]/," ")
instead of what I had:
gsub(/[^[:digit:].]/," ")
Note the "^".
Regards,
Ed.
My bad! I lost the ^ in the copy/past.
Thanks, Ed!
> I have a GC log file with entries like this one:
>
> 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K
> (502464K), 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02
> secs]
>
> I would like to parse this to output for easy plotting using gnuplot
> and would like the following output:
>
> 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09,
> 0.04, 0.02
If you can live without the spaces:
tr -sc '0-9.' ,
or (since gnuplot won't mind):
tr -sc '0-9.' ' '
<snip>
--
Ben.
>> > > > > Assuming the input is all on one line:
>>
>> > > > > $ cat file
>> > > > > 2.729: [GC [PSYoungGen: 70850K->6800K(152896K)] 70850K->6800K (502464K),
>> > > > > 0.0165440 secs] [Times: user=0.09 sys=0.04, real=0.02 secs]
>>
>> > > > > $ awk '{OFS=", "; gsub(/[^[:digit:].]/," "); $1=$1}1' file
>> > > > > 2.729, 70850, 6800, 152896, 70850, 6800, 502464, 0.0165440, 0.09, 0.04, 0.02
This looks like the original code.
<snip>
>> > > Try this:
>>
>> > > awk '{OFS=", "; t=substr($0,12,8); $0=substr($0,30);
>> > > � � � � gsub(/[[:digit:].]/," "); $1=$1; print t,$0}' file
>>
>> > > � �Ed.
This looks like your ammended code.
>>
>> > Thanks, Ed!
>>
>> > Well, I'm by no means a shell expert.
>>
>> > Running your command on the file, I get the following output:
>>
>> > 15:00:16, :, [GC, K->, K(, K),, secs]
>>
>> Are you sure you copy/pasted my script instead of retyping it?
>> Are you sure your input file is the same as you posted?
>>
>> Look:
>>
>> $ cat file
>> 2009-11-05T15:00:16.965-0600: 0.405: [GC 2112K->750K(7680K),
>> 0.0204170 secs]
>> 2009-11-05T15:00:17.087-0600: 0.527: [GC 2862K->1010K(7680K),
>> 0.0043760 secs]
>>
>> $ awk '{OFS=", "; t=substr($1,12,8); $0=substr($0,30); gsub(/[^
>> [:digit:].]/," "); $1=$1; print t,$0}' file
>> 15:00:16, 0.405, 2112, 750, 7680, 0.0204170
>> 15:00:17, 0.527, 2862, 1010, 7680, 0.0043760
>>
>> Please post exactly the same commands and their output so we can see
>> where something's going wrong.
I could be wrong, but I noticed what I thought was a missing "^" in
your ammended code. It could be that I only saw a munged reply in the
thread and missed the original reply.
I actually stopped for a couple of minutes when I saw your amended code
because I couldn't figure out how it worked and I typically learn
something nearly every time I read yur code. Local events overcame my
studies and I never got to try it out. My point here is not to call
out others errors; I thought my knowledge was leaking through a hole
and your response actually put a finger in the hole! I wanted to say
thanks.
As I get older I can't distuinguish between senior moments and actual
ignorance. The only bright side is that I can enjoy old movies.g
ruby -ne'puts [$_[11,8], $_[30..-1].scan(/[\d.]+/)].join(", ")' file
=== output ===
You're right, looks like I did drop the "^" in one of my posts.
Ed.
sed -e 's/[^0-9.]/ /g;s/ */ /g;s/^ //;s/ $//;s/ /, /g'
You could do this in one go:
perl -lne '$,=", ";print/\d+[.]?(?:\d+)?|[.]\d+/g'
yourfile
perl -lpe '$"=", ";$_="@{[/\d+[.]?(?:\d+)?|[.]\d+/g]}"'
yourfile
--Rakesh