Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Stange field mangling behavior - Linux

2 views
Skip to first unread message

Stephen

unread,
Jul 23, 2008, 4:38:41 PM7/23/08
to
My company uses a bourne script to get performance values. Problems
are coming from iostat (-d & -dxk) on Linux. The script is
automatically installed on many UNIX/Linux machines which cannot
(company policy) have gawk/awk/whatever upgraded. On some machines it
outputs managled fields.

Fields are separated by commas.
Bad Record Examples:
C,2008071200807161745,Disk,Physical Writes/sec,sdaw,
10000000,THOG2QU2,0.00,900,0.00
C,07161745,Disk,Block Writes/sec,sdqv,
10000000,THOG2QU2,0.00,900,0.00
C,200807161745,Disk,Quions/
sec,vpathcc1,10000000,THOG2QU2,0.28,900,0.28
Good Record Examples:
C,200807161745,Disk,Physical Writes/sec,sdfb,
10000000,THOG2QU2,0.00,900,0.00
C,200807161745,Disk,Block Writes/sec,sdgm,
10000000,THOG2QU2,7.34,900,7.34
C,200807161745,Disk,Queue Length,sdz,
00000000,THOG2QU2,0.00,900,0.00

The weird thing is field 2 comes from `date -u +"%Y%m%d%H%M"` and
field 4 is hard coded string "Queue Length". More fields have been
mangled but those are the ones I could find this go around. Now this
wouldn't be weird if it happened all the time but by and large it
doesn't happen that often. Some 1200 disks on that specific machine
and 3 records are bad.
The symptoms resemble
http://groups.google.com/group/comp.lang.awk/browse_thread/thread/2711ad3c63b6aaa0/5613113ae5dbde16?lnk=gst&q=mangle#5613113ae5dbde16.
However, my script is not using an arcane syntax.
Any help to figure out what could be causing this problem would be
appreciated. And to reiterate I need to solve this in script not by
using different software. Thanks in advance

Any who here the offending part of my script:

# iostat -d | -dxk processing for AIX and linux.
func_iostat() {
iostat -$1 $sampsize `expr $freq + $sampleOffset` |\
awk -v offset=$sampleOffset -v opsys=$opsys -v samp=$sampsize -v
sys="$sysname" -v ts=$timestamp '
function cRecord(cname,ctype,val) { print "C", ts, "Disk", cname,
inst, ctype, sys, val, samp, val }
BEGIN { OFS=","; }
NF<=4 || /System configuration:/ { next; }
/Device:|Disks:/ { # Header
gsub(/% tm_act/,"%tm_act",$0) # AIX was dumb and placed a
space in a header field
split($0,a); next;
}
/"Disk history since boot not available."/ { useFirstSample=1;
next; } # Aix might NOT show the first sample
offset==0 { useFirstSample=1; } # Aix 5.3+ doesnt output the first
sample
useFirstSample!=1 && dev[$1]!=1 { dev[$1]=1; next; } # first
sample skip it
{ # Not the first sample thus we process it
physical_kb_sec=0; # Zero out calculated var
for(i=1; i<=NF; i++) {
if(a[i]~/Device:|Disks:/) inst=$i;
if (opsys=="AIX") {
# AIX d processing. Dont use tps,%tm_act because we
get it from %busy
if(a[i]=="Kbps") cRecord("Physical KB/sec",
"00000000", $i)
if(a[i]=="Kb_read") cRecord("Read KB", "00000000", $i)
if(a[i]=="Kb_wrtn") cRecord("Write KB", "00000000",
$i)
} else {
# Linux d processing
if(a[i]=="tps") cRecord("Physical Operations/
sec", "10000000", $i)
if(a[i]=="Blk_read/s") cRecord("Block Reads/sec",
"10000000", $i)
if(a[i]=="Blk_wrtn/s") cRecord("Block Writes/sec",
"10000000", $i)
if(a[i]=="Blk_read") cRecord("Blocks Read",
"00000000", $i)
if(a[i]=="Blk_wrtn") cRecord("Blocks Written",
"00000000", $i)
# Linux dxk processing. Unmapped: rrqm/s, wrqm/s,
avgrq-sz, await
if(a[i]=="r/s") cRecord("Physical Reads/sec",
"10000000", $i)
if(a[i]=="w/s") cRecord("Physical Writes/sec",
"10000000", $i)
if(a[i]=="rkB/s") cRecord("Physical Read KB/
sec", "10000000", $i)
if(a[i]=="wkB/s") cRecord("Physical Write KB/
sec", "10000000", $i)
if(a[i]=="avgqu-sz") cRecord("Queue Length",
"00000000", $i)
if(a[i]=="svctm") cRecord("Service Time",
"00000000", $i)
if(a[i]=="%util") cRecord("% Busy Time",
"20000000", $i)
if(a[i]=="rkB/s" || a[i]=="wkB/s") {
flag_PhysicalKbSec=1; # Only log metric if its
parents exist
physical_kb_sec += $i;
}
}
}
if(flag_PhysicalKbSec==1) cRecord("Physical KB/sec",
"10000000", physical_kb_sec)
} ' >> $logname 2>> $uxpalog
}

sampleOffset=1
opsys=`uname`
sampsize=10
sysname=`uname -n`
timestamp=`date -u +"%Y%m%d%H%M"`
logname=test.log
uxpalog=test.err

func_iostat d &
func_iostat dxk &

Grant

unread,
Jul 23, 2008, 7:45:29 PM7/23/08
to
On Wed, 23 Jul 2008 13:38:41 -0700 (PDT), Stephen <stephen.mic...@gmail.com> wrote:

>My company uses a bourne script to get performance values. Problems
>are coming from iostat (-d & -dxk) on Linux. The script is
>automatically installed on many UNIX/Linux machines which cannot
>(company policy) have gawk/awk/whatever upgraded. On some machines it
>outputs managled fields.

Just like your whitespace mangled post? :o)

Grant.
--
http://bugsplatter.mine.nu/

Ed Morton

unread,
Jul 23, 2008, 11:50:31 PM7/23/08
to
On 7/23/2008 3:38 PM, Stephen wrote:
> My company uses a bourne script to get performance values. Problems
> are coming from iostat (-d & -dxk) on Linux. The script is
> automatically installed on many UNIX/Linux machines which cannot
> (company policy) have gawk/awk/whatever upgraded. On some machines it
> outputs managled fields.
<snip>

>
> Any who here the offending part of my script:
>
> # iostat -d | -dxk processing for AIX and linux.
> func_iostat() {
> iostat -$1 $sampsize `expr $freq + $sampleOffset` |\
> awk -v offset=$sampleOffset -v opsys=$opsys -v samp=$sampsize -v
> sys="$sysname" -v ts=$timestamp '
<snip>
> } ' >> $logname 2>> $uxpalog

NOTE The above ^^^^^^

> }
>
<snip>


> logname=test.log
> uxpalog=test.err
>
> func_iostat d &
> func_iostat dxk &

NOTE The above ^^^^^^

You're running the 2 functions simultaneously in the background with both of
them writing to the same file simultaneously, hence the mangling. Either use
separate log files or run the functions sequentially (e.g. by having them both
run in the foreground).

This is an issue with your OS rather than with awk and so is OT here so you
should follow up at comp.unix.shell or similair if you have questions.

Ed.

Stephen

unread,
Jul 24, 2008, 8:52:21 AM7/24/08
to
> should follow up at comp.unix.shell or similar if you have questions.
>
>         Ed.

But I am also running many many other processes at the same time all
outputting to the log file... Though I will acknowledge this is the
only instance of the same function being called twice. But why if that
is the case does this problem only present itself so infrequently to
only be seen 2 out of 1200 times? Sorry I am debating like this but I
am not in a test friendly environ so the theory has to be there. Both
iostats must run at the same time and I don't want the problem of
tracking an output file for them.

Ed Morton

unread,
Jul 24, 2008, 9:20:13 AM7/24/08
to
On 7/24/2008 7:52 AM, Stephen wrote:
> On Jul 23, 11:50 pm, Ed Morton <mor...@lsupcaemnt.com> wrote:
<snip>

>>You're running the 2 functions simultaneously in the background with both of
>>them writing to the same file simultaneously, hence the mangling. Either use
>>separate log files or run the functions sequentially (e.g. by having them both
>>run in the foreground).
>>
>>This is an issue with your OS rather than with awk and so is OT here so you
>>should follow up at comp.unix.shell or similar if you have questions.
>>
>> Ed.
>
>
> But I am also running many many other processes at the same time all
> outputting to the log file... Though I will acknowledge this is the
> only instance of the same function being called twice. But why if that
> is the case does this problem only present itself so infrequently to
> only be seen 2 out of 1200 times?

Luck (good or bad).

> Sorry I am debating like this but I
> am not in a test friendly environ so the theory has to be there. Both
> iostats must run at the same time and I don't want the problem of
> tracking an output file for them.

Like I said, this is an OS issue, not an awk one so your best chance of getting
the right, precise answer is in an OS-specific NG like comp.unix.shell. You
might get someone apparently answering it here, but it might be the wrong answer
as this isn't where most of the OS experts hang out.

Ed.

Stephen

unread,
Jul 30, 2008, 10:11:10 AM7/30/08
to
Forgive me for posting again but I just wanted to document the fix for
anyone finding this post.

It indeed was that I was running the function twice in the background.
The code:


func_iostat d &
func_iostat dxk &

was the culprit. I changed func_iostat to take an output file as arg1,
and to output straight to logname (the variable). Then:
iostat -d $sampsize `expr $freq + $sampleOffset` > iostat_d.tmp &
iostat -dxk $sampsize `expr $freq + $sampleOffset` > iostat_x.tmp
&
wait
func_iostat iostat_d.tmp
func_iostat iostat_x.tmp

Works like a champ. Thanks anyways for the responses!

0 new messages