ksh awk regexp help?

19 views
Skip to first unread message

Husam

unread,
Jun 19, 2011, 8:21:57 AM6/19/11
to Jo...@googlegroups.com
Hello JoLug

I need help with Any scripting language to perform the following 

I have the following input file with the follwoing format :

0000FILE_HEADER
0001DATA_TYPE_ONE_R01
0001DATA_TYPE_ONE_R02
0001DATA_TYPE_ONE_R03
0001DATA_TYPE_ONE_R04
0001DATA_TYPE_ONE_R05
0001DATA_TYPE_ONE_R06
0002DATA_TYPE_TWO_R01
0002DATA_TYPE_TWO_R02
0002DATA_TYPE_TWO_R03
0002DATA_TYPE_TWO_R04
0002DATA_TYPE_TWO_R05
0003DATA_TYPE_THREE_R01
0003DATA_TYPE_THREE_R02
0003DATA_TYPE_THREE_R03
0003DATA_TYPE_THREE_R04
0003DATA_TYPE_THREE_R05
0004FILE_END

Field separator |FS| (none printable character x001C)
Max line length after joining records is 115 char

Result should be

0000FILE_HEADER
0001DATA_TYPE_ONE_R01|FS|DATA_TYPE_ONE_R02|FS|DATA_TYPE_ONE_R03|FS|DATA_TYPE_ONE_R04|FS|DATA_TYPE_ONE_R05
0001DATA_TYPE_ONE_R06
0002DATA_TYPE_TWO_R01|FS|DATA_TYPE_TWO_R02|FS|DATA_TYPE_TWO_R03|FS|DATA_TYPE_TWO_R04|FS|DATA_TYPE_TWO_R05
0003DATA_TYPE_THREE_R01|FS|DATA_TYPE_THREE_R02|FS|DATA_TYPE_THREE_R03|FS|DATA_TYPE_THREE_R04|FS|DATA_TYPE_THREE_R05
0004FILE_END



--
Husam Habannakeh

+971 505 516 489 Dubai
+962 777 656 086 Amman
+966 561 154 798 Riyadh


Yaman Saqqa

unread,
Jun 19, 2011, 1:21:04 PM6/19/11
to jo...@googlegroups.com
Try this one, baring in mind:

  • It is not perfect or optimized
  • Takes data from stdin and spits to stdout
  • Assumes data value does not itself exceed 115)

#!/bin/perl

$count = 0;
@r = ();
$sep = "\x001C";

while (<STDIN>) {
($r_num, $r_data) = m/([0-9]{4})(.*$)/;
if($r[$r_num]) { $fs = $sep; } else { $fs = ''; }
$r[$r_num] = $r[$r_num] . $fs . $r_data;
}

$r_size = @r;

for( $i=0; $i<$r_size; $i++) {
if(length($r[$i]) <= 111) {
printf("%04d%s%s",$i,$r[$i],"\n");
} else {
@fields = split(/$sep/,$r[$i]);
$f_size = @fields;
printf("%04d%s", $i);
$temp_length = 0;
$tape = 0;
for( $j=0; $j<$f_size; $j++ ) {
if($temp_length + length($fields[$j]) < 111) { 
$temp_length += length($fields[$j]) + length($sep);
if($tape == 1) { print $sep; }
printf("%s%s", $fields[$j]);
$tape = 1;
} else {
printf("%s%04d%s%s", "\n",$i,$fields[$j],"\n");
$temp_length = 0;
$tape = 0;
}
}
}
}


Hope it helps.

Yaman

@abulyomon

--
### Jordan Linux Users Group ###
http://Jolug.org/
http://groups.google.com/group/Jolug
 
### Ubuntu Jordan LoCo Team ###
https://wiki.ubuntu.com/JordanTeam
http://lists.ubuntu.com/ubuntu-jo
 
### Ojuba Linux ###
http://ojuba.org/
 
### Jordan PHP ###
http://groups.google.com/group/JoPHP



Husam

unread,
Jun 20, 2011, 1:37:49 AM6/20/11
to jo...@googlegroups.com
Thanks Mohammad and Yaman

I will try this and report back

maq...@gmail.com

unread,
Jun 20, 2011, 3:22:09 AM6/20/11
to jo...@googlegroups.com
Hii
I tried to get it in bash and it looks that it worked with me:

cat FILE | while read l; do
DATA_TYPE_NEW=`echo $l | awk -F'DATA_TYPE' '{print $1}'`
if [ "${DATA_TYPE}" = "${DATA_TYPE_NEW}" ] ; then
WORD_LEN=$(( `echo $l|wc -m` -1));
CUREENT_LINE_LEN=$(( `echo $LINE |sed 's/,//g'| wc -m` -1 )) ;
LINE_LEN=$(( ${WORD_LEN} + ${CUREENT_LINE_LEN} )) ;
if [ "${LINE_LEN}" -gt "115" ] ;
then
echo $LINE;
LINE="$l"
else
LINE="${LINE},${l}";
fi;
else
DATA_TYPE="${DATA_TYPE_NEW}";
echo $LINE;
LINE=${l};
fi
[ "$l" = "0004FILE_END" ] && echo $l && exit 0
done | sed 's/,/'`echo -e "\01C"`'/g'


Mohammed Ameen Al-Qudah


On , Husam <haban...@gmail.com> wrote:
> Thanks Mohammad and Yaman
>
> I will try this and report back
>
>
>
>
> On Sun, Jun 19, 2011 at 9:21 PM, Yaman Saqqa abul...@gmail.com> wrote:
>
> Try this one, baring in mind:
>
> It is not perfect or optimizedTakes data from stdin and spits to stdout

> Assumes data value does not itself exceed 115)
>
>
>
> #!/bin/perl
>
>
> $count = 0;
> @r = ();
> $sep = "\x001C";
>
>
> while () {

> ($r_num, $r_data) = m/([0-9]{4})(.*$)/;
>
>
> if($r[$r_num]) { $fs = $sep; } else { $fs = ''; }
> $r[$r_num] = $r[$r_num] . $fs . $r_data;
>
> }
>
>
> $r_size = @r;
>
>
> for( $i=0; $i if(length($r[$i]) printf("%04d%s%s",$i,$r[$i],"\n");

>
>
> } else {
> @fields = split(/$sep/,$r[$i]);
> $f_size = @fields;
>
>
> printf("%04d%s", $i);
> $temp_length = 0;
> $tape = 0;
>
>
> for( $j=0; $j if($temp_length + length($fields[$j])

Husam

unread,
Jun 23, 2011, 3:45:04 AM6/23/11
to jo...@googlegroups.com
THanks Every body

I managed to fix a PERL script based on what you gave me
Reply all
Reply to author
Forward
0 new messages