The problem: read fixed length records (for example 80 chars) without
record separator.
Thank you in advance.
Christian
You can use dd to unblock the records - it will truncate trailing spaces:
$ cat block
012345678 012345678 012345678 012345678 012345678 012345678 ...
(240 bytes long)
$ dd if=block cbs=10 conv=unblock
0+1 records in
0+1 records out
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
012345678
--
Dan Mercer
dame...@mmm.com
Opinions expressed herein are my own and may not represent those of my employer.
I have not.
> The problem: read fixed length records (for example 80 chars) without
> record separator.
Awk is a text processing language. Use Perl or C for binary data.
Just curious... Are you SURE these are binary files? Can you describe
their binary-ness exactly? From your brief problem description, I
infer that you have fixed-length data records and I suspect that
this data is partly corrupted with NULs. If this is the case, you
could use tr to fix the corruption, dd to turn the fixed-length
data records into one-line-per-record ASCII text files, then awk
to parse the records. But this is a wild guess.
--
Jim Monty
mo...@primenet.com
Tempe, Arizona USA
Standard awk is not the right tool for this purpose.
On the other hand, gawk handles binary files well, and without introducing
a new, awful language in the process. For your particular purpose,
there is a special variable called FIELDWIDTHS, which is a list of integers.
If you set it, gawk's processing of input changes from the standard awk
delimited record/delimited field approach, to a fixed-length record
approach.
For instance, according to /usr/include/elf.h, an elf executable starts
with a 4-byte magic number, a byte giving the class of machine, a byte
giving the data format, a byte giving the ELF version, a byte of padding,
8 bytes of brand information, and various other crap. You could set
FIELDWIDTHS to
FIELDWIDTHS = "4 1 1 1 1 8"
and then parse this much of the file header with
FNR == 1 { if ($1 == "\177ELF") print "elf file"
else { print "not elf file"; nextfile }
if ($2 == "\001") print "32-bit"
else if ($2 == "\002") print "64-bit (2x as good as 32-bit)"
else print "you've got some odd kind of bits here"
if ($3 == "\001") print "little endian"
else if ($3 == "\002") print "big endian"
else print "medium endian"
if ($4 == "\001") print "this is the only elf version I know about"
else print "you've got a wacky elf version going here"
print "brand", $6
nextfile
}
[note that nextfile is a gawk language extension]
--
Patrick TJ McPhee
East York Canada
pt...@interlog.com