C-struct style data reads?

Paul

unread,

Apr 23, 2003, 11:11:20 AM4/23/03

to perl6-l...@perl.org

Ok, perhaps the one thing I've missed most in P5 that you get in C:

struct {
int someInt;
float someFloat;
char strdata[42];
float otherFloat;
char moreStr[123];
} buf;
fread(buf,sizeof buf,1,fp);
printf("%.0f %s\n",buf.someFloat,buf.moreStr);

To get the same thing in P5:

@buf{ qw/ someInt someFloat strData otherFloat moreStr / } =
unpack "ifa42fa123" <$fp>;
printf ""%.0f %s\n", @buf{qw/ someFloat moreStr /};

Accounting for the quick knock-off code, the lack of detail and
stricture, etc., I admit that thereare ways to make it more readable,
but it's still a bit ugly. The main thing is the binary unpacking; the
format is, well, icky. This isn't exactly horrible, but is P6 going to
have a native binary (read-and-parse-to-struct)'ish idiom? Since we
have types now, it should be reasonably easy to build an object that
represents data as stored in a binary file, like the struct above.
Maybe unpack() can accept a struct-ish layout now that we have types?

Hmm....

Class Foo {
method new ($class, Str $.source) {
has int $.someInt is target; # maybe order could be
has num $.someFloat is target; # somehow significant
has str $.strdata[42] is target; # for these?
has num $.otherFloat is target; # supposing "is target"
has str $.moreStr[123] is target; # makes it a target for
has IO .file; # unpacking to the object?
open .file: $.source or die $!;
} # I assume a constructor always returns the new object?

method nextrec ($me) {
unpack $me, <.file>;
}
}

I couldn't remember the initialization method name (build? new? :), so
I sort of rolled it into the new(). I know there are flaws here....

But this assumes that unpack() knows how to look at the object (notice
I didn't say C<unpack $me:> because it isn't a method call) to
determine which attributes are targets for data from the file, and how
to interpret the file data accordingly.

As an aside (sorry, Piers), is $class passed in? How are constructors
inherited? Since I don't bless() things anymore, does the compiler
automatically bless the returned object into the invocant class?

Boy, LOTS to learn. ~sigh~ :)

__________________________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo
http://search.yahoo.com

Luke Palmer

unread,

Apr 23, 2003, 11:49:55 AM4/23/03

to Hod...@writeme.com, perl6-l...@perl.org

>
> Ok, perhaps the one thing I've missed most in P5 that you get in C:
>
> struct {
> int someInt;
> float someFloat;
> char strdata[42];
> float otherFloat;
> char moreStr[123];
> } buf;
> fread(buf,sizeof buf,1,fp);
> printf("%.0f %s\n",buf.someFloat,buf.moreStr);
>
> To get the same thing in P5:
>
> @buf{ qw/ someInt someFloat strData otherFloat moreStr / } =
> unpack "ifa42fa123" <$fp>;
> printf ""%.0f %s\n", @buf{qw/ someFloat moreStr /};
>
> Accounting for the quick knock-off code, the lack of detail and
> stricture, etc., I admit that thereare ways to make it more
> readable, but it's still a bit ugly.

No kidding.

> The main thing is the binary unpacking; the format is, well,
> icky. This isn't exactly horrible, but is P6 going to have a native
> binary (read-and-parse-to-struct)'ish idiom? Since we have types
> now, it should be reasonably easy to build an object that represents
> data as stored in a binary file, like the struct above. Maybe
> unpack() can accept a struct-ish layout now that we have types?
>
> Hmm....
>
> Class Foo {
> method new ($class, Str $.source) {
> has int $.someInt is target; # maybe order could be
> has num $.someFloat is target; # somehow significant
> has str $.strdata[42] is target; # for these?
> has num $.otherFloat is target; # supposing "is target"
> has str $.moreStr[123] is target; # makes it a target for
> has IO .file; # unpacking to the object?
> open .file: $.source or die $!;
> } # I assume a constructor always returns the new object?
>
> method nextrec ($me) {
> unpack $me, <.file>;
> }
> }

This would be a good test for Perl6's introspection capabilities.

And I hope you mean:

has str $.strdata is size(42) is target;

Don't you dare try to pull C-style fixed size char array stuff here. :)

There are issues with allignment and endianness, etc. I presume they
could be solved with traits.

> I couldn't remember the initialization method name (build? new? :), so
> I sort of rolled it into the new(). I know there are flaws here....

It appears to be BUILD.

> But this assumes that unpack() knows how to look at the object (notice
> I didn't say C<unpack $me:> because it isn't a method call) to
> determine which attributes are targets for data from the file, and how
> to interpret the file data accordingly.
>
> As an aside (sorry, Piers), is $class passed in?

When you define C<new>, $class is the invocant. Most often, you won't
define C<new>, you'll define C<BUILD>, in which case a constructed
object is ready for you in the invocant (C<new> looks, perl5 style,
something like this:)

method Object::new($class, *@args) {
my $self = bless NewOpaqueObjectContainer => $class;
$self.BUILD(*@args)
}

> How are constructors inherited?

They're usually C<submethod>s, so they're usually not. It's probably
default that all base constructors are called before the current one,
which is why you woulnd't want them to be inherited, lest they be
called twice for the same object.

> Since I don't bless() things anymore, does the compiler
> automatically bless the returned object into the invocant class?

It blesses it before it gets to C<BUILD>, if bless is even a valid
concept anymore.

> Boy, LOTS to learn. ~sigh~ :)

Better to learn it now than when everybody else is going to: when it's
released.

Luke

Paul

unread,

Apr 23, 2003, 12:32:20 PM4/23/03

to Luke Palmer, perl6-l...@perl.org

--- Luke Palmer <fibo...@babylonia.flatirons.org> wrote:
> >
> > Ok, perhaps the one thing I've missed most in P5 that you get in C:
> >
> > struct {
> > int someInt;
> > float someFloat;
> > char strdata[42];
> > float otherFloat;
> > char moreStr[123];
> > } buf;
> > fread(buf,sizeof buf,1,fp);
> > printf("%.0f %s\n",buf.someFloat,buf.moreStr);
> >
> > To get the same thing in P5:
> >
> > @buf{ qw/ someInt someFloat strData otherFloat moreStr / } =
> > unpack "ifa42fa123" <$fp>;
> > printf ""%.0f %s\n", @buf{qw/ someFloat moreStr /};
> >
> > Accounting for the quick knock-off code, the lack of detail and
> > stricture, etc., I admit that thereare ways to make it more
> > readable, but it's still a bit ugly.
>
> No kidding.

I rest my case, lol....

Even after adding six lines of "clarification" code and comments, it'd
still be ugly.

> > The main thing is the binary unpacking; the format is, well,
> > icky. This isn't exactly horrible, but is P6 going to have a native
> > binary (read-and-parse-to-struct)'ish idiom? Since we have types
> > now, it should be reasonably easy to build an object that
> > represents data as stored in a binary file, like the struct above.
> > Maybe unpack() can accept a struct-ish layout now that we have
> > types?
> >
> > Hmm....
> >
> > Class Foo {
> > method new ($class, Str $.source) {
> > has int $.someInt is target; # maybe order could be
> > has num $.someFloat is target; # somehow significant
> > has str $.strdata[42] is target; # for these?
> > has num $.otherFloat is target; # supposing "is target"
> > has str $.moreStr[123] is target; # makes it a target for

> > has IO .file; # unpack to the object?

> > open .file: $.source or die $!;
> > } # I assume a constructor always returns the new object?
> >
> > method nextrec ($me) { unpack $me, <.file>; }
> > }
>
> This would be a good test for Perl6's introspection capabilities.

Yup.

> And I hope you mean:
>
> has str $.strdata is size(42) is target;
>
> Don't you dare try to pull C-style fixed size char array stuff here.
> :)

LOL!!! Absolutely!
Actually, I just copied the struct and forgot to remove the bracket
stuff. I wasn't even worried about adding size constraints back in, but
thanks, you solved that one, too. :)

I usually wouldn't actively *want* the size constraint, but sometimes
you *need* it.

> There are issues with allignment and endianness, etc. I presume they
> could be solved with traits.

Hmm, hadn't thought of those, either.
Though in my mind I was working with native data as a simple case.

I presume there might be traits to specify?

has int $.someInt is BigEnd is doubleword is target; # ??

Then again, maybe it's just

has BigEndDoubleWord $.someInt is target;

and could be farmed out to a class/module that need not bloat the "core
language" code, if that actually means anything. I *do* believe in
streamlining. :)

The thing I don't see as easily done with an external class is adding
"is target" functionality, unless that's just a simple boolean
trait.... or maybe it makes the parser enumerate them as it
compiles....

Maybe it would be better to say

has BigEndDoubleWord $.someInt is target(1);

and enumerate them manually? Feels kludgy, but it would allow that to
be more easily abstracted. The problem is still unpack(), though. Yet
again, hmm.... Of course, if the bytecode is built in a way that makes
the order of declaration apparent it's a nonissue, though we'd still
need a way to designate which traits are datafile and which are other.

Then there's the different behavior of unpack(). Maybe one could wrap()
it in a way that introspects a layout object. If it's just a simple
Str, do what you always do, but if it's a more complex object, dig
around in it's innards for "target" attr's, and build the layout from
them. That does introduce another bug(ger), though: suppose you layout
should be "a7x5if3xa". The x means "skip this many bytes", right?

Ok, I *don't* think you should have to declare "has $.dummy1 is
size(5)", but I don't really know if it's worth yet another keyword. Is
there another reasonable option? Maybe

has undef is size(5); # ?

That's not too entirely onerous, is it? It maps back to the logic of

($x,undef, $y, $z, undef, $rest) = foo();

Comments?

===

The rest is new/BUILD stuff.

> > I couldn't remember the initialization method name (build? new? :),
> > so I sort of rolled it into the new(). I know there are flaws
> > here....
>
> It appears to be BUILD.

Yeah.

> > But this assumes that unpack() knows how to look at the object
> > (notice I didn't say C<unpack $me:> because it isn't a method
> > call) to determine which attributes are targets for data from
> > the file, and how to interpret the file data accordingly.
> >
> > As an aside (sorry, Piers), is $class passed in?
>
> When you define C<new>, $class is the invocant. Most often,
> you won't define C<new>, you'll define C<BUILD>, in which case
> a constructed object is ready for you in the invocant (C<new>
> looks, perl5 style, something like this:)
>
> method Object::new($class, *@args) {
> my $self = bless NewOpaqueObjectContainer => $class;
> $self.BUILD(*@args)
> }
>
> > How are constructors inherited?
>
> They're usually C<submethod>s, so they're usually not. It's probably
> default that all base constructors are called before the current one,
> which is why you woulnd't want them to be inherited, lest they be
> called twice for the same object.

Again, just for clarity -- then I don't usually even need to define a
new() unless it's doing something strange, right? Just BUILD(), which
will customize this particular class's object instantiations?

And if this looks like a thread worthy of more than another brief post
or two, shall I split it off?

> > Since I don't bless() things anymore, does the compiler
> > automatically bless the returned object into the invocant class?
>
> It blesses it before it gets to C<BUILD>, if bless is even a valid
> concept anymore.
>
> > Boy, LOTS to learn. ~sigh~ :)
>
> Better to learn it now than when everybody else is going to: when
> it's released.

(lemmehearan) AMEN, brother! >:O)

Paul

unread,

Apr 23, 2003, 1:39:58 PM4/23/03

to Luke Palmer, perl6-l...@perl.org

> Ok, perhaps the one thing I've missed most in P5 that you get in C:
>
> struct {
> int someInt;
> float someFloat;
> char strdata[42];
> float otherFloat;
> char moreStr[123];
> } buf;
> fread(buf,sizeof buf,1,fp);
> printf("%.0f %s\n",buf.someFloat,buf.moreStr);
>
> To get the same thing in P5:
>
> @buf{ qw/ someInt someFloat strData otherFloat moreStr / } =
> unpack "ifa42fa123" <$fp>;
> printf ""%.0f %s\n", @buf{qw/ someFloat moreStr /};

Oops. Found another problem that makes it even uglier.
How does that <$fh> know how many bytes to read?
Now it needs to be

read $fh, $rec, $reclen;

@buf{ qw/ someInt someFloat strData otherFloat moreStr / } =

unpack "ifa42fa123" $rec;

printf ""%.0f %s\n", @buf{qw/ someFloat moreStr /};

Which means that if you have to change the layout, you have to modify
not only the lovely "ifa42fa123" but now $reclen as well.
Ugh!

Nicholas Clark

unread,

Apr 23, 2003, 6:05:18 PM4/23/03

to Hod...@writeme.com, perl6-l...@perl.org

On Wed, Apr 23, 2003 at 08:11:20AM -0700, Paul wrote:
>
> Ok, perhaps the one thing I've missed most in P5 that you get in C:
>
> struct {
> int someInt;
> float someFloat;
> char strdata[42];
> float otherFloat;
> char moreStr[123];
> } buf;
> fread(buf,sizeof buf,1,fp);
> printf("%.0f %s\n",buf.someFloat,buf.moreStr);
>
> To get the same thing in P5:
>
> @buf{ qw/ someInt someFloat strData otherFloat moreStr / } =
> unpack "ifa42fa123" <$fp>;
> printf ""%.0f %s\n", @buf{qw/ someFloat moreStr /};
>
> Accounting for the quick knock-off code, the lack of detail and
> stricture, etc., I admit that thereare ways to make it more readable,

You realise that the above perl is not necessarily correct? You're making
alignment assumptions which aren't true on all Linux platforms, let alone
other OSes.

There's a module on CPAN to do this sort of thing - Convert::Binary::C
I've never used it, but its author did a good talk on it, so I'm taking it
on trust that it's good. Its approach may be useful to "borrow"

Nicholas Clark

Paul

unread,

Apr 23, 2003, 6:53:44 PM4/23/03

to Nicholas Clark, perl6-l...@perl.org

> > Ok, perhaps the one thing I've missed most in P5 that you get in C:
> >
> > struct {
> > int someInt;
> > float someFloat;
> > char strdata[42];
> > float otherFloat;
> > char moreStr[123];
> > } buf;
> > fread(buf,sizeof buf,1,fp);
> > printf("%.0f %s\n",buf.someFloat,buf.moreStr);
> >
> > To get the same thing in P5:
> >
> > @buf{ qw/ someInt someFloat strData otherFloat moreStr / } =
> > unpack "ifa42fa123" <$fp>;
> > printf ""%.0f %s\n", @buf{qw/ someFloat moreStr /};
> >
> > Accounting for the quick knock-off code, the lack of detail and
> > stricture, etc., I admit that thereare ways to make it more
> > readable,
>
> You realise that the above perl is not necessarily correct? You're
> making alignment assumptions which aren't true on all Linux
platforms,
> let alone other OSes.

Actually, I wasn't even getting that detailed -- thinking of it more as
an example in terms of code and the desired effect, not as a literal
and executable C-to-Perl data conversion. I don't *care* what the other
language was, I just want to be able to define a *readable* Perl
construct that can automatically parse a line of data into a form where
I can access an individual field usefully the way I can when reading
into a C struct. I have no problem accounting for boundaries and such
(c.f. C<has undef is size(whatever)>); I'm just not trying to
completely define an interface until it's been presented in rough draft
and enough folk have commented that it might just be a good idea.

I apologize to anyone who was expecting executable code -- that wasn't
the point of the example, so I didn't waste the extra cycles trying to
figure out what the boundary alignments were. Such things are
frequently mot portable across several platforms anyway. If I were
actually doing this, then yes, I'd've checked my hardware guide and
added the necessary padding bytes of wasted space in the unpacking
layout.

As an aside, that kinda was part of the point, though I didn't want to
sidetrack too much. In the C struct, the boundary alignments aren't
specified; they're just handled. If unpack() could introspect and DTRT,
then maybe pack() could also when writing binary data. That's Perl to
Perl, but I could usually live with a file header specifying such
details, as then it would become platform-portable when read as
individual bytes....but that's another discussion, which can wait for later.

Piers Cawley

unread,

Apr 28, 2003, 12:31:37 PM4/28/03

to Hod...@writeme.com, perl6-l...@perl.org

Paul <ydb...@yahoo.com> writes:

Okay, I'm making some stuff up on the fly here, and deliberately
ignoring platform issues for the time being (just assuming platform
native files...)

class int is Packable { method packspec { "i" } }
class num is Packable { method packspec { "d" } }
class str is Packable { method packspec { "i/U*" } }

class Class {
method is_packable {
.isa(Packable) || all(.attribs).is_packable;
}
}

class Object {

method packspec {
fail "Can't pack a ", .class unless .class.is_packable;
join " ", map .packspec, .class.attribs;
}

method read_from (Handle $stream) {
my @attribs = $stream.read_record_with(packstring => .packstr);
.class.make_from_attrib_hash(.attribs.paired_with(@attribs));
}
}

The implementation of Handle::read_record_with(),
Object::make_from_attrib_hash() and Array::paired_with() is left as an
exercise for the interested reader. Note that it seems far more likely
that a better approach for this kind of work would be to use a grammar
rather than a simple pack string for reading. Which begs the question,
given an object and a grammar which specifies how that object should
be read from a stream, how do you output a stream which conforms to
that grammar and will reproduce the object when read back
in. (Assuming we're talking about objects with 'simple' attributes for
the time being...)

--
Piers