Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

PIO Questions.

11 views
Skip to first unread message

Benjamin Goldberg

unread,
Sep 2, 2003, 9:04:05 PM9/2/03
to perl6-i...@perl.org

I'm looking for, but not finding, information regarding the character
type and encoding on parrot io objects.

As an example of why... I found this in io.ops :

op write(in PMC) {
PMC *p = $1;
STRING *s = (VTABLE_get_string(interpreter, p));
if (s) {
PIO_write(interpreter, PIO_STDOUT(interpreter),
s->strstart, s->bufused);
}
goto NEXT();
}

Surely, blinding writing the bytes that are in a string, without
checking that the string's encoding and type match that of the stream,
is wrong.

I would expect something like:

op write(in PMC) {
PMC *p = $1;
STRING *s = VTABLE_get_string(interpreter, p);
if( s ) {
PMC *o = PIO_STDOUT(interpreter);
string_transcode(interpreter, s,
PIO_encoding(interpreter, o),
PIO_chartype(interpreter, o), &s);
PIO_write(interpreter, o, s->strstart, s->bufused);
}
goto NEXT();
}

Except that I can't seem to find any PIO_encoding and PIO_chartype
functions.

#############

Actually... I think that the op print(in PMC) and write(in PMC) are
designed wrong. Instead of asking for a string, and printing that
string, they should call a print_self and/or write_self vtable method on
the PMC, passing in the PIO object. In turn, the default
implementations of those methods should print or write the results of
DYNSELF.get_string() to that PIO object.

This way, a PMC whose string representation is very large doesn't need
to serialize itself to a string before it can be printed -- it can print
itself directly to the PIO object, thus avoiding allocating memory for
that big string, and probably lots of copying.

To avoid loss of synchronization between the get_string form of a pmc
and the print_self/write_self form of a pmc, one should be able to
define a string's get_string as creating a new stringstream PIO object,
printing itself to that stream, and returning the corresponding string.
However, there doesn't (yet) seem to be a stringstream layer. When do
we expect to have one?

#############

PIO_putps converts to a cstring, then calls PIO_puts.

Since PIO_puts doesn't take a length, obviously it must be determining
the length of the string based on the presence of a nul character in
it. Thus, we cannot use PIO_puts to print binary data. This means that
PIO_write must be used. Since there's no op write(in STR), the only way
to do it from parrot is to create a PerlString, store our Sreg into it,
then call write. Blech.

#############

Shouldn't everything in io_unix.c (except for pio_unix_layer) be
static? This isn't "just" about namespace pollution -- it slows down
linking and dynamic loading. I think.

Hmm, the same applies to the other io_foo.c files.

#############

Why does PIO_unix_seek clear errno before calling lseek?

#############

Why does PIO_unix_tell have a temp variable "pos"?

--
$a=24;split//,240513;s/\B/ => /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print "$@[$a%6
]\n";((6<=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))&&redo;}

Leopold Toetsch

unread,
Sep 3, 2003, 3:33:56 AM9/3/03
to Benjamin Goldberg, perl6-i...@perl.org
Benjamin Goldberg <ben.go...@hotpop.com> wrote:

> As an example of why... I found this in io.ops :

> op write(in PMC) {

Sorry for that. This was a quick hack, to get
languages/parrot_compiler/parrot_pasm working again. As you mentioned
below, there was no print routine that transparently handled binary
data.

> Surely, blinding writing the bytes that are in a string, without
> checking that the string's encoding and type match that of the stream,
> is wrong.

Yep for sure. But as long as we have no means to change/set an IO
stream's encoding and type, it doesn't matter - yet.

[ much skipped, all very reasonable and true ]

leo

Juergen Boemmels

unread,
Sep 3, 2003, 9:26:03 AM9/3/03
to Benjamin Goldberg, perl6-i...@perl.org
Benjamin Goldberg <ben.go...@hotpop.com> writes:

> I'm looking for, but not finding, information regarding the character
> type and encoding on parrot io objects.
>
> As an example of why... I found this in io.ops :
>
> op write(in PMC) {
> PMC *p = $1;
> STRING *s = (VTABLE_get_string(interpreter, p));
> if (s) {
> PIO_write(interpreter, PIO_STDOUT(interpreter),
> s->strstart, s->bufused);
> }
> goto NEXT();
> }
>
> Surely, blinding writing the bytes that are in a string, without
> checking that the string's encoding and type match that of the stream,
> is wrong.

The PIO-system does not know anything about encodings yet, but they
need to. We might push a encoding transforming layer on top of the
current layerstack.

> I would expect something like:
>
> op write(in PMC) {
> PMC *p = $1;
> STRING *s = VTABLE_get_string(interpreter, p);
> if( s ) {
> PMC *o = PIO_STDOUT(interpreter);
> string_transcode(interpreter, s,
> PIO_encoding(interpreter, o),
> PIO_chartype(interpreter, o), &s);
> PIO_write(interpreter, o, s->strstart, s->bufused);
> }
> goto NEXT();
> }

My current plan is to have it more like:

op write(in PMC) {
STRING *s = VTABLE_get_string(interpreter, $1);
if (s) {
PIO_puts(interpreter, PIO_STDOUT(interpreter), s);
}
goto NEXT();
}

with a revised PIO_puts API (using a parrotstring instead of a
c-string).

> Except that I can't seem to find any PIO_encoding and PIO_chartype
> functions.

Maybe we need to implement them. They surely must get integrated in
the ParrotIOLayerAPI to support transcoding layers.

> #############
>
> Actually... I think that the op print(in PMC) and write(in PMC) are
> designed wrong. Instead of asking for a string, and printing that
> string, they should call a print_self and/or write_self vtable method on
> the PMC, passing in the PIO object.

Putting an object to a stream is one of the most basic operations for
a PMC. But I don't think its a good idea to introduce many more vtable
functions. (Vtables are already really fat, mostly because every
PMC needs to know how to multiply a ParrotIO with the keyed version of
a Continuation). print, write, dump and all call one function. The way
of printing/writing/dumping the data should be controlled via flags
which can be stored in the stream.

> In turn, the default
> implementations of those methods should print or write the results of
> DYNSELF.get_string() to that PIO object.
>
> This way, a PMC whose string representation is very large doesn't need
> to serialize itself to a string before it can be printed -- it can print
> itself directly to the PIO object, thus avoiding allocating memory for
> that big string, and probably lots of copying.
>
> To avoid loss of synchronization between the get_string form of a pmc
> and the print_self/write_self form of a pmc, one should be able to
> define a string's get_string as creating a new stringstream PIO object,
> printing itself to that stream, and returning the corresponding string.
> However, there doesn't (yet) seem to be a stringstream layer. When do
> we expect to have one?

There will be one. When? When I find time to do it or you (or anybody
else) submitts a patch.

Implementations of read/write and seek/tell are fairly
trivial. Problem is fdopen/getfd; the PIOHANDLE needs to be
interpreted as a string-pointer. We have semantic problems of fdopen
on other platforms than Unix (stdio, win32). This normaly leads to
failing t/pmc/io 3-4.



> #############
>
> PIO_putps converts to a cstring, then calls PIO_puts.

PIO_puts is subject to change. The printing of cstrings can be
trivially done by calls to PIO_write. This will be ripped out of the
ParrotIOLayerAPI and a new puts wich passes parrotstrings down the
layers is introduced.

> Since PIO_puts doesn't take a length, obviously it must be determining
> the length of the string based on the presence of a nul character in
> it. Thus, we cannot use PIO_puts to print binary data. This means that
> PIO_write must be used. Since there's no op write(in STR), the only way
> to do it from parrot is to create a PerlString, store our Sreg into it,
> then call write. Blech.
>
> #############
>
> Shouldn't everything in io_unix.c (except for pio_unix_layer) be
> static? This isn't "just" about namespace pollution -- it slows down
> linking and dynamic loading. I think.
>
> Hmm, the same applies to the other io_foo.c files.

Yes it should. There should not be any inter-layer calls except via
the layerstack. Making them static would enforce this. If nobody
objects I will commit this change.

> #############
>
> Why does PIO_unix_seek clear errno before calling lseek?

Don't know. Its in the code since Melvin added the seek in Jan 2002.
I will remove it.

> #############
>
> Why does PIO_unix_tell have a temp variable "pos"?

This value gets returned.

bye
boe
--
Juergen Boemmels boem...@physik.uni-kl.de
Fachbereich Physik Tel: ++49-(0)631-205-2817
Universitaet Kaiserslautern Fax: ++49-(0)631-205-3906
PGP Key fingerprint = 9F 56 54 3D 45 C1 32 6F 23 F6 C7 2F 85 93 DD 47

0 new messages