Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

[RfC] constant PMCs and classes

6 views
Skip to first unread message

Leopold Toetsch

unread,
Aug 20, 2003, 8:05:19 AM8/20/03
to P6I
We currently have constant Key and Sub PMCs both created from the
packfile at load time. They live in the constant_table pointing to a
constant PMC pool. But we need more.

We have allover the core code like this:

string_from_cstring(interpreter, "pIt", 0)
key = key_new_cstring(interpreter, "_message");

to create some STRINGs or entries in hashes. The keys should be constant
PMCs and they should be shared as well as the STRINGs.
We need this in objects.c ("\0\0anonymous"), for setting standard
property names internally and for the current hash based implementation
of Exceptions.

This leads to my proposal:
* const_string_from_cstring - return cached constant string or create one
* const_key_new_cstring - return cached PMC or create a constant PMC
* constant_pmc_new* - ditto (called from above)

Further we would need for some classes a Const$Class variant, where the
set-like vtables throw an exception. These classes should be
autogenerated from *.pmc for all classes that have a "const_too" or such
in their classes $flags.

This leads to changes in parsing the vtable.tbl - which we need anyway
to do the proposed var/value split of vtables.

e.g.
[FETCH]
get_integer
...
[STORE]
set_integer
...
[PUSH]
push_integer
...
and so on.
The section names conform (where applicable) to methods described in
Tie::*(3pm).

Comments welcome
leo

Brent Dax

unread,
Aug 20, 2003, 1:25:54 PM8/20/03
to Leopold Toetsch, P6I
Leopold Toetsch:
# Further we would need for some classes a Const$Class variant, where
the
# set-like vtables throw an exception. These classes should be
# autogenerated from *.pmc for all classes that have a "const_too" or
such
# in their classes $flags.

Isn't The Plan(tm) to use properties for this?

--Brent Dax <br...@brentdax.com>
Perl and Parrot hacker

"Yeah, and my underwear is flame-retardant--that doesn't mean I'm gonna
set myself on fire to prove it."

Leopold Toetsch

unread,
Aug 20, 2003, 2:12:03 PM8/20/03
to Brent Dax, P6I
Brent Dax wrote:

> Leopold Toetsch:
> # Further we would need for some classes a Const$Class variant, where
> the
> # set-like vtables throw an exception. These classes should be
> # autogenerated from *.pmc for all classes that have a "const_too" or
> such
> # in their classes $flags.
>
> Isn't The Plan(tm) to use properties for this?

The property 'constant', 'ro' or whatever can only be some kind of
communication: the HLL is telling the PMC to be read only. We could now
have in each set-like vtable:

if (we_have_props && !is_bool(prop("ro")) // pseudo code
set_the_value ...
else
throw_exception

or we have a different class, that have in the set-like vtable slots
just the exception. No penalty for rw classes.

The lookup, if we have that property is a hash lookup, the is_bool is
int, string_bool or a vtable depending on the value of the property.

Setting/resetting the "ro" property could morph() the PMC to the
const/non-const variant.

I think that "read-only" is import enough (and used internally all over
the place) to get special treatment.


leo


Leopold Toetsch

unread,
Aug 20, 2003, 3:52:12 PM8/20/03
to Brent Dax, P6I
Brent Dax wrote:

> Leopold Toetsch:
> # The property 'constant', 'ro' or whatever can only be some kind of
> # communication: the HLL is telling the PMC to be read only. We could
> now
> # have in each set-like vtable:
> #
> # if (we_have_props && !is_bool(prop("ro")) // pseudo code
> # set_the_value ...
> # else
> # throw_exception
> #
> # or we have a different class, that have in the set-like vtable
> slots
> # just the exception. No penalty for rw classes.
>
> I'm starting to wonder if we shouldn't copy the vtable at construction
> time. That way, we can have our read-only set_* behaviors without
> requiring a second class:
>
> void init(...) {
> SELF->vtable=copy(this_class's_vtable);
> if(is_bool(prop("ro")) {
> SELF->vtable->set_int=&Parrot_pmc_ro_set_int;

This would work for properties known at constrution time. But it doesn't
much help, if at runtime the "ro" property is set. This scheme would
work e.g. for tie, where a new variable (ref) is returned, with
different vtable pieces allocated and copied in. Allocating new vtables
at runtime for the same object instance would need some book-keeping
(vtable already allocated, which entries...). And the more that constant
objects are used internally, I think its better to create these at
compile time.


> This would also allow for other class-specific vtable customization.
> After all, I doubt ro will be the *only* property that's much more
> efficiently implemented as a modified vtable...

Yep. That's right, your scheme matches somehow my tied sample code, I
posted long times ago.

A generalization of my compile time classes variations for runtime
modifications is for sure necessary - somewhen. But for now (and as long
Dan's specs for vtable chenges are hidden) I don't want to go further.

leo

Brent Dax

unread,
Aug 20, 2003, 3:14:55 PM8/20/03
to Leopold Toetsch, P6I
Leopold Toetsch:
# The property 'constant', 'ro' or whatever can only be some kind of
# communication: the HLL is telling the PMC to be read only. We could
now

# have in each set-like vtable:
#
# if (we_have_props && !is_bool(prop("ro")) // pseudo code
# set_the_value ...
# else
# throw_exception
#
# or we have a different class, that have in the set-like vtable
slots

# just the exception. No penalty for rw classes.

I'm starting to wonder if we shouldn't copy the vtable at construction
time. That way, we can have our read-only set_* behaviors without
requiring a second class:

void init(...) {
SELF->vtable=copy(this_class's_vtable);
if(is_bool(prop("ro")) {

/*
* I see no reason we can't provide one
implementation
* for all read-onlys--after all, it just needs
to throw
* an exception.
*/

SELF->vtable->set_int=&Parrot_pmc_ro_set_int;
SELF->vtable->set_num=&Parrot_pmc_ro_set_num;

SELF->vtable->set_string=&Parrot_pmc_ro_set_string;
SELF->vtable->set_pmc=&Parrot_pmc_ro_set_pmc;
...
}
}

This would also allow for other class-specific vtable customization.
After all, I doubt ro will be the *only* property that's much more
efficiently implemented as a modified vtable...

--Brent Dax <br...@brentdax.com>

Juergen Boemmels

unread,
Aug 21, 2003, 8:22:20 AM8/21/03
to Perl6 Internals
[
The listserver does not like my attachments
ezmlm-send: fatal: Sorry, after removing unacceptable MIME parts from your message I was left with nothing (#5.7.0)
ezmlm-gate: fatal: fatal error from child

Here is the resend with code inline
]

Leopold Toetsch <l.to...@nextra.at> writes:

> We currently have constant Key and Sub PMCs both created from the
> packfile at load time. They live in the constant_table pointing to a
> constant PMC pool. But we need more.
>
>
> We have allover the core code like this:
>
> string_from_cstring(interpreter, "pIt", 0)
> key = key_new_cstring(interpreter, "_message");
>
> to create some STRINGs or entries in hashes. The keys should be
> constant PMCs and they should be shared as well as the STRINGs.
>
> We need this in objects.c ("\0\0anonymous"), for setting standard
> property names internally and for the current hash based
> implementation of Exceptions.

[...]

Some time ago I did some experiments with initialising strings at
compile time. With some preprocessor magic and a perl program scanning
the c-file I got it running.

The changes to an existing c-file are minimal: At the beginning add a
#include "FILE.str" and replace all these string_form_cstring with
_S("text"). The _S macros are replaced by static string_structures
with static initialisers. These string structures are in the ro-data
segement of the executable and should load really fast. No calls to
any functions.

===== c2str.pl
#! perl

use Text::Balanced qw(extract_bracketed);
use Data::Dumper;

die "$0: Usage $0 FILE.c" unless $#ARGV == 0;

my $file = shift @ARGV;

$file =~ s/\.c$//;

my $infile = $file . '.c';
my $outfile = $file . '.str';

die "$0: $infile: $!" unless -e $infile;

print <<'HEADER';
/*
* !!!!!!! DO NOT EDIT THIS FILE !!!!!!!
*
* This file is generated automatically from '$infile'
* by $0.
*
* Any changes made here will be lost!
*
*/

#define CONCAT(a,b) a##b
#define _S(name) (__PARROT_STATIC_STR(__LINE__))
#define __PARROT_STATIC_STR(line) CONCAT(&static_string_, line)
#include <parrot/pobj.h>

#if ! DISABLE_GC_DEBUG
# define GC_DEBUG_VERSION ,0
#else
# define GC_DEBUG_VERSION
#endif

HEADER

my %known_strings = ();

sub output_string {
my ($text, $line) = @_;

if (exists $known_strings{$text}) {
<<"DATA";
#define static_string_${line} static_string_$known_strings{$text}

DATA
}
else {
$known_strings{$text} = $line;
<<"DATA";
static const char static_string_${line}_data\[\] = $text;
static const struct parrot_string_t static_string_${line} = {
{ /* pobj_t */
{{
(void*)static_string_${line}_data,
sizeof(static_string_${line}_data)
}},
PObj_constant_FLAG
GC_DEBUG_VERSION
},
sizeof(static_string_${line}_data),
(void*)static_string_${line}_data,
sizeof(static_string_${line}_data) - 1,
NULL,
NULL,
0
};

DATA
}
}

open IN, $infile;

my $line = 0;
while (<IN>) {
$line++;
next if m/^\s*\#/; # ignore preprocessor
next unless s/.*\b_S\b//;

my $str = extract_bracketed $_, '(")';

print output_string (substr($str,1,-1), $line);
}
===== example.c
/*
* example.c
*
* demonstrating static allocation of string
*
* to compile:
* perl c2str.pl example.c > example.str
* gcc -o example -Iinclude example.c blib/lib/libparrot.a -lm -ldl
*/

#include <parrot/parrot.h>

#include "example.str"

int
main(int argc, char* argv[]) {
struct Parrot_Interp * interpreter;

interpreter = Parrot_new();
Parrot_init(interpreter, &interpreter);

PIO_putps(interpreter, PIO_STDOUT(interpreter), _S("foo\n"));
PIO_putps(interpreter, PIO_STDOUT(interpreter), _S("bar\n"));
PIO_putps(interpreter, PIO_STDOUT(interpreter), _S("foo\n"));

return 0;
}
=====

Issues with this are:
* My macro vodoo is base on __LINE__ so only one _S directive per line
* the strings are local to one compilation unit (but that is not
diffrent from the c-string constants)
* It exposes the internal structure of PObj (this does not bother us at
the moment, nearly every file includes <parrot/parrot.h>)
* The encoding/chartype must be set to 0 or we get a many relocations,
the string functions have to deal with this case.

May a scheme like this can be used for keys too.

> Comments welcome
> leo

Same here
boe
--
Juergen Boemmels boem...@physik.uni-kl.de
Fachbereich Physik Tel: ++49-(0)631-205-2817
Universitaet Kaiserslautern Fax: ++49-(0)631-205-3906
PGP Key fingerprint = 9F 56 54 3D 45 C1 32 6F 23 F6 C7 2F 85 93 DD 47

Nicholas Clark

unread,
Aug 21, 2003, 8:43:10 AM8/21/03
to Leopold Toetsch, P6I
On Wed, Aug 20, 2003 at 02:05:19PM +0200, Leopold Toetsch wrote:

> This leads to changes in parsing the vtable.tbl - which we need anyway
> to do the proposed var/value split of vtables.
>
> e.g.
> [FETCH]
> get_integer
> ...
> [STORE]
> set_integer
> ...
> [PUSH]
> push_integer
> ...
> and so on.
> The section names conform (where applicable) to methods described in
> Tie::*(3pm).
>
> Comments welcome

It's been noted that FETCHSLICE/STORESLICE are missing from the perl5 tied
hash API, but that adding them at this stage would cause subtle bugs:

http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2001-11/msg02044.html


Could we split the vtable further, so var/value by read/write, which would
allow constant objects to swap their write vtable to a throw implementation.
Did I misread what you were suggesting either in this message or later, and
you were suggesting this split?

I'm not sure if a simple read/write split also helps PMCs shared across
threads. Maybe sharing between threads is a whole new can of worms that we
should leave untouched for now.

Nicholas Clark

Leopold Toetsch

unread,
Aug 21, 2003, 9:31:42 AM8/21/03
to Nicholas Clark, perl6-i...@perl.org
Nicholas Clark <ni...@ccl4.org> wrote:
> On Wed, Aug 20, 2003 at 02:05:19PM +0200, Leopold Toetsch wrote:

>> This leads to changes in parsing the vtable.tbl - which we need anyway
>> to do the proposed var/value split of vtables.
>>
>> e.g.
>> [FETCH]
>> get_integer
>> ...
>> [STORE]
>> set_integer
>> ...
>> [PUSH]
>> push_integer
>> ...
>> and so on.
>> The section names conform (where applicable) to methods described in
>> Tie::*(3pm).
>>
>> Comments welcome

> Could we split the vtable further, so var/value by read/write, which would


> allow constant objects to swap their write vtable to a throw implementation.
> Did I misread what you were suggesting either in this message or later, and
> you were suggesting this split?

I'm suggesting exactly this split. Or lets say pmc2c.pl knows that e.g.
[FETCH] is a read and [STORE] is a write, which will get a throw
implementation for the Const$Class.
WRT var/value split, I think only FETCH/STORE get duplicated in both
sections.

> I'm not sure if a simple read/write split also helps PMCs shared across
> threads. Maybe sharing between threads is a whole new can of worms that we
> should leave untouched for now.

When a PMC is shared it is either ro or rw - in all threads. I don't see
a problem here.

> Nicholas Clark

leo

Nicholas Clark

unread,
Aug 21, 2003, 2:32:52 PM8/21/03
to Leopold Toetsch, Nicholas Clark, perl6-i...@perl.org
On Thu, Aug 21, 2003 at 03:31:42PM +0200, Leopold Toetsch wrote:
> Nicholas Clark <ni...@ccl4.org> wrote:

> > I'm not sure if a simple read/write split also helps PMCs shared across
> > threads. Maybe sharing between threads is a whole new can of worms that we
> > should leave untouched for now.
>
> When a PMC is shared it is either ro or rw - in all threads. I don't see
> a problem here.

I don't see a problem but I do see a possible optimisation.
I can't remember if you were still in the p5p meeting in Paris when Arthur
explained Sarathy/his idea about doing copy on write of whole scalars in
perl5 for ithreads.

Basically the idea was that when a new thread is created, scalars aren't
copied immediately. However, they aren't locked - they are treated as
read only (with a reference count, boo hiss, of interested threads)
If a thread finds that it needs to modify something (with a reference count
>= 2) then it does lock that scalar (and checks again)
At that point it makes its own copy of the scalar, leaving the original
intact (untouched) for the other threads to keep on seeing.
(Obviously when it has made the copy it then unlocks the other)
The last thread inherits the original.

The beauty is that no shared scalar is ever written.

I think that something like this could also work in parrot - you replace the
write vtable ops with the lock/check/copy versions, but leave the read ops
intact. When you do the copy and it's now an unshared PMC you revert the
write vtable ops to the usual fast set. Hence why I made a connection
between split read/write ops and threading.

Nicholas Clark

Leopold Toetsch

unread,
Aug 21, 2003, 3:38:53 PM8/21/03
to Nicholas Clark, perl6-i...@perl.org
Nicholas Clark <ni...@ccl4.org> wrote:
> On Thu, Aug 21, 2003 at 03:31:42PM +0200, Leopold Toetsch wrote:

>> When a PMC is shared it is either ro or rw - in all threads. I don't see
>> a problem here.

> I don't see a problem but I do see a possible optimisation.
> I can't remember if you were still in the p5p meeting in Paris when Arthur
> explained Sarathy/his idea about doing copy on write of whole scalars in
> perl5 for ithreads.

A yes that. I wasn't on my first beer at this time but I remember it. I
didn't fully understand the implications of this proposal, but with ...

[ snip ]

> The beauty is that no shared scalar is ever written.

> I think that something like this could also work in parrot - you replace the
> write vtable ops with the lock/check/copy versions, but leave the read ops
> intact. When you do the copy and it's now an unshared PMC you revert the
> write vtable ops to the usual fast set.

... this excellent (thanks) explanation, this seems to be very promising
and doable, because ...

> ... Hence why I made a connection


> between split read/write ops and threading.

... a Const$PMC has a different vtable then $PMC. A Shared$Something has
the set-like part of its vtable changed, so its yet another PMC type.

When there is a need to copy the PMC, because it would be written to,
the orginal vtable (being the Const one or not) has to be swapped in,
and that's it. So basically, we just have to remember the original
vtable* (or class enum) in the changed vtable, to restore the previous
one. Then the vtable method gets redone, for the shared case it finally
would throw an exception if the PMC was ro (and we could again set the
shared vtable here)

Albeit, we might end up with a non trivial amount of combinations of vtables
consisting of permutations of different vtable pieces ...

> Nicholas Clark

leo

Luke Palmer

unread,
Aug 23, 2003, 3:23:36 PM8/23/03
to Leopold Toetsch, Nicholas Clark, perl6-i...@perl.org
Leopold Toetsch writes:

> Nicholas Clark <ni...@ccl4.org> wrote:
> > Could we split the vtable further, so var/value by read/write, which would
> > allow constant objects to swap their write vtable to a throw implementation.
> > Did I misread what you were suggesting either in this message or later, and
> > you were suggesting this split?
>
> I'm suggesting exactly this split. Or lets say pmc2c.pl knows that e.g.
> [FETCH] is a read and [STORE] is a write, which will get a throw
> implementation for the Const$Class.
> WRT var/value split, I think only FETCH/STORE get duplicated in both
> sections.

Ok, so supposing this split happens, what would it look like? As in,
would there now be two vtables, one for variables and one for values?
Would it just be a logical split?

Thanks,
Luke

Leopold Toetsch

unread,
Aug 24, 2003, 4:56:20 AM8/24/03
to Luke Palmer, perl6-i...@perl.org
Luke Palmer wrote:


> Ok, so supposing this split happens, what would it look like? As in,
> would there now be two vtables, one for variables and one for values?
> Would it just be a logical split?

AFAIK we would have still one vtable divided into sub-structures. At
least the get/set_<type> methods would be duplicated. For plain scalars
the methods are the same, get_integer would directly return the int. For
"magic" classes, one get_integer fetches the value and get_integer on
the value returns the int.

>
> Thanks,
> Luke

leo


Leopold Toetsch

unread,
Aug 26, 2003, 8:31:21 AM8/26/03
to perl6-i...@perl.org
Leopold Toetsch <l.to...@nextra.at> wrote:

[ snip ]

> * const_string_from_cstring - return cached constant string or create one
> * const_key_new_cstring - return cached PMC or create a constant PMC

Some more thoughts WRT this issue:
- PerlHash isn't suitable for storing/lookup these STRINGs/Keys
- so we would need a new hash type with a C<const char*> key
(a simple implementation like the SymReg* hash inside symreg.c would
suffice IMHO)
- the keys shouldn't be copied into the hash, so they should be
declared C<static const s[] = "xxx"> (allocated in rodata).

constant_key_new_cstring() is only needed because the hash interface is
missing optimized a {s,g}et_<type>_keyed_str() interface.

- the constant hash can be in the interpreter->iglobals array
- implementation in a new F<constants.c> source file

Comments welcome,
leo

PS: why we need it IMHO:
$ find . -name '*.c' | xargs egrep '_cstring.*"\w+"'
There are more, which are using string_make or multiple lines.

Benjamin Goldberg

unread,
Aug 26, 2003, 6:29:30 PM8/26/03
to perl6-i...@perl.org
Juergen Boemmels wrote:
> Leopold Toetsch wrotes:

>
> > We currently have constant Key and Sub PMCs both created from the
> > packfile at load time. They live in the constant_table pointing to a
> > constant PMC pool. But we need more.
> >
> > We have allover the core code like this:
> >
> > string_from_cstring(interpreter, "pIt", 0)
> > key = key_new_cstring(interpreter, "_message");
> >
> > to create some STRINGs or entries in hashes. The keys should be
> > constant PMCs and they should be shared as well as the STRINGs.
> >
> > We need this in objects.c ("\0\0anonymous"), for setting standard
> > property names internally and for the current hash based
> > implementation of Exceptions.
>
> [...]
>
> Some time ago I did some experiments with initialising strings at
> compile time. With some preprocessor magic and a perl program scanning
> the c-file I got it running.
>
> The changes to an existing c-file are minimal: At the beginning add a
> #include "FILE.str" and replace all these string_form_cstring with
> _S("text"). The _S macros are replaced by static string_structures
> with static initialisers. These string structures are in the ro-data
> segement of the executable and should load really fast. No calls to
> any functions.

I can think of a slightly easier method, one which would not depend on
running a helper perl program.

#define PCONCAT(b,c) _Parrot_static_##b##c
#define PARROT_DECLARE_STATIC_STRING(name, cstring) \
static const char PCONCAT(name,_cstring) [] = cstring; \
static const struct parrot_string_t \
PCONCAT(name,_STRING) = { \
{ /* pobj_t */ \
{{ \
(void*)PCONCAT(name,_cstring), \
sizeof(PCONCAT(name,_cstring)) \
}}, \
PObj_constant_FLAG \
GC_DEBUG_VERSION \
}, \
sizeof(PCONCAT(name,_cstring)), \
(void*)PCONCAT(name,_cstring), \
sizeof(PCONCAT(name,_cstring)) - 1, \
NULL, \
NULL, \
0 \
}, * const name = &PCONCAT(name,_STRING);
[untested]

Then, of course, one can simply declare the names and values of one's
constant strings in any place where variables may be declared, using:

PARROT_DECLARE_STATIC_STRING(mystr, "some string here");

Now, there's a const STRING * const mystr variable, whose contents are the
"some string here". Actually, I'm not sure that there's a need for the
PCONCAT(name,_cstring) variable -- it might be possible to use cstring
directly when initializing the struct.

--
$a=24;split//,240513;s/\B/ => /for@@=qw(ac ab bc ba cb ca
);{push(@b,$a),($a-=6)^=1 for 2..$a/6x--$|;print "$@[$a%6
]\n";((6<=($a-=6))?$a+=$_[$a%6]-$a%6:($a=pop @b))&&redo;}

Leopold Toetsch

unread,
Aug 27, 2003, 4:21:11 AM8/27/03
to Benjamin Goldberg, perl6-i...@perl.org
Benjamin Goldberg <ben.go...@hotpop.com> wrote:

> #define PARROT_DECLARE_STATIC_STRING(name, cstring) \

[ big snip ]

While Juergen's original or your proposal are fine, they don't work
(or not as proposed). First there are compiler issues:

$ gcc -Iinclude -Lblib/lib -lparrot -ldl -Wall -g bg.c && ./a.out

#v+
bg.c: In function `main':
bg.c:35: warning: braces around scalar initializer
bg.c:35: warning: (near initialization for `_Parrot_static_mystr_STRING.obj.u.int_val')
bg.c:35: warning: initialization makes integer from pointer without a cast
bg.c:35: warning: excess elements in scalar initializer
bg.c:35: warning: (near initialization for `_Parrot_static_mystr_STRING.obj.u.int_val')
#v-

The declaration looks ok at first sight, my gcc 2.95.2 might be wrong, but
anyway, a bigger problem is here:

PIO_printf(interpreter, "%S\n", mystr);

Program received signal SIGSEGV, Segmentation fault.
0x400f8725 in make_COW_reference (interpreter=0x8049828, s=0x80486e0)
at string.c:95
95 PObj_constant_CLEAR(s);
(gdb) bac
#0 0x400f8725 in make_COW_reference (interpreter=0x8049828, s=0x80486e0)
at string.c:95
#1 0x400f91b6 in string_copy (interpreter=0x8049828, s=0x80486e0)
at string.c:497

snippet from objdump:

.rodata 00000011 _Parrot_static_mystr_cstring.15
.rodata 00000024 _Parrot_static_mystr_STRING.16

If we get a general compiler independend solution for declaring static
STRINGs we still have nothing for static keys.

leo


#include "parrot/parrot.h"
#include "parrot/embed.h"

#if ! DISABLE_GC_DEBUG
# define GC_DEBUG_VERSION ,0
#else
# define GC_DEBUG_VERSION
#endif

#define PCONCAT(b,c) _Parrot_static_##b##c
#define PARROT_DECLARE_STATIC_STRING(name, cstring) \
static const char PCONCAT(name,_cstring) * = cstring; \


static const struct parrot_string_t \
PCONCAT(name,_STRING) = { \
{ /* pobj_t */ \
{{ \
(void*)PCONCAT(name,_cstring), \
sizeof(PCONCAT(name,_cstring)) \
}}, \
PObj_constant_FLAG \
GC_DEBUG_VERSION \
}, \
sizeof(PCONCAT(name,_cstring)), \
(void*)PCONCAT(name,_cstring), \
sizeof(PCONCAT(name,_cstring)) - 1, \
NULL, \
NULL, \
0 \
}, * const name = &PCONCAT(name,_STRING)

int main(int argc, char* argv[]) {

int dummy_var;
struct Parrot_Interp * interpreter;


PARROT_DECLARE_STATIC_STRING(mystr, "some string here");

interpreter = Parrot_new();
Parrot_init(interpreter, (void*) &dummy_var);
PIO_printf(interpreter, "%S\n", mystr);
return 0;
}

Juergen Boemmels

unread,
Aug 27, 2003, 8:14:40 AM8/27/03
to l...@toetsch.at, Benjamin Goldberg, perl6-i...@perl.org
Leopold Toetsch <l...@toetsch.at> writes:

> Benjamin Goldberg <ben.go...@hotpop.com> wrote:
>
> > #define PARROT_DECLARE_STATIC_STRING(name, cstring) \
>
> [ big snip ]
>
> While Juergen's original or your proposal are fine, they don't work
> (or not as proposed). First there are compiler issues:
>
> $ gcc -Iinclude -Lblib/lib -lparrot -ldl -Wall -g bg.c && ./a.out
>
> #v+
> bg.c: In function `main':
> bg.c:35: warning: braces around scalar initializer
> bg.c:35: warning: (near initialization for `_Parrot_static_mystr_STRING.obj.u.int_val')
> bg.c:35: warning: initialization makes integer from pointer without a cast
> bg.c:35: warning: excess elements in scalar initializer
> bg.c:35: warning: (near initialization for `_Parrot_static_mystr_STRING.obj.u.int_val')
> #v-

The first warning is about the Union initialisation. A initialiser of
a union always initializes the first element, which is a INTVAL, but I
wanted to initialize the last item (which is not possible in pure
ANSI, but it turned out gcc got it right anyway). A simple change of
order in the Union would make this warning go away.

> The declaration looks ok at first sight, my gcc 2.95.2 might be wrong, but
> anyway, a bigger problem is here:
>
> PIO_printf(interpreter, "%S\n", mystr);
>
> Program received signal SIGSEGV, Segmentation fault.
> 0x400f8725 in make_COW_reference (interpreter=0x8049828, s=0x80486e0)
> at string.c:95
> 95 PObj_constant_CLEAR(s);
> (gdb) bac
> #0 0x400f8725 in make_COW_reference (interpreter=0x8049828, s=0x80486e0)
> at string.c:95
> #1 0x400f91b6 in string_copy (interpreter=0x8049828, s=0x80486e0)
> at string.c:497

Ah it seems that make_COW_reference wants to change a constant
string. As my code puts the static strings in .rodata (this was one of
the targets of my experiments) it is not possible in any way to change
this item.

I even think PObj_constant_CLEAR(s) is evil. One reason for
setting a PObj_constant_FLAG is to declare that an object will not
change. Unsetting this flag means breaking this contract.

> snippet from objdump:
>
> .rodata 00000011 _Parrot_static_mystr_cstring.15
> .rodata 00000024 _Parrot_static_mystr_STRING.16
>
> If we get a general compiler independend solution for declaring static
> STRINGs we still have nothing for static keys.

The main problem of all this code is the union initialiser. Static
keys could also created with an initialisier, but i think this needs
an other union-val to be initialized. There are gcc extension to init
arbitary members, but ANSI-C allows only the initializiation of the
first member.

bye

Leopold Toetsch

unread,
Aug 28, 2003, 5:42:45 AM8/28/03
to perl6-i...@perl.org
The issues WRT the union initializer are gone now (can people please
test, if the program below compiles cleanly now) - but:

- make_COW_* sets flag on the source string [1]
- if that is solved (with a special RO flag, or whatever) we have:
- string_make sets a default type and encoding if these arguments were
NULL. These constant strings have NULL for encoding/type so during
string_append as the encodings differ transcode is called - next SIGSEGV
with NULL encoding pointer.

So using these .rodata-constant STRINGs needs a lot of work.
- COW code (there are problems with COWed stacks anyway)
- transcoding stuff (should be changed to use {strstart, byteidx} and do
something about default vs NULL type/encoding.

leo

[1] this was called via string_copy in the spf_vtable code (I don't see
the point, why a string should be copied just to print it - removed)

Juergen Boemmels

unread,
Aug 28, 2003, 12:30:15 PM8/28/03
to Leopold Toetsch, perl6-i...@perl.org
Leopold Toetsch <l.to...@nextra.at> writes:

> The issues WRT the union initializer are gone now (can people please
> test, if the program below compiles cleanly now) - but:
>
>
> - make_COW_* sets flag on the source string [1]
> - if that is solved (with a special RO flag, or whatever) we have:
> - string_make sets a default type and encoding if these arguments were
> NULL. These constant strings have NULL for encoding/type so during
> string_append as the encodings differ transcode is called - next
> SIGSEGV with NULL encoding pointer.

It is in principle possible to use a &default_encoding in the static
initialiser, but then a relocation is needed because the static string
and &default_encoding are not in the same object-file. This would lead
to a longer load-time.

On the other hand accessing the encoding through macros
#define ENCODING_skip_forward(enc, p, n) ((enc) ? \
(enc)->skip_forward(p, n) : \
default_enc->skip_forward(p, n))
has a runtime cost; the test will fool the branch prediction.

> So using these .rodata-constant STRINGs needs a lot of work.
> - COW code (there are problems with COWed stacks anyway)
> - transcoding stuff (should be changed to use {strstart, byteidx} and
> do something about default vs NULL type/encoding.

Next idea:
create a new function
STRING *string_from_static_cstring(const char *cstr);
which does not copy the string to newly allocated memory. But maybe
then the problems with COW are still there.

[...]
bye

Benjamin Goldberg

unread,
Aug 28, 2003, 7:56:03 PM8/28/03
to perl6-i...@perl.org

Leopold Toetsch wrote:
>
> We currently have constant Key and Sub PMCs both created from the
> packfile at load time. They live in the constant_table pointing to a
> constant PMC pool. But we need more.
>
> We have allover the core code like this:
>
> string_from_cstring(interpreter, "pIt", 0)

Don't do that. Instead, do:

string_from_cstring(interpreter, "pIt", PObj_external_FLAG);

This will allow parrot to avoid making a copy of "pIt".

Actually, IMHO, we should disallow (both for this, and for string_make)
a simple default of 0 for 'flags'. Instead, I think we should require
that the programmer explicitly indicate exactly what the pointer we're
passing in is. It could be any of:

A buffer which might be reused by something else, so we should
allocate space, and copy the data into it.
A const static buffer which won't (can't) ever change. We can
capture it and use it, but shouldn't [ever] try altering it.
An external buffer which won't change on us, but which is not
constant memory ... like non-const static memory, or perhaps a string
from a program in which parrot is embedded. We can freely use it
(without copying it) and even change it, but growing it requires that we
allocate from our own arenas. Also, when we no longer need it, just let
go of it.
A buffer allocated by mem_sys_allocate, which the calling function
wants to *give up* to the string. We can use it directly without
copying it, reallocate it as needed, and free it to our memory arenas
when we're done with it. Indeed, by indicating this, it would now be
the string's responsibility to free up the memory when it's gone,
instead of being the caller's responsibility.
If there are any other scenarios, I'm not aware of them.

The majority of calls from user-written C code will be of type 2: A
constant static buffer which won't (can't) ever change (like in your
example). Some user-land code will be type 1 (for example, turning the
result of inet_ntoa into a string).

I'm not sure if type 3 has a use -- one can claim that the string is
type 1, without harm.

There wouldn't be too many times that type 4 would need to be used,
except as an optomization (change places where we make a STRING then
free the memory that the cstring was in, into making a STRING and giving
up ownership of the cstring-pointer to the STRING). As I recall, there
are places in perl5 where buffers are "stolen" instead of copied... this
would be the same sort of thing.

Strings whose buffers are of type 2 would have a big advantage with
respect to COW: nothing needs to be done to make them COW (we copy the
pointer, and copy the flags), and nothing needs to be done to make them
non-COW.

In addition, I think we need *another* argument: One to tell what type
of STRING* object we want back. In particular, we might want back a
constant one, for which appending or truncating is illegal. In addition
to that, we might (or might not) be willing to accept a STRING which
isn't newly allocated, but was cached from an earlier call.

> key = key_new_cstring(interpreter, "_message");

IMHO, key_new_cstring should allow a "flags" argument, so that we can
pass in either a PObj_external_FLAG or 0; otherwise, it won't know
whether or not it needs to make a copy.

> to create some STRINGs or entries in hashes. The keys should be constant
> PMCs and they should be shared as well as the STRINGs.
> We need this in objects.c ("\0\0anonymous"), for setting standard
> property names internally and for the current hash based implementation
> of Exceptions.
>
> This leads to my proposal:
> * const_string_from_cstring - return cached constant string or create
> one

Instead of a new entry point, how about adding whatever behavior you
want here to be an added behavior of string_from_cstring, dependent on
what flags were passed as a third argument?

> * const_key_new_cstring - return cached PMC or create a constant PMC

If

> * constant_pmc_new* - ditto (called from above)
>
> Further we would need for some classes a Const$Class variant, where the
> set-like vtables throw an exception. These classes should be
> autogenerated from *.pmc for all classes that have a "const_too" or such
> in their classes $flags.
>
> This leads to changes in parsing the vtable.tbl - which we need anyway
> to do the proposed var/value split of vtables.
>
> e.g.
> [FETCH]
> get_integer
> ...
> [STORE]
> set_integer
> ...
> [PUSH]
> push_integer
> ...
> and so on.
> The section names conform (where applicable) to methods described in
> Tie::*(3pm).

--

Benjamin Goldberg

unread,
Aug 28, 2003, 8:11:44 PM8/28/03
to perl6-i...@perl.org

Juergen Boemmels wrote:
>
> Leopold Toetsch <l...@toetsch.at> writes:
>
> > Benjamin Goldberg <ben.go...@hotpop.com> wrote:
> >
> > > #define PARROT_DECLARE_STATIC_STRING(name, cstring) \
> >
> > [ big snip ]
> >
> > While Juergen's original or your proposal are fine, they don't work
> > (or not as proposed). First there are compiler issues:
> >

[snip]


> > The declaration looks ok at first sight, my gcc 2.95.2 might be wrong, but
> > anyway, a bigger problem is here:
> >
> > PIO_printf(interpreter, "%S\n", mystr);
> >
> > Program received signal SIGSEGV, Segmentation fault.
> > 0x400f8725 in make_COW_reference (interpreter=0x8049828, s=0x80486e0)
> > at string.c:95
> > 95 PObj_constant_CLEAR(s);
> > (gdb) bac
> > #0 0x400f8725 in make_COW_reference (interpreter=0x8049828, s=0x80486e0)
> > at string.c:95
> > #1 0x400f91b6 in string_copy (interpreter=0x8049828, s=0x80486e0)
> > at string.c:497
>
> Ah it seems that make_COW_reference wants to change a constant
> string. As my code puts the static strings in .rodata (this was one of
> the targets of my experiments) it is not possible in any way to change
> this item.

The problem is that as far as the COW logic knows, only the *buffer* is
constant. In this (peciliar?) case, both the string buffer and the
string header are constant.

> I even think PObj_constant_CLEAR(s) is evil. One reason for
> setting a PObj_constant_FLAG is to declare that an object will not
> change. Unsetting this flag means breaking this contract.

It's clearing the constant flag, then making the copy, then turning the
constant flag back on. If it were only the buffer that were in const
memory, and not the header, this would be perfectly fine. I think.

> > snippet from objdump:
> >
> > .rodata 00000011 _Parrot_static_mystr_cstring.15
> > .rodata 00000024 _Parrot_static_mystr_STRING.16
> >
> > If we get a general compiler independend solution for declaring static
> > STRINGs we still have nothing for static keys.
>
> The main problem of all this code is the union initialiser. Static
> keys could also created with an initialisier, but i think this needs
> an other union-val to be initialized. There are gcc extension to init
> arbitary members, but ANSI-C allows only the initializiation of the
> first member.

--

Vladimir Lipskiy

unread,
Aug 29, 2003, 2:29:26 AM8/29/03
to perl6-internals, Leopold Toetsch
> can people please test, if the program below compiles cleanly now:

It doesn't. MSVC++ aint' happy when * is at back of a name.


> static const char PCONCAT(name,_cstring) * = cstring;

this
static const char *PCONCAT(name,_cstring) = cstring;
would be okay.

Leopold Toetsch

unread,
Aug 29, 2003, 4:00:21 AM8/29/03
to Benjamin Goldberg, perl6-i...@perl.org
Benjamin Goldberg <ben.go...@hotpop.com> wrote:

> Leopold Toetsch wrote:
>>
>> We currently have constant Key and Sub PMCs both created from the
>> packfile at load time. They live in the constant_table pointing to a
>> constant PMC pool. But we need more.
>>
>> We have allover the core code like this:
>>
>> string_from_cstring(interpreter, "pIt", 0)

> Don't do that. Instead, do:

> string_from_cstring(interpreter, "pIt", PObj_external_FLAG);

> This will allow parrot to avoid making a copy of "pIt".

Yep, that's right. And such constant strings should be contructed with
the terminating '\0' in place, so that string_to_cstring() for these
doesn't have to grow the string and place a NUL at end.

> Actually, IMHO, we should disallow (both for this, and for string_make)
> a simple default of 0 for 'flags'.

We could redefine the meaning of string_from_cstring() to be "String
from constant C-string.

>> key = key_new_cstring(interpreter, "_message");

> IMHO, key_new_cstring should allow a "flags" argument, so that we can
> pass in either a PObj_external_FLAG or 0; otherwise, it won't know
> whether or not it needs to make a copy.

The major problem here is, that its used to create hash keys, because the
vtables are missing string-optimized variants. So each hash access like
above at runtime needs a new PMC just to hold a String.
We will probably have much more hash access with constant items at
runtime, when objects are in, so I think its worth the optimization.

The flags issue applies too, of course.

>> * const_string_from_cstring - return cached constant string or create
>> one

> Instead of a new entry point, how about adding whatever behavior you
> want here to be an added behavior of string_from_cstring, dependent on
> what flags were passed as a third argument?

Yep, that's better.

leo

0 new messages