When embedding or extending Parrot through the external API, most of the
strings go into and come out of Parrot as the type Parrot_STRING. This is
painful and somewhat tedious from C (where these are usually -- but not
always -- C strings already), and it has implications for memory management
(do you use const_string()? Create a new Parrot_STRING through a function?).
Where possible, the external API should take and receive C strings.
-- c
The distinct STRING* or Parrot_STRING* (in extend) type doesn't make much
sense at all. It's a full-blown unicode-capable type and much more
complicated than a plain PMC. And it's not a light-weight low-level type at
all, despite of some explanations and docs.
I was proposing the following already:
* STRING arguments use a String PMC
- gets rid of one extra indirection in the String PMC
- removes a lot of duplicate code with STRING vs. (String) PMC args
- all our dynamic HLLs except Perl6 don't have a notion of 'str' anyway
and are using a *String PMC
* the current S register type becomes a C-string
- this is matching Perl6 'str' type (a buffer of 'short' ints) - hopefully
- it's nicely covered and optimizable by libc's string functions
my 2c
leo
Some problems/questions on this ticket (which admittedly is very old by
now)...
* A number of Parrot's string-related opcodes currently only work on
string registers. The 'downcase' opcode comes to mind, but there are
quite a few others. So, we either have to allow string registers to
contain unicode, or we have provide pmc-capable versions of all string
operations.
* C-strings are null terminated, whereas Perl6 'str' type is not.
* I don't know how this ticket plays with the recent "strings pdd" (PDD28).
Let's abandon this ticket or incorporate its items into the strings pdd
dicussions/documentation. Or both.
Pm