Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Proposed amendment to chartype structure, is_digit and get_digit

8 views
Skip to first unread message

Peter Gibbs

unread,
Sep 5, 2003, 7:18:49 AM9/5/03
to perl6-internals
I am working on adding support for additional chartypes to Parrot, plus
the capability for run-time registration of same. To this end, I would like to:

1) Add CHARTYPE* as a parameter to the <chartype>_is_digit/_get_digit functions
2) Create a new struct chartype_digit_map_t to contain mappings from code values
to digit values
3) Add a pointer to the above struct to the CHARTYPE structure

Any comments on the above before I go ahead?

--
Peter Gibbs
EmKel Systems

Dan Sugalski

unread,
Sep 5, 2003, 8:46:52 AM9/5/03
to Peter Gibbs, perl6-internals
On Fri, 5 Sep 2003, Peter Gibbs wrote:

> I am working on adding support for additional chartypes to Parrot, plus
> the capability for run-time registration of same. To this end, I would like to:
>
> 1) Add CHARTYPE* as a parameter to the <chartype>_is_digit/_get_digit functions
> 2) Create a new struct chartype_digit_map_t to contain mappings from code values
> to digit values
> 3) Add a pointer to the above struct to the CHARTYPE structure

Go for it. There is the possibility that identifying a digit is more
involved than we might otherwise want for some chartypes, however--the
first thing that pops to mind is the fun that ensues in the
Chinese-derived writing systems where the characters for the various
numbers may have non-numeric meanings in some circumstances. For those
(and for some other functions, such as "what is a word character", and
"what is a word boundary") requires something more complex than a plain
lookup table.

Dan

Peter Gibbs

unread,
Sep 5, 2003, 8:58:46 AM9/5/03
to Dan Sugalski, perl6-internals
On Fri, 5 Sep 2003, Dan Sugalski wrote:

> Go for it. There is the possibility that identifying a digit is more
> involved than we might otherwise want for some chartypes, however--the

Yeah - I am hoping to handle the simpler cases generically, so that we only
need to write specific code for the less simple ones.
The current methods for both digit handling and transcoding are
context-free,
which I suspect may become a problem later; if so, some form of iterator
with
context information will be required.

Dan Sugalski

unread,
Sep 5, 2003, 9:05:21 AM9/5/03
to Peter Gibbs, perl6-internals

Which should be just *so* much fun... :)

Since you're modifying the struct, make sure there are entries for
*functions* that do all the things you're putting in pointers to data
members for. We can NULL them out for now, but it'll mean that when
someone throws the Shift-JIS chartype code in we won't have to change the
struct and rebuild the world.

Dan

0 new messages