On Tuesday, July 5, 2016 at 8:25:06 AM UTC-5, Richard Heathfield wrote:
> On 05/07/16 14:18, Richard Bos wrote:
> > Eric Sosman <eso...@comcast-dot-net.invalid> wrote:
> >
> >> On 7/5/2016 1:30 AM, luser droog wrote:
> >>> I've been reworking my utf8<->ucs4 conversion code in
> >>> preparation for overlaying my own codes over the unused
> >>> portions of the first byte [...]
> >>
> >> comp.lang.tactile.basic is down the hall to your left.
> >
> > Not on my server, it isn't. Now I'm curious. A quick websearch reveals
> > nothing.
>
> Perhaps he's being soscastic, and actually means VB (which has been
> well-described as 'stickle-brick programming' - very tactile in every
> sense (except, it must be admitted, in the sense of the sense of touch)).
>
I am aware of the infelicity of creating a new thing that's
ever-so-slightly incompatible with everything else. And in
the case of single-byte encodings for APL languages, there
are a lot of existing options to choose from:
http://meta.codegolf.stackexchange.com/questions/9428/when-can-apl-characters-be-counted-as-1-byte-each
But none of the existing ones work quite the way I want.
In particular, none cooperates with UTF8-encoded characters
in the same stream. By overlaying my shortcut codes over the
80-BF range of the first byte, it enables a workflow where
a source file can be edited with normal UTF8 tools and then
passed through a compression program to yield the program-specific
encoded form. Similar programs could easily convert to/from other
popular APL encodings if desired.
I really am attempting to consider compatibility issues and
make the least-constraining choices to achieve the goal
(here: devising a single utf8-compatible encoding for all
input to the interpreter, in one swoop enabling script files,
pasting Unicode into the repl, and 1-byte-per-char counting
of common extended symbols for golfing metrics).
I am not making a new VB.