As of 2007-10-20, the APEX APL Compiler was made available under the
GNU Public LIcense.
You may download it from www.snakeisland.com, as a tar file.
The compiler runs under Linux Dyalog APL (www.dyalog.com), but the
distribution includes all
functions in Weigang Interchange Format (aks APLASCII) as well, so
porting to a different
OS or APL interpreter should be relatively easy. Markus Triska
suggested changing those functions
to Unicode; this seems eminently sensible to me.
The compiler generates SAC (www.sac-home.org) code, which must then be
compiled before use.
As of this writing, things are fairly rocky from a stability
standpoint, and performance is all
over the map, so I do not recommend that you attempt to use APEX-
generated code to run
your pacemaker. The compiler was built for researching optimizations,
and has been very much a back-burner/crockpot item for many moons, and
it is definitely not product quality.
In the coming weeks, I'll be dealing with performance issues in the
back end (SAC). Mostly, I wanted to
get the compiler out the door, so don't expect me to listen very
closely to complaints. If you don't like it,
you get double your money back...
Bob
> changing those functions to Unicode; this seems eminently sensible
To everyone interested in converting files containing Jim Weigang's
ASCII transliterations to Unicode APL characters, check out UnicAPL:
http://stud4.tuwien.ac.at/~e0225855/unicapl/unicapl.html
With that apl.el loaded in Emacs, open any file containing such ASCII
transliterations and do M-x apl-ascii-to-unicode. For example,
s{<-}{times}/{rho},a is converted to: s←×/⍴,a
To use Emacs's built-in newsreader/mailer "Gnus" to send UTF-8 encoded
Unicode APL programs to this group, add a line like:
(setq gnus-select-method '(nntp "your.newsserver.com"))
to your .emacs, start Gnus with M-x gnus, do M-x gnus-fetch-group,
enter comp.lang.apl, and try it out!
If any APL vendor needs specialised conversion functions to or from
Unicode, I will add them (there's already one for A+); just send me
the mapping. Also, please tell me about transliterations you miss.
> If any APL vendor needs specialised conversion functions to or from
> Unicode, I will add them (there's already one for A+); just send me
> the mapping. Also, please tell me about transliterations you miss.
For the record, Dyalog version 11.0 does contain Unicode support in
the form of a class which was written for the SALT (Simple APL Library
Toolkit), but not widely publicised. If you have SALT enabled (see the
document on SALT in the "manuals" folder for more details), you can
write unicode data as follows:
[]SE.UnicodeFile.Write (filename) (data) [format]
... where [format] defaults to UTF-8 but can be set to UCS-2 or any
other common Unicode representation.
If not using SALT, copy/paste the text in the file classes\dyalog
\UnicodeFile.dyalog into a class first.
In the version 11.1 (about to enter Beta Test), Unicode support is
built-in to the language:
'UTF-8' UCS '{a+w}' // {alpha plus omega]
123 226 141 181 43 226 141 181 125
('UTF-8' UCS '{a+w}') NAPPEND tn 80 // Write to native file
Morten
Richard
> If you see []av in Dyalog and APLX as ordinal positions 0 ... 255,
> what are the unicode, say UTF-8, values for each position?
As an aside: The normal notation for Unicode code points (in the BMP)
is U+HHHH, where HHHH is the hexadecimal number of the code point. For
example, APL's lamp symbol (up shoe jot, with fonts: ⍝) is U+235D.
It makes sense to refer to the hex values of actual code points, as
they can be readily looked up in other tables like the widely
distributed unidata.txt file, and they are typically entered easily in
editors (Emacs: M-x ucs-insert RET HHHH RET). For vendor-supplied
conversion tables, I suggest to refer to Unicode code points in this
or a similar notation instead of particular encodings (like UTF-8).
Unicode allocates a 32-bit value to each character the world had ever
known (this is a work-in-progress)
UTF = Unicode Transformation Format
UTF-8 is one way of mapping these 32-bit quantities on to an 8-bit data
stream -- this makes it handy for data transmission, but characters are
always identified by using their 16-bit or 32-bit Unicode values
there comes a point where the concept of []av is entirely superfluous,
but we're not there yet
HTH and all the best . . . /phil
> there comes a point where the concept of []av is entirely superfluous,
> but we're not there yet
In Dyalog version 11.1 still has []AV, the mapping between []AV and
Unicode is soft, controlled by a translate table called []AVU (AV to
Unicode). This is because users of earlier versions have different
mappings and fonts defined - depending on which country they are in.
This means that there is no single interpretation of []AV. We need
this mapping for some years to make it easier to migrate to Unicode.
Once all data is Unicode, conversion to UTF-8 is algorithmic, UTF-8 is
simply an encoding of Unicode "code point" numbers.
Morten
so you're saying we're not there yet, right?
> Once all data is Unicode, conversion to UTF-8 is algorithmic, UTF-8 is
> simply an encoding of Unicode "code point" numbers.
given that the term "encoding" is quite frequently misunderstood or
misused, might it have been better to call UTF-8 a "transformation"?
and all the best to you, Morten -- you're doing a good job
Found the utf codes for APLX 4.0 beta in the help files. Done the
mapping for APL+Win. How do I get hold of beta (?) 11.1?
> How do I get hold of beta (?) 11.1?
You write to v111...@dyalog.com and explain what you are able to
commit to in terms of helping us test it.
Morten
Dear APL2, Dyalog and APLX vendors, _please_ do this. Thanks.
/Sasha.
For the record... In a standard Dyalog 11.1, (16 16 {reshape} []AVU)
returns the following resukt, which is the correct transliteration to
use for earlier versions of Dyalog, using the font "Dyalog ALT" (in
which underscores are replaced by various European letters):
0 8 10 13 32 12 6 7 27 9 9014 619 37 39
9082 9077
95 97 98 99 100 101 102 103 104 105 106 107 108 109
110 111
112 113 114 115 116 117 118 119 120 121 122 1 2
175 46 9068
48 49 50 51 52 53 54 55 56 57 3 164 165 36
163 162
8710 65 66 67 68 69 70 71 72 73 74 75 76
77 78 79
80 81 82 83 84 85 86 87 88 89 90 4 5 253
183 127
9049 193 194 195 199 200 202 203 204 205 206 207 208 210
211 212
213 217 218 219 221 254 227 236 240 242 245 123 8364 125
8867 9015
168 192 196 197 198 9064 201 209 214 216 220 223 224 225
226 228
229 230 231 232 233 234 235 237 238 239 241 91 47
9023 92 9024
60 8804 61 8805 62 8800 8744 8743 45 43 247 215 63 8714
9076 126
8593 8595 9075 9675 42 8968 8970 8711 8728 40 8834 8835 8745 8746
8869 8868
124 59 44 9073 9074 9042 9035 9033 9021 8854 9055 9017 33 9045
9038 9067
9066 8801 8802 243 244 246 248 34 35 30 38 8217 9496 9488
9484 9492
9532 9472 9500 9508 9524 9516 9474 64 249 250 251 94 252 8216
8739 182
58 9079 191 161 8900 8592 8594 9053 41 93 31 160 167 9109
9054 9059
0 167 8364 186 170 168 8222 9223 9224 9225 9226 8250 9228 9229 339 181
229 230 236 144 249 242 208 190 8224 8225 8230 27 129 164 8220 8221
32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63
64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111
112 113 114 115 116 117 118 119 120 121 122 123 166 125 126 127
199 252 233 226 228 224 172 231 234 235 232 239 238 8212 196 732
201 8216 215 244 246 338 251 141 381 214 220 162 163 165 174 254
225 237 243 250 241 209 169 8482 191 222 245 248 253 161 171 187
338 338 338 124 124 124 124 43 43 124 124 43 43 43 43 43
192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207
45 209 210 211 212 213 214 43 216 217 218 219 220 221 124 255
184 223 188 240 227 8249 402 8218 178 180 352 353 8217 179 185 157
173 143 8240 710 8226 8211 247 34 176 177 376 189 382 175 124 160
The one for Visual APL is different ....
The one for APLX is in their help file.
APL2 has had this facility for years. It's called QuadUCS.
I believe Dyalog has added QuadUCS recently too.
David Liebtag