På 10 juni 2021 kl. 01:05:08, Tristano Ajmone (taj...@gmail.com) skrev:
Ciao @Thomas,
> Hi Everyone!
> I'm happy to announce that the Alan Continuous Integration pipeline has been rebuilt and does now continuously deliver new builds of command line SDK:s for Linux and Windows, as well as the Windows installations for WinArun and the Alan SDK. They can as per usual be downloaded from https://www.alanif.se/download-alan-v3/development-snapshots but also directly from the CI at https://ci.alanif.se.
I apologize for the feedback delay, but the weekend didn't roll out as planned, and I didn't manage to check the new SDK until tonight.
No need to apologize, we all do this on our spare time, in the time we have.
So, first things first, I can confirm that the Alpha SDK for Windows seems to work fine (i.e. the previous DLL related errors are now gone).
I didn't yet try to install WinARun, for the reasons explained below...
> The latest alpha/snapshot contains substantial improvements for those of you that work with non-ASCII languages as Alan now supports UTF-8 which is the predominant encoding on most systems of today. The advantage of this is that you (normally) don't have to do anything special to get Alan to correctly interpret your ñ, ä, ß and other characters. This has previously required special setup of editors and consoles to get to work right.
Now I'm trying to work out how to setup a dev branch for the StdLib were I can test switching the whole project to UTF-8 sources.
My main concern right now is how I'm going to handle this in my editor, Sublime Text, for which I've created the Sublime Alan
package. I'm trying to figure out how to tweak the package so it will default to UTF-8 encoded Alan sources (and transcripts,
solutions) but be able to fallback on ISO encoding if old project files are opened.
Since ST in most cases won't be able to determine if an English adventure is encoded in UTF-8 or ISO, due to the text containing
only chars from the ASCII range, all I can safely rely on is the presence of a BOM in the UTF-8 source.
I remember discussing the BOM on some repository Issue or Discussion, but can't remember where it was and how it ultimately
rolled out. Is an UTF-8 BOM in ALAN sources now allowed, mandatory or not allowed?
At the time of that discussion it seemed to me that you knew a lot more than me on the subject, now it sounds like you have forgotten some of that knowledge as you transfered it to me ;-)
AFAIK, there is no difference between an ISO-encoded file and an UTF-8 encoded file if they only contain ASCII-range characters. So from that respect a file could be either and it does not matter. From an Alan compiler perspective it will not matter either. So you actually don't have to know if an ASCII-only file is UTF-8 or ISO, they are the same.
What I decided from that discussion in #12 was that the Alan compiler can be instructed to assume UTF-8, but even if not, a file which have the BOM will be converted (as described in A.3. Encodings and character sets in the alpha manual).
If adding a BOM to UTF8 source is permitted, my best option is to use that to direct Sublime Alan on switching encoding, which
would mean that I could still leave ISO as the default encoding, and let ST auto-switch to UTF8 when a BOM is found.
In case of sources/transcripts/solutions which are the same in both encodings, end users will have to manually switch to UTF8
if they want to start working in that encoding — and if the BOM is supported, switch to UTF8 with BOM.
Users in ASCII-only environments don't actually have to do anything at all ever, I think. All files will be identical if they do not contain non-ASCII characters. And even after Alan has gone all-UTF, the files will still work and be the same.
The only snag might be if a particular environment/editor will open a file and assume some random (non-UTF-8) encoding if the BOM is missing, and then only if the user will then add non-ASCII characters to that file. But that is a rather unlikely scenario, I think.
(Another note to self: add check for BOM on "solution files" when they are opened, I don't know if opening a text file with a BOM will botch the command reading, but at least we could do an automatic encoding switch for that file as for source files.)
How is the AlanIDE going to approach this new feature?
AlanIDE has not been updated to address this, or even at all for a long time (so that is another project to take on...). I'm not even sure how the underlying Eclipse functions and editors handle UTF-8. But again the safe route (for a user) would be to use UTF-8 with BOM. I need to investigate this and also think about what needs to be done in terms of user experience here.
Right now I'm unsure on my next steps, because on the one hand I want to be able to work with both old project in ISO and
newer one in UTF8, without going bonkers. On the other hand, I would prefer not to update the Sublime Alan package to
adopt UTF-8 as the default Alan encoding until the next Beta is out — i.e. I would implement these changes in a local
dev branch of the package, and not push them on the Sublime Alan repository; but I would also like to plan how to roll
out this feature.
What's your advise in this respect?
I also think that restricting the impact of this to after beta8 is important. I'm not exactly sure what the Sublime package does with the encoding. Would it make sense to let it
I believe that right now attempting to switch the whole StLib repo to UTF-8, in a dedicated branch, is an essential
step for testing the new feature, and for understanding how editor plugins for Alan should approach the upcoming
switch to UTF8 encoding as the default, in Beta9.
A bulk conversion of the StdLib would be a valuable (repeatable) experiment, but I'm not sure there is value in keeping that as a separate branch. *I* think that gradual conversion would be a better route for the StdLib. But that then has to wait for beta8, so that's the drawback.
Another hindrance that has delayed my experimentation with StdLib UTF8 is that I'll now need to add to the repository
a script that should determine which SDK version to use based on the branch, and invoke it with the required options
to enable UTF8 (until it become the default). Since the master branch will keep using an older version of the SDK
(i.e. the latest Beta that was out when the StdLib was last released), having a toolchain that is branch-aware now
becomes mandatory, and relying on the ALAN binaries on the Sys Path might no longer be a viable solution in this
transition period from one encoding to the other.
Again, I *personally* would think this is no much effort for a temporary situation. I would, as outlined above, experiment with converting one or a couple of files...
WAIT, isnt' the StdLib in English!?!?!? Why is this a problem then? See the discussion about ASCII-only environments above.
This should only be a problem for the Italian library. Or is that what you are referring to?
If, so ... and after converting a few files and running all tests, take a step back, report any errors to me ;-) and we could discuss how to solve them, then reiterate until all non-ASCII files where in UTF-8.
Still a bit confused about how the configuration/function of the Sublime package fits into this. Maybe a user scenario would help me....
/Thomas
Thanks
Tristano
> You can read more about the function in the alpha-level documentation Compiler Switches ('-encoding UTF-8') and Interpreter Switches ('-u').
> For the upcoming beta8 you will explicitly have to tell the Alan compiler that the source files are in UTF-8, but for beta9 this will be the default. If your text file happens to be encoded with UTF-8 with a "BOM" (a special indicator in the file) Alan will already automatically respect that. (Some environments add this "marker", some don't).
> Also the interpreter accepts a UTF-8 option which of course controls command line input and output for the command line interpreters. But it also is useful for the GLK-based interpreters (WinArun, Gargoyle, ...) as logs and canned command input will have to be read with the correct encoding.
> So after beta9 (the beta release after the upcoming one), Alan authoring and running will be more natural for us who work in non-English lingua.
> /Thomas
Tristano Ajmone (Italy)
--
You received this message because you are subscribed to the Google Groups "Alan IF discussions" group.
To unsubscribe from this group and stop receiving emails from it, send an email to alan-if+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/alan-if/1107396598.20210610010504%40gmail.com.