Very simply: by (hopefully) not breaking any locale stuff in Cygnal,
relative to upstream Cygwin.
The run-time support libraries from Cygwin have the locale support,
according to POSIX. If you patch some things in Cygwin to make it
more native-Windows-like here and there, and don't break the locale
stuff, then the locale stuff continues to work.
> Anyway, I have done some tests and it looks really good. Great work so far!
Great almost-no-work really; some of what you're testing might not
even be different between Cygwin and Cygnal.
> We should test your version thoroughly and if no one finds an issue we may
> use your version as best GAWK Windows port.
If you want to validate a Cygnal-based awk for yourself, the focus
should probably be on all the things that are different with regard
to that same awk executable running on Cygwin.
Mainly that would be in the area of path handling and also running
external processes.
Windows paths should work, and also the concept of a current working
directory per drive letter. If you're currently in C:\Users
but your D: drive is in D:\whatever, and you pass a path like
D:foo.txt to the Cygnal gawk, it should open D:\whatever\foo.txt,
and not D:\foo.txt.
System commands like with system("...") and the pipe syntax and
whatnot should be using the CMD.EXE command interpreter under
Cygnal. Under Cygwin, they look for a /bin/sh shell.
Cygnal isn't likely going to break anything internal to Gawk.
Because Gawk doesn't use stdio streams, it doesn't benefit from the
Cygnal having text mode streams in Windows mode (CR-LF) as default.
This is the down side. When you do printf("foo\n") in gawk, it
puts out a Unix newline.
One useful feature in Gawk is that the RT variable is set to the piece
of text which matches the RS record separator. So with that, if you
have a record separator regex that matches either CR or CR-LF, RT
can reproduce the actual separator regex which occurred. If you
explicitly use RT, you can write code that preserves the line
termination style.