Hi,
here the sources from tcl:
int
Tcl_UniCharIsSpace(
int ch) /* Unicode character to test. */
{
/*
* If the character is within the first 127 characters, just use the
* standard C function, otherwise consult the Unicode table.
*/
if (((Tcl_UniChar) ch) < ((Tcl_UniChar) 0x80)) {
return isspace(UCHAR(ch)); /* INTL: ISO space */
} else {
return ((SPACE_BITS >> GetCategory(ch)) & 1);
}
}
int
Tcl_UniCharIsPrint(
int ch) /* Unicode character to test. */
{
return (((GRAPH_BITS|SPACE_BITS) >> GetCategory(ch)) & 1);
}
As you see, the function Tcl_UniCharIsSpace asks for the '\n' the native c library using their character classes, so you are right!
But the function Tcl_UniCharIsPrint don't use the native c library for 1 byte codepoints!
If spaces are part of the print'able characters, than IMHO the function Tcl_UniCharIsPrint should behave similar to Tcl_UniCharIsSpace in detecting spaces:
int
Tcl_UniCharIsPrint(
int ch) /* Unicode character to test. */
{
if (Tcl_UniCharIsSpace(ch))
return 1;
return ((GRAPH_BITS >> GetCategory(ch)) & 1);
}
Wouldn't this be better? Or do I miss something Unicode specific?
Any Tcl core team member around?
Best regards,
Martin