Weird LCMonetary initialization bug?

75 views
Skip to first unread message

Hans-Martin Mosner

unread,
Nov 12, 2025, 10:04:15 AMNov 12
to VAST Community Forum
Hello, while analyzing a user report about a number formatting bug I found that the primitive for LCMonetary initialization works differently in the Windows and Linux VMs:
The field monGrouping is initialized with '3 0' in the Windows VM, while the Linux VM initializes it as '3 ', which leads to incorrect formatting of numbers bigger than 1 million.
Correct formatting for 1000000 in the german locale would be '1.000.000,00', and under Windows that's what is produced, but VA under Linux shows '1000.000,00'.
(easy way to reproduce: AbtNumberConverter new decimalPlace: 2; objectToPrint: 1000000)

The VAST version in use is 14.0.0.0, I did not yet check whether it might already be fixed in 14.1.0.0.

Cheers,
Hans-Martin

Marcus Wagner

unread,
Nov 13, 2025, 10:10:15 AMNov 13
to VAST Community Forum
Hello, Hans-Martin,

to confirm: I ran 
AbtNumberConverter new decimalPlace: 2; objectToPrint: 1000000
under AlmaLinux 9.6 (Sage Margay) (it was just at hand)
using 11.0.1 64 bit and the answer was (same as you reported)
'1 000 000,00'
but 
inspecting it closer revealed hidden unprintable garbage contained in the resulting string of length 16, visible are only 12 as shown below:
inspect.png
Even worse: inspecting individual character positions (like at: 2, at: 3) lead to corrupt visual inspection windows like
at3.png
Character values are at:2 226, at:3 128. See Character inspector details inserted above.
The result is: under Linux, inspectors are also broken when character values are >= 128.
That is perhaps related to 7 bit Ascii, as in original X-Motiv dated in the 90ies, at the origins.

Underneath hidden there seems to be an even more severe VM problem under Linux, concerning characters.

Fortunately, the broken code did not cause walkbacks or VM traps. 
That points to tool side representation issues, where as the intertwined characters stem from somewhere else, likely an API to the OS (where the localization information originates).
Unfortunately this may cause further troubles concerning the recent UTF enhancements.
There is more to be fixed here than the formatting problem...
Kind regards
M

To compare, under Windows, inspector shows 12 (correct) characters in the result string:WINDOWS.png

Hans-Martin Mosner

unread,
Nov 13, 2025, 11:08:47 AMNov 13
to VAST Community Forum
Thanks for investigating! You results are actually different from mine, which makes me suspect that the underlying problem might be more complex than I thought initially. In addition, given the huge amount of work that went into unicode support, there may be differences between 11.0.1 and 14.0.0.

Could you (and perhaps others using Linux as well) run "(LCMonetary for: #('' '')) monGrouping" on their platform, possibly with different locale settings?
I found that with LANG=de_DE.utf-8 I get '3 ' which yields the incorrect formatting, while LANG=en_US.utf-8 gets me '' which results in the formatted result '100000000', i.e. 6 zeroes for the million, 2 for the decimal places, but neither a thousands separator nor a decimal point/comma.
So it's definitely locale-dependent, but wrong in several cases.

Cheers,
Hans-Martin

Marcus Wagner

unread,
Nov 16, 2025, 9:24:07 AMNov 16
to VAST Community Forum
Hello Hans-Martin,
sorry for the delayed response. 
Meanwhile I just attempted to track down a special idea: OS dependencies, I did not follow your questions yet.

In the past, when I was forced to make use of Linux, the brands I had to touch were Red Hat, then Scientific Linux and finally Centos.  
And I did not develop code under those target systems, only tested cross packaged images most of the time.
Under Centos 8 64 reponses on 
AbtNumberConverter new decimalPlace: 2; objectToPrint: 1000000 
are ok for 64 bit VAST 13 and VAST 12.0.1 see pictures below.

As I just was able to overcome EOL of Centos 8 (updates missing), but I still could not yet run the 32 bit versions of 13 and 12.0.1.
And another thing is the GUI subsystem. 
I found that under Centos 8 the old X-Motiv run well (including all peculiar things like resize) whereas under newer systems like AlmaLinux and Rocky Linux (Wayland) I found other defects (like black blinking regions occasionally overlapping VAST) using X-Term.
My conclusion for now is VASTs behaviour heavely depends on the Linux brand, bitness, the X-Motif subsystem, its version and the GUI where it is embedded.
As far as I saw yet, SeLinux did not block the 32 bit versions under Centos 8 (my first suspicion why they do not run).
I suspect that the garbage shown in VAST 11.0.1 64 seems to come from faulty OS library primitives.
I have to go back to school - as I did not run Linux versions frequently and missed several things happening here in the mean time.
I ll report again when I made better progress.
Kind regards
M

centos8.pngcentos8-2.png

Richard Sargent

unread,
Nov 17, 2025, 12:58:25 PM (13 days ago) Nov 17
to VAST Community Forum
Those odd characters make me think that the String has been "polluted" with the UTF-8 representation of some two characters.
Bytes 2 and 3 suggest the three byte encoding:

U+0800 - 
U+FFFF range
1110wwww 
10xxxxyy 10yyzzzz


Where it's coming from is an entirely different question, of course.


Also, some thoughts about running 32-bit applications under modern Linux versions:
We had to cease any possible support for GBS  for VA under Linux because modern versions don't have 32-bit GUI support (any more?).
I think we first noticed this with Ubuntu 2020. Newer versions may have reinstated it. I haven't tried.

Johan Brichau

unread,
Nov 17, 2025, 1:15:15 PM (13 days ago) Nov 17
to VAST Community Forum
With the locale set to 'xxx.utf8', the contents of `Locale current lcMonetary currencySymbol` are the utf-8 bytes for the symbol. These bytes are then added to the String instance and the result is what you reported. The localization functionality requires to use locale settings configured with the charset ISO-8559-15 instead of utf8. They can be installed but require some manual steps which are different depending on the Linux distro.
Message has been deleted

Hans-Martin Mosner

unread,
Nov 18, 2025, 3:33:19 AM (12 days ago) Nov 18
to VAST Community Forum
While I think it's entirely possible that UTF8-handling in this area might be incomplete, this does not explain my initial observation. Even with the ISO-8859-1 locale, the monGrouping string stays '3 '. It would be good to see what is actually happening in the primitives, that might lead to a solution much faster than tinkering and speculating :-)

Cheers,
Hans-Martin

Johan Brichau

unread,
Nov 18, 2025, 1:04:49 PM (12 days ago) Nov 18
to VAST Community Forum
Hi Hans-Martin,

Indeed, we are looking at that as well. I failed to mention it in my previous reply but I'll get back to it asap.

Marcus Wagner

unread,
Nov 19, 2025, 10:22:11 AM (11 days ago) Nov 19
to VAST Community Forum
A short summary about what I found: base is the POSIX standard Open Group
Platforms follow this more or less strictly.
Over the time that covered also symbolic sign languages, representation was based on UTF.
The VM makes use of the API offered by the respective OS and in turn offers primitives to make use of the provided information.
So silently characters are imported into the image which could not be represented properly. This explains the observed garbage.
The Unix command to check the originating OS setting is
locale -k LC_MONETARY
To give an example, under AlmaLinux (example above) I yield - indicating faults decoding the information before handing it over to Smalltalk.moneyalma.png
where as under Centos 8 I got
moneycentos.png

The result smells like a lot of work, as different OS provide different representations beneath the similar localization configuration options (I used consistently (German*Austria).
And then the VM fetches this and converts it so that the inconsistencies increase (eg. EUR vs. $, may be a result of cached instances in the image). 
Look also at int_curr_symbol or mon_grouping, for instance, subtle variations already in the OS.
The whole thing decorated with UTF / non UTF ends up in a mess. 
At least, the configuration provides a monetary-codeset='UTF-8'  to avoid even more garbage detection.
To be honest, this here would escalate to the most extreme if that has to extend to cover even symbols like in the near and far east...
I just found s.th. about ancient hungarian numbers... not to speek about Chinese and Japanese symbols.
Kind regards
M

Marcus Wagner

unread,
Nov 19, 2025, 10:28:55 AM (11 days ago) Nov 19
to VAST Community Forum
To be honest, most of us select a language and a keyboard layout and even that is inconsistent and ends up unexpected (as I found out)...

Johan Brichau schrieb am Montag, 17. November 2025 um 19:15:15 UTC+1:

Marcus Wagner

unread,
Nov 19, 2025, 10:41:06 AM (11 days ago) Nov 19
to VAST Community Forum
I found several deviations already in different distros. 
I recommend to run the command locale -k LC_MONETARY to check at the root. 
I assume, the interpreting VM or Smalltalk code around here was stable and did not change for ages... contributing to the situation.
Kind regards
M

Johan Brichau

unread,
Nov 20, 2025, 3:38:49 AM (10 days ago) Nov 20
to VAST Community Forum
Hi all,

After analysis, we can confirm that the primitive that reads the grouping information from the locale has a bug on Linux/UNIX. We track this in internal development case 74499 "LC_MONETARY parsing for repeated number group on Unix". This will most probably be fixed in the upcoming VAST 15.

Thank you for reporting and investigating this issue.
Reply all
Reply to author
Forward
0 new messages