Bug: Capture() includes CR for UTF-16LE output

14 views
Skip to first unread message

Carlo Hogeveen

unread,
Feb 27, 2026, 9:07:55 AM (13 days ago) Feb 27
to sem...@googlegroups.com

Context:
Probably not relevant: Tested in Windows 11 with GUI TSE.
In Windows the "wmic" command's output uses character-encoding UTF-16LE+BOM.
I tested that this happens independently of Windows' code page setting.

Capture()'s bug is, that when it "converts" UTF-16LE+BOM to ASCII, it leaves a carriage return (CR) at the end of the line.
The demo macro below shows this.
Additionally the captured buffer can be tested with the Potpourri menu's ShowCurr extension, which also shows the carriage return at
the end of each line.
Capture() does not have this bug for ANSI output.

Aside:
For UTF-16LE files the above conversion fails for non-ASCII characters.
That is unavoidable when converting Unicode's 159,801 characters to TSE's 256 characters, but so far in practice that is not a
problem yet, including now.

Carlo



proc Main()
integer i = 0
string r [MAXSTRINGLEN] = ''

EmptyBuffer(Query(CaptureId))
Capture('wmic process get name', _STDOUT_|_STDERR_)
BegFile()
for i = Max(1, CurrLineLen() - 25) to CurrLineLen()
r = Format(r; Asc(GetText(i, 1)))
endfor

Warn(r)
PurgeMacro(CurrMacroFilename())
end Main


S.E. Mitchell

unread,
Feb 28, 2026, 6:23:36 AM (12 days ago) Feb 28
to sem...@googlegroups.com
If I run:
proc Main()
dos("wmic process get name>wmic.txt", _DONT_PROMPT_|_TEE_OUTPUT_)
EditFile("wmic.txt")
end Main

I have the same problem. However, if I run (remove the _TEE_OUTPUT_):
proc Main()
dos("wmic process get name>wmic.txt", _DONT_PROMPT_)
EditFile("wmic.txt")
end Main

I do not have the problem. So it appears to be a problem with the way
I capture output in both the internal capture() command and the
external tee.exe.
Stay tuned.
--
> --
>
> ---
> You received this message because you are subscribed to the Google Groups "SemWare TSE Pro text editor" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to semware+u...@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/semware/001001dca7f2%2477d16c90%24677445b0%24%40ecarlo.nl.

S.E. Mitchell

unread,
Feb 28, 2026, 7:04:56 AM (12 days ago) Feb 28
to sem...@googlegroups.com
It looks like Windows is doing something. Maybe. Run this at a command line:
wmic process get name>wmic2.txt
tee32 wmic process get name>wmic3.txt
wmic2.txt is in utf-16 format.
But now look at wmic3.txt, try: g32 -b-3 wmic3.txt
That is what the editor gets when we run capture(....
Ah Ha!
I'm using CreateProcessA() in both the editor's capture() and tee32.exe.
It is translating the utf-16 to ascii.
Ok, so now that I know that, I can investigate ReadFile() leaving the
extra carriage-returns/
Reply all
Reply to author
Forward
0 new messages