Re: [Midnight Commander] #3972: lib/vfs testsuite failures on illumos

1 view
Skip to first unread message

Ticket System

unread,
Sep 3, 2024, 4:41:42 AMSep 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking:
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------
Changes (by zaytsev):

* owner: => zaytsev
* status: new => accepted


Comment:

Mentioned Alpine problems are due to musl iconv: #4495 .

Unfortunately, Cfarm doesn't have an Illumos / !OpenIndiana host. I see
that mc 4.8.31 is packaged. What is the current situation there, do the
tests pass? Is there a development shell host one could use? Any
build/test logs? What is the libc like, who provides iconv? Not sure if
these failures are due to (to be fixed) mocking things, or iconv.

Maybe a good approximation is to check what is the situation on Solaris.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:3>
Midnight Commander <https://midnight-commander.org>
Midnight Development Center

Ticket System

unread,
Sep 3, 2024, 4:49:59 AMSep 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking:
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by mnowak):

I am not involved with the project anymore, but you can get a Vagrant Box
from https://app.vagrantup.com/openindiana/boxes/hipster.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:4>

Ticket System

unread,
Sep 3, 2024, 5:05:16 AMSep 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking:
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

Thanks for the tip, do you happen to know if it's possible to download
this somehow and import into UTM instead of setting up Vagrant? UTM is an
interface for QEMU, so I'd need a qcow2 image and information about the
machine configuration. Apparently recently a plugin for Vagrant was
developed to use UTM behind the scenes, but if I don't have to install
lots of stuff, that would be great...

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:5>

Ticket System

unread,
Sep 3, 2024, 5:21:28 AMSep 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking:
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by mnowak):

The libvirt provider of the Box might be a QCOW2 underneath, but likely
won't boot as such if extracted and imported.

There are also installation images https://openindiana.org/downloads/;
even the minimal image has a small runtime environment with shell, but if
you need to install a package to debug things, you'd rather install the
whole OS and go from there.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:6>

Ticket System

unread,
Sep 3, 2024, 5:29:37 AMSep 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking:
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

Thanks again! That looks good so far. Is there any way to know how the
machine is configured? Vargantfile has only the following inside:

{{{
{"architecture":"amd64","disks":[{"format":"qcow2","path":"box_0.img"}],"provider":"libvirt"}
}}}

Like memory, CPU, devices...

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:7>

Ticket System

unread,
Sep 3, 2024, 5:45:01 AMSep 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking:
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by mnowak):

The configuration is likely sourced from https://github.com/OpenIndiana
/oi-packer.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:8>

Ticket System

unread,
Sep 3, 2024, 6:44:57 AMSep 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking:
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

Perfect, I've got it to work - the configuration is here:

https://github.com/OpenIndiana/oi-packer/blob/master/hipster.pkr.hcl

Default PC, 4G RAM, AMD64.

Sadly, everything is extremely slow on ARM, but apparently OI doesn't
support ARM natively, so emulation is the only option.

But how do I get in on the console? jack/jack & root/openindiana are not
working!

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:9>

Ticket System

unread,
Sep 3, 2024, 6:59:37 AMSep 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking:
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by mnowak):

Try, root/vagrant https://github.com/OpenIndiana/oi-
packer/blob/master/hipster.pkr.hcl#L16?

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:10>

Ticket System

unread,
Sep 4, 2024, 2:36:15 PMSep 4
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking:
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

Hey, not sure how I have missed it. You were of great help, thank you! I
guess I should add some notes on how to test on !OpenIndiana on our wiki.
I was still able to reproduce test failures. There are two types of
problems:

1. Relates to our vfs scripts and test suite not compatible with `ksh`,
because of the `local` keyword - see attached patch - seems uncomplicated
to fix
2. Everything that has to do with charset conversion fails - with
`--disable-charset` all test pass (surprisingly!) - which seems to be the
same issue as with Alpine / musl.

The charset stuff is annoying, because to my understanding we are actually
not using it directly, but pulling it from glib, so either we do something
wrong, or glib iconv is broken on these platforms.

I guess I need to look into small reproducer. In the worst case we should
include it in autoconf tests and disable charsets if this test fails,
because then mc is not going to work correctly.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:11>

Ticket System

unread,
Sep 4, 2024, 3:04:25 PMSep 4
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking:
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by mnowak):

You are welcome. iconv might be different on illumos distributions like
OpenIndiana (source from https://github.com/illumos/illumos-
gate/tree/master/usr/src/cmd/iconv) and might do a few things differently
to more common GNU libinconv.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:12>

Ticket System

unread,
Sep 5, 2024, 2:37:46 AMSep 5
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking:
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

I have tested iconv on the command line and it seems to convert between
UTF-8 and KOI8-R used in the tests just fine. All aliases also available.
I was even able to read the file in translation mode with mcview coming
from the shipped mc! So at least for that the conversions are working
correctly.

So I started reading `path_len.c` and following the functions with
`HAVE_CHARSET`, and I actually don't quite get how it works. Apparently
the idea was that the elements of the path are converted from terminal
encoding into whatever #enc says they should be encoded in memory, and the
encoding is also set to what enc says:

https://github.com/MidnightCommander/mc/blob/master/lib/vfs/vfs.c#L170

But then apparently it was supposed to stay UTF-8 (or rather terminal
encoding) in `vpath->str` when all elements are joined again?

For me on macOS this conversion doesn't do anything though (WHAT?!), it
seems that the element stays encoded as UTF-8. But apparently as there is
a bug in the conversion back, which also doesn't happen, so the end result
is correct and the test pass.

It seems that on !OpenIndiana/Illumos and Alpine/musl it actually DOES the
conversion to KOI8-R as intended, but due to missing conversion back the
test fails.

Andrew, can you help, at least tell me how it was supposed to work before
I debug further? This is very confusing.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:13>

Ticket System

unread,
Sep 5, 2024, 2:39:26 AMSep 5
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495

Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------
Changes (by zaytsev):

* blocking: => 4495


--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:14>

Ticket System

unread,
Sep 5, 2024, 5:46:16 AMSep 5
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

On Illumos: `term_encoding: 646` (!?)

{{{
$ iconv -l
The following are all supported code set names. All combinations
of those names are not necessarily available for the pair of the
fromcode-tocode. Some of those code set names have aliases, which
are case-insensitive and described in parentheses following the
canonical name:

5601,
646 (ASCII, US-ASCII, US_ASCII, USASCII),
}}}

With the following patch:

{{{
diff --git a/tests/lib/vfs/path_len.c b/tests/lib/vfs/path_len.c
index 6bab6f551..24beca8ac 100644
--- a/tests/lib/vfs/path_len.c
+++ b/tests/lib/vfs/path_len.c
@@ -42,7 +42,7 @@
static void
setup (void)
{
- str_init_strings (NULL);
+ str_init_strings ("UTF-8");

vfs_init ();
vfs_init_localfs ();
}}}

The tests work on Illumos. But for any single bit (KOI8-R or ASCII) seems
to break.

On macOS: `term_encoding: US-ASCII`, but also forcing "UTF-8" works.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:15>

Ticket System

unread,
Sep 5, 2024, 10:18:21 AMSep 5
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by andrew_b):

Replying to [comment:13 zaytsev]:


> Andrew, can you help, at least tell me how it was supposed to work
before I debug further? This is very confusing.

Sorry :-(

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:16>

Ticket System

unread,
Sep 5, 2024, 11:57:28 AMSep 5
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------
Changes (by zaytsev):

* cc: slavazanko (added)


Comment:

I see, so the whole VFS magic including recoding was Slava's work, and you
don't know how this #enc: stuff was actually supposed to work? Slava, are
you alive :) ? Can you say something?

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:17>

Ticket System

unread,
Sep 7, 2024, 3:20:42 AMSep 7
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

So I traced the functions again on macOS, and I was initially wrong about
conversion not happening there which confused me. Everything is working as
expected (?):

* `vpath = vfs_path_from_str_uri_parser (path);`
* Here, deep inside, the conversion from `UTF-8` to `KOI8-R` in-memory
representation is happening in `_vfs_translate_path`
* `vpath->str = vfs_path_to_str_flags (vpath, 0, flags);`
* Here the opposite conversion from `KOI8-R` memory representation to
`UTF-8` is happening using `str_vfs_convert_from`

It seems that there is some problem with `vfs_path_to_str_flags` on
Illumos, which apparently doesn't have to do with `iconv`, but with
detecting terminal encoding and the need to convert back. Or maybe the
conversion fails and this is not handled correctly. In any case, the
problem is that the conversion from `UTF-8` to `KOI8-R` happens correctly
in `vfs_path_from_str_uri_parser`, but the conversion back to `UTF-8`
doesn't happen.

If I force terminal encoding to `UTF-8` with `str_init_strings ("UTF-8")`
as shown above, then both conversions are happening correctly like on
macOS.

I wonder if I can get any further with a debugger...

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:18>

Ticket System

unread,
Sep 7, 2024, 6:43:20 AMSep 7
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

gdbserver is not supported on Solaris, and gdbgui needed fixing:

* https://github.com/miguelgrinberg/Flask-SocketIO/pull/2088
* https://github.com/cs01/gdbgui/pull/491

And best of all, the tests pass under graphical debugger - probably
because it's run inside Python and stuff is set to UTF there :(

I have the feeling that `g_iconv_open (codeset, from_enc)` where
`codeset="646"` and `from_enc="KOI8-R"` in `strutil.c:269` returns
something that doesn't do anything, which is weird, because the conversion
in the forward direction works.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:19>

Ticket System

unread,
Sep 11, 2024, 4:49:12 AMSep 11
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

I think I've figured out what the problem is with Illumos and Alpine. The
issues are not exactly the same, but the root cause is the same.

In the VFS library we have conversions that go in two directions:

1. From terminal encoding to the desired encoding
* Handled by the `vfs_translate_path` function
2. From desired encoding to terminal encoding
* Handled by `str_vfs_convert_from` function

Both functions return `estr_t`, which can be `ESTR_SUCCESS`,
`ESTR_PROBLEM` (some characters could not be translated, so it's not
possible to reverse the operation) and `ESTR_FAILURE` (conversion
impossible due to wrong / unsupported encoding).

The result of the forward conversion `vfs_translate_path` is somehow
checked and returns a NULL pointer on failure, which is supposed to be
checked (hopefully?), but on problems it doesn't do anything and just
continues.

The result of the backward conversion (`str_vfs_convert_from`) is simply
ignored!

=== `path_len` test

This test is started with `str_init_strings (NULL)`, which means `US-
ASCII` on macOS and equivalents elsewhere. The `HAVE_CHARSET` part
contains a raw UTF-8 string that should be recoded as `KOI8-R` for the
file system.

==== Linux and macOS

`iconv` just interprets the `UTF-8` string as bytes of `US-ASCII` and
"converts" them from bytes of `KOI8-R` without changing anything.

The problem happens during the conversion from `KOI8-R` to `US-ASCII` and
is signaled as `ESTR_PROBLEM`, but it's not a hard error, just some
characters can't be translated. And what `iconv` does is just leave the
string as it is.

Since the result of the backward conversion is ignored, everything
matches.

==== Solaris

Here `iconv` actually decides to do a forward conversion because it can
and it's valid, even though it probably shouldn't. So the path element is
now a valid `KOI8-R` string.

The backward conversion from `KOI8-R` to `US-ASCII` fails, but `iconv`
leaves the string as it is and ignores the result, so the `KOI8-R` string
is returned.

The length obviously doesn't match and the test fails.

==== Alpine

Similar to Solaris, but instead of returning the string unchanged, `iconv`
returns an empty string, so path ends up being `/#enc:KOI8-R` and the
assertion fails:

{{{
Assertion 'actual_length == data->expected_length' failed: actual_length
== 12, data->expected_length == 38
}}}

== Conclusion

This particular test is simply broken and only works on Linux and macOS by
(bad) luck. It should be initialized with `str_init_strings ("UTF-8")` and
this would make the test work correctly on all systems.

I haven't analyzed other tests, but probably they can be fixed somehow.

My question is, what do we do about error handling? It doesn't sound like
a good idea to just ignore problems. At the very least, it leads to buggy
tests working by accident without anyone noticing. But there are also
segfaults on Alpine.

I think maybe it was designed that way because the whole recoding story
was for like Linux (KOI8-R) systems accessing Windows FTP (CP1251) with
wrong encoding returned or something, so in the worst case the errors mean
the file list is garbage or something, and there was actually no intention
to propagate errors.

I think we should at least not ignore `ESTR_FAILURE` on reverse conversion
and return a string of ??? of the same length as the input one, so nothing
segfaults.

I'm also not sure about returning a `NULL` pointer on forward conversion.
I see some `NULL` checks, but many seem to be missing. Maybe also return a
string of ??? instead?

We can then test for invalid conversions.

Andrew, what do you think?

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:20>

Ticket System

unread,
Sep 11, 2024, 6:17:02 AMSep 11
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by andrew_b):

Replying to [comment:20 zaytsev]:


> This particular test is simply broken and only works on Linux and macOS
by (bad) luck. It should be initialized with `str_init_strings ("UTF-8")`

Because test strings are in UTF-8?

> and this would make the test work correctly on all systems.

> My question is, what do we do about error handling? It doesn't sound


like a good idea to just ignore problems.

Definitely, an error handling should be done.

> I think we should at least not ignore `ESTR_FAILURE` on reverse
conversion and return a string of ??? of the same length as the input one,
so nothing segfaults.

> I'm also not sure about returning a `NULL` pointer on forward
conversion. I see some `NULL` checks, but many seem to be missing. Maybe
also return a string of ??? instead?

Probably, in cases of `ESTR_FAILURE` return of an input string is a
workable solution. But if it hides a conversion problem from the caller,
it's better to return `NULL` and a caller should decide to do then.

> We can then test for invalid conversions.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:21>

Ticket System

unread,
Sep 11, 2024, 2:01:31 PMSep 11
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

> Because test strings are in UTF-8?

Yes.

Suppose you want to test the correctness of the conversions from the
terminal encoding to the desired encoding and back at a very superficial
level. You need to provide a path that will be encoded differently than
the terminal encoding, but the conversion should be lossless.

For example, you can provide the test string in UTF-8 (which will be your
terminal encoding) and say that a part of it should be converted to
KOI8-R: `/#enc:KOI8-R/тестовый/путь`. You can now check that after the
roundtrip the path has the correct length in bytes (38). You should also
check that `/тестовый/путь` takes 14 bytes in KOI8-R and not 26 bytes. But
this was never checked, and so no one has found out that the test is
wrong.

The thing is, you have to do `str_init_strings ("UTF-8")` for the code to
work correctly (UTF-8 -> KOI8-R -> UTF-8). If you do `str_init_strings
(NULL)`, it will assume that the terminal encoding is ASCII and your
string is in ASCII, but it's actually in UTF-8. What happens then depends
on how exactly `iconv` fails, but because we don't do proper error
handling, the roundtrip actually works on Linux and macOS as no conversion
happens and the raw bytes are just preserved. On Solaris and musl,
different things happen, but at least one of the two conversions fails.

So we can either fix the test by adding a missing assertion and setting
the encoding to UTF-8, which it is, and then it will work everywhere. Or
we can choose a different pair of encodings.

For example, hardcode bytes to CP1251 and do a CP1251 -> KOI8-R -> CP1251
conversion. But this might not be so good for length checks, because both
are singlebyte encodings. Unfortunately, the only multibyte encoding we
support is UTF-8, because the whole path handling depends on ASCII symbols
like `/` being encoded as single-byte ASCII. So I think it's fine to keep
UTF-8. It's actually a good test, and single byte conversions should be
tested elsewhere, like `vfs_path_string_convert'.

I have started a branch to fix the tests. Unfortunately, the following
tests still need to be investigated (some of them may fail due to broken
mocks, see #4584):

{{{
FAIL: path_cmp
FAIL: path_manipulations
FAIL: path_serialize
FAIL: vfs_path_string_convert
FAIL: edit_complete_word_cmd
}}}

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:22>

Ticket System

unread,
Sep 12, 2024, 5:05:49 AMSep 12
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

`path_cmp` - fixed in the same way as `path_len`.

`path_manipulations` & `path_serialize` have the same problem: for some
reason I don't understand yet, `#enc:KOI8-R` is dropped when manipulating
the path (in `vfs_path_append_vpath_new` and so `vfs_path_to_str_flags`).

Setting terminal encoding to UTF-8 or a valid Cyrillic encoding like
`CP1251` fixes the tests.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:23>

Ticket System

unread,
Sep 12, 2024, 2:09:52 PMSep 12
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

Ok, I think I know why this happens. On Solaris, `g_iconv_open ("ASCII",
"KOI8-R")` fails because this conversion is considered invalid. On macOS
and apparently Linux, this converter can be created, and as long as the
compatible subset is given as strings, it even works.

The problem is the lack of error handling in the `str_crt_conv_from'
function in the VFS. At other call sites there are some cases where errors
are handled. But I don't know how to add error handling there. My idea is
to keep fixing tests and then make a list of places where error handling
is missing, so this went unnoticed.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:24>

Ticket System

unread,
Sep 13, 2024, 4:43:49 AMSep 13
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

`vfs_path_string_convert` segfaults, because the `NULL` pointer returned
in `vfs_translate_path_n` is simply ignored and later explodes as
`element->path` is `NULL`:

{{{

$ pstack core
core 'core' of 10170: ./vfs_path_string_convert
0000000000427cd0 vfs_path_to_str_flags () + 15f2
00000000004280b5 vfs_path_from_str_flags () + b2
000000000042b05d test_from_to_string_fn () + 3d
00007fffac43945c srunner_run_tagged () + 55c
0000000000424808 mctest_run_all () + 5f
000000000042b510 main () + b3
0000000000424687 _start_crt () + 87
00000000004245e8 _start () + 18
}}}

`edit_complete_word_cmd` segfaults due to missing error handling:

{{{
$ pstack core
core 'core' of 19717: ./edit_complete_word_cmd
00007fffaf379c05 iconv_close (ffffffffffffffff) + 15
00000000004f1af9 str_close_conv () + 28
00000000004ea422 str_nconvert_to_display () + 90
00000000004656a9 edit_collect_completion_from_one_buffer () + 2cd
0000000000465972 edit_collect_completions () + 1c4
0000000000466104 edit_complete_word_cmd () + 2d5
0000000000455319 test_autocomplete_fn () + c1
00007fffac43945c srunner_run_tagged () + 55c
00000000004545e8 mctest_run_all () + 5f
000000000045583e main () + 8a
0000000000454487 _start_crt () + 87
00000000004543e8 _start () + 18
}}}

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:25>

Ticket System

unread,
Sep 13, 2024, 5:32:40 AMSep 13
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by andrew_b):

Replying to [comment:25 zaytsev]:


> `edit_complete_word_cmd` segfaults due to missing error handling:
>
> {{{

> 00000000004ea422 str_nconvert_to_display () + 90
> }}}

I've added a commit with fix of that.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:26>

Ticket System

unread,
Sep 13, 2024, 6:52:50 AMSep 13
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

> I've added a commit with a fix for this.

Wow, with your change, the test passes when I remove my `UTF-8` commit!

However, if the terminal encoding is initialized as `UTF-8`, the test
fails (see screenshot for a better view - expected completion is
"эъйцукен", but actual completion is "эъ" in UTF-8 + "йцукен" in KOI8-R).
I can reproduce this on Mac as well by changing to `str_init_strings
("UTF-8")`.

I think this is of the same nature as other problems - with `UTF-8` in the
loop, the conversion of "йцукен" to KOI8-R (which probably shouldn't be
there) works and the test fails, but if some other encoding is used, the
string is simply copied without conversion and nobody notices because of
the lack of error handling.

I also tried running mc in a CP866 terminal, and two screenshots are
attached. It seems that there are indeed conversion/display bugs in the
editor. When I open the file from the test, it looks about right with no
translation, with UTF-8 it looks "sort of" right, but there is some
display bug, and with "KOI8-R" translation everything is wrong.

Unfortunately I'm completely lost with the editor stuff. Maybe if you
could help debug this, we can open a separate ticket for it, as it seems
to be a general problem with encoding translation in the editor...

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:27>

Ticket System

unread,
Sep 13, 2024, 7:08:43 AMSep 13
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

Regarding your changes to `str_nconvert_to_display` and
`str_nconvert_to_input`: this means that if conversion is not possible, it
is silently not done.

That's why I don't know how to implement error handling myself. I can't
throw an exception that will propagate the call stack, and if I just
return empty or unchanged strings, no one will notice that the conversion
didn't work. If I return NULL pointers, then the caller has to check, but
callers are usually pretty deep and can't display user messages or
anything.

So maybe it is better to return strings of the same size as the input, but
made of "????" - at least then the problem will be noticed? At the moment
I think that skipping the conversion as you did is what "fixed" the test,
but the conversions that happen might be wrong - see the strange way it
fails with "UTF-8".

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:28>

Ticket System

unread,
Sep 13, 2024, 8:44:39 AMSep 13
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by andrew_b):

Replying to [comment:27 zaytsev]:


> with UTF-8 it looks "sort of" right, but there is some display bug,

If you try input the cyrillic 'Н' or 'н' (Unicode 0x41d and 0x43d
respectively) you can see the dot as in screenshot (this looks like this
character is non-printable). All other letters are displayed correctly.
This is very strange.

> and with "KOI8-R" translation everything is wrong.

Can't confirm. Looks correct for me. Locale is ru_RU.ibm866 '''and'''
IBM866 in mc.charsets.
But can confirm if locale is ru_RU.ibm866 '''and''' CP866 in mc.charsets.
Seems the same name (IBM866 or CP866) must be as in locale as in
mc.charsets.

> Unfortunately I'm completely lost with the editor stuff. Maybe if you
could help debug this, we can open a separate ticket for it, as it seems
to be a general problem with encoding translation in the editor...

Ok.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:29>

Ticket System

unread,
Sep 13, 2024, 2:05:19 PMSep 13
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

> Seems the same name (IBM866 or CP866) must be as in locale as in
mc.charsets.

You are right! I don't have IBM866 on Mac, so I started mc as follows and
set the Terminal to Russian (DOS) - see screenshot.

{{{
zaytsev@Yurys-MBP ~ % locale -a | grep 866
ru_RU.CP866
zaytsev@Yurys-MBP ~ % export LANG=ru_RU.CP866 LC_ALL=ru_RU.CP866
zaytsev@Yurys-MBP ~ % locale
LANG="ru_RU.CP866"
LC_COLLATE="ru_RU.CP866"
LC_CTYPE="ru_RU.CP866"
LC_MESSAGES="ru_RU.CP866"
LC_MONETARY="ru_RU.CP866"
LC_NUMERIC="ru_RU.CP866"
LC_TIME="ru_RU.CP866"
LC_ALL="ru_RU.CP866"
}}}

After I changed IBM866 to CP866 in the charsets, the translation to KOI8-R
started working correctly, but I still have the weird non-printable
character for UTF-8, which is not right.

So do you have IBM866 on your system and no CP866? If yes, then I think we
have a rather annoying problem, because we can't make an entry that works
on all systems, unless we add a concept of aliases :( In any case, since
there is no error handling, instead of showing the error message or
interacting with the user somehow, mcedit just does a completely wrong
translation...

Maybe when the main translation converters are initialized, if `iconv`
returns an invalid converter, we can alert the user?

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:30>

Ticket System

unread,
Sep 13, 2024, 2:42:53 PMSep 13
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by andrew_b):

Replying to [comment:30 zaytsev]:


> So do you have IBM866 on your system and no CP866?

Yes. This is glibc.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:31>

Ticket System

unread,
Sep 14, 2024, 2:31:24 AMSep 14
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

> Yes. This is glibc.

I see, you are right: glibc only has `IBM866` - I think this is where it
all comes from. I tested on the latest Fedora: it doesn't have 866
locales, so I generated them:

{{{
root@fedora:~# localedef --no-archive --inputfile=ru_RU --charmap=IBM866
ru_RU.cp866
root@fedora:~# localedef --no-archive --inputfile=ru_RU --charmap=IBM866
ru_RU.ibm866
root@fedora:~# locale -a | grep 866
ru_RU.cp866
ru_RU.ibm866
root@fedora:~# iconv -l | grep 866
866//
866NAV//
CP866//
CP866NAV//
CSIBM866//
IBM866//
IBM866NAV//
}}}

When I start mc with these locales and terminal set to 866, translation
works regardless of the entry in `mc.charsets`. The only thing that
doesn't work reproducibly is the Russian `н`, which is shown as
unprintable.

What is your Linux system and what does iconv say? Is it possible that you
don't have the `CP866` alias in iconv?

{{{
vagrant@openindiana:~/src/mc/tests/src/editor$ iconv -l | grep 866
CP866 (CP866, CP-866, CP_866, 866),
IBM-866,

zaytsev@Yurys-MBP ~ % iconv -l | grep 866
CP866 866 CSIBM866 IBM866 MSCP866
}}}

On Solaris it doesn't even have `IBM866` as an alias, which is why the
tests segfaulted, on Mac it has an alias, but then the same strange thing
happens as on your system.

The change to IBM866 in our code from CP866 happened in changeset:56d95515
without any explanation.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:32>

Ticket System

unread,
Sep 14, 2024, 3:09:04 AMSep 14
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

Actually, I wonder why the values in `mc.charset` and their correspondence
to the current system locale matter at all. I thought they were just for
the transcoding menu, to map iconv-accepted conversion names to human-
readable descriptions. It sounds completely wrong for them to affect
anything else. If we could fix this, we could use aliases that work on all
systems.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:33>

Ticket System

unread,
Sep 14, 2024, 3:28:51 AMSep 14
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by andrew_b):

Replying to [comment:32 zaytsev]:


> When I start mc with these locales and terminal set to 866, translation

works regardless of the entry in mc.charsets. The only thing that doesn't
work reproducibly is the Russian н, which is shown as unprintable.

I've created ru_RU.cp866:
{{{


# localedef --no-archive --inputfile=ru_RU --charmap=IBM866 ru_RU.cp866
}}}

In ru_RU.CP866 locale and CP866 terminal character encoding:
* CP866 in mc.charsets: translation to KOI8-R doesn't work as in your
[/attachment/ticket/3972/Screenshot%202024-09-13%20at%2012.37.54.png
Screenshot 2024-09-13 at 12.37.54]
* IBM866 in mc.charsets: translation to KOI8-R works correctly.

In ru_RU.IBM866 locale and CP866 terminal character encoding:
* CP866 in mc.charsets: translation to KOI8-R doesn't work
* IBM866 in mc.charsets: translation to KOI8-R works correctly.

With CP866 in mc.charsets, translation to KOI8-R doesn't work in both
cases. Very strange.

> What is your Linux system

[https://www.basealt.ru/en/download#c1665 ALT Workstation 10]

and what does iconv say? Is it possible that you don't have the `CP866`
alias in iconv?

{{{
$ iconv -l | grep 866


866//
866NAV//
CP866//
CP866NAV//
CSIBM866//
IBM866//
IBM866NAV//
}}}

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:34>

Ticket System

unread,
Sep 14, 2024, 3:35:00 AMSep 14
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by andrew_b):

Regardless of 866 locale name (ru_RU.CP866 or ru_RU.IBM866),
`nl_langinfo(CODESET)` returns `IBM866` (glibc doesn't have `CP866`).

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:35>

Ticket System

unread,
Sep 14, 2024, 4:44:59 AMSep 14
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

> Very strange.

I think I'm beginning to understand why this happens. Your hunch about
`nl_langinfo` is correct.

In `check_codeset`, the result of `str_detect_termencoding` is compared to
the values in the `mc.charset` file to set `display_codepage`. In your
case it's `IBM866` and in mine it's `CP866`. If there is no match, the
result is `ASCII`. If `display_codepage` is set to a wrong value, things
break - but no warning is shown to the user.

https://github.com/MidnightCommander/mc/blob/master/src/main.c#L103

It's possible to find the same name or alias for important encodings on
different platforms that are supported by all iconvs (like `CP866` or
`CP1251`), but charmap filenames are incompatible between glibc and other
libcs such as BSD, Solaris, AIX and musl.

However, mc relies on charmap names instead of iconv names because it
wants to display "friendly" names of encodings in dialogs, and this
requires a key. This is also why a hack for ANSI1251 was added for Solaris
in the past.

I wonder how we can solve this problem:

* Is it possible to decouple `display_codepage` & `source_codepage` from
`mc.charsets`? I guess not :(
* Maybe we could introduce aliases to `mc.charset` (and change the format
to something that can be automatically parsed by glib to throw away old
complicated custom parsing code)?
* Another (bad?) idea is to make entries in `mc.charsets` like this:

{{{
IBM866 CP 866 (Linux)
CP866 CP 866 (Mac, Solaris)
}}}

Any thoughts on what we should do? Also, I'm not sure if it's possible to
show a warning in `check_codeset`...

For letter "Н" I still have no explanation :(

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:36>

Ticket System

unread,
Sep 14, 2024, 8:45:04 AMSep 14
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by andrew_b):

Replying to [comment:36 zaytsev]:


> Another (bad?) idea is to make entries in mc.charsets like this:

What about aliases: comma-separated items in 1st column:
{{{
IBM866,CP866 CP 866
CP1251,Windows-1251 Windows 1251
}}}

> For letter "Н" I still have no explanation :(

src/editor/editdraw.c:606: get utf-8 char from editbuffer.
src/editor/editdraw.c:732: convert utf-8 char to 8-bit one.
src/editor/editdraw.c:765: mis-interpretate 8-bit char as gnichar and pass
it to g_unichar_isprint().

I think I've fixed that. Patch edit_printable.patch is attached: the
(edit->utf8) branch (src/editor/editdraw.c:763) was removed.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:37>

Ticket System

unread,
Sep 14, 2024, 1:31:18 PMSep 14
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

> What about aliases: comma-separated items in 1st column

Yes, we can do that. Rewrite the parser with
`g_data_input_stream_read_line` & `g_strsplit` :) But what about the data
structure? Currently it's an array of `codepage_desc`. If we don't change
it, we might as well add additional entries like I suggested.

If we do change it (support aliases), then we can do something like add an
`aliases` GArray, but then we have to be very careful to make sure that
whenever encoding comes from "outside", aliases are considered and it's
normalized to an `id`. The first alias (id) will then be what iconv
accepts everywhere, and the next aliases are charmap names on different
systems.

> Patch version 2

Wow! Will have a look. I was thinking, maybe introduce a constant like
`UNPRINTABLE` for '.'?

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:38>

Ticket System

unread,
Sep 15, 2024, 5:56:01 AMSep 15
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

I looked a bit more into implementing aliases support. As I suggested,
there are two ways to do it:

1. Parse file into existing data structure (id & name)
2. Parse file into new data structure (id, aliases & name)

I tried the second option, which I think is the only one that makes things
better, but it just seems too crazy:

My understanding is that we need to check all usages of
`get_codepage_index` and `get_codepage_id` and the variables where the
results are stored. Every place where some comparisons are done needs to
be replaced with a new function like `codepage_equals` which takes the
aliases into account. I think it's about ~100 places to check.

So I tried the first option just to see how it feels and came up with a
patch (I know it's dirty, just to show what I mean!). I'm not sure if it
makes things any better compared to manual entries in the file. You need
some kind of way to disambiguate the descriptions for aliases, so I added
the `id` in parentheses. None of this feels like it's worth it... except
maybe if you like my `gio` / `glib` conversion instead of the usual
C-style manual parsing.

P.S. I also threw away the `default` thing, which I think was never used.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:39>

Ticket System

unread,
Sep 15, 2024, 10:02:36 AMSep 15
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by andrew_b):

Aliases mean that we will have several codepages assigned to one menu
item. We don't want several menu items with same text, are we? If one menu
item will cover several codepages we should somehow try them one by one to
found workable convertor.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:40>

Ticket System

unread,
Sep 15, 2024, 11:40:30 AMSep 15
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com, slava...@gmail.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+---------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: Future Releases
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495
Branch state: no branch | Votes for changeset:
--------------------------+---------------------------------------

Comment (by zaytsev):

> Aliases mean that we will have several codepages assigned to one menu
item.

Yes, this is the option (2) described in comment:39. But how to do this?
Do you have any idea how to avoid going through all the call sites? I will
not manage to rewrite this in several months. Parsing turned out to be the
"easy" part...

Maybe we can do more hacks like the Solaris one, even though it's
disgusting? I think not many code pages are affected. I have started to
make a list of differences between platforms.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:41>

Ticket System

unread,
Sep 24, 2024, 6:21:37 AMSep 24
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+----------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: 4.8.33

Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: on review | Votes for changeset:
--------------------------+----------------------------------
Changes (by zaytsev):

* cc: slavazanko (removed)
* branch_state: no branch => on review
* milestone: Future Releases => 4.8.33


Comment:

Branch: 3972_fix_illumos
Initial changeset:c4a29da290dd6ebf6fb88071f9e6456820e92f8c

I have rebased everything, cleaned up the patches and went for the
encoding name detection. The other options are just an insane amount of
work with the current code base. Follow-up ticket is #4591. Maybe one day
someone could have a look.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:43>

Ticket System

unread,
Sep 30, 2024, 4:04:08 AMSep 30
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+----------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: on review | Votes for changeset:
--------------------------+----------------------------------

Comment (by zaytsev):

It seems that there is still one test, that now fails on Linux :( The
paths are dropped due to missing error handling as explained in #4591.

{{{
FAIL: vfs_path_string_convert
}}}

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:44>

Ticket System

unread,
Sep 30, 2024, 5:16:34 AMSep 30
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+----------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: on review | Votes for changeset:
--------------------------+----------------------------------

Comment (by zaytsev):

The test passes on Linux if I add `CP866` in addition to `IBM866` to
`mc.charsets` for tests.

{{{
IBM866 CP 866
CP866 CP 866
}}}

What is this evil magic again :( I thought that `iconv` translations were
independent of the display stuff, but somehow there is still a
relationship.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:45>

Ticket System

unread,
Oct 1, 2024, 3:05:12 AMOct 1
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+----------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: on review | Votes for changeset:
--------------------------+----------------------------------

Comment (by andrew_b):

Replying to [comment:45 zaytsev]:


> The test passes on Linux if I add `CP866` in addition to `IBM866` to
`mc.charsets` for tests.
>
> {{{
> IBM866 CP 866
> CP866 CP 866
> }}}
>
> What is this evil magic again :( I thought that `iconv` translations
were independent of the display stuff, but somehow there is still a
relationship.

The test is passed if encoding in mc.charsets is the same as in
vfs_path_string_convert.c.

{{{
sed -i -e 's/CP866/IBM866/' vfs_path_string_convert.c
}}}
makes the test passed with IBM866 in mc.charsets.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:46>

Ticket System

unread,
Oct 1, 2024, 4:24:55 AMOct 1
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+----------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: on review | Votes for changeset:
--------------------------+----------------------------------

Comment (by zaytsev):

> The test is passed if encoding in mc.charsets is the same as in
vfs_path_string_convert.c.

Yes, I understand that. My question is WHY?

I found the answer for initialisation (codeset must match), and figured
it's impossible for me to fix without rewriting mc. But I was hoping that
conversions would be independent of this list of encodings: The `iconv`
function works with both `CP866` and `IBM866`.

The only suspicious place I've found is `is_supported_encoding`, but when
I set it to return `true` it still doesn't work. There seems to be some
other hidden connection.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:47>

Ticket System

unread,
Oct 3, 2024, 5:40:53 AMOct 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+----------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: on review | Votes for changeset:
--------------------------+----------------------------------

Comment (by zaytsev):

It is the `is_supported_encoding` function! Apparently the source files
were not rebuilt properly the last time I checked, so this confused me.
This function determines if encoding is supported by looking at the
charsets table, which I think is wrong. My idea would be to try creating a
converter with iconv instead, and if it works - great, if not, the
encoding is not supported.

{{{
diff --git a/lib/charsets.c b/lib/charsets.c
index ccaf4f6ae..67553fa62 100644
--- a/lib/charsets.c
+++ b/lib/charsets.c
@@ -267,17 +267,10 @@ get_codepage_index (const char *id)
gboolean
is_supported_encoding (const char *encoding)
{
- gboolean result = FALSE;
- guint t;
-
- for (t = 0; t < codepages->len; t++)
- {
- const char *id;
-
- id = ((codepage_desc *) g_ptr_array_index (codepages, t))->id;
- result |= (g_ascii_strncasecmp (encoding, id, strlen (id)) == 0);
- }
-
+ GIConv coder = str_crt_conv_from (encoding);
+ gboolean result = (coder != INVALID_CONV) ? TRUE : FALSE;
+ if (result)
+ str_close_conv (coder);
return result;
}
}}}

But I have the next problem: `canonicalize_pathname` test fails :(

{{{
/b/#enc:utf-8/../c != /c
/b/c/#enc:utf-8/.. != /b
}}}

Other test cases pass...

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:48>

Ticket System

unread,
Oct 3, 2024, 5:52:51 AMOct 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+----------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: on review | Votes for changeset:
--------------------------+----------------------------------

Comment (by zaytsev):

This seems to be a bug in the code: the old version of
`is_supported_encoding` would ignore everything in the encoding name if a
partial match was found. The new version, of course, wants a full match.
The code in these test cases passes `utf-8/../c` as the encoding name, and
the new function returns false.

I think a partial match is a bad idea. It could allow old code to accept
completely invalid encoding names like `koi8-rus`. It wouldn't be removed
from the path even though it's invalid and unsupported.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:49>

Ticket System

unread,
Oct 3, 2024, 7:32:26 AMOct 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+----------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: on review | Votes for changeset:
--------------------------+----------------------------------

Comment (by zaytsev):

Suggested fix in changeset:3ef535f8b48e850f7ef465c2dee36e9cde3f55b6 .
Please review.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:50>

Ticket System

unread,
Oct 3, 2024, 2:25:26 PMOct 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+----------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: on review | Votes for changeset:
--------------------------+----------------------------------

Comment (by andrew_b):

Replying to [comment:50 zaytsev]:


> Suggested fix in changeset:3ef535f8b48e850f7ef465c2dee36e9cde3f55b6 .
Please review.

Fixes:
* coding style;
* memory leaks in canonicalize_pathname_custom() due to usage of
vfs_get_encoding();
vfs_get_encoding() was move to the "public functions" section of file.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:51>

Ticket System

unread,
Oct 3, 2024, 3:02:44 PMOct 3
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
--------------------------+----------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: on review | Votes for changeset:
--------------------------+----------------------------------

Comment (by zaytsev):

Thank you very much! I didn't realise that this function allocates a new
string that needs to be freed :(

What about the rest? I think we fixed some bugs, but the aliases and error
handling stuff are more like design issues. I have created a follow up if
anyone wants to address this one day. As far as I'm concerned, this one is
done.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:52>

Ticket System

unread,
Oct 5, 2024, 4:26:16 AMOct 5
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
-------------------------+----------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: accepted
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: approved | Votes for changeset: andrew_b
-------------------------+----------------------------------
Changes (by andrew_b):

* votes: => andrew_b
* branch_state: on review => approved


Comment:

tests: vfs_path_string_convert - use UTF-8 to prevent creation of invalid
converters
tests: path_manipulations / path_serialize - use UTF-8 to prevent
creation of invalid converters

can be squashed.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:53>

Ticket System

unread,
Oct 5, 2024, 5:51:30 AMOct 5
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
-----------------------+----------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: testing

Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: fixed | Keywords:

Blocked By: | Blocking: 4495, 4591
Branch state: merged | Votes for changeset: committed-master
-----------------------+----------------------------------------
Changes (by zaytsev):

* status: accepted => testing
* votes: andrew_b => committed-master
* resolution: => fixed
* branch_state: approved => merged


Comment:

Merged as changeset:a24d45ffab3ee9f5e74f5b210f79e6ea91e3b69e . Wow!

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:54>

Ticket System

unread,
Oct 8, 2024, 2:54:49 AMOct 8
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
-----------------------+----------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: testing
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: fixed | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: merged | Votes for changeset: committed-master
-----------------------+----------------------------------------

Comment (by andrew_b):

An aftershok.

{{{
if ! xgettext -h 2>&1 | grep -e '--keyword=' >/dev/null ; then
}}}
This line in '''autogen.sh''' is not quite correct. The help of program is
translated and may not matched the pattern '--keyword='. Particularly, in
Russian
{{{
$ xgettext -h 2>&1 | grep -e '--keyword='
$
}}}
and po-related files are not created. Because
{{{
$ xgettext -h 2>&1 | grep -e '--keyword'
-k, --keyword[=СЛОВО] дополнительное ключевое слово для поиска
-k, --keyword не использовать ключевые слова по умолчанию
}}}

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:55>

Ticket System

unread,
Oct 8, 2024, 3:12:01 AMOct 8
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
-----------------------+----------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: testing
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: fixed | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: merged | Votes for changeset: committed-master
-----------------------+----------------------------------------

Comment (by zaytsev):

Oh, this is annoying, sorry.

What do you think is the best way to fix it? I wasn't happy about it
anyway, because of course the text might go unnoticed, but I think the
situation before, with random errors, was even worse. Do you have a better
idea?

* `grep -e '--keyword'` without =
* `LANG=C`
* Print it in red using ANSI sequences?
* Print to stderr?
* Go to the beginning of the script and `exit 1`?

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:56>

Ticket System

unread,
Oct 8, 2024, 3:37:27 AMOct 8
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
-----------------------+----------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: testing
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: fixed | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: merged | Votes for changeset: committed-master
-----------------------+----------------------------------------

Comment (by andrew_b):

`LANG=C xgettext --help`: output is in Russian.
`LC_MESSAGES xgettext --help`: output is in English

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:57>

Ticket System

unread,
Oct 17, 2024, 3:17:56 AM (9 days ago) Oct 17
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
-----------------------+----------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: testing
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: fixed | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: merged | Votes for changeset: committed-master
-----------------------+----------------------------------------

Comment (by zaytsev):

Unfortunately, I managed to also break a bootstrapped build on Solaris. Ok
to commit to master?

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:58>

Ticket System

unread,
Oct 17, 2024, 5:53:41 AM (9 days ago) Oct 17
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
-----------------------+----------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: testing
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: fixed | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: merged | Votes for changeset: committed-master
-----------------------+----------------------------------------

Comment (by andrew_b):

Ok.

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:59>

Ticket System

unread,
Oct 17, 2024, 7:06:44 AM (9 days ago) Oct 17
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
-----------------------+----------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: testing
Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: fixed | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: merged | Votes for changeset: committed-master
-----------------------+----------------------------------------

Comment (by zaytsev):

Done, thanks!

--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:60>

Ticket System

unread,
Oct 18, 2024, 3:11:54 PM (7 days ago) Oct 18
to mno...@startmail.com, yu...@shurup.com, mc-...@googlegroups.com
#3972: lib/vfs testsuite failures on illumos
-----------------------+----------------------------------------
Reporter: mnowak | Owner: zaytsev
Type: defect | Status: closed

Priority: major | Milestone: 4.8.33
Component: tests | Version: master
Resolution: fixed | Keywords:
Blocked By: | Blocking: 4495, 4591
Branch state: merged | Votes for changeset: committed-master
-----------------------+----------------------------------------
Changes (by zaytsev):

* status: testing => closed


--
Ticket URL: <http://www.midnight-commander.org/ticket/3972#comment:61>

Reply all
Reply to author
Forward
0 new messages