Hi Kimmo,
> This is not a big problem, but is this something that could be fixed?
Frankly, I don't know if I can fix it at all or how long it will take to find the cause. I tried umlauts in Linux Mint now as well and visone behaved as it should, so I don't have an immediate failing test case.
visone itself doesn't do any kind of manual path or string manipulation, but defers all these manipulation operations to Apache Commons Configuration2 and Java (which internally might defer to the operating system). So the cause is likely hidden somewhere in the JDK or the Commons package, and even if I find it, I might not really be able to do anything about it.
I have one possible guess what's going on. More complex characters like umlauts often don't have a unique representation in Unicode, but several. For example, the umlaut ü can be represented as ü or u+◌̈. Even if you use the same unicode encoding like UTF-8, these two representations result in two different byte sequences. Because of this, there are a number of Unicode normalization algorithms that translate different representations of the same character into the one of them.
Likely, the file paths in on your system use one possible way to represent umlauts. Then visone extracts the directory, stores it as a string in the configuration, extracts it again and then passes it tot he file dialog, which tries to look up the directory. Maybe this sequence of operations triggers some kind of Unicode normalization at some point, replacing one representation of umlauts by another. So the paths in the file system uses one representation of umlauts, but visone/Java ends up looking for a directory using the other representation of umlauts.
Unicode support on Linux often just boils down to "we can just use UTF-8 everywhere and then compare strings byte by byte". That's also the case for file names, which are basically just treated as byte sequences. But that means that file search doesn't handle these different ways to encode the same character in Unicode. Rather, when the byte sequences for the directory names does not match because the representation of an umlaut differs, the directory won't be found anymore.
So my guess is: Somewhere between extraction of the directory path, storage in and extraction from the configuration, conversion into a File object and lookup of the directory, some kind of Unicode normalization is performed. This exchanges the representation of the umlaut character. Because Linux effectively compares file paths byte by byte, lookup of the directory fails, and so the file dialog defaults to the home directory.
Best,
Julian