Hi, Stephan Weinberger wrote:I'm struggling with encodings (again), particularly the sketchy/broken implementation of TRANSLIT in my libiconv.//TRANSLIT would not work on inputs, as the target encoding (UTF-8) can handle all characters from whatever source encoding, so no transliteration is necessary when reading user input.
Sure. I was trying to use to_text(to_bytes(...)) to coerce it to
do the transliteration for me, but I only got "invalid character
sequence" errors (or very strange transliterations like "ö" ->
"A?" when trying to use "ASCII//TRANSLIT" as a target encoding.
After some googling I came to the conclusion that there are
various different implementations of iconv in circulation that all
do TRANSLIT a bit differently (or not at all).
That's what we had before setting the encoding. I don't recall having problems with static functions (the simul-efun does set_this_object(previous_object()); at the beginning).
Does simul_efun get the same special treatment for call_other() as input_to() does? I admit I haven't tried that yet.
A much cleaner solution IMHO would be a new hook H_MODIFY_INPUT, that is applied to _all_ incoming text.A companion to H_MODIFY_COMMAND for input_to()s would make sense, that one could get input_to information as well. Combining both into a single function for all text could be done on the mudlib site.
My thinking was that, if you want to do something like that it would likely affect all input anyways, so having one "central" hook would make more sense to me. The special hook for commands would be more geared towards aliases, handling of paralysis, etc. after that.
But either way is fine with me, as long as there is a way to capture any input at some point. Commands and input_to()s are the only ways to get text input, aren't they?
But would it be sufficient to get the already converted unicode string, or do you need the actual byte input? Because that would be messy. The conversion is done right at the beginning interwoven with telnet handling (as telnet handling can change the encoding) and before command (newline) detection.
Yes telneg handling and conversion to unicode should happen
before that. That's exactly what I _don't_ want to deal with in
the lib ;-)
So either H_MODIFY_INPUT right after telnegs/conversion but before
the branch into command/input_to, or H_MODIFY_INPUT_TO right
before calling the input_to target function.