We've had one report of someone using a non-utf-8 encoding, which means there are probably more non-utf-8 users out there.
We shouldn't avoid doing the right thing - whatever that is - but breaking a user's long-working setup, for no clear benefit to that person, is not a good thing. So I am getting some cold feet here, at least for the imminent 1.26 release.
Here are some options, some of them feasible for 1.26 and some maybe not:
1. Keep the status quo: read files using the encoding of the system locale.
2. Switch to reading utf-8 always, ignoring the system locale. The idea floated above.
3. 2, but include a fallback flag like --use-system-locale for people who want to keep the old behaviour. (This was always my plan, time allowing.)
4. New idea: 1, but detect the most common system locale misconfigurations ($LANG unset, C, or some value not including variations of "utf8" ?) and in that case assume utf-8. (Too much complication ?)
5. 4, but don't use $LANG heuristics, instead catch and detect all decoding-related errors and in that case try again with utf-8. (Feasible ?)
6. Something else ?
Any thoughts ?