Git Bash not decoding unicode characters?

Alex Budovski

unread,

Jan 8, 2011, 1:15:17 AM1/8/11

to msysgit

Hi,

if you look at the log of say the git project from git bash, you'll
find lines like

Nguy<E1><BB><85>n Th<C3><A1>i Ng<E1><BB><8D>c Duy <pcl...@gmail.com>

where I assume things like <E1><BB><85> are utf-8 sequences that
weren't decoded at all by the pager (gnu less).

Is this a known issue, or is there a solution?

-Alex

Alex Budovski

unread,

Jan 8, 2011, 2:50:39 AM1/8/11

to msysgit

A related problem, possibly even simpler is the following:

In Git Bash:

$ xxd -g1 ~/a.txt
0000000: c3 a9 // file contains 2 bytes: 0xC3 0xA9
$ cat ~/a.txt
Ac

The file contains the utf-8 sequence for U+00E9 LATIN SMALL LETTER E WITH ACUTE.

But the display is 2 characters A, followed by c. It didn't decode it
correctly as 1 character.

I'm using a TrueType font in the console window (Lucida console).

If I use cmd.exe, by default it also mis-decodes is (since the default
codepage is OEM == 437), but I can easily change it to UTF-8 by
running: chcp 65001, then 'type a.txt' will decode and print the
correct character, é.

I've seen issue 358 but I don't believe anything there remedies this situation.

-Alex

Karsten Blees

unread,

Jan 8, 2011, 6:22:56 PM1/8/11

to msysGit

Hi,

MSys programs (i.e. bash, less, cat etc. that come with git) don't
support Unicode/UTF-8. There are several workarounds for better pager
UTF-8 support though, see http://groups.google.com/group/msysgit/msg/14ad6d31d4fa8c0e

Karsten

On Jan 8, 7:15 am, Alex Budovski <abudov...@gmail.com> wrote:
> Hi,
>
> if you look at the log of say the git project from git bash, you'll
> find lines like
>

> Nguy<E1><BB><85>n Th<C3><A1>i Ng<E1><BB><8D>c Duy <pclo...@gmail.com>

Alex Budovski

unread,

Jan 8, 2011, 7:56:46 PM1/8/11

to Karsten Blees, msysGit

Thanks for the workaround -- though it is a bit unfortunate to hear
the news. Hopefully this will be addressed at some stage in the future
(perhaps by upgrading the versions of msys utilities to match
cygwin's, since cygwin less seems to work).

Reply all

Reply to author

Forward