here is v13 of the unicode patch series:
git: http://repo.or.cz/w/git/mingw/4msysgit/kblees.git/kb/unicode-v13
less-444 (unchanged):
https://github.com/kblees/msysgit/commits/kb/msys/less
Unicode msys.dll (unchanged):
https://github.com/kblees/msysgit/commits/kb/msys/unicode
A git installer that bundles the Unicode msys.dll, less-444 and git
v1.7.7.1 with unicode-v13 patches can be found here:
https://docs.google.com/uc?id=0BxXUoUg2r8ftNWZkY2U1NGQtZDVmNy00N2FhLWJhZTItYjRjYTg3NGJhNzQx&export=download
And TortoiseGit V1.7.5.0 built with CP_ACP replaced by CP_UTF8
(unchanged):
32 bit:
https://docs.google.com/uc?id=0BxXUoUg2r8ftYjFmMzg0MmItYjZhMy00MjM4LWFkYjktN2RiOTUxNDdiMzdk&export=download
Changes since kb/unicode-v12:
- Win32: Thread-safe windows console output:
- added handling of split UTF-8 sequences
- minimized changes (old initialization code, didn't move
write_console)
- reformatted to fit into 80 cols with tab width 8
- improved commit message
- Win32: add Unicode conversion functions (and subsequent patches)
- renamed conversion functions to xutftowcs/xwcstoutf
- changed ENAMETOOLONG to ERANGE
- added filename-specific xutftowcs_path
Commit logs ('!': modified in v13)
---
[01/25] MSVC: link dynamically to the CRT
[02/25] git-gui: fix encoding in git-gui file browser
[03/25] gitk: fix file name encoding in diff hunk headers
[04/25] Revert "Disable test on MinGW that challenges its bash quoting"
! [05/25] Win32: Thread-safe windows console output
! [06/25] Win32: add Unicode conversion functions
! [07/25] Win32: Unicode file name support (except dirent)
! [08/25] Win32: Unicode file name support (dirent)
[09/25] Unicode file name support (gitk and git-gui)
! [10/25] Win32: Unicode arguments (outgoing)
! [11/25] Win32: Unicode arguments (incoming)
! [12/25] Win32: sync Unicode console output and file system
! [13/25] Win32: Unicode environment (outgoing)
! [14/25] Win32: Unicode environment (incoming)
[15/25] MinGW: disable legacy encoding tests
[16/25] Win32: fix environment memory leaks
[17/25] Win32: unify environment case-sensitivity
[18/25] Win32: simplify internal mingw_spawn* APIs
[19/25] Win32: move environment functions
[20/25] Win32: unify environment function names
[21/25] Win32: move environment block creation to a helper method
[22/25] Win32: don't copy the environment twice when spawning child
processes
[23/25] Win32: reduce environment array reallocations
[24/25] Win32: keep the environment sorted
[25/25] Win32: patch Windows environment on startup
Makefile | 8 +-
compat/mingw.c | 703
+++++++++++++++++++++++++++++++----------------
compat/mingw.h | 134 ++++++++-
compat/win32/dirent.c | 32 +--
compat/win32/dirent.h | 2 +-
compat/winansi.c | 387 +++++++++++++++++---------
git-gui/git-gui.sh | 6 +-
git-gui/lib/browser.tcl | 2 +-
git-gui/lib/index.tcl | 6 +-
gitk-git/gitk | 15 +-
run-command.c | 10 +-
t/t3901-i18n-patch.sh | 19 +-
t/t4201-shortlog.sh | 6 +-
t/t5505-remote.sh | 5 +-
t/t8005-blame-i18n.sh | 8 +-
15 files changed, 900 insertions(+), 443 deletions(-)
--
Diff between v12 and v13...
---
$ git diff -p --stat kb/unicode-v12..kb/unicode-v13
compat/mingw.c | 65 +++++++++++++---------------
compat/mingw.h | 39 +++++++++++++----
compat/win32/dirent.c | 9 +++-
compat/winansi.c | 113
+++++++++++++++++++++++++++++++++----------------
4 files changed, 144 insertions(+), 82 deletions(-)
diff --git a/compat/mingw.c b/compat/mingw.c
index 15c1029..2104f25 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -205,7 +205,7 @@ int mingw_unlink(const char *pathname)
{
int ret, tries = 0;
wchar_t wpathname[MAX_PATH];
- if (utftowcs(wpathname, pathname, MAX_PATH) < 0)
+ if (xutftowcs_path(wpathname, pathname) < 0)
return -1;
/* read-only files cannot be removed */
@@ -253,7 +253,7 @@ int mingw_rmdir(const char *pathname)
{
int ret, tries = 0;
wchar_t wpathname[MAX_PATH];
- if (utftowcs(wpathname, pathname, MAX_PATH) < 0)
+ if (xutftowcs_path(wpathname, pathname) < 0)
return -1;
while ((ret = _wrmdir(wpathname)) == -1 && tries <
ARRAY_SIZE(delay)) {
@@ -293,7 +293,7 @@ void mingw_mark_as_git_dir(const char *dir)
{
wchar_t wdir[MAX_PATH];
if (hide_dotfiles != HIDE_DOTFILES_FALSE && !is_bare_repository())
- if (utftowcs(wdir, dir, MAX_PATH) < 0 ||
make_hidden(wdir))
+ if (xutftowcs_path(wdir, dir) < 0 || make_hidden(wdir))
warning("Failed to make '%s' hidden", dir);
git_config_set("core.hideDotFiles",
hide_dotfiles == HIDE_DOTFILES_FALSE ? "false" :
@@ -305,7 +305,7 @@ int mingw_mkdir(const char *path, int mode)
{
int ret;
wchar_t wpath[MAX_PATH];
- if (utftowcs(wpath, path, MAX_PATH) < 0)
+ if (xutftowcs_path(wpath, path) < 0)
return -1;
ret = _wmkdir(wpath);
if (!ret && hide_dotfiles == HIDE_DOTFILES_TRUE) {
@@ -335,7 +335,7 @@ int mingw_open (const char *filename, int oflags, ...)
if (filename && !strcmp(filename, "/dev/null"))
filename = "nul";
- if (utftowcs(wfilename, filename, MAX_PATH) < 0)
+ if (xutftowcs_path(wfilename, filename) < 0)
return -1;
fd = _wopen(wfilename, oflags, mode);
@@ -385,8 +385,8 @@ FILE *mingw_fopen (const char *filename, const char
*otype)
hide = access(filename, F_OK);
if (filename && !strcmp(filename, "/dev/null"))
filename = "nul";
- if (utftowcs(wfilename, filename, MAX_PATH) < 0 ||
- utftowcs(wotype, otype, 10) < 0)
+ if (xutftowcs_path(wfilename, filename) < 0 ||
+ xutftowcs(wotype, otype, 10) < 0)
return NULL;
file = _wfopen(wfilename, wotype);
if (file && hide && make_hidden(wfilename))
@@ -404,8 +404,8 @@ FILE *mingw_freopen (const char *filename, const char
*otype, FILE *stream)
hide = access(filename, F_OK);
if (filename && !strcmp(filename, "/dev/null"))
filename = "nul";
- if (utftowcs(wfilename, filename, MAX_PATH) < 0 ||
- utftowcs(wotype, otype, 10) < 0)
+ if (xutftowcs_path(wfilename, filename) < 0 ||
+ xutftowcs(wotype, otype, 10) < 0)
return NULL;
file = _wfreopen(wfilename, wotype, stream);
if (file && hide && make_hidden(wfilename))
@@ -416,7 +416,7 @@ FILE *mingw_freopen (const char *filename, const char
*otype, FILE *stream)
int mingw_access(const char *filename, int mode)
{
wchar_t wfilename[MAX_PATH];
- if (utftowcs(wfilename, filename, MAX_PATH) < 0)
+ if (xutftowcs_path(wfilename, filename) < 0)
return -1;
/* X_OK is not supported by the MSVCRT version */
return _waccess(wfilename, mode & ~X_OK);
@@ -425,7 +425,7 @@ int mingw_access(const char *filename, int mode)
int mingw_chdir(const char *dirname)
{
wchar_t wdirname[MAX_PATH];
- if (utftowcs(wdirname, dirname, MAX_PATH) < 0)
+ if (xutftowcs_path(wdirname, dirname) < 0)
return -1;
return _wchdir(wdirname);
}
@@ -433,7 +433,7 @@ int mingw_chdir(const char *dirname)
int mingw_chmod(const char *filename, int mode)
{
wchar_t wfilename[MAX_PATH];
- if (utftowcs(wfilename, filename, MAX_PATH) < 0)
+ if (xutftowcs_path(wfilename, filename) < 0)
return -1;
return _wchmod(wfilename, mode);
}
@@ -465,7 +465,7 @@ static int do_lstat(int follow, const char *file_name,
struct stat *buf)
{
WIN32_FILE_ATTRIBUTE_DATA fdata;
wchar_t wfilename[MAX_PATH];
- if (utftowcs(wfilename, file_name, MAX_PATH) < 0)
+ if (xutftowcs_path(wfilename, file_name) < 0)
return -1;
if (GetFileAttributesExW(wfilename, GetFileExInfoStandard,
&fdata)) {
@@ -608,7 +608,7 @@ int mingw_utime (const char *file_name, const struct
utimbuf *times)
int fh, rc;
DWORD attrs;
wchar_t wfilename[MAX_PATH];
- if (utftowcs(wfilename, file_name, MAX_PATH) < 0)
+ if (xutftowcs_path(wfilename, file_name) < 0)
return -1;
/* must have write permission */
@@ -656,11 +656,11 @@ unsigned int sleep (unsigned int seconds)
char *mingw_mktemp(char *template)
{
wchar_t wtemplate[MAX_PATH];
- if (utftowcs(wtemplate, template, MAX_PATH) < 0)
+ if (xutftowcs_path(wtemplate, template) < 0)
return NULL;
if (!_wmktemp(wtemplate))
return NULL;
- if (wcstoutf(template, wtemplate, strlen(template) + 1) < 0)
+ if (xwcstoutf(template, wtemplate, strlen(template) + 1) < 0)
return NULL;
return template;
}
@@ -729,7 +729,7 @@ char *mingw_getcwd(char *pointer, int len)
wchar_t wpointer[MAX_PATH];
if (!_wgetcwd(wpointer, MAX_PATH))
return NULL;
- if (wcstoutf(pointer, wpointer, len) < 0)
+ if (xwcstoutf(pointer, wpointer, len) < 0)
return NULL;
for (i = 0; pointer[i]; i++)
if (pointer[i] == '\\')
@@ -1056,7 +1056,7 @@ static wchar_t *make_environment_block(char
**deltaenv)
for (i = 0; tmpenv[i]; i++) {
size = 2 * strlen(tmpenv[i]) + 2;
ALLOC_GROW(wenvblk, (envblkpos + size) * sizeof(wchar_t),
envblksz);
- envblkpos += utftowcs(&wenvblk[envblkpos], tmpenv[i],
size) + 1;
+ envblkpos += xutftowcs(&wenvblk[envblkpos], tmpenv[i],
size) + 1;
}
/* add final \0 terminator */
wenvblk[envblkpos] = 0;
@@ -1114,9 +1114,9 @@ static pid_t mingw_spawnve_fd(const char *cmd, const
char **argv, char **deltaen
si.hStdOutput = winansi_get_osfhandle(fhout);
si.hStdError = winansi_get_osfhandle(fherr);
- if (utftowcs(wcmd, cmd, MAX_PATH) < 0)
+ if (xutftowcs_path(wcmd, cmd) < 0)
return -1;
- if (dir && utftowcs(wdir, dir, MAX_PATH) < 0)
+ if (dir && xutftowcs_path(wdir, dir) < 0)
return -1;
/* concatenate argv, quoting args as we go */
@@ -1137,7 +1137,7 @@ static pid_t mingw_spawnve_fd(const char *cmd, const
char **argv, char **deltaen
}
wargs = xmalloc((2 * args.len + 1) * sizeof(wchar_t));
- utftowcs(wargs, args.buf, 2 * args.len + 1);
+ xutftowcs(wargs, args.buf, 2 * args.len + 1);
strbuf_release(&args);
wenvblk = make_environment_block(deltaenv);
@@ -1599,9 +1599,7 @@ int mingw_rename(const char *pold, const char *pnew)
DWORD attrs, gle;
int tries = 0;
wchar_t wpold[MAX_PATH], wpnew[MAX_PATH];
- if (utftowcs(wpold, pold, MAX_PATH) < 0)
- return -1;
- if (utftowcs(wpnew, pnew, MAX_PATH) < 0)
+ if (xutftowcs_path(wpold, pold) < 0 || xutftowcs_path(wpnew, pnew)
< 0)
return -1;
/*
@@ -1842,9 +1840,8 @@ int link(const char *oldpath, const char *newpath)
typedef BOOL (WINAPI *T)(LPCWSTR, LPCWSTR, LPSECURITY_ATTRIBUTES);
static T create_hard_link = NULL;
wchar_t woldpath[MAX_PATH], wnewpath[MAX_PATH];
- if (utftowcs(woldpath, oldpath, MAX_PATH) < 0)
- return -1;
- if (utftowcs(wnewpath, newpath, MAX_PATH) < 0)
+ if (xutftowcs_path(woldpath, oldpath) < 0 ||
+ xutftowcs_path(wnewpath, newpath) < 0)
return -1;
if (!create_hard_link) {
@@ -1973,7 +1970,7 @@ int mingw_offset_1st_component(const char *path)
return offset + is_dir_sep(path[offset]);
}
-int utftowcsn(wchar_t *wcs, const char *utfs, size_t wcslen, int utflen)
+int xutftowcsn(wchar_t *wcs, const char *utfs, size_t wcslen, int utflen)
{
int upos = 0, wpos = 0;
const unsigned char *utf = (const unsigned char*) utfs;
@@ -1993,7 +1990,7 @@ int utftowcsn(wchar_t *wcs, const char *utfs, size_t
wcslen, int utflen)
if (wpos >= wcslen) {
wcs[wpos] = 0;
- errno = ENAMETOOLONG;
+ errno = ERANGE;
return -1;
}
@@ -2045,7 +2042,7 @@ int utftowcsn(wchar_t *wcs, const char *utfs, size_t
wcslen, int utflen)
return wpos;
}
-int wcstoutf(char *utf, const wchar_t *wcs, size_t utflen)
+int xwcstoutf(char *utf, const wchar_t *wcs, size_t utflen)
{
if (!wcs || !utf || utflen < 1) {
errno = EINVAL;
@@ -2054,7 +2051,7 @@ int wcstoutf(char *utf, const wchar_t *wcs, size_t
utflen)
utflen = WideCharToMultiByte(CP_UTF8, 0, wcs, -1, utf, utflen,
NULL, NULL);
if (utflen)
return utflen - 1;
- errno = ENAMETOOLONG;
+ errno = ERANGE;
return -1;
}
@@ -2099,14 +2096,14 @@ void mingw_startup()
buffer = xmalloc(maxlen);
/* convert command line arguments and environment to UTF-8 */
- len = wcstoutf(buffer, _wpgmptr, maxlen);
+ len = xwcstoutf(buffer, _wpgmptr, maxlen);
__argv[0] = xmemdupz(buffer, len);
for (i = 1; i < argc; i++) {
- len = wcstoutf(buffer, wargv[i], maxlen);
+ len = xwcstoutf(buffer, wargv[i], maxlen);
__argv[i] = xmemdupz(buffer, len);
}
for (i = 0; wenv[i]; i++) {
- len = wcstoutf(buffer, wenv[i], maxlen);
+ len = xwcstoutf(buffer, wenv[i], maxlen);
environ[i] = xmemdupz(buffer, len);
}
environ[i] = NULL;
diff --git a/compat/mingw.h b/compat/mingw.h
index 675eb06..675d19a 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -375,13 +375,33 @@ void mingw_mark_as_git_dir(const char *dir);
* utflen: size of string to convert, or -1 if 0-terminated
*
* Returns:
- * length of converted string (_wcslen(wcs)), or -1 on failure (errno is
set
- * to EINVAL or ENAMETOOLONG)
+ * length of converted string (_wcslen(wcs)), or -1 on failure
+ *
+ * Errors:
+ * EINVAL: one of the input parameters is invalid (e.g. NULL)
+ * ERANGE: the output buffer is too small
+ */
+int xutftowcsn(wchar_t *wcs, const char *utf, size_t wcslen, int utflen);
+
+/**
+ * Simplified variant of xutftowcsn, assumes input string is
\0-terminated.
+ */
+static inline int xutftowcs(wchar_t *wcs, const char *utf, size_t wcslen)
+{
+ return xutftowcsn(wcs, utf, wcslen, -1);
+}
+
+/**
+ * Simplified file system specific variant of xutftowcsn, assumes output
+ * buffer size is MAX_PATH wide chars and input string is \0-terminated,
+ * fails with ENAMETOOLONG if input string is too long.
*/
-int utftowcsn(wchar_t *wcs, const char *utf, size_t wcslen, int utflen);
-static inline int utftowcs(wchar_t *wcs, const char *utf, size_t wcslen)
+static inline int xutftowcs_path(wchar_t *wcs, const char *utf)
{
- return utftowcsn(wcs, utf, wcslen, -1);
+ int result = xutftowcsn(wcs, utf, MAX_PATH, -1);
+ if (result < 0 && errno == ERANGE)
+ errno = ENAMETOOLONG;
+ return result;
}
/**
@@ -410,10 +430,13 @@ static inline int utftowcs(wchar_t *wcs, const char
*utf, size_t wcslen)
* utflen: size of target buffer
*
* Returns:
- * length of converted string, or -1 on failure (errno is set to EINVAL
or
- * ENAMETOOLONG)
+ * length of converted string, or -1 on failure
+ *
+ * Errors:
+ * EINVAL: one of the input parameters is invalid (e.g. NULL)
+ * ERANGE: the output buffer is too small
*/
-int wcstoutf(char *utf, const wchar_t *wcs, size_t utflen);
+int xwcstoutf(char *utf, const wchar_t *wcs, size_t utflen);
/*
* A replacement of main() that adds win32 specific initialization.
diff --git a/compat/win32/dirent.c b/compat/win32/dirent.c
index 37f56b7..c69a689 100644
--- a/compat/win32/dirent.c
+++ b/compat/win32/dirent.c
@@ -9,7 +9,7 @@ struct DIR {
static inline void finddata2dirent(struct dirent *ent, WIN32_FIND_DATAW
*fdata)
{
/* convert UTF-16 name to UTF-8 */
- wcstoutf(ent->d_name, fdata->cFileName, sizeof(ent->d_name));
+ xwcstoutf(ent->d_name, fdata->cFileName, sizeof(ent->d_name));
/* Set file type, based on WIN32_FIND_DATA */
if (fdata->dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY)
@@ -27,9 +27,12 @@ DIR *opendir(const char *name)
DIR *dir;
/* convert name to UTF-16, check length (-2 for '/' '*') */
- len = utftowcs(pattern, name, MAX_PATH - 2);
- if (len < 0)
+ len = xutftowcs(pattern, name, MAX_PATH - 2);
+ if (len < 0) {
+ if (errno == ERANGE)
+ errno = ENAMETOOLONG;
return NULL;
+ }
/* append optional '/' and wildcard '*' */
if (len && !is_dir_sep(pattern[len - 1]))
diff --git a/compat/winansi.c b/compat/winansi.c
index d11b532..18f2cdf 100644
--- a/compat/winansi.c
+++ b/compat/winansi.c
@@ -53,7 +53,8 @@ static void warn_if_raster_font(void)
/* GetCurrentConsoleFontEx is available since Vista */
pGetCurrentConsoleFontEx = (PGETCURRENTCONSOLEFONTEX)
GetProcAddress(
- GetModuleHandle("kernel32.dll"),
"GetCurrentConsoleFontEx");
+ GetModuleHandle("kernel32.dll"),
+ "GetCurrentConsoleFontEx");
if (pGetCurrentConsoleFontEx) {
CONSOLE_FONT_INFOEX cfi;
cfi.cbSize = sizeof(cfi);
@@ -62,8 +63,8 @@ static void warn_if_raster_font(void)
} else {
/* pre-Vista: check default console font in registry */
HKEY hkey;
- if (ERROR_SUCCESS == RegOpenKeyExA(HKEY_CURRENT_USER,
"Console", 0,
- KEY_READ, &hkey)) {
+ if (ERROR_SUCCESS == RegOpenKeyExA(HKEY_CURRENT_USER,
"Console",
+ 0, KEY_READ, &hkey)) {
DWORD size = sizeof(fontFamily);
RegQueryValueExA(hkey, "FontFamily", NULL, NULL,
(LPVOID) &fontFamily, &size);
@@ -72,10 +73,10 @@ static void warn_if_raster_font(void)
}
if (!(fontFamily & TMPF_TRUETYPE)) {
- const wchar_t *msg = L"\nWarning: Your console font
probably doesn\'t "
- L"support Unicode. If you experience strange
characters in the "
- L"output, consider switching to a TrueType font
such as Lucida "
- L"Console!\n";
+ const wchar_t *msg = L"\nWarning: Your console font
probably "
+ L"doesn\'t support Unicode. If you experience
strange "
+ L"characters in the output, consider switching to
a "
+ L"TrueType font such as Lucida Console!\n";
WriteConsoleW(console, msg, wcslen(msg), NULL, NULL);
}
}
@@ -85,6 +86,8 @@ static int is_console(int fd)
CONSOLE_SCREEN_BUFFER_INFO sbi;
HANDLE hcon;
+ static int initialized = 0;
+
/* get OS handle of the file descriptor */
hcon = (HANDLE) _get_osfhandle(fd);
if (hcon == INVALID_HANDLE_VALUE)
@@ -99,11 +102,34 @@ static int is_console(int fd)
return 0;
/* initialize attributes */
- attr = plain_attr = sbi.wAttributes;
- negative = 0;
+ if (!initialized) {
+ attr = plain_attr = sbi.wAttributes;
+ negative = 0;
+ initialized = 1;
+ }
+
return 1;
}
+#define BUFFER_SIZE 4096
+#define MAX_PARAMS 16
+
+static void write_console(unsigned char *str, size_t len)
+{
+ /* only called from console_thread, so a static buffer will do */
+ static wchar_t wbuf[2 * BUFFER_SIZE + 1];
+
+ /* convert utf-8 to utf-16 */
+ int wlen = xutftowcsn(wbuf, (char*) str, 2 * BUFFER_SIZE + 1,
len);
+
+ /* write directly to console */
+ WriteConsoleW(console, wbuf, wlen, NULL, NULL);
+
+ /* remember if non-ascii characters are printed */
+ if (wlen != len)
+ non_ascii_used = 1;
+}
+
#define FOREGROUND_ALL (FOREGROUND_RED | FOREGROUND_GREEN |
FOREGROUND_BLUE)
#define BACKGROUND_ALL (BACKGROUND_RED | BACKGROUND_GREEN |
BACKGROUND_BLUE)
@@ -288,39 +314,21 @@ static void set_attr(char func, const int *params,
int paramlen)
}
}
-#define BUFFER_SIZE 4096
-#define MAX_PARAMS 16
-
-static void write_console(char *str, size_t len)
-{
- /* only called from console_thread, so a static buffer will do */
- static wchar_t wbuf[2 * BUFFER_SIZE + 1];
-
- /* convert utf-8 to utf-16 */
- int wlen = utftowcsn(wbuf, str, 2 * BUFFER_SIZE + 1, len);
-
- /* write directly to console */
- WriteConsoleW(console, wbuf, wlen, NULL, NULL);
-
- /* remember if non-ascii characters are printed */
- if (wlen != len)
- non_ascii_used = 1;
-}
-
enum {
TEXT = 0, ESCAPE = 033, BRACKET = '[', EXIT = -1
};
static DWORD WINAPI console_thread(LPVOID unused)
{
- char buffer[BUFFER_SIZE];
+ unsigned char buffer[BUFFER_SIZE];
DWORD bytes;
- int start, end, c, parampos = 0, state = TEXT;
+ int start, end = 0, c, parampos = 0, state = TEXT;
int params[MAX_PARAMS];
while (state != EXIT) {
/* read next chunk of bytes from the pipe */
- if (!ReadFile(hread, buffer, BUFFER_SIZE, &bytes, NULL)) {
+ if (!ReadFile(hread, buffer + end, BUFFER_SIZE - end,
&bytes,
+ NULL)) {
/* exit if pipe has been closed */
if (GetLastError() == ERROR_BROKEN_PIPE)
break;
@@ -329,6 +337,7 @@ static DWORD WINAPI console_thread(LPVOID unused)
}
/* scan the bytes and handle ANSI control codes */
+ bytes += end;
start = end = 0;
while (end < bytes) {
c = buffer[end++];
@@ -337,7 +346,8 @@ static DWORD WINAPI console_thread(LPVOID unused)
if (c == ESCAPE) {
/* print text seen so far */
if (end - 1 > start)
- write_console(buffer +
start, end - 1 - start);
+ write_console(buffer +
start,
+ end - 1 - start);
/* then start parsing escape
sequence */
start = end - 1;
@@ -358,7 +368,10 @@ static DWORD WINAPI console_thread(LPVOID unused)
params[parampos] *= 10;
params[parampos] += c - '0';
} else if (c == ';') {
- /* next parameter, bail out if out
of bounds */
+ /*
+ * next parameter, bail out if out
of
+ * bounds
+ */
parampos++;
if (parampos >= MAX_PARAMS)
state = TEXT;
@@ -366,7 +379,10 @@ static DWORD WINAPI console_thread(LPVOID unused)
/* "\033[q": terminate the thread
*/
state = EXIT;
} else {
- /* end of escape sequence, change
console attributes */
+ /*
+ * end of escape sequence, change
+ * console attributes
+ */
set_attr(c, params, parampos + 1);
start = end;
state = TEXT;
@@ -375,9 +391,32 @@ static DWORD WINAPI console_thread(LPVOID unused)
}
}
- /* print remaining text unless we're parsing an escape
sequence */
- if (state == TEXT && end > start)
- write_console(buffer + start, end - start);
+ /* print remaining text unless parsing an escape sequence
*/
+ if (state == TEXT && end > start) {
+ /* check for incomplete UTF-8 sequences and fix
end */
+ if (buffer[end - 1] >= 0x80) {
+ if (buffer[end -1] >= 0xc0)
+ end--;
+ else if (end - 1 > start &&
+ buffer[end - 2] >= 0xe0)
+ end -= 2;
+ else if (end - 2 > start &&
+ buffer[end - 3] >= 0xf0)
+ end -= 3;
+ }
+
+ /* print remaining complete UTF-8 sequences */
+ if (end > start)
+ write_console(buffer + start, end -
start);
+
+ /* move remaining bytes to the front */
+ if (end < bytes)
+ memmove(buffer, buffer + end, bytes -
end);
+ end = bytes - end;
+ } else {
+ /* all data has been consumed, mark buffer empty
*/
+ end = 0;
+ }
}
/* check if the console font supports unicode */
Does this fix the problem for you?
---
[PATCH] git-gui: fix git work tree encoding
...if the work tree path contains non-ASCII characters.
Signed-off-by: Karsten Blees <bl...@dcon.de>
---
git-gui/git-gui.sh | 1 +
1 files changed, 1 insertions(+), 0 deletions(-)
diff --git a/git-gui/git-gui.sh b/git-gui/git-gui.sh
index 5f1faeb..6b531e8 100755
--- a/git-gui/git-gui.sh
+++ b/git-gui/git-gui.sh
@@ -1228,6 +1228,7 @@ apply_config
# v1.7.0 introduced --show-toplevel to return the canonical work-tree
if {[package vsatisfies $_git_version 1.7.0]} {
set _gitworktree [git rev-parse --show-toplevel]
+ set _gitworktree [encoding convertfrom utf-8 $_gitworktree]
} else {
# try to set work tree from environment, core.worktree or use
# cdup to obtain a relative path to the top of the worktree. If
--
1.7.7.1.msysgit.2
I've tested last versions of git and tortoisegit with unicode support
and found a bug: diff doesn't work with files those have russian
filename - says that file in temporary folder doesn't exists.
I tested it with simple text files and openoffice/libreoffice files -
the same error.
I think that tortoise extracts file with bad codepage and filesystem
prevents creation of file with such filename.
Maybe this is a git (unicode version) bug, not tortoisegit - I don't
know - but error exists.
With regards, Michael!
Here I want to report a bug - it don't allow to diff two openoffice/
libreoffice files with russian filenames - says that file in temporary
folder doesn't exist. I think tortoisegit tries extract file with bad
codepage and filesystem can't create file.
With regards, Michael!
On Fri, 2 Dec 2011, karste...@dcon.de wrote:
> [01/25] MSVC: link dynamically to the CRT
This is not really Unicode, but it is uncontroversial, I think, so I
merged it into devel.
> [02/25] git-gui: fix encoding in git-gui file browser
> [03/25] gitk: fix file name encoding in diff hunk headers
> [04/25] Revert "Disable test on MinGW that challenges its bash quoting"
What about this? Why is it in the middle of the series? Can't we put it to
the end?
> ! [05/25] Win32: Thread-safe windows console output
> ! [06/25] Win32: add Unicode conversion functions
> ! [07/25] Win32: Unicode file name support (except dirent)
> ! [08/25] Win32: Unicode file name support (dirent)
> [09/25] Unicode file name support (gitk and git-gui)
> ! [10/25] Win32: Unicode arguments (outgoing)
> ! [11/25] Win32: Unicode arguments (incoming)
> ! [12/25] Win32: sync Unicode console output and file system
This patch is the one that probably got the most comments. Since it is not
obvious how it related into the Unicode series, can we please back it out
into its own branch?
> ! [13/25] Win32: Unicode environment (outgoing)
> ! [14/25] Win32: Unicode environment (incoming)
> [15/25] MinGW: disable legacy encoding tests
> [16/25] Win32: fix environment memory leaks
> [17/25] Win32: unify environment case-sensitivity
> [18/25] Win32: simplify internal mingw_spawn* APIs
> [19/25] Win32: move environment functions
> [20/25] Win32: unify environment function names
> [21/25] Win32: move environment block creation to a helper method
> [22/25] Win32: don't copy the environment twice when spawning child
> processes
> [23/25] Win32: reduce environment array reallocations
> [24/25] Win32: keep the environment sorted
> [25/25] Win32: patch Windows environment on startup
Probably the rest is good to go, no?
To show what I mean, I rebased your -v13 branch, backed out the console
synching, moved the patch to the test to the end, and pushed the result to
rebased-unicode:
https://github.com/msysgit/git/commits/rebased-unicode
It builds fine, but please be patient with me; I really could not follow
the developments all that closely. So I might very well have missed
something really, really important.
Unfortunately, /share/msysgit/run-tests.sh stops at t7400 because it
cannot find the libiconv-2.dll (and I do not have time to debug it,
aargh!) so I cannot run the performance comparison easily that I would
like to have.
Ciao,
Dscho