hg-git chokes on filenames git can handle and hg can't

116 views
Skip to first unread message

mcc

unread,
Feb 28, 2012, 8:48:20 PM2/28/12
to hg-git
I don't know if the right place to send this is here or the hg-git
issues list.

There is a git repository I wish to clone. It contains, in one or more
of its recent revisions, a file named "Icon\r" (i.e. Icon followed by
a CR). This is a standard filename which is used for custom icons in
OS X; the repository contains an OS X app.

Even with 2.1, apparently, hg cannot handle files containing newlines.
If I attempt to clone or pull this repository-- I don't even have to
update, if I just pull in revisions-- hg immediately chokes:

$ hg pull
pulling from git+ssh://[snip]
["git-upload-pack '[snip]'"]
Counting objects [snip...]
importing git objects into hg
Icon
transaction abort!
rollback completed
abort: '\n' and '\r' disallowed in filenames: 'Icon\r'!

What do I do?

Augie Fackler

unread,
Feb 29, 2012, 11:33:59 AM2/29/12
to hg-...@googlegroups.com

The likely answer is give up. hg can't support \n in filenames (it's
actually impossible for technical reasons).

>
> --
> You received this message because you are subscribed to the Google Groups "hg-git" group.
> To post to this group, send email to hg-...@googlegroups.com.
> To unsubscribe from this group, send email to hg-git+un...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/hg-git?hl=en.
>

Dirkjan Ochtman

unread,
Feb 29, 2012, 11:43:01 AM2/29/12
to hg-...@googlegroups.com
On Wed, Feb 29, 2012 at 17:33, Augie Fackler <li...@durin42.com> wrote:
> The likely answer is give up. hg can't support \n in filenames (it's
> actually impossible for technical reasons).

Ah, but does hg also technically have to abort on \r in filenames?

Cheers,

Dirkjan

Augie Fackler

unread,
Feb 29, 2012, 5:07:21 PM2/29/12
to hg-...@googlegroups.com

I didn't get a straight answer out of anyone in IRC for that, but I've had a stupidly busy day today and haven't looked any further.

>
> Cheers,
>
> Dirkjan

mcc

unread,
Feb 29, 2012, 6:00:13 PM2/29/12
to hg-git
So, couple things--

One, from informal poking around on google it appears (?) the reason
this is the case is that newline is being used as an internal
delimiter in some hg file format. Apparently the error message + abort
when you try to add a file containing a newline in the name is a
relatively recent addition; before that, it sounds like if you
attempted to add a newline-containing file mercurial would just do
this and the internal file formats would break (?). If I'm correct
about this I personally wouldn't describe this as "impossible for
technical reasons" so much as "Mercurial is broken, they made a bad
decision in specifying a file format and this needs to be fixed". It
seems to me a cross platform file tracking program should (1) not make
assumptions about filesystem, and (2) should support things like
escaping in its file formats if the file formats have to have magic
characters..! But in either case it doesn't sound like this is
something that can be fixed on this list...
http://mercurial.selenic.com/bts/issue352
http://mercurial.selenic.com/bts/issue671

Two, it seems to me that *if* there's a filename that git can handle
and hg can't, then hg-git can and should (?) be doing something like
converting to an hg-acceptable equivalent. There seem to be several
places in hg-git where hg-git is doing something like this already. It
seems like the Icon\r file could be getting either its name mangled to
something that hg-git knows how to convert back to Icon\r, or could be
omitted from the hg side of the repository entirely (as long as this
could be done without screwing up rev hash calculations). Or maybe hg-
git could keep a list of special-case "X file on git side is named Y
on hg side" exceptions. Do any of these solutions sound plausible or
better/worse than any other?

If the answer is "give up" then that basically is saying, assume you
cannot use hg-git with source repositories containing Macintosh
projects :/ because a git repository of a mac program could sensibly
contain an Icon\r at any time. Remember the failure mode here is that
if *any* file containing a newline exists in *any* revision, hg-git
rejects the entire pull/clone and commits nothing to disk... I feel
like there must be some way to get this to work.

On Feb 29, 8:33 am, Augie Fackler <li...@durin42.com> wrote:

Augie Fackler

unread,
Mar 1, 2012, 9:55:58 AM3/1/12
to hg-...@googlegroups.com

On Feb 29, 2012, at 5:00 PM, mcc wrote:

> So, couple things--
>
> One, from informal poking around on google it appears (?) the reason
> this is the case is that newline is being used as an internal
> delimiter in some hg file format. Apparently the error message + abort
> when you try to add a file containing a newline in the name is a
> relatively recent addition; before that, it sounds like if you
> attempted to add a newline-containing file mercurial would just do
> this and the internal file formats would break (?). If I'm correct
> about this I personally wouldn't describe this as "impossible for
> technical reasons" so much as "Mercurial is broken, they made a bad
> decision in specifying a file format and this needs to be fixed".

It's eminently reasonable. No reasonable user puts newlines in filenames. Most GUIs don't allow it, and it's tricky even with command line tools. Some OSes forbid it, and many many software packages can't handle it either. In fact, in a quick check, I'm not figuring out the right incantation to make one at a shell prompt.

> It seems to me a cross platform file tracking program should (1) not make
> assumptions about filesystem, and (2) should support things like
> escaping in its file formats if the file formats have to have magic
> characters..! But in either case it doesn't sound like this is

> something that can be fixed on this list…

You can't change the newline limitation in Mercurial. It's just not possible. We will /never/ break backwards compatibility, so it's going to have to be a new system.

> http://mercurial.selenic.com/bts/issue352
> http://mercurial.selenic.com/bts/issue671
>
> Two, it seems to me that *if* there's a filename that git can handle
> and hg can't, then hg-git can and should (?) be doing something like
> converting to an hg-acceptable equivalent.

Can? Yes, it's technically possible to do that. Should? I think you're wrong. The reason is that if we rename the files, we (potentially) change their meaning, and probably break build scripts or similar. If we rename Icon\r to Icon\\r or something, we break the file's only reason for existing.

> There seem to be several
> places in hg-git where hg-git is doing something like this already.

Not on file content. File content is sacred.

> It seems like the Icon\r file could be getting either its name mangled to
> something that hg-git knows how to convert back to Icon\r, or could be
> omitted from the hg side of the repository entirely (as long as this
> could be done without screwing up rev hash calculations). Or maybe hg-
> git could keep a list of special-case "X file on git side is named Y
> on hg side" exceptions. Do any of these solutions sound plausible or
> better/worse than any other?

I'd be amenable to an off by default flag that would let you elide files with broken names.

> If the answer is "give up" then that basically is saying, assume you
> cannot use hg-git with source repositories containing Macintosh
> projects :/

Slow down there. I'm a Mac user, since the early 90s. There's NO reason anyone should be checking in Icon\r files. It's just silly and non-portable.

> because a git repository of a mac program could sensibly
> contain an Icon\r at any time. Remember the failure mode here is that
> if *any* file containing a newline exists in *any* revision, hg-git
> rejects the entire pull/clone and commits nothing to disk... I feel
> like there must be some way to get this to work.

See the patch I'm amenable to above. I don't personally have time for it, but you're welcome to hack on it.

For what it's worth, I intend to mail a patch to mercurial-devel unblocking \r in filenames. I don't know if it stands a chance. mom didn't dismiss the idea entirely, but others may have good technical reasons to forbid it (I'm actually of the opinion that we should block this from the command line, and only allow users of internal APIs to create such bogons). If this happens, it'll be in Mercurial 2.2 at the soonest.

Augie Fackler

unread,
Mar 1, 2012, 10:11:55 AM3/1/12
to hg-...@googlegroups.com

Obviously I mean mpm here. Stupid fingers.

mcc

unread,
Mar 1, 2012, 3:05:59 PM3/1/12
to hg-git
On Mar 1, 6:55 am, Augie Fackler <duri...@gmail.com> wrote:
> It's eminently reasonable. No reasonable user puts newlines in filenames. Most GUIs don't allow it, and it's tricky even with command line tools. Some OSes forbid it, and many many software packages can't handle it either. In fact, in a quick check, I'm not figuring out the right incantation to make one at a shell prompt... Slow down there. I'm a Mac user, since the early 90s. There's NO reason anyone should be checking in Icon\r files. It's just silly and non-portable.

So, although I'm actually not going to disagree with this, the files
are still *there* and still get created by the OS as of Lion (try it--
give an folder a custom icon, you get a garbage Icon? file in
Terminal), and there are thus various paths by which one could get
checked into a git repository by *accident*. It's also the case that,
even if this is poor practice, there is nothing to stop the git user
from making a poor decision in this case. If either of these two
scenarios occur it is then the hg-git user who gets punished for the
git user's mistake and/or dubious decision...

(In fact in the case of the repository I'm trying to check out,
shortly after the Icon\r file was checked in it was found to be
breaking shell scripts and was removed! However like I said just the
file being present in the history makes the whole repository
radioactive from hg-git's perspective...)

> I'd be amenable to an off by default flag that would let you elide files with broken names.
>

OK, I'll try to take a crack at implementing this. Thanks.

> For what it's worth, I intend to mail a patch to mercurial-devel unblocking \r in filenames. I don't know if it stands a chance. mom didn't dismiss the idea entirely, but others may have good technical reasons to forbid it (I'm actually of the opinion that we should block this from the command line, and only allow users of internal APIs to create such bogons). If this happens, it'll be in Mercurial 2.2 at the soonest.

That's very useful, thank you. I guess the backward compatibility
requirement really does take priority here but, it does seem to me
like if at all possible the basic engine / fileformats should be able
to handle these files for pathological cases (like compatibility with
a foreign SCM).

Blocking access to \r\n support through the normal command line does
sound totally reasonable, for all the reasons you've given. I notice
that checking on google for git and newlines in filenames-- seemingly
every single reference to this online refers to people who checked in
Icon\r files by accident, and then couldn't figure out how to remove
them and/or couldn't figure out how to add Icon\r to their .gitignore!
So there seems to be evidence most/all users would actually prefer to
be *protected* from getting those Icon\r files in their repository...

Lester Caine

unread,
Mar 1, 2012, 3:46:30 PM3/1/12
to hg-...@googlegroups.com
mcc wrote:
> Blocking access to \r\n support through the normal command line does
> sound totally reasonable, for all the reasons you've given. I notice
> that checking on google for git and newlines in filenames-- seemingly
> every single reference to this online refers to people who checked in
> Icon\r files by accident, and then couldn't figure out how to remove
> them and/or couldn't figure out how to add Icon\r to their .gitignore!
> So there seems to be evidence most/all users would actually prefer to
> be*protected* from getting those Icon\r files in their repository...

This is a perfect example of a bug in git ... THAT is where it should be fixed
rather than getting other packages to sort out a mess that should never have
been allowed in the first place. I consider this like the problem of committing
'file' and 'File' and then wondering why windows can't clone it. The source of
the problem needs fixing rather than wasting time providing bodges to cope for a
condition that DOES need to be fixed at source anyway?

--
Lester Caine - G8HFL
-----------------------------
Contact - http://lsces.co.uk/wiki/?page=contact
L.S.Caine Electronic Services - http://lsces.co.uk
EnquirySolve - http://enquirysolve.com/
Model Engineers Digital Workshop - http://medw.co.uk//
Firebird - http://www.firebirdsql.org/index.php

mcc

unread,
Mar 1, 2012, 3:54:27 PM3/1/12
to hg-git
I would tend to guess the git maintainers view supporting \n as a
feature rather than a bug.

(For what it's worth, svn does *not* support this.)
> EnquirySolve -http://enquirysolve.com/
Reply all
Reply to author
Forward
0 new messages