'.git file' alternative, native (cross-platform) workdir support.

37 views
Skip to first unread message

Marius Storm-Olsen

unread,
Feb 29, 2008, 7:27:53 AM2/29/08
to Git Mailing List, msysGit
Hi guys,

I just caught a glimpse of the '.git file' efforts, as a file for
redirection to a real repository.

As far as I can tell, the reason for adding the support is to in the
end provide a cross-platform way of supporting workdirs. (If this is
not the [main] point, please point me to the thread describing the
real reason, I couldn't find it.)

However, wouldn't simply redirecting everything into a real repo then
create problems with shared index file and more? A problem which could
be tacled by file suffixes or other methods, I'm sure, but which would
require even more patches to achieve the goal.


I was actually thinking about the whole workdir thing the other day,
since I mainly work on Windows, and constantly green of envy of the
'mainly Linux' guys. I figured, why not just add support for file
redirection in
char *git_path(const char *fmt, ...)
which we use all over the place? That surely is easy and
cross-platform :-)


Attached you'll find a patch which will achieve 'native workdir'
support on all platforms, independent of underlying file system. And
if you ignore the code added to builtin-init-db.c &
builtin-rev-parse.c for minimal usage support, it's a mere 104 lines
touched. The patch should apply cleanly on both git's next branch, and
Hannes' j6t master branch (the mingw port).

Please note that the patch is not meant for end-user consumption, nor
does it follow Git coding standards. It's just meant as a proof of
concept, and as a means of discussion.
Also note that this way of supporting workdirs suffers from the same
'flaws' the current git-new-workdir has (locking of repo etc).

If people want, I can work with it, and implement this properly (with
its own builtin-workdir, test cases etc). As I see it, the '.git file'
concept still has some way to go. Maybe the '.git file' concept is the
best way in the end, but that this way is an 'ok' temporary way of
doing it?

^shrug^ - Input please!

To try out the patch, apply it on git 'next' branch, then in some
random directory:
git init --workdir-for=<abs-path to repo>
git reset --hard
.. hack away as normal ..
(I know I know, it just a quick hack to let ppl play with it.)

PS. Sorry for the patch being an attachment, I'm a Thunderbird slave.
(Someone really needs to make a Thunderbird addon for sending Git
patches ;-)

--
.marius

signature.asc
0001-Add-cross-platform-workdir-support.patch

Johannes Schindelin

unread,
Feb 29, 2008, 7:54:11 AM2/29/08
to Marius Storm-Olsen, Git Mailing List, msysGit
Hi,

On Fri, 29 Feb 2008, Marius Storm-Olsen wrote:

> I just caught a glimpse of the '.git file' efforts, as a file for
> redirection to a real repository.
>
> As far as I can tell, the reason for adding the support is to in the end
> provide a cross-platform way of supporting workdirs. (If this is not the
> [main] point, please point me to the thread describing the real reason,
> I couldn't find it.)

This is the main reason, yes.

However, you can also use the .git file to separate the working directory
from the repository, say, on two different drives, when you do not have
symbolic links.

> However, wouldn't simply redirecting everything into a real repo then
> create problems with shared index file and more? A problem which could
> be tacled by file suffixes or other methods, I'm sure, but which would
> require even more patches to achieve the goal.

Not only would it requre these patches, but it would actually make a
_safe_ multiple-workdirs feature possible.

ATM the problem is that you can change a ref that is checked out
elsewhere, and if you are not a Git expert, it will just make your life
miserable.

However, if we do not pretend to have different repositories, but actually
use the _identical_ repository for multiple working directories, we can
make the mechanisms safe!

This is basically the reason why I do not like the current new-workdir
script (and the patch in my private tree where I taught git-branch about
it).

So while your approach may seem easier in the short run, there is no way
you can make it safe. No way, except going the full nine yards, and
actually use the same repository, which means that you have to have the
"other patches", too.

Ciao,
Dscho

Marius Storm-Olsen

unread,
Feb 29, 2008, 8:24:22 AM2/29/08
to Johannes Schindelin, Git Mailing List, msysGit
Johannes Schindelin said the following on 29.02.2008 13:54:

Sure, I'm aware of that. The initial goal was to make something which
works as the current contrib/workdir/git-new-workdir, just
cross-platform. Then we can take it from there, step by step, until we
have something which works safely; instead of taking a single big leap.

I'm actually not sure that it's impossible to make it safe.
My implementation works by redirecting files into the real repo.
However, we can also detect when redirection is in effect, and do
extra 'maintainance' things then, to avoid the bad effects.

For example, when setting up a workdir, we could duplicate
<real repo>/.git/refs/*
into <real repo>/.git/refs/workdir-<sha1>/*
(<sha1> being the sha1 of the abs path to the workdir)
and have the redirection mechanism redirect all git_path("refs/*") to
the duplicated locations. That way, when pulling in the workdir, it
wouldn't create havok with the real repo's refs. Then in the real
repo, you can easily refer to the refs in from the workdir too, when
you need to.

There are several possibilities here. Since file redirection works
from the beginning, we have a place to start, which can slowly migrate
into whatever. When you think about it, my approach is kinda similar
to the '.git file' approach, just that I don't redirect everything
from the start, just parts to make it work as today on Linux. In the
end my technique could also redirect everything into the real repo,
giving you the same effect as the '.git file'.

--
.marius

signature.asc

Jakub Narebski

unread,
Feb 29, 2008, 9:14:36 AM2/29/08
to Marius Storm-Olsen, Johannes Schindelin, Git Mailing List, msysGit
Marius Storm-Olsen <mar...@trolltech.com> writes:

> Johannes Schindelin said the following on 29.02.2008 13:54:
>> On Fri, 29 Feb 2008, Marius Storm-Olsen wrote:
>>>
>>> However, wouldn't simply redirecting everything into a real repo
>>> then create problems with shared index file and more? A problem
>>> which could be tacled by file suffixes or other methods, I'm
>>> sure, but which would require even more patches to achieve the
>>> goal.
>>
>> Not only would it requre these patches, but it would actually make
>> a _safe_ multiple-workdirs feature possible.
>> ATM the problem is that you can change a ref that is checked out
>> elsewhere, and if you are not a Git expert, it will just make your
>> life miserable.
>> However, if we do not pretend to have different repositories, but
>> actually use the _identical_ repository for multiple working
>> directories, we can make the mechanisms safe!
>> This is basically the reason why I do not like the current
>> new-workdir script (and the patch in my private tree where I taught
>> git-branch about it).
>

> Sure, I'm aware of that. The initial goal was to make something which
> works as the current contrib/workdir/git-new-workdir, just
> cross-platform. Then we can take it from there, step by step, until we
> have something which works safely; instead of taking a single big leap.
>
> I'm actually not sure that it's impossible to make it safe.
> My implementation works by redirecting files into the real
> repo. However, we can also detect when redirection is in effect, and
> do extra 'maintainance' things then, to avoid the bad effects.
>
> For example, when setting up a workdir, we could duplicate
> <real repo>/.git/refs/*
> into <real repo>/.git/refs/workdir-<sha1>/*
> (<sha1> being the sha1 of the abs path to the workdir)
> and have the redirection mechanism redirect all git_path("refs/*") to
> the duplicated locations. That way, when pulling in the workdir, it
> wouldn't create havok with the real repo's refs. Then in the real
> repo, you can easily refer to the refs in from the workdir too, when
> you need to.

I have had yet another idea, namely of shadow / unionfs-alike, which I
have abandoned due to perceived difficulties in implementing it. But
perhaps this would be the best solution for multiple working
directories problem.

The idea is to add core.gitdir variable to the config, which would
point to "master" (main) GIT_DIR.

When requesting any file from repository, be it .git/refs/HEAD,
.git/index, .git/refs/heads/master or any other file, git would first
try to find it in the current GIT_DIR, as existing heuristics find it,
and if not found try with GIT_DIR set to core.gitdir. When trying to
create any file, git would check if directories / path leading to file
in GIT_DIR exists, and if not create it under GIT_DIR set to
core.gitdir.

The only exception would be config file, where current GIT_DIR config
would be fourth layer, on top of system-wide config, global (per-user)
config and core.gitdir config.

What do you think of this idea?

--
Jakub Narebski
Poland
ShadeHawk on #git

Johannes Schindelin

unread,
Feb 29, 2008, 9:25:16 AM2/29/08
to Marius Storm-Olsen, Git Mailing List, msysGit
Hi,

On Fri, 29 Feb 2008, Marius Storm-Olsen wrote:

> Johannes Schindelin said the following on 29.02.2008 13:54:
> > On Fri, 29 Feb 2008, Marius Storm-Olsen wrote:
> > > However, wouldn't simply redirecting everything into a real repo
> > > then create problems with shared index file and more? A problem
> > > which could be tacled by file suffixes or other methods, I'm sure,
> > > but which would require even more patches to achieve the goal.
> >
> > Not only would it requre these patches, but it would actually make a
> > _safe_ multiple-workdirs feature possible.
>

> Sure, I'm aware of that. The initial goal was to make something which
> works as the current contrib/workdir/git-new-workdir, just
> cross-platform. Then we can take it from there, step by step, until we
> have something which works safely; instead of taking a single big leap.

That's what I am saying: there is no way to make it safe.

> I'm actually not sure that it's impossible to make it safe. My
> implementation works by redirecting files into the real repo. However,
> we can also detect when redirection is in effect, and do extra
> 'maintainance' things then, to avoid the bad effects.

From the perspective of Windows, I guess it is easy to overlook the fact
that permissions can break your idea.

Even after creating a second working tree for an existing repository, the
permissions of the original repository can change.

The only way to be on the safe side is to use _the repository_ twice. IOW
not having a second .git/ directory.

Also, having a single .git is just a very simple, and thus preferable
concept, to having part of this, and part of that repository.

Ciao,
Dscho

Marius Storm-Olsen

unread,
Feb 29, 2008, 9:51:52 AM2/29/08
to Johannes Schindelin, Git Mailing List, msysGit
Johannes Schindelin said the following on 29.02.2008 15:25:

> On Fri, 29 Feb 2008, Marius Storm-Olsen wrote:
>> I'm actually not sure that it's impossible to make it safe. My
>> implementation works by redirecting files into the real repo.
>> However, we can also detect when redirection is in effect, and do
>> extra 'maintainance' things then, to avoid the bad effects.
>
> From the perspective of Windows, I guess it is easy to overlook the
> fact that permissions can break your idea.
>
> Even after creating a second working tree for an existing
> repository, the permissions of the original repository can change.

Sure, but that would break _any_ working tree implementation. Without
access to the original data, it whole thing is bust, no matter if you
redirect all or part of .git/.

Checking if we have access to the redirected .git is trivial in both
cases. (partial or whole redirection)

> The only way to be on the safe side is to use _the repository_
> twice. IOW not having a second .git/ directory.
>
> Also, having a single .git is just a very simple, and thus
> preferable concept, to having part of this, and part of that
> repository.

I whole heartedly agree. I'm not proposing to keep it split in the
long run. I'm just proposing something that 'works' *now*, and can be
improved incrementally; as opposed to, doesn't work now, and needs to
be fully implemented before it works for the Windows crowd.

PS. The redirection method I propose already alleviates an issue of
the current git-new-workdir has, which Shawn has experienced many
atime: The deletion of .git/config and .git/packed-refs, making
'git-config' and 'git tag -d' unsafe in a workdir. (Though I'm unsure
if that has been fixed already. In any case, since the files are
really redirected, there no chance that deleting a file will remove a
synlink, only to be recreated as a normal file instead)

--
.marius

signature.asc

Junio C Hamano

unread,
Feb 29, 2008, 3:02:42 PM2/29/08
to Johannes Schindelin, Marius Storm-Olsen, Git Mailing List, msysGit
Johannes Schindelin <Johannes....@gmx.de> writes:

> On Fri, 29 Feb 2008, Marius Storm-Olsen wrote:
>
>> I just caught a glimpse of the '.git file' efforts, as a file for
>> redirection to a real repository.
>>
>> As far as I can tell, the reason for adding the support is to in the end
>> provide a cross-platform way of supporting workdirs. (If this is not the
>> [main] point, please point me to the thread describing the real reason,
>> I couldn't find it.)
>
> This is the main reason, yes.

I do not think so. For repository and work tree separation, we already
have core.worktree. Multiple work trees attached to a single repository
is what contrib/workdir/ does, and it could probably be extended, but that
one needs more than "redirect .git elsewhere".

The primary reason we may want to do the ".git file" thing is to sanely
support switching between branches (or checking out a different revision,
which amounts to the same thing) when one has a submodule and the other
one either does not have that submodule anywhere or have it in a different
location in its tree.

Suppose the HEAD one binds a submodule "gitk" at gitk-git. Then suppose
we want to switch to an old branch that did not have that submodule bound
yet. Or the branch we are switching to has the submodule at modules/gitk.
What happens?

Currently, when we are on HEAD, we create a directory at gitk-git and make
gitk-git/.git directory its controlling repository (i.e. GIT_DIR).
Switching to a branch that did not have the submodule bound will need to
rmdir gitk-git (this needs to happen no matter what) but as a side effect
we will lose gitk-git/.git repository. Switching back to where we were
would require reloading that repository from somewhere else, but if you
are "the upstream", that somewhere else may not even exist.

One way to solve this would be to add .git/submodules/paulus.git
repository inside the controlling reopsitory of the toplevel project, and
point that with the ".git file" installed at gitk-git/.git, when we are on
HEAD. We can lose gitk-git directory and everything below it when
switching away from the revision, but when we come back, we can recreate
gitk-git directory, point gitk-git/.git back to .git/submodules/paulus.git
kept in the toplevel repository, and check the appropriate commit out there.

After switching to an old revision that did not have the submodule,
further switching to a branch that has the submodule at modules/gitk would
be the same deal. Instead of creating gitk-git directory and installing
the ".git file" there (which is what we did when we came back to the
original HEAD), create modules/gitk and install the ".git file" there, to
point at the same .git/submodules/paulus.git/.

We should be able to do this today without ".git file" using symlinks.
It's just a Porcelain hackery, so I'll leave it to interested parties as
an exercise.

Marius Storm-Olsen

unread,
Feb 29, 2008, 4:32:30 PM2/29/08
to Junio C Hamano, Johannes Schindelin, Git Mailing List, msysGit
Junio C Hamano wrote:
> The primary reason we may want to do the ".git file" thing is to
> sanely support switching between branches (or checking out a
> different revision, which amounts to the same thing) when one has a
> submodule and the other one either does not have that submodule
> anywhere or have it in a different location in its tree.
...

> One way to solve this would be to add .git/submodules/paulus.git
> repository inside the controlling reopsitory of the toplevel project,
> and point that with the ".git file" installed at gitk-git/.git, when
> we are on HEAD. We can lose gitk-git directory and everything below
> it when switching away from the revision, but when we come back, we
> can recreate gitk-git directory, point gitk-git/.git back to
> .git/submodules/paulus.git kept in the toplevel repository, and check
> the appropriate commit out there.
>
> After switching to an old revision that did not have the submodule,
> further switching to a branch that has the submodule at modules/gitk
> would be the same deal. Instead of creating gitk-git directory and
> installing the ".git file" there (which is what we did when we came
> back to the original HEAD), create modules/gitk and install the ".git
> file" there, to point at the same .git/submodules/paulus.git/.

Ahh, ok, this makes sense. *Then* you need to point all of .git to a
specific location. But, in this case you're not interested in keeping
n-different states, as we are in a multiple workdir situation. So, it's
a much easier case.

The question is, is this also the appropriate basis for solving the
multiple workdir case? If so, we need to come up with a scheme that lets
us keep n number of states inside one single .git structure. Is this
reasonable? It's not like it's too hard, just a bit messy.

The reason I ask is to evaluate if I should cleanup the patch I did for
the native workdir support that started out this thread, or just lay it
dead for the '.git file' solution which still would need a lot of work
before it's finished. (Though, as I said before, my redirection way
could certainly migrate into a '.git file' solution over time)

I basically just need a certain amount of knowledgeable people to either
say 'drop it' or 'roll with it'.. I'm ambivalent, though I want workdirs
on Windows yesterday. :-)
So, I guess I have Dscho's -1, and my own +1 = 0 :-p

> We should be able to do this today without ".git file" using
> symlinks. It's just a Porcelain hackery, so I'll leave it to
> interested parties as an exercise.

Symlinks wouldn't really work, unless you force people to always keep
their full .git next to the workdir. So, multiple workdirs with
submodules would then fail, as would FS without symlink support (duh).
This seems to be the issue with the current '.git file' implementation
as well. If the repo is not located next to the workdir, it will fail.
Or am I reading the code incorrectly? And what happens with recursive
submodules?

--
.marius


signature.asc
Reply all
Reply to author
Forward
0 new messages