On Wed, May 20, 2009 at 10:49 PM, Shawn Pearce <s...@google.com> wrote:
>
> During `repo sync`, repo will only update to the SHA-1 listed in the manifest's gitlink file entry. If the branch named by the revision property is 5 commits ahead, `repo sync` will ignore the branch and will stick to what is listed in the gitlink; that entry displayed by `git ls-files --stage`.
I assume that the developer's 'repo sync' will just continue to work
as before, and only the admin at the central repository would have to
update the project SHA-1 within the manifest (either with Gerrit, or
some script) correct?
That brings to the next point..
>
> Someone, somewhere, must be responsible for updating these project SHA-1s within the manifest, otherwise `repo sync` will never see new changes. Enter Gerrit Code Review.
Does n't this add a rather tight dependency on Gerrit? ie, If one
does not use Gerrit, there is a really painful maintenance task on the
admin side. Or so, it seems from your description. Is this the
proposal? Is it not possible to retain the loose coupling and provide
some out-of-the-box solution, for admins with non-gerrit setup too?
-Jey
On Wed, May 20, 2009 at 10:49 PM, Shawn Pearce <s...@google.com> wrote:I assume that the developer's 'repo sync' will just continue to work
> During `repo sync`, repo will only update to the SHA-1 listed in the manifest's gitlink file entry. If the branch named by the revision property is 5 commits ahead, `repo sync` will ignore the branch and will stick to what is listed in the gitlink; that entry displayed by `git ls-files --stage`.
as before, and only the admin at the central repository would have to
update the project SHA-1 within the manifest (either with Gerrit, or
some script) correct?
> Someone, somewhere, must be responsible for updating these project SHA-1s within the manifest, otherwise `repo sync` will never see new changes. Enter Gerrit Code Review.
Does n't this add a rather tight dependency on Gerrit?
ie, If one
does not use Gerrit, there is a really painful maintenance task on the
admin side. Or so, it seems from your description. Is this the
proposal? Is it not possible to retain the loose coupling and provide
some out-of-the-box solution, for admins with non-gerrit setup too?
-the new scheme also needs to work with back-end pushes to
refs/heads/cupcake, not just submits done as a result of pushes to
refs/for/cupcake (e.g. Google's auto-importers and auto-mergers work
like that, and it's also used for some maintenance tasks).
-let's keep in mind the use case where someone uses a private variant
of some projects and a shared variant of most others (a use case that
we've seen both inside and outside Google). It's quite painful today,
and I'd hate to see it become harder.
-Gerrit will need some logic to know which projects to include in the
manifest (it can't just rely on existing branches), i.e. it'll need a
manifest template from which to build the manifest.
-having a format that allows to diff manifests that contain explicit
revisions would be a nice added bonus - the current format makes this
hard as every line ends up being different.
I'm not sure if I fully understand the proposal, so forgive me if this
seems uninformed.
Why is that? If it is because of possible instability,
> After some discussion with my coworkers I've come to realize that the
> current "repo sync" behavior of jumping to the latest version in each
> project is insane.
couldn't you
just specify SHA1 tags in your manifest and update them when needed?
It seems like the proposed change is to always specify the SHA1 tag in
the manifest, and have gerrit automatically update the manifest when
code is merged.
Won't that result in the same behavior of always
jumping to the latest version?
I guess that isn't the case if you aren't currently using gerrit...
but why would anyone not be using gerrit? ;)
> - Every project always has a current SHA-1.
By this, do you mean that a repo sync won't pull down changes past
what the manifest SHA1 specifies?
This has been a nice feature for
us, so any developer can see what has changed in a project and merge
it in to their branch or update their manifest to point to a more
recent version. Will this capability still exist?
> - There is no .repo/local_manifest.xml. I can't decide how to represent it,
> or what aspects are important to maintain. Comments from users about this
> removal in functionality would be most appreciated.
> Manifest inheritance is supposed to fix that. Its never been implemented.We are really looking forward to manifest inheritance, but are using
local_manifest.xml currently as a workaround. They aren't required
for our setup, but are very handy.
Other than better manifest diffs and automatic branch detection, I
don't understand what new features or benefits this change will
provide which aren't already supported by the current xml file. I
think the branch detection could be added to the xml format.
Here is
our use-case (still somewhat under development):
We have several products sharing one gerrit install. Of the 170ish
repositories, about 110 are pristine copies from upstreem and are used
by all products. Another 50 or so have a MyCompany branch, and all
the products use this branch. The other 10 repositories are fairly
product-specific and each product has their own branch. While the
product is in development mode, it follows tips of all internal
repositories and specifies SHA1 tags for upstream unmodified
repositories. When we hit release/bug squash mode, that product will
specify SHA1 hashes in the manifest for all repositories and only
update to newer versions of a repository when needed.
The current xml format works quite well for this. With the new
format, if product A submits a change to one of the 50 shared
repositories, will products B-E have to update their manifest by hand
to use the new change?
I'm also curious about this. My main reason for using repo is that I
don't have to manually specify or update SHA-1 hashes. I'm sort of
worried now because I've been proposing my group to switch to repo,
but if we have to manually update each manifest repository every time
somebody pushes a change to a sub-project (which is like 30 times a
day on average), we're in for a ton of work to keep everything in
sync. It's not possible for us to use an update hook (at least I don't
think it is) because some people that have access to a sub-project
don't have access to all the super projects that use it so they
wouldn't be able to update some super projects.
I'm concerned about the complexity of propagating/preserving manifest
changes through an auto-merge process.
We (Google) currently have an auto-merge process that walks our 100+
projects every few minutes, and whenever it find a new change in some
specified branch (e.g. in cupcake) it tries to merge it into some
other specified branch (e.g. into donut).
The process is currently already subject to race conditions
(short-term livelocks): the auto-merger syncs, performs a merge, and
attempts a simple push of the result (which fails if the result isn't
a fast-forward, i.e. if someone else touched that project during that
time window).
I'm trying to picture how such a process would work (or could be made
to work) with the new scheme that you are proposing.
I haven't looked at Gitosis yet. It seems like that would allow
visibility of all the projects if they could see what the update hook
is doing. Secrecy is the main reason some users don't have access to
certain projects. We don't want to disclose certain projects to
certain users because of sometimes crazy company policy. But maybe I
don't fully understand how Gitosis works.
> I wonder if a middleground is possible. Like a user level configuration
> that says "just always float to the latest", and have the client do the
> floating, like it does now. So the manifest SHA-1s may be stale, but the
> client would just float ahead to current branch tips anyway. With that, you
> might only update the manifest SHA-1s to tag a stable release point, e.g. a
> build you give to testing prior to release to a customer, but during normal
> development, the manifest just stays really old.
Is there a reason it even needs to store the SHA-1 hash if it's not
being used? Maybe the SHA-1 hash could be optional. If it's present
then it always uses it even if the branch tip has advanced. If the
SHA-1 hash is missing then it uses the branch tip.
If a change only touches one project, then yes, as soon as it submits, the manifest jumps to that new revision. So there is no difference.
However, if a change touches 2 or more projects, and they are interrelated, the first project submits, and the manifest does nothing. When the second project submits, then the manifest jumps both projects simultaneously, in the same manifest update commit. Thus clients syncing off that manifest either see no update, or they see both updates, but they are never presented a version of the manifest where only one of the 2 projects has been updated.
The thread starter email references the classic debate about working copy update policies: "latest greatest" or "controlled, tested version".
I think this is an issue that any reasonably sized project has to overcome in its lifecycle.
On Tue, May 26, 2009 at 5:48 PM, Shawn Pearce <s...@google.com> wrote:
If a change only touches one project, then yes, as soon as it submits, the manifest jumps to that new revision. So there is no difference.
However, if a change touches 2 or more projects, and they are interrelated, the first project submits, and the manifest does nothing. When the second project submits, then the manifest jumps both projects simultaneously, in the same manifest update commit. Thus clients syncing off that manifest either see no update, or they see both updates, but they are never presented a version of the manifest where only one of the 2 projects has been updated.
Probably I don't understand something: Is it planned to also submit manifest updates as part of a repo upload?
Actually I think it would be a good idea: it creates a "project-wide" atomic commit, which references other commits in other repositories. This way it is very easy to reproduce the exact state of the project: a single commit in the manifest repository does the trick.
It also provides an easy way to communicate between developers: something like SVN supports with tagging a working copy into the repository.
Also, different manifest commits could be tagged by different labels, thus implement a build/configuration promotion system.
If repo itself can easily create new manifest revisions from specific project states, and submit those to Gerrit or just push them into the master manifest repository, I don't see a problem with the "manual" gitlink changes. Repo should also include a possibility to update to the tip of each branch in the project, and generate the appropriate manifest for that.
I would like to add one more consideration into the mix: handling continuous integration.
I started to implement repo support for Hudson (hudson.dev.java.net), so it can be used as a CI for Android.
Of course, builds have to be reproducible. I saw that repo can output a manifest xml format, which lists all the commits for each repository. I planned to store this xml file as a build artifact so the build can be reproduced. It would be much nicer to use a single commit hash which points to the manifest repository.
I was also thinking on how to support the review process with a CI server.
It would periodically check the not yet merged and not abandoned changes in gerrit, merge them into a workspace, and try to build it. If it does not build it would automatically add a -1 comment if it builds it would add a +1 comment.
It is of course a question whether changes are applied one at a time, or the CI tries to include as many changes as possible into each build.
Also, most probably after a change is merged, the pending changes should be checked whether they can be still merged or not. If not, then they should be excluded from further CI runs, and a report should be sent to the creator of the patch, so they can submit an updated version.
What do you think?
If repo itself can easily create new manifest revisions from specific project states, and submit those to Gerrit or just push them into the master manifest repository, I don't see a problem with the "manual" gitlink changes. Repo should also include a possibility to update to the tip of each branch in the project, and generate the appropriate manifest for that.
Yes, but it gets ugly with live-lock when you are talking about updating the manifest repository. E.g. you might succeed in pushing to the projects, but fail on pushing the manifest, as someone else has already updated the manifest. Then repo would need to retry the manifest. A bad client could push the manifest first, then the projects, resulting in the classic "out of order lock aquisition" problem, but instead of causing deadlock, you'd get a nasty merge conflict in the manifest repository.
I agree about the CI aspects. FWIW, the "Verified" field is meant to be set +1/-1 by a CI system, as you describe above.
Actually, we've already talked about all of this internally at Google before we launched the AOSP. But its all been pie-in-the-sky engineering, because we've been dragged down in more mundane issues. I haven't even had time to write it out as part of a roadmap for Gerrit Code Review. I'm glad someone else shares the same thoughts on the matter, and has taken the time to write it out for us. :-)
- I don't know where to put the review URL. Given the URLs being relative, it suddenly seems odd to put the review URL into a "submodule.dalvik.review" key in the .gitmodules file. Suggestions would be appreciated.
- The manifest format matches what `git submodule` expects. That means one could clone the manifest and manage to do a checkout (and possibly build) without repo. Conversely, repo could automatically be used on any other already existing `git submodule` style project.
On Thu, May 21, 2009 at 7:49 AM, Shawn Pearce <s...@google.com> wrote:
- I don't know where to put the review URL. Given the URLs being relative, it suddenly seems odd to put the review URL into a "submodule.dalvik.review" key in the .gitmodules file. Suggestions would be appreciated.In the current manifest a review url is specified for a remote name, e.g. korg.
I think that the best way would be not to extend the .gitmodules file, but add a new .gerrit file. There you could specify the code review URL.
- The manifest format matches what `git submodule` expects. That means one could clone the manifest and manage to do a checkout (and possibly build) without repo. Conversely, repo could automatically be used on any other already existing `git submodule` style project.
If I understand it correctly, repo creates the following working tree format:
.repo/
.repo/manifests.git -> the manifest repository
.repo/manifests -> a linked working tree of the manifest repository
.repo/projects/*/*.git -> the cloned repositories
.repo/repo -> the repo repository checked out with a working tree, so repo can find its files
dir1/dir2
dir3
....
dirN -> these are linked working trees, which link to .repo/projects/*/*.git
If I understand it correctly, the main advantage of this setup is that it should be possible to create multiple working tree setups without duplicating the repository clones itself.
Also, it could be possible to configure different working trees for different products (e.g. dream requires different kernel ... etc.)
If I am not mistaken, this functionality is not yet implemented in repo (or it is well hidden :) )
In order to be able to build Android simply with git submodules, we need to be able to do the following:
git clone git://path/to/manifest.git mydroid
cd mydroid
git submodules init
git submodules update
make (*)
*) for make to work, we need to add the currently <copyfile/> -d top level Makefile to the manifest repository.
In order for this to work we need to add the absolute urls (git://a.g.k.o/*/*.git) into .gitmodules.
How I would make repo work with this new scheme (assuming that I was right about my assumption above, why repo makes use of the linked working tree approach):
The directory hierarchy would look like this:
.repo
.repo/manifests.git -> clone of the manifests repository
.repo/projects/*/*.git -> clones of the other repositories
.repo/repo -> clone of the repo repository with a working tree so repo can find its files
.repo/working-trees -> a list of directories where workingtrees are stored based on this .repo storage
workingtree1 -> linked working tree of the manifest repository
workingtree1/.git/config -> includes the submodule links to the relative ../.repo/projects/*/*.git
workingtree1/.gitmodules -> the unaltered manifest configuration file, containing the full URLs of all the project repositories.
workingtree1/.repo-inherit -> a file to specify the parent manifest branch (or even commit)
workingtree1/.gerrit -> The code review URL
workingtree1/Makefile -> the currently <copyfile/> -ed Makefile which only includes the makefile from build anyways.
workingtree1/project1
....
workingtree1/projectN -> linked working trees of the projects to ../.repo/projects/*/*.git.
workingtree2
...
workingtreeN -> similar to workingtree1
workingtree1 ... workingtreeN would be named after the manifest branch name, e.g. master, master-dream (master branch configured for dream hardware), donut, donut-dream ... etc.
So if my manifest says 'Follow tips of projects A - E', and someone
submits interdependent changes to projects A, B, and C, and they merge
cleanly on A and B but have a conflict on C, what will gerrit do?
I
assume you want it to modify the manifest to specify the last good
commit for projects A and B while it waits on someone to fix the merge
conflicts with C?
What if there are other manifests following tips of
A-E - it sounds like gerrit will update their manifests as well?
What
if a manifefst tracks tips of A and B, but is on a static revision of
C? (This is more a problem in general, and not with the current
proposal)
How will gerrit know that changes are interdependent? Will the user
specify this through git or a gerrit page?
I agree that changing the manifest from xml to a git config type file
will be beneficial on lots of fronts. Having gerrit update it
concerns me though
... as an alternative, would it be possible for
gerrit to run a test merge on interdependent changes to projects A-C,
and only do the actual merge if it knows everything will merge
cleanly?
In order to be able to build Android simply with git submodules, we need to be able to do the following:
git clone git://path/to/manifest.git mydroid
cd mydroid
git submodules init
git submodules update
make (*)
*) for make to work, we need to add the currently <copyfile/> -d top level Makefile to the manifest repository.
Or just copy it by hand. Its 3 lines. You already went through two git submodule calls just to use this. Might as well go through a 3rd cp.
In order for this to work we need to add the absolute urls (git://a.g.k.o/*/*.git) into .gitmodules.
The relative URL format I chose for the .gitmodules file was based upon the gitmodules documentation saying it supported relative URLs if the URL starts with "./". Maybe the version of git you are looking at doesn't yet support relative URLs inside .gitmodules?
The advantage of the relative URL format is clear... a very large number of organizations mirror the Android tree internally. Not needing to hack all of the URLs in the .gitmodules file would be an advantage to them.
How I would make repo work with this new scheme (assuming that I was right about my assumption above, why repo makes use of the linked working tree approach):
The directory hierarchy would look like this:
.repo
.repo/manifests.git -> clone of the manifests repository
.repo/projects/*/*.git -> clones of the other repositories
.repo/repo -> clone of the repo repository with a working tree so repo can find its files
.repo/working-trees -> a list of directories where workingtrees are stored based on this .repo storage
workingtree1 -> linked working tree of the manifest repository
workingtree1/.git/config -> includes the submodule links to the relative ../.repo/projects/*/*.git
workingtree1/.gitmodules -> the unaltered manifest configuration file, containing the full URLs of all the project repositories.
workingtree1/.repo-inherit -> a file to specify the parent manifest branch (or even commit)
workingtree1/.gerrit -> The code review URL
workingtree1/Makefile -> the currently <copyfile/> -ed Makefile which only includes the makefile from build anyways.
workingtree1/project1
....
workingtree1/projectN -> linked working trees of the projects to ../.repo/projects/*/*.git.
workingtree2
...
workingtreeN -> similar to workingtree1
workingtree1 ... workingtreeN would be named after the manifest branch name, e.g. master, master-dream (master branch configured for dream hardware), donut, donut-dream ... etc.
Interesting.
But, lets not make a ton of changes at once. Some people are already unhappy they need to create a directory to execute "repo init" inside of. Asking them to move one more level down for "workingtree1" is going to annoy them. Others though already have 3, 4 parallel repo checkouts. Clearly these people would benefit from this structure.
I like the multiple work tree approach... but lets do it after the manifest changes are done, rather than trying to do it at the same time, and lets make it optional.
I had planned on *not* creating a ".git" at the top of the working tree, so that commands like "git commit" always fail up here. This way its clear that you need to give more context to operate on a project, like cd'ing into a project's working directory, before you can execute a command. Thus far we have explicitly never had a ".git" at the top level for this very reason.
With the switch to a submodule style format, I want to keep that approach. The supermodule (aka manifest) is a VCS implementation detail that should be semi-hidden from users, not in their face at the root level. I'm unhappy that its "cd .repo/manifests" to access it, but, its better that it is out of sight most of the time. I think many Android engineers at Google agree with me, they are used to "p4 client" being how they view/manipulate the equivilant of the manifest, and usually they don't think about what their client spec says, they just copy one from another engineer, and that's that.
As far as manifest inheritence, I realized this morning that the most likely proper way to do that is to just merge the parent manifest into your own. The best way to explain this example is the T-Mobile G1 device. We really should have 6 manifests:
platform/manifest.git -> the Android Open Source Project, e.g. build, dalvik, framework
hardware/msm.git -> msm chipset specific projects
google/google-experience.git -> projects related to the "Google experience" device
t-mobile/tmus.git -> T-Mobile customizations, like "myFaves"
htc/g1.git -> a manifest that pulls all of these together.
The way to build htc/g1.git is:
$ mkdir g1
$ cd g1
$ git init
$ git pull git://android.git.kernel.org/platform/manifest.git cupcake
$ git pull git://android.git.kernel.org/hardware/msm.git cupcake
$ git pull google:/google/-experience.git cupcake
$ git pull tmobile:/tmus.git cupcake
If you need a new version of the base platform, just pull it in again:
$ git pull git://android.git.kernel.org/platform/manifest.git cupcake
Where that gets ugly is the .gitmodules file. A semantic merge of the .gitmodules file wouldn't be too difficult to define, its a pretty simple thing to merge together, we can do a "repo merge-driver-gitmodules" or something and help users define it in .git/config as a merge driver, and setup a .gitattributes to use our repo based merge driver for the .gitmodules file.
But you can't just create your own manifest like this without first having everything mirrored locally, otherwise the relative URLs would fail to resolve.
I think this is a good idea.
*) for make to work, we need to add the currently <copyfile/> -d top level Makefile to the manifest repository.
But then your repository won't be clean, you will have an untracked file. It is just ugly :)
In order for this to work we need to add the absolute urls (git://a.g.k.o/*/*.git) into .gitmodules.
The advantage of the relative URL format is clear... a very large number of organizations mirror the Android tree internally. Not needing to hack all of the URLs in the .gitmodules file would be an advantage to them.I don't really see the advantage.
If an organization mirrors the android repositories, they most likely will have their own repositories added to the manifest, or just use different branches ... etc. So they will have their own manifest / manifest branch anyways.
It can always be supported with tools to create / maintain these manifests, but I think it is unrealistic from an organization to assume that they can mirror the repositories and not touch the manifest. There are many more areas that need work and would be much greater help to these organizations.
I had planned on *not* creating a ".git" at the top of the working tree, so that commands like "git commit" always fail up here. This way its clear that you need to give more context to operate on a project, like cd'ing into a project's working directory, before you can execute a command. Thus far we have explicitly never had a ".git" at the top level for this very reason.
With the switch to a submodule style format, I want to keep that approach. The supermodule (aka manifest) is a VCS implementation detail that should be semi-hidden from users, not in their face at the root level. I'm unhappy that its "cd .repo/manifests" to access it, but, its better that it is out of sight most of the time. I think many Android engineers at Google agree with me, they are used to "p4 client" being how they view/manipulate the equivilant of the manifest, and usually they don't think about what their client spec says, they just copy one from another engineer, and that's that.Regarding p4: I know the same "copy my neighbors' config spec" approach from ClearCase :)
However, with this change I wanted to shift the role of the "manifest" repository a bit. It would be no longer a mere "configspec" storage, but really the "top-level" umbrella repository.
As far as manifest inheritence, I realized this morning that the most likely proper way to do that is to just merge the parent manifest into your own. The best way to explain this example is the T-Mobile G1 device. We really should have 6 manifests:
platform/manifest.git -> the Android Open Source Project, e.g. build, dalvik, framework
hardware/msm.git -> msm chipset specific projects
google/google-experience.git -> projects related to the "Google experience" device
t-mobile/tmus.git -> T-Mobile customizations, like "myFaves"
htc/g1.git -> a manifest that pulls all of these together.
The way to build htc/g1.git is:
$ mkdir g1
$ cd g1
$ git init
$ git pull git://android.git.kernel.org/platform/manifest.git cupcake
$ git pull git://android.git.kernel.org/hardware/msm.git cupcake
$ git pull google:/google/-experience.git cupcake
$ git pull tmobile:/tmus.git cupcake
If you need a new version of the base platform, just pull it in again:
$ git pull git://android.git.kernel.org/platform/manifest.git cupcake
Would these repositories you list only contain the manifests?
In fact this is very similar to what I was proposing with .repo-inherit. The .repo-inherit would only be a convenience feature: With a "repo update-manifest" command it would automatically find the "parent" of the current manifest and do the merge if the parent has changed.
I expect that a build manager / CM person will be responsible for managing each organization's manifest repository which would track a.g.k.o and possibly other private / public manifest repositories from their partners.
From these sources they would derive their own manifest branch(es), and use those.
However, with this change I wanted to shift the role of the "manifest" repository a bit. It would be no longer a mere "configspec" storage, but really the "top-level" umbrella repository.
We've never wanted the manifest to be some sort of top level. Its always supposed to be *just* a "configspec". So its small, doesn't change that often, etc. It wasn't even a git repository in the early days of repo, we tacked that on once we realized that editing the XML file by hand was horrible, and people would want to share the XML files back and forth... so we shoved it in git just to facilitate sharing.
I'm not sure what value we get from it being on the top level, other than the "git add $subproject; git commit" benefit. Which may or may not confuse the heck out of a user used to p4 or svn, for example.
$ git pull git://android.git.kernel.org/platform/manifest.git cupcake
$ git pull git://android.git.kernel.org/hardware/msm.git cupcake
$ git pull google:/google/-experience.git cupcake
$ git pull tmobile:/tmus.git cupcake
If you need a new version of the base platform, just pull it in again:
$ git pull git://android.git.kernel.org/platform/manifest.git cupcake
Would these repositories you list only contain the manifests?
Yes. Or, well, each repository is a manifest repository. They merge down into the "htc/g1" repository to provide the manifest necessary for HTC to build a release system image for manufactoring. "htc/adp1" would be a very similar manifest, but is slightly different, due to the different keys on the device, etc.
In fact this is very similar to what I was proposing with .repo-inherit. The .repo-inherit would only be a convenience feature: With a "repo update-manifest" command it would automatically find the "parent" of the current manifest and do the merge if the parent has changed.
Sure, but why not just "git pull" ? And note above, there's more than one parent for "htc/g1". There are technically 4.
On Fri, May 29, 2009 at 4:06 AM, Shawn Pearce <s...@google.com> wrote:
However, with this change I wanted to shift the role of the "manifest" repository a bit. It would be no longer a mere "configspec" storage, but really the "top-level" umbrella repository.
We've never wanted the manifest to be some sort of top level. Its always supposed to be *just* a "configspec". So its small, doesn't change that often, etc. It wasn't even a git repository in the early days of repo, we tacked that on once we realized that editing the XML file by hand was horrible, and people would want to share the XML files back and forth... so we shoved it in git just to facilitate sharing.Fair enough, but with the proposed format change the manifest repository will have to change on each and every merged commit into any of the project repositories. So the "seldom changing" argument no longer applies.
I'm not sure what value we get from it being on the top level, other than the "git add $subproject; git commit" benefit. Which may or may not confuse the heck out of a user used to p4 or svn, for example.The main benefit is together with my "atomic change" proposal where each change also includes a commit in the manifest repository, so every change is globally defined. This is a property, that should not be underestimated. It is something that works in SVN, in a single git repository, and even in a regular git submodule based project if we look at the superproject's commits.
In my opinion, my proposal blends in nicely with the intent of the whole change: to have more control over what gets into the developer's working tree.
At the basis it remains completely compatible with a regular "git submodule" project, no need to touch repo or gerrit if you want. (You can add "freedom of choice" to your marketing flyers :-) )
Repo only builds on this foundation:
- linked working trees to save disk space and sync times
- make working with submodules easier (not having to do separate commits in subprojects ... etc.)
- topic branch management
- code review workflow
etc.
In any case, I can submit patches for repo to add support for this layout, and test how it sits with the developers / CM people. However, the biggest advantage of this layout would be the easy support for atomic changes, which would also require Gerrit support.
After sleeping on it, I have changed my mind, and I think we are now in agreement. I've moved over to your idea that the top level should just be a git repository working directory. Well, actually, it depends on whether or not you use the "multiple parallel work trees" approach.
I think we should support both layouts, starting/defaulting with the single work tree layout, and then allowing the user to switch to multiple if they so choose. E.g. "repo init --multi -u ..." would immediately use the multiple layout you proposed several messages back. "git new-work-dir" is still in contrib for a reason, its easy to get confused and wind up with a branch checked out in two places at once, etc. As it is repo and git give you enough rope to hang yourself, lets at least make it an advanced user option to ask for a gun along with that rope. Moving from the default/single layout to multi should just be "repo init --multi" in the existing client, and only require moving the top level directories and fixing some symlinks. So you wouldn't even lose your build products, or need to recompile the world.
But either layout, the "top level" should be the manifest project, like you have been argueing for.
Yes, but its also a throwback to BitKeeper for some. Comitting a file, and then committing again to create the change set. Its a very awkward UI.
If you haven't yet looked at the repo code, I'm going to warn you, it isn't pretty. I learned Python by writing it. I know there are more elegant ways to write the code. I know some of the style is inconsistent. Its uh, not my proudest moment. And since it works well enough for the folks that are using it today, it doesn't get any attention to clean it up... my time goes to Gerrit. So, uh, apologies in advance for the sad state that the code is currently in.
On Fri, May 29, 2009 at 4:45 PM, Shawn Pearce <s...@google.com> wrote:After sleeping on it, I have changed my mind, and I think we are now in agreement. I've moved over to your idea that the top level should just be a git repository working directory. Well, actually, it depends on whether or not you use the "multiple parallel work trees" approach.
I think we should support both layouts, starting/defaulting with the single work tree layout, and then allowing the user to switch to multiple if they so choose. E.g. "repo init --multi -u ..." would immediately use the multiple layout you proposed several messages back. "git new-work-dir" is still in contrib for a reason, its easy to get confused and wind up with a branch checked out in two places at once, etc. As it is repo and git give you enough rope to hang yourself, lets at least make it an advanced user option to ask for a gun along with that rope. Moving from the default/single layout to multi should just be "repo init --multi" in the existing client, and only require moving the top level directories and fixing some symlinks. So you wouldn't even lose your build products, or need to recompile the world.
But either layout, the "top level" should be the manifest project, like you have been argueing for.So the "single layout" would be like this:
.repo
.repo/manifest.git -> clone of the manifest repository without working tree
.repo/repo -> repo.git clone with working tree
.repo/projects/*/*.git
.git/* -> Linked working tree to .repo/manifest.git
.git/config -> Generated content, listing submodules as relative urls to .repo/projects/*/*.git
.gitmodules -> File checkout from .repo/manifest.git
.gerrit -> File checkout from .repo/manifest.git
dir1 ... dirN -> Submodule working trees, essentially unchanged from the current layout.
This makes the conversion of the current workspaces very easy. A repo sync can first update the manifest repository, then generate the .git/* alongside .repo, then execute the new sync algorithm by looking at .gitmodules and the gitlink entries.
Are we on the same page here?
Yes, but its also a throwback to BitKeeper for some. Comitting a file, and then committing again to create the change set. Its a very awkward UI.Well, I thought that repo could provide a frontend to make this easier. In fact a repo commit in the top level project could commit recursively in all projects with the appropriate topic branch, for example.
Or repo commit in a subproject could automatically execute "git add $subproject" in the top level project. Also, repo upload could check for commits in subproject topic branches that are not reflected in the top level project.
We should probably think this through systematically by collecting all the relevant use cases to have a consistent experience for developers.
If you haven't yet looked at the repo code, I'm going to warn you, it isn't pretty. I learned Python by writing it. I know there are more elegant ways to write the code. I know some of the style is inconsistent. Its uh, not my proudest moment. And since it works well enough for the folks that are using it today, it doesn't get any attention to clean it up... my time goes to Gerrit. So, uh, apologies in advance for the sad state that the code is currently in.I already had a peek before we started this discussion, so I know what to expect. I don't think it is in too bad shape. I saw that you already started with the preparations for the manifest format change.
You mentioned a few days ago about supporting a "floating" branch such
On May 29, 1:36 pm, Shawn Pearce <s...@google.com> wrote:
>
> I still have more code to push out. I started that stuff a while ago, but
> couldn't get around to testing it enough to get it out there. Right now I'm
> working on a bug in some code I did that split the "revision" property of a
> manifest into two fields, SHA-1 and revision, so that in the submodule style
> manifest we have access to both the commit the gitlink refers to, and also
> to the project's remote branch name, if one is specified for gerrit to
> auto-update.
it would always bring in the tip of the specified branch even if that
made the SHA-1 in the manifest/modules file was stale. Are you still
planning on supporting this?
But assuming the floating/stale thing is allowed, it sounds like the
main change from a usage standpoint with the new manifest format is
that instead of creating the manifest repository, the super project
would just be a git repository where you do 'git submodule add' for
each sub project then create another file to specify the branch names
of each sub project (since I didn't think git submodule let you
specify a branch name)?
Once that was set up, if I commit a change to a sub project and push
it to the sub project's repository and then somebody does a 'repo
sync' of the super project, they'd automatically get that newly
commited/pushed sub project change, right?