[RFC] repo export-manifest command

2,080 views
Skip to first unread message

Shawn Pearce

unread,
Mar 2, 2009, 8:49:12 PM3/2/09
to repo-discuss
I just posted a patch, https://review.source.android.com/9051, to export a manifest from an existing client.  What may be especially interesting is exporting the current revisions of each project.  E.g. I can now "snapshot" one client:

  repo export-manifest -r -o mine.xml

and recreate that exact same state elsewhere, at any future point in time, as the commit names are embedded in the XML:

  repo init -u git://android.git.kernel.org/platform/manifest.git
  cp mine.xml .repo/manifest.xml
  repo sync

This recreate using init/cp/sync is a mess.  Its not a good idea to do for a long-lived client as the manifest repository is now dirty and upstream updates of the manifest won't merge clean.  But it may be a step in the right direction.


The output manifest looks something like this, its not nearly as neat formatting wise as the hand-written ones:

  <?xml version="1.0" encoding="UTF-8"?>
  <manifest>
    <remote fetch="git://android.git.kernel.org/" name="korg" review="review.source.android.com"/>

    <default remote="korg" revision="cupcake"/>

    <project name="kernel/common" path="kernel" revision="462bfbff2e3b4e93b0509a48f8372f7dd3d17e0b"/>
    <project name="platform/bionic" path="bionic" revision="d37527501c85edcb3a6a7c8a0b6297d52d434897"/>
  ...

Ugly, but it beats creating the XML file by hand.


Before I commit this I would appreciate feedback from people who have asked for this sort of feature in the past, like for continuous build snapshots, or creating baselines for engineers to work against.

Jean-Baptiste Queru

unread,
Mar 3, 2009, 12:36:51 AM3/3/09
to repo-d...@googlegroups.com
Thoughts:

-No concern from the export-manifest side, I think it's great.

-I think that in the most common case where someone would want to use
such a manifest they'll already have an instance inited and synced. I
think in such a case it'd be more convenient to have a repo checkout
command that'd take an exported manifest as an argument, verify that
its structure matches that of the inited instance, and checkout the
appropriate sha-1s.

-I think that capturing the entire information as a real manifest is
valuable, because it's actually self-contained (and in a distance
future it can be used with the patten that you mentioned).

-I wonder what would be the appropriate way for a team to store such
manifests for easy future retrieval. It'd seem that having them
"somewhere" in manifest.git would make the most sense. I'm not sure
what granularity would make sense from a scalability point of view
(daily build per branch, i.e. about a thousand per year, or each
individual change, i.e. tens of thousands per year). If they're stored
as branches, this allows the usual repo init -b syntax to work, and
the counterpart might be repo checkout -b.

That way, as a user, I can see several major scenarios:

-capture a snapshot of my client with repo export-manifest -o
mine.xml, and sync back to if with repo checkout -i mine.xml

-init and sync a tree from an existing snapshot manifest: repo init -u
... -b PLAT-RC33 ; repo sync

-checkout my local tree to match an existing manifest: assuming (repo
init -u ... -b PLAT ; repo sync) at some point in the past, run repo
checkout -b PLAT-RC33.

The command names are entirely made up.

JBQ

--
Jean-Baptiste M. "JBQ" Queru
Android Engineer, Google.

Questions sent directly to me will likely get ignored or forwarded to
a public forum with no further warning.

lingy...@gmail.com

unread,
Mar 3, 2009, 3:08:38 AM3/3/09
to Repo and Gerrit Discussion
Shawn,

Yes, we need such scene. In old repo,
1. cp new_manifest.xml .repo/manifest.xml
2. repo sync

It would save sync time. But I find something doesn't work correctly.
For example, in new release, new project A is added, and project A is
added in new_manifest.xml.
When I did above steps, everything seemed to be OK.
Then I run "repo status", it showed that 1 project was missing. And I
found there's no project A in my old repo. While in .repo/
manifest.xml, project A was in the project list.

Do you know what's the problem?

For such case, must I create a new dir, then repo init/sync?

Thanks,
Emily
> > I just posted a patch,https://review.source.android.com/9051, to export a
> a public forum with no further warning.- Hide quoted text -
>
> - Show quoted text -

Shawn Pearce

unread,
Mar 3, 2009, 3:56:54 PM3/3/09
to repo-d...@googlegroups.com
On Mon, Mar 2, 2009 at 21:36, Jean-Baptiste Queru <j...@android.com> wrote:

Thoughts:

-No concern from the export-manifest side, I think it's great.

What about command name?  I'm a bit unhappy about "export-manifest" as a command name, it seems long and ugly to me compared to our more simple "init" and "sync".  But its the best I could come up with.
 
-I think that in the most common case where someone would want to use
such a manifest they'll already have an instance inited and synced. I
think in such a case it'd be more convenient to have a repo checkout
command that'd take an exported manifest as an argument, verify that
its structure matches that of the inited instance, and checkout the
appropriate sha-1s.

I agree.  This is likely either an option to "init" or to "sync", or to both...  but sync is meant to perform that checkout you are talking about doing.  So we have to figure out UI to feed the exported manifest file into sync.
 
-I think that capturing the entire information as a real manifest is
valuable, because it's actually self-contained (and in a distance
future it can be used with the patten that you mentioned).

Yes.  One downside is when we ever get around to doing inheritance (http://jira.source.android.com/jira/browse/REPO-3).  What do we do when we export a manifest that is based on inheritance?  Do we flatten it all out like I'm doing here in export-manifest, or do we export only parts?  It gets ugly for everyone, user, repo tool developer, etc.
 
-I wonder what would be the appropriate way for a team to store such
manifests for easy future retrieval. It'd seem that having them
"somewhere" in manifest.git would make the most sense. I'm not sure
what granularity would make sense from a scalability point of view
(daily build per branch, i.e. about a thousand per year, or each
individual change, i.e. tens of thousands per year). If they're stored
as branches, this allows the usual repo init -b syntax to work, and
the counterpart might be repo checkout -b.

IMHO, check them into the manifest Git repository.  Then you can reference any given snapshot as the commit SHA-1.  Making a thousand a day is no big deal, they are just unique commits in the manifest repository.

Interesting points in time can be tagged, e.g. builds actually OTA'd to testing.  Or shipped on consumer production models.

We just need to make "repo init -b" accept more than just a branch name.  If it can accept any arbitrary commit SHA-1 then we don't need to create a thousand branches per year in the manifest repository.

This is actually heading into the area where git submodules behave, where the submodules are always tagged with specific SHA-1s.  Android originally said we didn't want to do that, because we wanted to allow the submodules to "float" to the current latest version on some named branch.  Now we seem to be backpeddling into behaving like git submodules where we peg every subproject and have to update the manifests anytime the subproject changes?
 
That way, as a user, I can see several major scenarios:

-capture a snapshot of my client with repo export-manifest -o
mine.xml, and sync back to if with repo checkout -i mine.xml

Yes, exactly.  Wink pestered me about this a while ago.  Its not well supported in repo currently.  If we store the manifest as a branch in your local manifest repository we can help you catalog these files and even version them, so you can diff across mine1.xml and mine2.xml if there's something funny going on between the two states.  But then we're also starting to say, hmm, maybe git submodule with its revision pegging is better?

Jean-Baptiste Queru

unread,
Mar 3, 2009, 4:27:37 PM3/3/09
to repo-d...@googlegroups.com
Comments inline.

On Tue, Mar 3, 2009 at 12:56 PM, Shawn Pearce <s...@google.com> wrote:
> What about command name? I'm a bit unhappy about "export-manifest" as a
> command name, it seems long and ugly to me compared to our more simple
> "init" and "sync". But its the best I could come up with.

That's just a name. It's probably not gonna be a command that gets run
manually a lot, and if it is we could always have a short name for it
(e.g. repo em). At least the name is very descriptive.

> Yes. One downside is when we ever get around to doing inheritance
> (http://jira.source.android.com/jira/browse/REPO-3). What do we do when we
> export a manifest that is based on inheritance? Do we flatten it all out
> like I'm doing here in export-manifest, or do we export only parts? It gets
> ugly for everyone, user, repo tool developer, etc.

As far as I can tell "flat" should be enough. Manifests are easy to
use, actually, and can be edited manually, so anything more advanced
can be done by hand.

> This is actually heading into the area where git submodules behave, where
> the submodules are always tagged with specific SHA-1s. Android originally
> said we didn't want to do that, because we wanted to allow the submodules to
> "float" to the current latest version on some named branch. Now we seem to
> be backpeddling into behaving like git submodules where we peg every
> subproject and have to update the manifests anytime the subproject changes?

I'd say that this is an issue of usage, and each project might want to
have different approaches. I personally like to have everything float,
and having the ability to "peg" at the manifest level seems like an
added bonus that will probably give other people some flexibility (if
anything, if it's similar to what git submodules does, it'll provide a
familiar migration path for existing single-tree git users). I think
that the Android side will be mostly interested in being able to
temporarily repo sync to a snapshot, and the sync back to the head.

Compare:

p4 sync @127436 # sync to the perforce state of cupcake at the last export
# Followed by:
p4 sync # sync to the head revision

With

repo sync @44 # sync to the git state of cupcake at the last export
# Followed by:
repo sync # sync to the head

(Yes, I made up that repo syntax, it actually feels out of place, but
the similarity is interesting IMHO).

BTW, I do think that creating the thousands of branches would be
useful, as it'd allow to name them in order, which I know is something
that many people on the Android side are asking for. If we start with
a cupcake client (repo init -u ... -b cupcake), maybe those branches
could be named cupcake@34, cupcake@41, cupcake@44...

JBQ

Joe Onorato

unread,
Mar 3, 2009, 4:30:58 PM3/3/09
to repo-d...@googlegroups.com
On Tue, Mar 3, 2009 at 3:56 PM, Shawn Pearce <s...@google.com> wrote:
 
IMHO, check them into the manifest Git repository.  Then you can reference any given snapshot as the commit SHA-1.  Making a thousand a day is no big deal, they are just unique commits in the manifest repository.

Would it be possible to make a synthetic repository, where each version was whatever version of everything is in the repository at any given moment?  That could become the manifest repository that everyone uses, and the head of that is always the head of everything.

 

This is actually heading into the area where git submodules behave, where the submodules are always tagged with specific SHA-1s.  Android originally said we didn't want to do that, because we wanted to allow the submodules to "float" to the current latest version on some named branch.  Now we seem to be backpeddling into behaving like git submodules where we peg every subproject and have to update the manifests anytime the subproject changes?

The key to making this work might be that the versions are typically automatically updated, unless they're held for some reason, by making a manual copy (branch or tag).  I'm not sure we're  backpedaling -- I think for normal development, most people want to float.  It's just for the extraordinary cases like debugging, or automated tools that actually care about stuff other than the head, that it becomes useful (required) to sync up at particular versions of everything.
 

-joe





lingy...@gmail.com

unread,
Mar 3, 2009, 9:32:56 PM3/3/09
to Repo and Gerrit Discussion
>
> BTW, I do think that creating the thousands of branches would be
> useful, as it'd allow to name them in order, which I know is something
> that many people on the Android side are asking for. If we start with
> a cupcake client (repo init -u ... -b cupcake), maybe those branches
> could be named cupcake@34, cupcake@41, cupcake@44...
>
> JBQ
>
I agree with JBQ at this point. We have this requirement, too.
We want to know which version of cupcake I am syncing now, and which
version of cupcake I'll synced to. This would be helpful for us to
identify what's changed between two versions.

Emily

Shawn Pearce

unread,
Mar 5, 2009, 1:10:53 PM3/5/09
to repo-d...@googlegroups.com
On Tue, Mar 3, 2009 at 12:56, Shawn Pearce <s...@google.com> wrote:

What about command name?  I'm a bit unhappy about "export-manifest" as a command name, it seems long and ugly to me compared to our more simple "init" and "sync".

So I've decided to call the command "manifest", as in "repo manifest".

"repo manifest -o -" will output the manifest to stdout.
"repo manifest -o foo.xml" will output the manifest to a file.

"repo help manifest" shows the documentation on the manifest file format folks keep asking about, in addition to the command syntax.

I'm sort of hoping "repo help manifest" will be enough DIWM-ery for people to discover it on their own.

Shawn Pearce

unread,
Mar 5, 2009, 1:46:41 PM3/5/09
to repo-d...@googlegroups.com
On Tue, Mar 3, 2009 at 13:27, Jean-Baptiste Queru <j...@android.com> wrote:
BTW, I do think that creating the thousands of branches would be
useful, as it'd allow to name them in order, which I know is something
that many people on the Android side are asking for. If we start with
a cupcake client (repo init -u ... -b cupcake), maybe those branches
could be named cupcake@34, cupcake@41, cupcake@44...

If these are all just points along the "cupcake" branch history, we don't actually need to create the branches.  Instead we can have a rule in repo that says "ref@nth" really means do this:

  git rev-list --topo-order --reverse $ref  | head -n$nth | tail -1

(Grab the complete list of commits along that branch, sort them by topology, then commit date, reverse them so the the oldest comes first, then cut out the $nth records from that.)

This is somewhat expensive, but probably is going to be faster than dealing with 1200 "cupcake@nth" branches in the branch namespace.  Git does a "table scan" over the branch namespace on almost every operation, but we only need this "cupcake@nn" infrequently, like during a sync.

Of course, one reason why Git itself doesn't support this is because the nth number is only valid globally if everyone has the same history.  Its very easy for two different parties to have "cupcake@200" and be talking about completely different states.

Jean-Baptiste Queru

unread,
Mar 5, 2009, 2:01:03 PM3/5/09
to repo-d...@googlegroups.com
Is there a way to achieve this with repo across all projects? I was
under the impression that, given two changes done in two different
projects, there was no way to know which one was made before the
other.

JBQ

--

Jean-Baptiste M. "JBQ" Queru
Android Engineer, Google.

Questions sent directly to me that have no reason for being private

Shawn Pearce

unread,
Mar 5, 2009, 2:09:28 PM3/5/09
to repo-d...@googlegroups.com
On Thu, Mar 5, 2009 at 11:01, Jean-Baptiste Queru <j...@android.com> wrote:

Is there a way to achieve this with repo across all projects?

Not easily.
 
I was
under the impression that, given two changes done in two different
projects, there was no way to know which one was made before the
other.

You are correct.

One could guess by doing a "git rev-list --topo-order" in all projects and keeping track of the commit dates, and the positions of each commit, and try to create a super-ordering that covers all projects.

Git itself has no operations to do this.  However one could write a program to process an output such as:

  repo forall -c 'git log --topo-order HEAD "--pretty=format:%H %ct" | sed "s,^,$REPO_PROJECT ,";echo'

and try to produce some sort of sense from it.

Of course, since Gerrit fast-forwards when submitting changes, you can have a change enter a project months after its commit date, and it will show up in the wrong position relative to other changes submitted about the same time.  To get a more accurate picture Gerrit would either need to always force a merge commit node, or you would need to take reflogs into account as you do this ordering.

Jean-Baptiste Queru

unread,
Mar 5, 2009, 2:14:32 PM3/5/09
to repo-d...@googlegroups.com
Yeah, my concern with fast-forwarding is that it means that the
numbering of builds wouldn't be unique even on a single machine (and
therefore obviously not globally).

I think there's a plan on the Android side to have a reference server
sync in a loop and remember the state after each change as it sees
them (and number those states monotonically). I was just wondering how
useful repo manifest would be in that case, and how much the result
could be stored in manifest.git

JBQ

--

Shawn Pearce

unread,
Mar 5, 2009, 2:18:01 PM3/5/09
to repo-d...@googlegroups.com
On Thu, Mar 5, 2009 at 11:14, Jean-Baptiste Queru <j...@android.com> wrote:

Yeah, my concern with fast-forwarding is that it means that the
numbering of builds wouldn't be unique even on a single machine (and
therefore obviously not globally).

Right.
 
I think there's a plan on the Android side to have a reference server
sync in a loop and remember the state after each change as it sees
them (and number those states monotonically). I was just wondering how
useful repo manifest would be in that case, and how much the result
could be stored in manifest.git

Yup.

If we use the new "repo manifest -o -r" to save the manifest each time the build server sees them, we can commit them to a "cupcake-build" branch or something, and publish that.  So long as only that build server is updating "cupcake-build" and anyone who cares about talking about points of time along "cupcake-build" use the build server's output, and only its output, we can get an ordering along that single branch.

And then the rev-list | head | tail trick I described can be applied to "cupcake-build" to get sequential points that are "well known" to anyone following AOSP.


Jean-Baptiste Queru

unread,
Mar 5, 2009, 2:23:57 PM3/5/09
to repo-d...@googlegroups.com
Ah, that makes total sense, yeah.

JBQ, convinced.

--

johnny

unread,
Mar 6, 2009, 2:25:27 AM3/6/09
to Repo and Gerrit Discussion
To me this command seems like the tag function of GIT. By saving the
manifest file, you actually take a snapshot of all the projects and
give the snapshot a name.Does it sound this way? However, a GIT tag
will be maintained by GIT. But we have to maintain the "repo tag" by
ourselves.

With the "repo tag" we can easily reproduce the build easily and
exactly. The other thing I am thinking is, we need also a way to know
the differences between two "repo tags" or between one "repo tag" and
current system. For example, which project is added or deleted, which
project is modified.

Regards,
Johnny Xia.

Shawn Pearce

unread,
Mar 6, 2009, 10:15:16 AM3/6/09
to repo-d...@googlegroups.com
On Thu, Mar 5, 2009 at 23:25, johnny <john...@gmail.com> wrote:

To me this command seems like the tag function of GIT. By saving the
manifest file, you actually take a snapshot of all the projects and
give the snapshot a name.Does it sound this way? However, a GIT tag
will be maintained by GIT. But we have to maintain the "repo tag" by
ourselves.

Eh.

The parallel in Git is more like a supermodule.  In a Git supermodule we have a listing of pairs:

  commit SHA-1 ( "revision" property in the manifest)
  path  ("path" property in the manifest)

That listing is then committed as a single commit, giving us a single commit SHA-1 for the supermodule, describing that snapshot.  Committing the manifest XML into the manifest repository gives us roughly the same thing.

The difference is, we also include in the manifest XML remote URLs, and code review server URLs.  And instead of encoding a commit SHA-1 we can encode a branch or tag name, allowing that repository to automatically "float".

Oh, and we use XML.

With the "repo tag" we can easily reproduce the build easily and
exactly. The other thing I am thinking is, we need also a way to know
the differences between two "repo tags" or between one "repo tag" and
current system. For example, which project is added or deleted, which
project is modified.

Yea.

Git has difference support with its super/submodule code.  We don't have that with repo.  But I agree we need it.
Reply all
Reply to author
Forward
0 new messages