Repo vs. Submodule

1,319 views
Skip to first unread message

Ali Al-Shabibi

unread,
Aug 9, 2016, 1:52:08 PM8/9/16
to cord-d...@opencord.org
Hi all,

We have been using git submodule for cord in the first release. While it does present some advantages it’s use seems error prone.

Repo tracks a branch rather than an individual commit and uses a manifest file to describes to sub repos in a project. Given that a manifest file is used it will be easier for third parties to integrate their customisations.

Here are two discussions that could be useful:

[1] repo vs git submodule discussion in linaro maillist: http://lists.linaro.org/pipermail/linaro-dev/2011-June/004941.html
[2] 'repo is built to manage an OS distribution, in Git' repo announcement mail http://lwn.net/Articles/304488/

Any thoughts?

Cheers.

--
Ali

Zack Williams

unread,
Aug 9, 2016, 3:49:15 PM8/9/16
to Ali Al-Shabibi, cord-d...@opencord.org
On Aug 9, 2016, at 10:52 AM, Ali Al-Shabibi <ali.al-...@onlab.us> wrote:
>
> Any thoughts?

Two other options:

3. We automate the process of updating submodules to match branch head versions

4. We do away with submodules and unified repos and have scripts that handle the checkouts, which could then be programmatically defined by tag/branch/etc.

I'd prefer #3 to being dependent on an additional 3rd party tool like repo.

#4 might be even better, as it's easy to imagine scenarios where some versions should be on a stable branch/commit, and development work might occur on another. In XOS we went in this direction - this Makefile that allows per-Service overridable repo and branches:

https://github.com/opencord/service-profile/blob/master/common/Makefile.services
https://github.com/opencord/service-profile/blob/master/common/Makedefs

(not that we should use make for this, but the general concept still applies)

- Zack

Ali Al-Shabibi

unread,
Aug 9, 2016, 4:25:55 PM8/9/16
to Zack Williams, cord-d...@opencord.org

>> Any thoughts?
>
> Two other options:
>
> 3. We automate the process of updating submodules to match branch head versions

This can be done in gerrit but we decided that we would not want that because nasty things could happen behind a developers back.

>
> 4. We do away with submodules and unified repos and have scripts that handle the checkouts, which could then be programmatically defined by tag/branch/etc.

This scares me. First, it creates confusion as to which metadata (.git) applies to which repo. Second, how does a developer/user know where one repo ends and the other starts. Finally, managing an extra layer of scripting for this seems unnecessary given that there a many tools out there (repo, submodule, subtree, etc.)

Scott Baker

unread,
Aug 9, 2016, 4:42:59 PM8/9/16
to Ali Al-Shabibi, Zack Williams, cord-d...@opencord.org
Regarding #4, we're currently doing the checkouts in the wrong place (on the target machine instead of on the dev machine) and that is causing some grief with requirement of Internet connectivity on pods, but we could fix that. I haven't given it much thought, as I had assumed XOS's current approach was only temporary until we moved to using submodules for everything.

A potential advantage to #4 is that, assuming the repo tags were all in one place, it would be easy for someone to build a mixed configuration, for example "RC3, but with RC2's ONOS and RC4's vSG" if they had some reason to do so.

I'm not familiar enough with the repo tool to appreciate it's advantages and disadvantages compared to #4. I do know that what we have now is a bit clunky, especially when a developer is trying to develop across multiple repositories at the same time. It's often that I'm working in xos, service-profile, and a service repository all at the same time, and I don't know if repo would ease my workflow.

Scott

--
You received this message because you are subscribed to the Google Groups "CORD Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cord-discuss+unsubscribe@opencord.org.
To post to this group, send email to cord-d...@opencord.org.
Visit this group at https://groups.google.com/a/opencord.org/group/cord-discuss/.
To view this discussion on the web visit https://groups.google.com/a/opencord.org/d/msgid/cord-discuss/039D1AC4-8723-4CD0-B942-4C2AC3C13F07%40onlab.us.

Zack Williams

unread,
Aug 9, 2016, 5:03:00 PM8/9/16
to Ali Al-Shabibi, cord-d...@opencord.org
On Aug 9, 2016, at 1:25 PM, Ali Al-Shabibi <ali.al-...@onlab.us> wrote:
>
>> 3. We automate the process of updating submodules to match branch head versions
>
> This can be done in gerrit but we decided that we would not want that because nasty things could happen behind a developers back.

That's the tradeoff - automation and possible automated error vs. manual process and possible user error.

A goal of reducing error in general and human interaction at the same time seems like the right course of action, as it has benefits in either case.

>> 4. We do away with submodules and unified repos and have scripts that handle the checkouts, which could then be programmatically defined by tag/branch/etc.
>
> This scares me. First, it creates confusion as to which metadata (.git) applies to which repo.

I'm not quite clear on what you mean by this - each repo will retain it's own metadata, correct? Or is this being stripped out somehow in the current scenario?

> Second, how does a developer/user know where one repo ends and the other starts.

A directory per repo seems like a fairly straightforward mapping.

> Finally, managing an extra layer of scripting for this seems unnecessary given that there a many tools out there (repo, submodule, subtree, etc.)

repo is just another script, and after reading it, appears to have a lot of Android-specific stuff in it like their certificates, etc. that we don't need.

I'm not trying to be "Not Invented Here" on this - adding even more required tooling to make a build seems like poor precedent to set. What gets checked out is two elements:

- The repo to use
- The commit/branch within that repo

All these systems are variants on that basic concept with a variety of limitations.

I'm leaning toward #3 (continuing to use submodules, automation), with work to make that less error prone and reduce human interaction to a minimum. Maybe everything up until the application of final release tags is made automatic?

- Zack

Ali Al-Shabibi

unread,
Aug 9, 2016, 5:24:05 PM8/9/16
to Zack Williams, cord-d...@opencord.org

>>
>>> 3. We automate the process of updating submodules to match branch head versions
>>
>> This can be done in gerrit but we decided that we would not want that because nasty things could happen behind a developers back.
>
> That's the tradeoff - automation and possible automated error vs. manual process and possible user error.
>
> A goal of reducing error in general and human interaction at the same time seems like the right course of action, as it has benefits in either case.

Yes this is true. But the idea here was to move to a friendlier system than submodules which is relatively complex at times.

>
>>> 4. We do away with submodules and unified repos and have scripts that handle the checkouts, which could then be programmatically defined by tag/branch/etc.
>>
>> This scares me. First, it creates confusion as to which metadata (.git) applies to which repo.
>
> I'm not quite clear on what you mean by this - each repo will retain it's own metadata, correct? Or is this being stripped out somehow in the current scenario?

Git commands find the repo meta information by starting to look for .git in the current working directory, and if there's none they keep going to the parent. However, if there are .git directories in any subdirectories, that doesn't matter, the meta information for the entire tree is held in the top-level .git directory. As a result, it's quite possible to create a situation where git status reports a file in a subdirectory as new (or modified), whereas when you run git status in the subdirectory which is a "nested repo", it will appear untracked. And that's just one example -- the basic issue is that a repo tracks files in all its subdirectories, so duplicating that type of info in subdirectories creates a denormalised situation which will result in inconsistencies sooner or later.

>
>> Second, how does a developer/user know where one repo ends and the other starts.
>
> A directory per repo seems like a fairly straightforward mapping.
>
>> Finally, managing an extra layer of scripting for this seems unnecessary given that there a many tools out there (repo, submodule, subtree, etc.)
>
> repo is just another script, and after reading it, appears to have a lot of Android-specific stuff in it like their certificates, etc. that we don't need.

That’s just a gnupg to verify that these repos have been handled with this version of repo (which can be disabled). Seems like this is a desirable feature though.
Reply all
Reply to author
Forward
0 new messages