Code vendoring: Goven & subtree merging

323 views
Skip to first unread message

Nathan Youngman

unread,
Oct 2, 2013, 11:25:49 PM10/2/13
to go-package...@googlegroups.com
Let's discuss the merits of code vendoring (the Google way) and use this thread to evaluate current solutions:

Keith Rarick's Goven

Subtree merging:

Note: I don't think we (as the larger Go community) have the luxury of using Git-specific solution, but I do like the collaboration potential of Subtree Merging, and it's certainly worth learning how it works.




William Kennedy

unread,
Oct 2, 2013, 11:43:44 PM10/2/13
to Nathan Youngman, go-package...@googlegroups.com
Goven is performing the same operation as go get but then removing all the DVCS support files plus altering the import paths in the code. This really scares me. I am not sure I want to use a tool that will be altering the code. 

Sincerely,
Bill Kennedy
--
You received this message because you are subscribed to the Google Groups "Go Package Management" group.
To unsubscribe from this group and stop receiving emails from it, send an email to go-package-manag...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Jiahua Chen

unread,
Oct 2, 2013, 11:46:46 PM10/2/13
to go-package...@googlegroups.com, Nathan Youngman
I definitely agree, it's not safe to change any code by tool, I think the better way is to generate an alert file or message or whatever? except changing the code.
To unsubscribe from this group and stop receiving emails from it, send an email to go-package-management+unsub...@googlegroups.com.

Sebastien Douche

unread,
Oct 3, 2013, 12:02:32 AM10/3/13
to Nathan Youngman, go-package...@googlegroups.com
On Thu, Oct 3, 2013 at 5:25 AM, Nathan Youngman <n...@nathany.com> wrote:
> Subtree merging:
> http://git-scm.com/book/en/Git-Tools-Subtree-Merging
>
> Note: I don't think we (as the larger Go community) have the luxury of using
> Git-specific solution, but I do like the collaboration potential of Subtree
> Merging, and it's certainly worth learning how it works.

+1. I love Git very much but it's dangerous to depend of another
technology. What happens if tomorrow, a new tool supersedes Git (like
Git with SVN)? Or if Git developers drop or change Subtree?


--
Sebastien Douche <sdo...@gmail.com>
Twitter: @sdouche / G+: +sdouche

Nathan Youngman

unread,
Oct 3, 2013, 12:08:38 AM10/3/13
to go-package...@googlegroups.com, Nathan Youngman

The my concern with changing code is not safety (I can review the changes in my DVCS tools). My concern is ease of collaboration.

Some people have suggested manipulating the GOPATH (eg. to include vendor/src) rather than rewriting imports.

Combined that with being able to easily move between the upstream code and the local vendored path (something subtree merging seems to provide), and I'm starting to get interested. ^_^

Though I think the Go Team actively discourages mucking with the GOPATH?

Sebastien Douche

unread,
Oct 3, 2013, 12:10:50 AM10/3/13
to William Kennedy, Nathan Youngman, go-package...@googlegroups.com
On Thu, Oct 3, 2013 at 5:43 AM, William Kennedy <bi...@thekennedyclan.net> wrote:
> Goven is performing the same operation as go get but then removing all the
> DVCS support files plus altering the import paths in the code. This really
> scares me. I am not sure I want to use a tool that will be altering the
> code.

I understand your point of view. Speakinf for myself, I don't like the
link between the imported code and the url of the home of the same
code. The good way I think is to add here, one indirection.


Example in Dart:

import 'package:mypackage/some_file.dart'; // I import mypackage.
Don't care where it's hosted

And in pubspec.yaml:
dependencies:
mypackage:
git: https://github.com/path/to/mypackage.git

Nota: pubspec (go get tool for Dart) is optional. You can live w/o it.

Sebastien Douche

unread,
Oct 3, 2013, 12:15:34 AM10/3/13
to Nathan Youngman, go-package...@googlegroups.com
On Thu, Oct 3, 2013 at 6:08 AM, Nathan Youngman <n...@nathany.com> wrote:
> Though I think the Go Team actively discourages mucking with the GOPATH?
> https://groups.google.com/forum/m/#!msg/golang-nuts/dxOFYPpJEXo/mpQ5CwBuj-EJ

I don't understand the Go Team. It's like if they worked on a C
project, with only system dependencies and few stalled libraries.
Message has been deleted

Keith Rarick

unread,
Oct 3, 2013, 12:55:37 AM10/3/13
to go-package...@googlegroups.com
Assuming you want to vendor packages into your repo, goven
works okay. Its implementation probably still has a few rough
edges, but it's not horribly off the mark.

The problem I have with goven (or, AFAICT, any tool that
rewrites import paths), is dealing with transitive dependencies.
Say you are developing package A, which depends on B and C,
and B also depends on C. If you vendor C before B, B's imports
will not be rewritten. Maybe there's an elegant way to check for
this case and prevent or warn about it, or avoid it altogether.

(It's slightly offtopic for this thread, but after living with godep
for a few weeks, I prefer its model over vendoring. Now that
godep is integrated with the heroku buildpack, it really eases
development and collaboration on multiple related repositories.
You can push library code up in a branch, and have one or
more other projects using the branch before you've merged
those changes into the main line, so you don't disrupt normal
users of the library. This helps your users even if they're not
using godep!)

Dave Cheney

unread,
Oct 3, 2013, 2:40:21 AM10/3/13
to Nathan Youngman, go-package...@googlegroups.com
Having worked in a C++ environment for a while that used svn, this was
an amazingly effective solution.

Having said that, I think it is not generally applicable for a general
purpose package management solution (yup, I'm aware I havent' defined
that term properly) for the following reasons.

1. not everyone uses the same DVCS, this pretty much makes this idea a
non starter from the get go, I am not aware of any solution that lets
you setup a hg subrepo inside a git repo, or a git subtree inside a
bzr branch.
2. I have never used them, but I have heard that git subtrees, or
whatever passes for svn externals in git, are a bit sucky and the
general consensus is to avoid them (much as we discourage general
usage of the container/* std library types)
3. I can't see a world where everyone who wants to program in Go
*must* use a single DVCS

So, having poo pooed the general idea of submodules/externals, what
good parts can be taken from the idea?
> --
> You received this message because you are subscribed to the Google Groups
> "Go Package Management" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to go-package-manag...@googlegroups.com.

Dave Cheney

unread,
Oct 3, 2013, 2:41:51 AM10/3/13
to William Kennedy, Nathan Youngman, go-package...@googlegroups.com
I'm less worried about rewriting the source, gofmt does that every
time you run it.

What I am concerned about is loosing the DVCS history, which, as a
maintenance programmer, is something I need every day. Nuking it as
part of the goven process sounds like a bad idea.

Nathan Youngman

unread,
Oct 3, 2013, 2:46:34 AM10/3/13
to Dave Cheney, go-package...@googlegroups.com
I think submodules/externals are more in the realm of revision locking (storing a reference to the other code), whereas subtree merging is a means to move code between a path in the repo and a branch/proper repository. 

While I don't think subtree merging itself is the answer for the reasons mentioned, it may worth investigating how it compares to Goven. Surely a cross-DVCS tool with similar capabilities would be possible? (if desirable)

--
Nathan Youngman
Email: n...@nathany.com
Web: http://www.nathany.com

Dave Cheney

unread,
Oct 3, 2013, 2:59:11 AM10/3/13
to Nathan Youngman, go-package...@googlegroups.com
> While I don't think subtree merging itself is the answer for the reasons
> mentioned, it may worth investigating how it compares to Goven. Surely a
> cross-DVCS tool with similar capabilities would be possible? (if desirable)

I'd like to see those investigations, but I have a concern with the approach

If subtree/submodules/externals/whatever replaces goven, then how can
import paths be written ? ie, if I 'vendor' the following package via
some mechanism from the previous sentence,

github.com/pkg/a

what does the import path in my code look like ? It can't be `import
"github.com/pkg/a"` because that would confuse go get, so it has to be
`import "github.com/mycorp/myproject/vendor/github.com/pkg/a"`, or
something like that. But that approach blows up if we find this code
inside pkg/a

package a

import "github.com/pkg/a/b"

The only solution that occurs to me to fix _this_ problem is relative
imports, and I don't want that to be the solution to anything.

kamil....@gmail.com

unread,
Oct 3, 2013, 3:01:29 AM10/3/13
to go-package...@googlegroups.com
It's not super elegant, but I wrote github.com/kisielk/godepgraph as a means to make sure I've vendored all my dependencies properly. You can at least visually inspect the dependency graph. It would probably be possible to add some kind of automatic detection but since vendoring packages involves human interaction anyway, having a tool a person can use to verify it is pretty good.

It's worked well enough for our internal projects anyway.

Nathan Youngman

unread,
Oct 3, 2013, 3:14:41 AM10/3/13
to go-package...@googlegroups.com, Nathan Youngman

I think how I might see this working (but it wouldn't actually work) is to have a project-specific GOPATH.

GOPATH=$HOME/go/myproj/vendor/src:$HOME/go

Use some tool to copy stuff into vendor/src which takes precedence over code installed in the normal way. This would allow you to evaluate a library by "go getting" it and importing it the usual way, and if satisfied, vendor a copy without changing the import paths at all. Presumably the tools would allow you to easily pull in upstream changes or push changes back.

Of course this won't work so well because Go get will install into the vendor/src path.
I've ranted to no end on the GOPATH being backwards:

I still think go get should install things into the last thing in the GOPATH, but I've not created a CFP because it's a breaking change, and because someone else probably has a better idea. :-)

Any monkeying about with GOPATH is just kinda a hack for "virtualenvs" anyway, but IMO the lesser of two evils vs. import rewriting/unwriting.

If the Go Team is serious about pushing the Google way of using code vendoring, we're going to need some better tooling.

Nathan.

Nathan Youngman

unread,
Oct 5, 2013, 3:06:58 AM10/5/13
to go-package...@googlegroups.com, Nathan Youngman

Other than Goven, Git subtree merging and a few Perl scripts in Camlistore, is anyone aware of any other tools for vendoring Go packages?

Say you are developing package A, which depends on B and C,
and B also depends on C. If you vendor C before B, B's imports
will not be rewritten. Maybe there's an elegant way to check for
this case and prevent or warn about it, or avoid it altogether.

If a tool kept track of the original locations and could rewrite all the dependencies each time something was added, would that solve the order-dependent issues? 

To do this, the original code repositories shouldn't be required, otherwise that kinda defeats the point of vendoring.

package a 
import "github.com/pkg/a/b

I think that would need to be rewritten as github.com/mycorp/myproject/vendor/github.com/pkg/a/b

Outside of the standard library, any import not inside that project should be an error (as determined by the tool). Otherwise one might also run into:


and somewhere else:


Nathan.

Nathan Youngman

unread,
Oct 5, 2013, 10:27:07 PM10/5/13
to go-package...@googlegroups.com, William Kennedy, Nathan Youngman, Russ Cox


On Thursday, 3 October 2013 00:41:51 UTC-6, Dave Cheney wrote:
I'm less worried about rewriting the source, gofmt does that every
time you run it.

What I am concerned about is loosing the DVCS history, which, as a
maintenance programmer, is something I need every day. Nuking it as
part of the goven process sounds like a bad idea.

Yah. 

The vendor approach brings a degree of safety that is certainly desirable, but at least with current tools, it loses out with regards to collaboration (even with your own reusable libraries). 

Though, what if a tool could rewrite imports paths both ways

What if I could change the import paths in my project to go back to e.g. github.com/howeyc/fsnotify. If I ran into a bug due to a recent update, I could use the DVCS tools like bisect. I could do my upstream commit/pull request as usual. When I'm done I could vendor the changes back in.

Another situation is where a developer commits a local fix to the vendored copy, but then desires to push that fix upstream. That is something that Git subtree merging is capable of handling: http://git-scm.com/book/en/Git-Tools-Subtree-Merging
I don't know enough about how subtree merging is implemented to be able to say what's involved in building a DVCS-agnostic tool that can do that.

What I do know is that Goven has languished for a year while Keith turned his attention to godep, yet as of the moment, Goven will be the only third-party packaging tool mentioned in the Go 1.2 FAQ on December 1st. Of the many, many open source package-related tools for Go, I'm not aware of any other tools to vendor Go packages. 

At the same time several people on (and off) this list consider the ability to vendor packages a must. This approach to reproducible builds certainly deserves more attention than it has received.

Aside: I take back my previous comment that import rewriting is worse than mucking with the GOPATH, for three reasons.
1. Manipulating the GOPATH requires downstream users to do the same in order to build the project. Import rewrites only come into play when updating third-party libraries.
2. Having all the import paths (outside stdlib) prefixed with the path to the project makes it easier to verify that all third-party dependencies are in vendor.
3. My complaint is solely with relation to making upstream collaboration more difficult (as with losing DVCS history). If tools can mitigate those problems, then I'm happy :-)

Nathan.


Dave Cheney

unread,
Oct 6, 2013, 8:16:55 PM10/6/13
to Nathan Youngman, go-package...@googlegroups.com
I think I understand the basic gist of a bidrectional version of goven
than can 'undo' the vendor process to possibly automate, or at least
assist with keeping up to date with the upstream.

I have two points, one minor, on major.

1. should go get be used at all this scenario once you have your
checkout of your package ?
2. this approach is not going to be popular with the Debian/Ubuntu (I
guess fedora as well, I don't know their plans well), who are freaked
out by the fact that to get reproducible builds in Go currently, you
have to give them a whole $GOPATH as a tarball. The amount of
duplication of code raises their security hackles. Maybe for good
reason.

I raise point 2 not to shoot down your idea, just to mention that for
that subset of the Go community they will probably take issue.

Cheers

Dave
> --
> You received this message because you are subscribed to the Google Groups
> "Go Package Management" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to go-package-manag...@googlegroups.com.
> To post to this group, send email to go-package...@googlegroups.com.

Nathan Youngman

unread,
Oct 7, 2013, 1:12:21 AM10/7/13
to go-package...@googlegroups.com, Nathan Youngman


On Sunday, 6 October 2013 18:16:55 UTC-6, Dave Cheney wrote:
I think I understand the basic gist of a bidrectional version of goven
than can 'undo' the vendor process to possibly automate, or at least
assist with keeping up to date with the upstream.

I have two points, one minor, on major.

1. should go get be used at all this scenario once you have your
checkout of your package ?

You mean to "go get -u" the repository and then merge that into vendor/?

I not that familiar with workflows used by those who vendor code, having used this approach only a wee bit.

But I would say that one advantage of the two step process is that any other tools that augment the "go get -u" process could be used alongside the tool to vendor packages.

On the other hand, more steps is more steps. I think Camlistore just involves running update.pl and you're done in one step. (Yet to try it out)

 
2. this approach is not going to be popular with the Debian/Ubuntu (I
guess fedora as well, I don't know their plans well), who are freaked
out by the fact that to get reproducible builds in Go currently, you
have to give them a whole $GOPATH as a tarball. The amount of
duplication of code raises their security hackles. Maybe for good
reason.

Do you mean all the code in the app, including all transitive dependencies in vendor/? 

If there is something in $GOPATH not used by the app, I see no reason to include it in the tarball.

I think that for any commit made to the app, all dependencies should be in vendor/, and the tool should help to check that.

Rewriting import paths the other way would just be to access DVCS tools and contribute upstream, but only for temporarily use during development. 
 
I raise point 2 not to shoot down your idea, just to mention that for
that subset of the Go community they will probably take issue.

I'm not sure if this is what you're getting at.... having an app not contain code for all its dependencies?

There are more than a dozen tools that use the "revision locking" approach for "reproducible builds", despite the caveats:

1. No safety against repositories vanishing, resulting in a bad day for the developer, impacting downstream users of the app who build from source, and breaking the ability to build old versions of the app without first fixing import paths.
2. Requiring a separate solution for project-specific workspaces, which may also impact downstream users.

Even knowing the tradeoffs, some may prefer this approach. I think that's fine. I think our working group should help people be aware of the trade offs for the tools they choose, and help cut through the confusion of 20 tools that do the same thing slightly differently.
 
If we can reduce the tradeoffs, then even better, such as improving upstream collaboration for tools used to vendor Go packages.

The README for Goven suggests that it doesn't support Windows. Ensuring that it does work before Go 1.2 is released should take minimal effort, and hopefully we can do a whole lot more in the next 8 weeks.

Cheers,
Nathan.

P.S. I didn't make a separate thread for "revision locking" when the group began. Probably should.

Kamil Kisiel

unread,
Oct 7, 2013, 1:42:18 AM10/7/13
to go-package...@googlegroups.com
My project this weekend was creating a new tool along the lines of goven. I called it "vendorize":


It works on the same principle as goven, but instead of providing a package you wish to vendor, you provide a package whose dependencies you want to copy. The tool walks the whole dependency graph of the package and copies dependencies (including transitive) as necessary. It updates all import paths as it goes. So far I've only tried it on a few limited test cases, and I'm sure there's probably bugs in it that I haven't thought of, but I thought I'd put it out there so people can give it a try and give me feedback.

I don't really think vendoring packages is a general solution to dependency management for Go programs, but it can be appropriate in some cases. Notably we've used the technique at my employer for our go projects to keep from having to get external dependencies and have code change on us. Previously I did all the work manually and used godepgraph to make sure it worked out correctly, but I should be able to replace that with the use of vendorize in the future.

Hopefully it will be another useful tool in the toolkit. Feedback is welcome.


On Wednesday, October 2, 2013 8:25:49 PM UTC-7, Nathan Youngman wrote:

Nathan Youngman

unread,
Oct 7, 2013, 2:40:48 AM10/7/13
to go-package...@googlegroups.com

Kamil,

I won't have a chance to try it out tonight (already too tired), but I just want to say I think it's awesome that you can build a tool like this over a weekend in ~280 lines of code. A testament to the usefulness of the Go stdlib? 

Super glad to have you involved in the group.

Nathan.

Dave Cheney

unread,
Oct 7, 2013, 5:55:14 AM10/7/13
to Nathan Youngman, go-package...@googlegroups.com
> You mean to "go get -u" the repository and then merge that into vendor/?
>
> I not that familiar with workflows used by those who vendor code, having
> used this approach only a wee bit.
>
> But I would say that one advantage of the two step process is that any other
> tools that augment the "go get -u" process could be used alongside the tool
> to vendor packages.

I haven't given this a lot of thought, but I think I was wondering
what would happen if say the application imported both

github.com/lib/pq

and some vendorized version

github.com/you/pkg/vendor/github.com/lib/pq

I'm also concerned about what happens if the package you depend on has
also gone down the vendor route

github.com/your/pkg/vendor/github.com/someoneelse/pkg/github.com/lib/pq

would be surprising to discover in your dependency tree. It strikes me
that vendoring is only appropriate for projects that just consume
packages, not provide them.

> Do you mean all the code in the app, including all transitive dependencies
> in vendor/?

I probably didn't explain this point thoroughly enough, but yes, if
vendoring is the process whereby you package (in the Debian sense of
the word) everything that your application needs inside it so it is
just one repo, on tarball that just needs a Go compiler, then yes.

> If there is something in $GOPATH not used by the app, I see no reason to
> include it in the tarball.

Yup, not sure how I hinted at that, it's not an issue. I'm only
considering the big blob of code that lives in
github.com/you/pkg/vendor

> I think that for any commit made to the app, all dependencies should be in
> vendor/, and the tool should help to check that.

Sounds sane.

> Rewriting import paths the other way would just be to access DVCS tools and
> contribute upstream, but only for temporarily use during development.
>
>>
>> I raise point 2 not to shoot down your idea, just to mention that for
>> that subset of the Go community they will probably take issue.

Yup, I wasn't clear before. I had on by Debian packaging hat without
giving anyone the requisite 10 minute warning.

> I'm not sure if this is what you're getting at.... having an app not contain
> code for all its dependencies?
>
> There are more than a dozen tools that use the "revision locking" approach
> for "reproducible builds", despite the caveats:
>
> 1. No safety against repositories vanishing, resulting in a bad day for the
> developer, impacting downstream users of the app who build from source, and
> breaking the ability to build old versions of the app without first fixing
> import paths.

Debian based distros (and Redhat for that matter) consider this a
solved problem, source debs or srpms contain everything they need to
be rebuilt.

> 2. Requiring a separate solution for project-specific workspaces, which may
> also impact downstream users.
>
> Even knowing the tradeoffs, some may prefer this approach. I think that's
> fine. I think our working group should help people be aware of the trade
> offs for the tools they choose, and help cut through the confusion of 20
> tools that do the same thing slightly differently.

The specific problem for distro makers can be summarised as this. If
your application is compiled against it's own private copy of zlib or
openssl, it is harder to update when those packages have security
issues, and harder to _know_ those packages even need to be updated
because the package metadata does not declare a dependency on the
vulnerable package.

Substitiute zlib and openssl for the Go code vendored into the tarball
that represents the input to the dpkg-buildpkg step and you can
understand why they get grumpy about private (vendor) copies of
libraries.

>
> If we can reduce the tradeoffs, then even better, such as improving upstream
> collaboration for tools used to vendor Go packages.
>
> The README for Goven suggests that it doesn't support Windows. Ensuring that
> it does work before Go 1.2 is released should take minimal effort, and
> hopefully we can do a whole lot more in the next 8 weeks.
>
> Cheers,
> Nathan.
>
> P.S. I didn't make a separate thread for "revision locking" when the group
> began. Probably should.
>

Kamil Kisiel

unread,
Oct 7, 2013, 1:16:46 PM10/7/13
to go-package...@googlegroups.com
Yes, the go/* libraries are very powerful and there is huge potential for more tooling that hasn't been written yet. But as usual, it's not the number of lines of code, but *which* lines of code ;) I spent a lot of time rewriting it many times over. Some clever use of recursion helps keep things simple.
Reply all
Reply to author
Forward
0 new messages