Dependencies & vendoring

24,853 views
Skip to first unread message

Brad Fitzpatrick

unread,
Mar 2, 2015, 12:38:02 PM3/2/15
to golang-dev

Gophers,


One of the most frequent questions we’ve received since Go 1 was how to deal with dependencies and their versions. We’ve never recommended any particular answer.


In Google’s internal source tree, we vendor (copy) all our dependencies into our source tree and have at most one copy of any given external library. We have the equivalent of only one GOPATH and rewrite our imports to refer to our vendored copy. For example, Go code inside Google wanting to use “golang.org/x/crypto/openpgp” would instead import it as something like “google/third_party/golang.org/x/crypto/openpgp”. This has worked out very well for us and we get reproducible builds. If we ever want to update openpgp in our vendored directory, we update it, verify that all the affected code in the world still compiles and tests pass (making code changes as necessary), and then check it in. The world compiles and passes at all points in the version control system’s history, and if we bisect in time, we see which version of an external library was in use at any point.


We’re starting to see others in the Go community do the same, with tools like godep and nut.


We think it’s time to start addressing the dependency & vendoring issue, especially before too many conflicting tools arise and fragment best practices in the Go ecosystem, unnecessarily complicating tooling. It would be nice if the community could converge on a standard way to vendor.


Our proposal is that the Go project,


  1. officially recommends vendoring into an “internal” directory with import rewriting (not GOPATH modifications) as the canonical way to pin dependencies.

  2. defines a common config file format for dependencies & vendoring

  3. makes no code changes to cmd/go in Go 1.5. External tools such as “godep” or “nut” will implement 1) and 2). We can reevaluate including such a tool in Go 1.6+.


The important part is that as a community, we all do this the same way, so tooling can mature and interoperate.


In Go 1.5, the “internal” package mechanism introduced in Go 1.4 for the standard library will be extended to all go-gettable packages, so using the “internal” directory as the root of rewritten import paths makes sense (as opposed to “vendor” or “third_party”).


Consider an existing use of vendoring in the Go source tree: $GOROOT/src/cmd/internal currently contains copies of “rsc.io/x86/x86asm” and “rsc.io/arm/armasm” as $GOROOT/src/cmd/internal/rsc.io/x86/x86asm/ and $GOROOT/src/cmd/internal/rsc.io/arm/armasm/, respectively.  When we use those inside the Go tools, however, we import them “cmd/internal/rsc.io/x86/x86asm” and not “rsc.io/x86/x86asm”. (Although this example comes from the Go distribution repo, the effect is the same as a local project using $GOPATH/src/your.project/path instead of $GOROOT/src/cmd.)


We currently maintain those copies by hand. Instead, we want to write a file (filename and syntax to be determined), such as:


     src/cmd/internal/TBDCONFIG.CFG:

             “rsc.io/x86/x86asm” with revision af2970a7819d

             “rsc.io/arm/armasm” with revision 616aea947362


And then your vendoring tool (such as “godep” or “nut”) would read TBDCONFIG.CFG and write out,

     src/cmd/internal/rsc.io/x86/x86asm/*.go

     src/cmd/internal/rsc.io/arm/armasm/*.go


rewriting imports and import comments in these files for the new location.  It may also optionally change your source so any occurrence of


    import “rsc.io/x86/x86asm


becomes


    import “cmd/internal/rsc.io/x86/x86asm


The vendoring tool would be responsible for generating errors on conflicts or missing dependencies.


The thing that we as a community need to figure out is the recommended configuration file format.


We’d prefer something that the Go standard library can already parse easily. That includes XML, JSON, and Go.  Nobody likes XML, so that leaves JSON and Go.


godep already uses JSON, as do Go tools themselves (e.g. “go list -json fmt”), so we’ll probably want something like godep’s JSON format:


{

   "Deps": [

       {

           "ImportPath": "rsc.io/arm/armasm",

           "Rev": "616aea947362"

       },

       {

           "ImportPath": "rsc.io/x86/x86asm",

           "Rev": "af2970a7819d"

       }

   ]

}


We can start with that for discussion.


Note that we have rejected non-vendoring approaches that require modifications to GOPATH or new semantics inside the go command and toolchain. We believe it is important that the solution not require additional effort on the part of all the tools that already understand how to build, analyze, or modify code in the standard GOPATH hierarchy.


Aram Hăvărneanu

unread,
Mar 2, 2015, 12:47:03 PM3/2/15
to Brad Fitzpatrick, golang-dev
One of the nicest things about Go is that all the information required
to build Go programs is in the source code, a.i. in .go files, and not
in any other files (Makefiles, .cfg files, etc). With this proposal,
this would change. We'd have information in non-Go files
(TBDCONFIG.CFG).

--
Aram Hăvărneanu

Brendan Tracey

unread,
Mar 2, 2015, 12:49:12 PM3/2/15
to Brad Fitzpatrick, golang-dev
Who is the we in “Our proposal is that”?

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

cm...@golang.org

unread,
Mar 2, 2015, 12:51:30 PM3/2/15
to golan...@googlegroups.com, brad...@golang.org
As noted in the OP, TBDCONFIG.CFG could be TBDCONFIG.go (or TBDCONFIG.json). I'm not sure if that alleviates the problem of having "other files" but the other files can be other Go files.
 
--
Aram Hăvărneanu

Brad Fitzpatrick

unread,
Mar 2, 2015, 12:56:19 PM3/2/15
to Brendan Tracey, golang-dev
On Mon, Mar 2, 2015 at 9:49 AM, Brendan Tracey <tracey....@gmail.com> wrote:
Who is the we in “Our proposal is that”?

me, Rob, Russ, Andrew, Ian, David Crawshaw, David Symonds, Sameer, Alan Donovan, et al.

Russ Cox

unread,
Mar 2, 2015, 12:56:30 PM3/2/15
to Aram Hăvărneanu, Brad Fitzpatrick, golang-dev
On Mon, Mar 2, 2015 at 12:46 PM, Aram Hăvărneanu <ara...@mgk.ro> wrote:
One of the nicest things about Go is that all the information required
to build Go programs is in the source code, a.i. in .go files, and not
in any other files (Makefiles, .cfg files, etc).

Not really true. cmd/internal/gc has information in a .y file. go generate arranges to write out the files needed to conform to the usual setup, without teaching the usual setup about yacc. The same thing is going on here. The tool (godep, nut, whatever) would use the config file to write the files needed to conform to the usual setup. This config file would only be read by that tool, not by the go command.

Russ

Nathan Youngman

unread,
Mar 2, 2015, 1:37:16 PM3/2/15
to golan...@googlegroups.com, ara...@mgk.ro, brad...@golang.org

It's good to see this discussion happening.

Agreeing on a common file format and file layout below /internal/ would be a huge step forward.

There are some philosophical differences between nut and godep. Is the file generated from current imports or created manually? Does it only contain SHA commits or can it include branches or tags?

I think it's great to have multiple independent implementations (at first), to see what works better in the wild.

Nathan.

Rodrigo Kochenburger

unread,
Mar 2, 2015, 2:10:24 PM3/2/15
to Nathan Youngman, golan...@googlegroups.com, ara...@mgk.ro, brad...@golang.org
I really like the fact we are discussing this as the community has a lot to benefit from a standard approach.

One thing I love about Go is how the workflow flows. It's smooth. How do you guys envision the workflow for: a) adding a new dependency; b) updating a dependency?

Does the developer have to manually update the dependency file (similar to nut) or is the dependency file generated from the code (similar to godep)? In which step would the import path rewrite happen? 

Also, will library packages also have dependencies specified like that or only application/main packages? What happens to the dependencies of dependencies?  For example, my application (github.com/divoxx/app) imports "github.com/foo/foo which in turn imports "github.com/bar/bar", after the rewrite they are both gonna be "github.com/divoxx/app/internal/github.com/foo/foo" and "github.com/divoxx/app/internal/github.com/bar/bar" respectively?



--

Brad Fitzpatrick

unread,
Mar 2, 2015, 2:20:34 PM3/2/15
to Rodrigo Kochenburger, Nathan Youngman, golang-dev, Aram Hăvărneanu
On Mon, Mar 2, 2015 at 11:10 AM, Rodrigo Kochenburger <div...@gmail.com> wrote:
I really like the fact we are discussing this as the community has a lot to benefit from a standard approach.

One thing I love about Go is how the workflow flows. It's smooth. How do you guys envision the workflow for: a) adding a new dependency; b) updating a dependency?

Does the developer have to manually update the dependency file (similar to nut) or is the dependency file generated from the code (similar to godep)? In which step would the import path rewrite happen? 

We're not looking to specify either of those right now.  We only care about standardizing the configuration file format at this time.  (we might say "for Go 1.5" on accident, but this has nothing to do with Go 1.5's release cycle timing... this could happen tomorrow or in 5 months)
 
Also, will library packages also have dependencies specified like that or only application/main packages?

The mechanism should work for libraries too, but I think we'll discourage overuse of it for libraries. We imagine it'll be mostly used for package main.

What happens to the dependencies of dependencies?  For example, my application (github.com/divoxx/app) imports "github.com/foo/foo which in turn imports "github.com/bar/bar", after the rewrite they are both gonna be "github.com/divoxx/app/internal/github.com/foo/foo" and "github.com/divoxx/app/internal/github.com/bar/bar" respectively?

That's up to the tool.  There might also be a mechanism to declare packages which are safe for duplication (no internal state in package-level vars) vs those which are not.  Then the vendoring tool can error if it finds a problem.

Rodrigo Kochenburger

unread,
Mar 2, 2015, 2:28:30 PM3/2/15
to Brad Fitzpatrick, Nathan Youngman, golang-dev, Aram Hăvărneanu
Okay, gotcha. I also think Godeps seems to be the easiest and more comprehensive format out there. I like the fact that it also has the go version specified in it, which AFAIK is not enforced in any way but potentially could.

Brad Fitzpatrick

unread,
Mar 2, 2015, 2:29:36 PM3/2/15
to Rodrigo Kochenburger, Nathan Youngman, golang-dev, Aram Hăvărneanu
On Mon, Mar 2, 2015 at 11:28 AM, Rodrigo Kochenburger <div...@gmail.com> wrote:
Okay, gotcha. I also think Godeps seems to be the easiest and more comprehensive format out there. I like the fact that it also has the go version specified in it, which AFAIK is not enforced in any way but potentially could.

I removed it from the example because it wasn't clear what it even meant.
 

ben.d...@gmail.com

unread,
Mar 2, 2015, 2:38:37 PM3/2/15
to golan...@googlegroups.com
On Monday, March 2, 2015 at 12:38:02 PM UTC-5, Brad Fitzpatrick wrote:

In Google’s internal source tree, we vendor (copy) all our dependencies into our source tree and have at most one copy of any given external library. We have the equivalent of only one GOPATH and rewrite our imports to refer to our vendored copy. For example, Go code inside Google wanting to use “golang.org/x/crypto/openpgp” would instead import it as something like “google/third_party/golang.org/x/crypto/openpgp”. This has worked out very well for us and we get reproducible builds. If we ever want to update openpgp in our vendored directory, we update it, verify that all the affected code in the world still compiles and tests pass (making code changes as necessary), and then check it in. The world compiles and passes at all points in the version control system’s history, and if we bisect in time, we see which version of an external library was in use at any point.


There is a crucial difference between Google's use of import rewriting and how this would be used in the outside world: the third_party directory (which I assume would be renamed to "internal" under this proposal) exists in a scope that spans many projects, so google/gmail, google/plus, and google/selfdrivingcar can all share the same set of vendored packages. Outside of Google, there is no accessible common ancestor directory, so when two packages share a common dependency they must each import their own version of it into their own vendor directory.

For a concrete example, cockroachdb currently depends on etcd, and they both depend on golang.org/x/net/context. Etcd vendors their dependencies and so cockroach uses the unsightly import path "github.com/coreos/etcd/Godeps/_workspace/src/golang.org/x/net/context". If "Godeps/_workspace" were moved under "internal", we couldn't do that and we'd end up with two copies of the library (for net/context I think that would still work since the main thing used from that package is an interface, but it wouldn't work with e.g. glog).

Granted, this situation is a bit unusual since etcd is primarily intended to be an end-user executable rather than a reusable go library, and we intend to break out the parts that are used by cockroachdb (the 'raft' subpackage) into a separate package/repo, but that just moves the problem around. The new raft package would either have to vendor its dependencies and rewrite its imports, or use canonical import paths for everything with no way to control dependency versions even for its own tests. 

Personally, while I see the value of import-rewriting for corporate environments where you have one large meta-project, I think the better solution for the open-source world is to avoid rewriting imports and instead improve tooling support for a per-project GOPATH (I now let the emacs package go-projectile manage my GOPATH, which works very well for my workflow) and to use something like gpm or goop to pin dependency versions.

-Ben

Axel Wagner

unread,
Mar 2, 2015, 3:44:53 PM3/2/15
to Brad Fitzpatrick, golang-dev
Hi,

the gist: I strongly dislike vendoring because
1 You end up with more than one copy of the same code in your binary
(unless there is some magic I am not aware of possible)
2 It discourages contribution to upstream (because it is quicker to fix
it in the vendored copy, right?)
3 As a consequence it creates diverging versions of the same library
4 And possibly licensing issues (as observed with ruby)
5 It puts a lot of burden unto package maintainers, as distributions
normally disallow vendoring (for security- and maintability concerns)
At the very least I would want to discourage vendoring for the creators
of libraries.

The rant (I will probably not say a lot more about this, but I at least
wanted to put my concerns out there):

I am a big non-fan of vendoring. My main question with this proposal is:
What happens, when two different packages vendor the same dependency?
Will we end up with two copies of the dependency? Will that be a
problem? I could see problems both in codesize and with stuff like
sql/driver, where you would effectively end up with two disjunct sources
for drivers, if the package gets duplicated.

I dislike the comparison with Google's mechanism. A big difference is,
that nobody imports googles internal packages, so the above problem (or
any below) doesn't arrive. Google's approach works, because it is
basically the same as the approach in a distribution:
third_party/foo/bar corresponds to a package in a distribution with
distribution-specific patches. As you have one codebase with one
controlling authority, you can do global updates and more or less atomic
changes over all of the codebase. This stops working with an
open-source ecosystem, where you have in part little to no control over
upstreams. Instead of updating a library on my system and maybe fixing
up dependencies, if there are build-failures, I now have to wait for all
upstreams to update their vendored versions (or in turn have to vendor
*them* and take over some maintainership responsibilities).

Vendoring is the solution the ruby-community has chosen, apparently, and
at least as far as I can tell, that is the main reason, why jekyll in
debian has been broken for year(s?). At least I gave up trying to
package it after having to figure out bugs in all the transitive
vendored dependencies and having to deal with the licensing jungle
(because once stuff gets vendored anyway, you can just add random files
right? Without regard for what is licensed how. Or even annotating what
files are from where).

I think vendoring is the wrong solution for reproducible
builds. Reproducible build only requires all the version information of
all transitive dependencies used to build a binary plus the version of
the go toolchain and stdlib used (btw: Does godep or nut address that?
Just out of curiosity).

Therefore I feel that reproducible builds are a red herring, when
advocating for vendoring. I think the main reason, why vendoring became
popular is, that it makes it easier to deal with API-incompatibilities
of upstream and to make software go-gettable. But I think this is a bad
optimization goal. Again, gem/pip/npm are things chosen by other
languages and I at least will try everything I can *not* to install
something that depends on any of those. I prefer clean releases and
*one* Package manager. Not everyone wants to install a go toolchain,
just to install one tool, just as I don't want to install a node.js
toolchain just to use the keybase.io CLI :)

I fear, that with library, we are ultimately giving up on any
API-stability, because we don't even have to deal with an upstream that
changes their API all the time, we can just vendor and ship with a
frozen-in-time version…

minux

unread,
Mar 2, 2015, 3:51:14 PM3/2/15
to Russ Cox, Aram Hăvărneanu, Brad Fitzpatrick, golang-dev
Then this view suggests indirectly that the go tool will never gain the ability to use such file
to import new revisions?

Keith Rarick

unread,
Mar 2, 2015, 3:52:24 PM3/2/15
to Brad Fitzpatrick, golang-dev
This proposal sounds good to me.
Especially using "internal"; it makes sense.

I'll make any changes necessary in godep to
work with the format and file paths we agree on.

I don't care about the details of the config file
format. I just want it to be possible to generate
it automatically from scratch from an existing
set of go source code and dependency code,
if the user so desires.

Keith Rarick

unread,
Mar 2, 2015, 3:56:48 PM3/2/15
to Brad Fitzpatrick, Rodrigo Kochenburger, Nathan Youngman, golang-dev, Aram Hăvărneanu
>> Also, will library packages also have dependencies specified like that or
>> only application/main packages?
>
> The mechanism should work for libraries too, but I think we'll discourage
> overuse of it for libraries. We imagine it'll be mostly used for package
> main.

I support discouraging this for libraries.

If a library P does this with its dependency D,
then someone who wants to use both P and D
is required to also use a vendoring tool (unless
it happens to be ok to link in two copies of D).
That is annoying, especially for a person or
project that's just getting started.

It's much nicer to be able to 'go get P D' and
import P and D and get to work.

Keith Rarick

unread,
Mar 2, 2015, 4:04:52 PM3/2/15
to Axel Wagner, Brad Fitzpatrick, golang-dev
On Mon, Mar 2, 2015 at 12:10 PM, Axel Wagner
<axel.wa...@googlemail.com> wrote:
> Reproducible build only requires all the version information of
> all transitive dependencies used to build a binary plus the version of
> the go toolchain and stdlib used

Unfortunately, experience shows this is not true.

A reproducible build requires the source code needed to make
the build. With vendoring, you have the source code. Without it,
you need not only the version information but also a *means* to
acquire the source code—a network that is functional and fast
enough. You might be surprised how often that requirement is
not met.

> (btw: Does godep or nut address that?
> Just out of curiosity).

Godep records the output of "go version" when it generates
the file Godeps.json.

Nathan Youngman

unread,
Mar 2, 2015, 4:23:47 PM3/2/15
to Keith Rarick, Brad Fitzpatrick, Rodrigo Kochenburger, golang-dev, Aram Hăvărneanu

Libraries that depend on libraries other than the standard library do complicate things.

I've had good success with shallow dependencies:

main (Godep) -> library (gopkg.in or go get) -> standard library


So while the directory structure and revisions file may support vendoring within libraries, I would also discourage it.

Nathan.

--
Nathan Youngman 
Email: he...@nathany.com
Web: http://www.nathany.com

Sébastien Douche

unread,
Mar 2, 2015, 4:41:31 PM3/2/15
to golan...@googlegroups.com
On Mon, 2 Mar 2015, at 18:46, Aram Hăvărneanu wrote:
> One of the nicest things about Go is that all the information required
> to build Go programs is in the source code, a.i. in .go files, and not
> in any other files (Makefiles, .cfg files, etc).

Hmm, I would say "the nicest thing about Go is you don't need external
tool to build Go programs".


--
Sébastien Douche <s...@nmeos.net>
Twitter: @sdouche
http://douche.name

Owen Ou

unread,
Mar 2, 2015, 4:46:50 PM3/2/15
to golan...@googlegroups.com

It's good to see such discussion going on. I'm the author of nut so my comments might be bias :)

It sounds like there're two things nut does differently from Brad's suggestions:

  1. the format of the config file
  2. the file structure for vendored dependencies

For 1), I'm open to suggestions as long as the format is clear and concise. Currently nut adopts Tomlas the config format. An example of declaring depnednecies:

[dependencies]

"rsc.io/arm/armasm" = "616aea947362"
"rsc.io/x86/x86asm" = "af2970a7819d"

I personally think this is very clear and less noisy than JSON. But I understand the preference of using something Go standard library can parse. It wouldn't be hard for nut to support JSON as the config file. But I think the proposed JSON schema is a bit verbose: the keys of "ImportPath" and "Rev" seem unecessary. How about removing the keys?

{
    "Deps": {
        "rsc.io/arm/armasm": "616aea947362",
        "rsc.io/x86/x86asm": "af2970a7819d"
    }
}

For 2), I don't have a problem of renaming "vendor" as in nut to "internal": making vendored dependencies only accessible to current package makes sense.

Brendan Tracey

unread,
Mar 2, 2015, 4:52:54 PM3/2/15
to Brad Fitzpatrick, Rodrigo Kochenburger, Nathan Youngman, golang-dev, Aram Hăvărneanu

Also, will library packages also have dependencies specified like that or only application/main packages?

The mechanism should work for libraries too, but I think we'll discourage overuse of it for libraries. We imagine it'll be mostly used for package main.

What is the suggested behavior for libraries? I must be missing something, because this mentality does not seem in the spirit of Go. Go is excellent at supporting programming in the large. As programs get richer and as the ecosystem grows, there will inevitably be a hierarchy of libraries. The standard library will only implement so many ideas. I don’t intend to be critical, I just don’t see the intended vision.

Andrew Gerrand

unread,
Mar 2, 2015, 4:54:11 PM3/2/15
to ben.d...@gmail.com, golang-dev
On 3 March 2015 at 06:34, <ben.d...@gmail.com> wrote:
Granted, this situation is a bit unusual since etcd is primarily intended to be an end-user executable rather than a reusable go library, and we intend to break out the parts that are used by cockroachdb (the 'raft' subpackage) into a separate package/repo, but that just moves the problem around. The new raft package would either have to vendor its dependencies and rewrite its imports, or use canonical import paths for everything with no way to control dependency versions even for its own tests. 

Vendoring should be used only for the dependencies of binaries, not for the dependencies of packages. This particular problem could be solved by:
a) a policy of API stability adhered to by the raft subpackage and its dependent packages, 
b) testing infrastructure that is version-aware.

Personally, while I see the value of import-rewriting for corporate environments where you have one large meta-project, I think the better solution for the open-source world is to avoid rewriting imports and instead improve tooling support for a per-project GOPATH (I now let the emacs package go-projectile manage my GOPATH, which works very well for my workflow) and to use something like gpm or goop to pin dependency versions.

I think we can develop the tools to make it easier to test packages with different versions of their dependencies, and to make it easier for the ultimate consumer (a program binary) to vendor the correct dependencies (or provide diagnostics in the rare case of incompatible dependenct versions). Those tools may be [based on] projects like gpm or goop.

Right now I'm working on an API stability policy similar to the Go 1 compatibility promise for the gokit project and its dependencies. If that goes well, maybe the greater Go community can adopt the policy for their projects.

Andrew

Andrew Gerrand

unread,
Mar 2, 2015, 4:56:01 PM3/2/15
to Aram Hăvărneanu, Brad Fitzpatrick, golang-dev
This property doesn't change; you won't need non-Go files to *build* Go programs. You'll just need the metadata to update a program's dependencies, and that necessary metadata is not currently encoded in Go source code anywhere.

Andrew Gerrand

unread,
Mar 2, 2015, 4:58:02 PM3/2/15
to Owen Ou, golang-dev

On 3 March 2015 at 08:46, Owen Ou <jing...@gmail.com> wrote:
I personally think this is very clear and less noisy than JSON. But I understand the preference of using something Go standard library can parse. It wouldn't be hard for nut to support JSON as the config file. But I think the proposed JSON schema is a bit verbose: the keys of "ImportPath" and "Rev" seem unecessary. How about removing the keys?

One advantage to Brad's proposed syntax is that we can add extra fields (if need be) while maintaining backward compatibility with older tools.

 

Burcu Dogan

unread,
Mar 2, 2015, 5:16:26 PM3/2/15
to Andrew Gerrand, ben.d...@gmail.com, golang-dev
> Vendoring should be used only for the dependencies of binaries, not for the dependencies of packages.

Could you clarify what this exactly means? How am I supposed to vendor
a revision of package x, if that revision doesn't know which revision
of package y to depend on? In any case, a revision has to know about
the revision it is depending on.

And if the packages need to vendor, there is a critical problem with
this proposal.

> officially recommends vendoring into an “internal” directory with import rewriting (not GOPATH modifications) as the canonical way to pin dependencies.

How am I supposed to export a symbol from a vendored package?

package a

type Doer interface


package b

func X(doer a.Doer)


a's import path will be visible to package b only.

Burcu Dogan

unread,
Mar 2, 2015, 5:18:37 PM3/2/15
to Andrew Gerrand, ben.d...@gmail.com, golang-dev
Ah, Gmail quoted and wrapped the bottom half of my email. Resending that part.

And if the packages need to vendor, there is a critical problem with
this proposal.

How am I supposed to export a symbol from a vendored package?

package a

type Doer interface


package b

func X(doer a.Doer)


a's import path will be visible to package b only.

Owen Ou

unread,
Mar 2, 2015, 5:23:23 PM3/2/15
to golan...@googlegroups.com, jing...@gmail.com

On Monday, March 2, 2015 at 1:58:02 PM UTC-8, Andrew Gerrand wrote:

One advantage to Brad's proposed syntax is that we can add extra fields (if need be) while maintaining backward compatibility with older tools.


Ben Darnell

unread,
Mar 2, 2015, 5:24:41 PM3/2/15
to Andrew Gerrand, golang-dev
On Mon, Mar 2, 2015 at 4:53 PM, Andrew Gerrand <a...@golang.org> wrote:

On 3 March 2015 at 06:34, <ben.d...@gmail.com> wrote:
Granted, this situation is a bit unusual since etcd is primarily intended to be an end-user executable rather than a reusable go library, and we intend to break out the parts that are used by cockroachdb (the 'raft' subpackage) into a separate package/repo, but that just moves the problem around. The new raft package would either have to vendor its dependencies and rewrite its imports, or use canonical import paths for everything with no way to control dependency versions even for its own tests. 

Vendoring should be used only for the dependencies of binaries, not for the dependencies of packages. This particular problem could be solved by:
a) a policy of API stability adhered to by the raft subpackage and its dependent packages, 

The raft package was developed in tandem with the rest of etcd; it is only now reaching a point of API stability (after validating its interfaces with usage in two applications). In a world of rewritten imports (where at least some dependencies are like glog and must not be duplicated), reusing a piece of an application is a significant burden. Before we could even begin to use raft in cockroachdb, we would have had to move the master copy of the raft code out of etcd (into a new repo which would be developed under "package rules" instead of "binary rules") and vendor it back in to etcd. (we are in fact undertaking this work now, but only after the experiment has proven successful) I don't like the strict division of packages into "those which are so closely tied to a particular binary that they use its rewritten imports" and "those which use 'go get' for their own dependencies but should be vendored into any application that uses them"

 
b) testing infrastructure that is version-aware.

But if we had this version-aware infrastructure (which I admit is a harder problem), would we still want to rewrite imports? I claim that for purposes of reproducible builds it's better to place the entire GOPATH under version control (as long as you can have per-project GOPATHs) than to use an internal/vendor directory to ensure that all your dependencies share a common prefix deep in the source tree.
 

 

Personally, while I see the value of import-rewriting for corporate environments where you have one large meta-project, I think the better solution for the open-source world is to avoid rewriting imports and instead improve tooling support for a per-project GOPATH (I now let the emacs package go-projectile manage my GOPATH, which works very well for my workflow) and to use something like gpm or goop to pin dependency versions.

I think we can develop the tools to make it easier to test packages with different versions of their dependencies, and to make it easier for the ultimate consumer (a program binary) to vendor the correct dependencies (or provide diagnostics in the rare case of incompatible dependenct versions). Those tools may be [based on] projects like gpm or goop.

Yes. I agree that the needs of library developers are different from the needs of application developers. Libraries must be as broad as possible in their dependencies to minimize conflicts (hopefully just a minimum version; occasionally a maximum or range) while applications want to pin things down exactly. I'd just like for this difference to be as localized as possible (e.g. "use == instead of >= in your dependencies.json file") instead of completely changing the workflow.

-Ben

zel...@gmail.com

unread,
Mar 2, 2015, 5:49:03 PM3/2/15
to golan...@googlegroups.com, jing...@gmail.com
On Monday, March 2, 2015 at 1:58:02 PM UTC-8, Andrew Gerrand wrote:
On 3 March 2015 at 08:46, Owen Ou <jing...@gmail.com> wrote:
[...]I think the proposed JSON schema is a bit verbose: the keys of "ImportPath" and "Rev" seem unecessary. How about removing the keys?

One advantage to Brad's proposed syntax is that we can add extra fields (if need be) while maintaining backward compatibility with older tools.

Agreed. As an example, I could see particular tools adopting conventions for noting which (semver.org -style) version of dependencies you were aiming for, and trying to merge shared dependencies.
Like Brad's "safe for duplication declaration" mentioned above, this would require some kind of metadata convention too, so I'm glad we're not trying here to decide what color to paint that bikeshed.

Zellyn

Andrew Gerrand

unread,
Mar 2, 2015, 6:36:28 PM3/2/15
to Ben Darnell, golang-dev
On 3 March 2015 at 09:24, Ben Darnell <ben.d...@gmail.com> wrote:
On Mon, Mar 2, 2015 at 4:53 PM, Andrew Gerrand <a...@golang.org> wrote:

On 3 March 2015 at 06:34, <ben.d...@gmail.com> wrote:
Granted, this situation is a bit unusual since etcd is primarily intended to be an end-user executable rather than a reusable go library, and we intend to break out the parts that are used by cockroachdb (the 'raft' subpackage) into a separate package/repo, but that just moves the problem around. The new raft package would either have to vendor its dependencies and rewrite its imports, or use canonical import paths for everything with no way to control dependency versions even for its own tests. 

Vendoring should be used only for the dependencies of binaries, not for the dependencies of packages. This particular problem could be solved by:
a) a policy of API stability adhered to by the raft subpackage and its dependent packages, 

The raft package was developed in tandem with the rest of etcd; it is only now reaching a point of API stability (after validating its interfaces with usage in two applications). In a world of rewritten imports (where at least some dependencies are like glog and must not be duplicated), reusing a piece of an application is a significant burden. Before we could even begin to use raft in cockroachdb, we would have had to move the master copy of the raft code out of etcd (into a new repo which would be developed under "package rules" instead of "binary rules") and vendor it back in to etcd. (we are in fact undertaking this work now, but only after the experiment has proven successful) I don't like the strict division of packages into "those which are so closely tied to a particular binary that they use its rewritten imports" and "those which use 'go get' for their own dependencies but should be vendored into any application that uses them"

I think it's reasonable to expect some difficulty when re-using the internal packages of larger projects. Until those packages are officially supported by their authors, you're kind of in "here be dragons" territory. I don't think this proposal makes such issues better or worse. 
  
b) testing infrastructure that is version-aware.

But if we had this version-aware infrastructure (which I admit is a harder problem), would we still want to rewrite imports? I claim that for purposes of reproducible builds it's better to place the entire GOPATH under version control (as long as you can have per-project GOPATHs) than to use an internal/vendor directory to ensure that all your dependencies share a common prefix deep in the source tree.

There's an argument for keeping all the various mechanical pieces small and simple. It's definitely possible to imagine some larger infrastructure that manages everything for us, from end to end, but that's not the direction we have taken so far.

 

Personally, while I see the value of import-rewriting for corporate environments where you have one large meta-project, I think the better solution for the open-source world is to avoid rewriting imports and instead improve tooling support for a per-project GOPATH (I now let the emacs package go-projectile manage my GOPATH, which works very well for my workflow) and to use something like gpm or goop to pin dependency versions.

I think we can develop the tools to make it easier to test packages with different versions of their dependencies, and to make it easier for the ultimate consumer (a program binary) to vendor the correct dependencies (or provide diagnostics in the rare case of incompatible dependenct versions). Those tools may be [based on] projects like gpm or goop.

Yes. I agree that the needs of library developers are different from the needs of application developers. Libraries must be as broad as possible in their dependencies to minimize conflicts (hopefully just a minimum version; occasionally a maximum or range) while applications want to pin things down exactly. I'd just like for this difference to be as localized as possible (e.g. "use == instead of >= in your dependencies.json file") instead of completely changing the workflow.

What I'm talking about is codifying our existing workflows and formal documentation of the stability of packages, not changing workflows entirely.

Andrew Gerrand

unread,
Mar 2, 2015, 6:38:45 PM3/2/15
to Burcu Dogan, Ben Darnell, golang-dev
On 3 March 2015 at 09:16, Burcu Dogan <j...@google.com> wrote:
> Vendoring should be used only for the dependencies of binaries, not for the dependencies of packages.

Could you clarify what this exactly means? How am I supposed to vendor
a revision of package x, if that revision doesn't know which revision
of package y to depend on? In any case, a revision has to know about
the revision it is depending on.

And if the packages need to vendor, there is a critical problem with
this proposal.

I'm saying the maintainers of the binary should vendor all of its transitive dependencies, if they vendor at all.
 
The package maintainers should not vendor anything.

> officially recommends vendoring into an “internal” directory with import rewriting (not GOPATH modifications) as the canonical way to pin dependencies.

How am I supposed to export a symbol from a vendored package?

package a

type Doer interface


package b

func X(doer a.Doer)


a's import path will be visible to package b only.

I don't understand the question. As maintainer of the code en masse you have visibility into all import paths. Maybe my answer to your first question helps?

Keith Rarick

unread,
Mar 2, 2015, 6:46:50 PM3/2/15
to Andrew Gerrand, Owen Ou, golang-dev
For example: if the program has vendored a patch to one
of its dependencies and the patch is not yet merged upstream,
the author might want their tool to include an alternative URL
where the patch has been published.

Or: godep currently includes the output of "git describe --tags"
(or a similarly descriptive command for hg and bzr) for each
dependency, for the benefit of any human who is reading the file.

Russ Cox

unread,
Mar 2, 2015, 6:47:41 PM3/2/15
to Andrew Gerrand, Burcu Dogan, Ben Darnell, golang-dev
People seem to agree that libraries should not vendor other libraries without a good reason. That's actually beyond the scope here. There is some question about what happens if libraries *do* vendor other libraries, and that gets to the heart of the proposal. 

If there is an agreed-upon vendoring approach (import path rewriting) and data format that describes what vendoring did happen, then no matter what vendoring helper did it, another tool (or perhaps the same one) can come along and analyze the situation and either just fix it or help the programmer fix it with minimal interactions.

On the other hand, if the vendoring tools use different semantics or even just different config files, this kind of metatool becomes much more difficult.

The goal here is to (1) agree on import path rewriting (as opposed to dynamic GOPATH tweaking or other complications), and (2) agree on a config file format that records the rewriting that happened, so that different projects can use different tools and still interoperate, both for just building things and for using meta tools that provide things like deduplication and diamond crushing.

Russ

Ben Darnell

unread,
Mar 2, 2015, 6:59:26 PM3/2/15
to Andrew Gerrand, golang-dev
On Mon, Mar 2, 2015 at 6:35 PM, Andrew Gerrand <a...@golang.org> wrote:



The raft package was developed in tandem with the rest of etcd; it is only now reaching a point of API stability (after validating its interfaces with usage in two applications). In a world of rewritten imports (where at least some dependencies are like glog and must not be duplicated), reusing a piece of an application is a significant burden. Before we could even begin to use raft in cockroachdb, we would have had to move the master copy of the raft code out of etcd (into a new repo which would be developed under "package rules" instead of "binary rules") and vendor it back in to etcd. (we are in fact undertaking this work now, but only after the experiment has proven successful) I don't like the strict division of packages into "those which are so closely tied to a particular binary that they use its rewritten imports" and "those which use 'go get' for their own dependencies but should be vendored into any application that uses them"

I think it's reasonable to expect some difficulty when re-using the internal packages of larger projects. Until those packages are officially supported by their authors, you're kind of in "here be dragons" territory. I don't think this proposal makes such issues better or worse. 

This proposal does make some such issues a bit worse by requiring (or at least encouraging) the use of an "internal" namespace for vendored packages. Currently we can experiment in "here be dragons" territory by importing another application's vendored packages; that becomes more difficult (at least in some cases) if such packages become internal. But your point is taken; in the absence of a specific counter-proposal it is probably better to move forward with increased standardization in this area since vendoring with rewritten imports does solve at least some of the versioning problem.

-Ben