Gophers,
One of the most frequent questions we’ve received since Go 1 was how to deal with dependencies and their versions. We’ve never recommended any particular answer.
In Google’s internal source tree, we vendor (copy) all our dependencies into our source tree and have at most one copy of any given external library. We have the equivalent of only one GOPATH and rewrite our imports to refer to our vendored copy. For example, Go code inside Google wanting to use “golang.org/x/crypto/openpgp” would instead import it as something like “google/third_party/golang.org/x/crypto/openpgp”. This has worked out very well for us and we get reproducible builds. If we ever want to update openpgp in our vendored directory, we update it, verify that all the affected code in the world still compiles and tests pass (making code changes as necessary), and then check it in. The world compiles and passes at all points in the version control system’s history, and if we bisect in time, we see which version of an external library was in use at any point.
We’re starting to see others in the Go community do the same, with tools like godep and nut.
We think it’s time to start addressing the dependency & vendoring issue, especially before too many conflicting tools arise and fragment best practices in the Go ecosystem, unnecessarily complicating tooling. It would be nice if the community could converge on a standard way to vendor.
Our proposal is that the Go project,
officially recommends vendoring into an “internal” directory with import rewriting (not GOPATH modifications) as the canonical way to pin dependencies.
defines a common config file format for dependencies & vendoring
makes no code changes to cmd/go in Go 1.5. External tools such as “godep” or “nut” will implement 1) and 2). We can reevaluate including such a tool in Go 1.6+.
The important part is that as a community, we all do this the same way, so tooling can mature and interoperate.
In Go 1.5, the “internal” package mechanism introduced in Go 1.4 for the standard library will be extended to all go-gettable packages, so using the “internal” directory as the root of rewritten import paths makes sense (as opposed to “vendor” or “third_party”).
Consider an existing use of vendoring in the Go source tree: $GOROOT/src/cmd/internal currently contains copies of “rsc.io/x86/x86asm” and “rsc.io/arm/armasm” as $GOROOT/src/cmd/internal/rsc.io/x86/x86asm/ and $GOROOT/src/cmd/internal/rsc.io/arm/armasm/, respectively. When we use those inside the Go tools, however, we import them “cmd/internal/rsc.io/x86/x86asm” and not “rsc.io/x86/x86asm”. (Although this example comes from the Go distribution repo, the effect is the same as a local project using $GOPATH/src/your.project/path instead of $GOROOT/src/cmd.)
We currently maintain those copies by hand. Instead, we want to write a file (filename and syntax to be determined), such as:
src/cmd/internal/TBDCONFIG.CFG:
“rsc.io/x86/x86asm” with revision af2970a7819d
“rsc.io/arm/armasm” with revision 616aea947362
And then your vendoring tool (such as “godep” or “nut”) would read TBDCONFIG.CFG and write out,
src/cmd/internal/rsc.io/x86/x86asm/*.go
src/cmd/internal/rsc.io/arm/armasm/*.go
rewriting imports and import comments in these files for the new location. It may also optionally change your source so any occurrence of
import “rsc.io/x86/x86asm”
becomes
import “cmd/internal/rsc.io/x86/x86asm”
The vendoring tool would be responsible for generating errors on conflicts or missing dependencies.
The thing that we as a community need to figure out is the recommended configuration file format.
We’d prefer something that the Go standard library can already parse easily. That includes XML, JSON, and Go. Nobody likes XML, so that leaves JSON and Go.
godep already uses JSON, as do Go tools themselves (e.g. “go list -json fmt”), so we’ll probably want something like godep’s JSON format:
{
"Deps": [
{
"ImportPath": "rsc.io/arm/armasm",
"Rev": "616aea947362"
},
{
"ImportPath": "rsc.io/x86/x86asm",
"Rev": "af2970a7819d"
}
]
}
We can start with that for discussion.
Note that we have rejected non-vendoring approaches that require modifications to GOPATH or new semantics inside the go command and toolchain. We believe it is important that the solution not require additional effort on the part of all the tools that already understand how to build, analyze, or modify code in the standard GOPATH hierarchy.
--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
Aram Hăvărneanu
Who is the we in “Our proposal is that”?
One of the nicest things about Go is that all the information required
to build Go programs is in the source code, a.i. in .go files, and not
in any other files (Makefiles, .cfg files, etc).
--
I really like the fact we are discussing this as the community has a lot to benefit from a standard approach.One thing I love about Go is how the workflow flows. It's smooth. How do you guys envision the workflow for: a) adding a new dependency; b) updating a dependency?Does the developer have to manually update the dependency file (similar to nut) or is the dependency file generated from the code (similar to godep)? In which step would the import path rewrite happen?
Also, will library packages also have dependencies specified like that or only application/main packages?
What happens to the dependencies of dependencies? For example, my application (github.com/divoxx/app) imports "github.com/foo/foo which in turn imports "github.com/bar/bar", after the rewrite they are both gonna be "github.com/divoxx/app/internal/github.com/foo/foo" and "github.com/divoxx/app/internal/github.com/bar/bar" respectively?
Okay, gotcha. I also think Godeps seems to be the easiest and more comprehensive format out there. I like the fact that it also has the go version specified in it, which AFAIK is not enforced in any way but potentially could.
In Google’s internal source tree, we vendor (copy) all our dependencies into our source tree and have at most one copy of any given external library. We have the equivalent of only one GOPATH and rewrite our imports to refer to our vendored copy. For example, Go code inside Google wanting to use “golang.org/x/crypto/openpgp” would instead import it as something like “google/third_party/golang.org/x/crypto/openpgp”. This has worked out very well for us and we get reproducible builds. If we ever want to update openpgp in our vendored directory, we update it, verify that all the affected code in the world still compiles and tests pass (making code changes as necessary), and then check it in. The world compiles and passes at all points in the version control system’s history, and if we bisect in time, we see which version of an external library was in use at any point.
It's good to see such discussion going on. I'm the author of nut so my comments might be bias :)
It sounds like there're two things nut does differently from Brad's suggestions:
For 1), I'm open to suggestions as long as the format is clear and concise. Currently nut adopts Tomlas the config format. An example of declaring depnednecies:
[dependencies] "rsc.io/arm/armasm" = "616aea947362" "rsc.io/x86/x86asm" = "af2970a7819d"
I personally think this is very clear and less noisy than JSON. But I understand the preference of using something Go standard library can parse. It wouldn't be hard for nut to support JSON as the config file. But I think the proposed JSON schema is a bit verbose: the keys of "ImportPath" and "Rev" seem unecessary. How about removing the keys?
{ "Deps": { "rsc.io/arm/armasm": "616aea947362", "rsc.io/x86/x86asm": "af2970a7819d" } }
For 2), I don't have a problem of renaming "vendor" as in nut to "internal": making vendored dependencies only accessible to current package makes sense.
Also, will library packages also have dependencies specified like that or only application/main packages?The mechanism should work for libraries too, but I think we'll discourage overuse of it for libraries. We imagine it'll be mostly used for package main.
Granted, this situation is a bit unusual since etcd is primarily intended to be an end-user executable rather than a reusable go library, and we intend to break out the parts that are used by cockroachdb (the 'raft' subpackage) into a separate package/repo, but that just moves the problem around. The new raft package would either have to vendor its dependencies and rewrite its imports, or use canonical import paths for everything with no way to control dependency versions even for its own tests.
Personally, while I see the value of import-rewriting for corporate environments where you have one large meta-project, I think the better solution for the open-source world is to avoid rewriting imports and instead improve tooling support for a per-project GOPATH (I now let the emacs package go-projectile manage my GOPATH, which works very well for my workflow) and to use something like gpm or goop to pin dependency versions.
I personally think this is very clear and less noisy than JSON. But I understand the preference of using something Go standard library can parse. It wouldn't be hard for nut to support JSON as the config file. But I think the proposed JSON schema is a bit verbose: the keys of "ImportPath" and "Rev" seem unecessary. How about removing the keys?
One advantage to Brad's proposed syntax is that we can add extra fields (if need be) while maintaining backward compatibility with older tools.
On 3 March 2015 at 06:34, <ben.d...@gmail.com> wrote:Granted, this situation is a bit unusual since etcd is primarily intended to be an end-user executable rather than a reusable go library, and we intend to break out the parts that are used by cockroachdb (the 'raft' subpackage) into a separate package/repo, but that just moves the problem around. The new raft package would either have to vendor its dependencies and rewrite its imports, or use canonical import paths for everything with no way to control dependency versions even for its own tests.Vendoring should be used only for the dependencies of binaries, not for the dependencies of packages. This particular problem could be solved by:a) a policy of API stability adhered to by the raft subpackage and its dependent packages,
b) testing infrastructure that is version-aware.
Personally, while I see the value of import-rewriting for corporate environments where you have one large meta-project, I think the better solution for the open-source world is to avoid rewriting imports and instead improve tooling support for a per-project GOPATH (I now let the emacs package go-projectile manage my GOPATH, which works very well for my workflow) and to use something like gpm or goop to pin dependency versions.I think we can develop the tools to make it easier to test packages with different versions of their dependencies, and to make it easier for the ultimate consumer (a program binary) to vendor the correct dependencies (or provide diagnostics in the rare case of incompatible dependenct versions). Those tools may be [based on] projects like gpm or goop.
On 3 March 2015 at 08:46, Owen Ou <jing...@gmail.com> wrote:
[...]I think the proposed JSON schema is a bit verbose: the keys of "ImportPath" and "Rev" seem unecessary. How about removing the keys?
One advantage to Brad's proposed syntax is that we can add extra fields (if need be) while maintaining backward compatibility with older tools.
On Mon, Mar 2, 2015 at 4:53 PM, Andrew Gerrand <a...@golang.org> wrote:On 3 March 2015 at 06:34, <ben.d...@gmail.com> wrote:Granted, this situation is a bit unusual since etcd is primarily intended to be an end-user executable rather than a reusable go library, and we intend to break out the parts that are used by cockroachdb (the 'raft' subpackage) into a separate package/repo, but that just moves the problem around. The new raft package would either have to vendor its dependencies and rewrite its imports, or use canonical import paths for everything with no way to control dependency versions even for its own tests.Vendoring should be used only for the dependencies of binaries, not for the dependencies of packages. This particular problem could be solved by:a) a policy of API stability adhered to by the raft subpackage and its dependent packages,The raft package was developed in tandem with the rest of etcd; it is only now reaching a point of API stability (after validating its interfaces with usage in two applications). In a world of rewritten imports (where at least some dependencies are like glog and must not be duplicated), reusing a piece of an application is a significant burden. Before we could even begin to use raft in cockroachdb, we would have had to move the master copy of the raft code out of etcd (into a new repo which would be developed under "package rules" instead of "binary rules") and vendor it back in to etcd. (we are in fact undertaking this work now, but only after the experiment has proven successful) I don't like the strict division of packages into "those which are so closely tied to a particular binary that they use its rewritten imports" and "those which use 'go get' for their own dependencies but should be vendored into any application that uses them"
b) testing infrastructure that is version-aware.But if we had this version-aware infrastructure (which I admit is a harder problem), would we still want to rewrite imports? I claim that for purposes of reproducible builds it's better to place the entire GOPATH under version control (as long as you can have per-project GOPATHs) than to use an internal/vendor directory to ensure that all your dependencies share a common prefix deep in the source tree.
Personally, while I see the value of import-rewriting for corporate environments where you have one large meta-project, I think the better solution for the open-source world is to avoid rewriting imports and instead improve tooling support for a per-project GOPATH (I now let the emacs package go-projectile manage my GOPATH, which works very well for my workflow) and to use something like gpm or goop to pin dependency versions.I think we can develop the tools to make it easier to test packages with different versions of their dependencies, and to make it easier for the ultimate consumer (a program binary) to vendor the correct dependencies (or provide diagnostics in the rare case of incompatible dependenct versions). Those tools may be [based on] projects like gpm or goop.Yes. I agree that the needs of library developers are different from the needs of application developers. Libraries must be as broad as possible in their dependencies to minimize conflicts (hopefully just a minimum version; occasionally a maximum or range) while applications want to pin things down exactly. I'd just like for this difference to be as localized as possible (e.g. "use == instead of >= in your dependencies.json file") instead of completely changing the workflow.
> Vendoring should be used only for the dependencies of binaries, not for the dependencies of packages.
Could you clarify what this exactly means? How am I supposed to vendor
a revision of package x, if that revision doesn't know which revision
of package y to depend on? In any case, a revision has to know about
the revision it is depending on.
And if the packages need to vendor, there is a critical problem with
this proposal.
> officially recommends vendoring into an “internal” directory with import rewriting (not GOPATH modifications) as the canonical way to pin dependencies.
How am I supposed to export a symbol from a vendored package?
package a
type Doer interface
package b
func X(doer a.Doer)
a's import path will be visible to package b only.
The raft package was developed in tandem with the rest of etcd; it is only now reaching a point of API stability (after validating its interfaces with usage in two applications). In a world of rewritten imports (where at least some dependencies are like glog and must not be duplicated), reusing a piece of an application is a significant burden. Before we could even begin to use raft in cockroachdb, we would have had to move the master copy of the raft code out of etcd (into a new repo which would be developed under "package rules" instead of "binary rules") and vendor it back in to etcd. (we are in fact undertaking this work now, but only after the experiment has proven successful) I don't like the strict division of packages into "those which are so closely tied to a particular binary that they use its rewritten imports" and "those which use 'go get' for their own dependencies but should be vendored into any application that uses them"I think it's reasonable to expect some difficulty when re-using the internal packages of larger projects. Until those packages are officially supported by their authors, you're kind of in "here be dragons" territory. I don't think this proposal makes such issues better or worse.