vgo & vendoring

5423 views
Skip to first unread message

Russ Cox

unread,
Feb 26, 2018, 2:50:02 PM2/26/18
to golang-dev
One common piece of feedback has been that we should not drop vendoring, at least not completely. I still 100% want to and intend to drop vendoring in the sense of "any package can have a vendor directory under it that changes the meaning of imports in that package". However, it also seems like we need some way to have a vendor directory at the top level of a module and have a way to make that vendor directory apply to direct builds of that module. I can see a few design questions.

1. What goes in the vendor directory? Unpacked sources or whole zip files? Whole modules or just necessary packages?

I think the answer here is unpacked sources, because you want tools like debuggers and compiler errors and the like to be able to refer to real files, instead of teaching every tool how to read zip files. A few times people mentioned uncompressed zip files, but honestly I don't see the point. It seems like you'd end up having all the size and none of the convenience of a file tree. So I'm pretty confident about unpacked sources.

As for whole modules vs just necessary packages, I can see arguments for both, but if you include the whole modules then you can verify directory hashes, and also you don't have to worry about having to update vendor just because you wrote a new import for a module you already vendored. Those tip me toward whole modules.

2. Does a vendored module's directory name include the full minor version?

That is, is it vendor/rsc.io/quote/quote.go or vendor/rsc.io/qu...@v1.5.2/quote.go? I think the former, since that makes diffs meaningful and also keeps the vendor directory layout the same as before, which lets us continue to use the vendor directory as a transition mechanism for old go get. There's no use introducing a second convention without a compelling benefit, and I don't see a benefit (yet).

3. When is the vendor directory used?

A few people pointed out Yarn's offline mode, where as I understand it you have to say --offline to avoid the usual network case. But that seems wrong. The people who put vendor directories in their repos seem to want them to apply by default. So maybe the rule is that the vendor directory gets used if it has the specific version vgo wants, and otherwise it doesn't get used. That seems workable. There needs to be an index of what versions are there; that's already created as vendor/vgo.list by 'vgo vendor', but we could change it any way we want.

For anyone here have direct experience with wanting vendor to stay, does this seem like a reasonable set of answers?

Thanks.
Russ

Daniel Theophanes

unread,
Feb 26, 2018, 3:16:05 PM2/26/18
to golang-dev
Re 1: Another reason to vendor by module and not packages is some packages rely on sub-directories of c files or other resources.

Re 2: Agreed.

Re 3: If I understand you correctly, that would fulfill the needs of this workflow. That is:

If "vendor" is a folder and a sibling to "go.mod", do not hit the network when running "go (build,install,list)" but build from the vendor directory.
The only thing I would add to that is if in this "mode", "go get" copy to the vendor folder, not GOPATH.

This is already stated in how (v)go works, but just for emphasis, the vendor folder in a given module would be used to build that module only; other packages that depend on that module will read the go.mod file, but will ignore the vendor folder.

Lastly, if this mode is supported, the issue https://golang.org/issue/24101 may become more important to "prune" unsued dependencies from both. But that could also easily be part of an external tool.

ra...@develer.com

unread,
Feb 26, 2018, 3:21:32 PM2/26/18
to golang-dev
I think what's missing is a way to make sure that the vendor directory contains all the required modules to correctly build/test the project. If go build/test had something like "--offline" that disabled network access and just failed if a module is not present, that would probably work as one could configure the CI to use --offline.

I might be wrong, but I think that most people using vendor directories will want all modules to be there, not just a subset of them. In fact, I don't fully understand why vgo needs a different cache directory, in a different format; ideally, the vendor directory should be the vgo module cache (and viceversa), just like node_modules for npm. So that people can commit it if they want, or just git-ignore it otherwise. I don't care if it's vendor or .vendor or whatever name, but it would be great to have a single directory that I can commit and make sure that 

Giovanni

matthe...@gmail.com

unread,
Feb 26, 2018, 3:43:22 PM2/26/18
to golang-dev
Checking in vendor source isn’t critical for me although making fixes that way is tempting since the directory is just regular checked in code. This wouldn’t be a good practice due to the need for security and fix updates though.

What I’d like to do is include the modules pointed at by the module as part of the module download from the one server. If those module dependencies also have their original repository available then checking for consistency would be a plus.

1. What goes in the vendor directory? Unpacked sources or whole zip files? Whole modules or just necessary packages?

Whole zip files make sense to me because I don’t want to see the dependency changes here, just that the dependency was updated. Perhaps there could be an unpack option for the tooling need that doesn’t affect the repository history. Checking in the dependency modules seems necessary though.

I’ll try out vgo before providing more feedback. My goal is to have my one folder plus a Go installation allow use, inspection, and modification of the app even if the folder is only available by a thumb drive.

Matt

Filippo Valsorda

unread,
Feb 26, 2018, 6:53:01 PM2/26/18
to golan...@googlegroups.com
2018-02-26 14:49 GMT-0500 Russ Cox <r...@golang.org>:
1. What goes in the vendor directory? Unpacked sources or whole zip files? Whole modules or just necessary packages?

I think it will be important to be able to match the hashes to these folders, for security, caching and to discourage local modifications. So definitely whole modules.

Unpacked sources seem better if they are guaranteed to be deterministically compressed into the same zip (which I think is a desirable property anyway).

2. Does a vendored module's directory name include the full minor version?

No strong opinion, your points make sense to me.

3. When is the vendor directory used?

I think it's really important to have clear, easy assurance that the network will not be used and that the build is self-contained for security reasons.

The easiest answer here might be: if the vendor folder is there, it's the only thing used. If it's incomplete, it's an error.

This avoids a kludgy extra flag, is simple, clear, and makes sure vendor folders are kept tidy and machine-generated. Errors would surface immediately, not later in CI, and be easily fixed.

Having a different minor version in vendor/ which doesn't get used in the build would be an extremely confusing experience, and incompatible with current tooling. Disallowing version mismatch but allowing missing modules starts being complicated.

Henrik Johansson

unread,
Feb 27, 2018, 1:08:06 AM2/27/18
to Filippo Valsorda, golan...@googlegroups.com
Whatever happens with vendor/ if it is kept the rules for when it is used needs to be crystal clear.
If it would simply work as "the cache" for vgo then why not externalize it and be gone with vendor.

I have grown to like the vendor/ folder over the past year but in light of vgo it returns to the weird corner I found it.
I would prefer it be gone personally but I don't know, perhaps there are really good reasons.


--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

seaner...@gmail.com

unread,
Feb 27, 2018, 8:34:07 AM2/27/18
to golang-dev
Vendoring can address the "disappearing repository" situation. Is this a use case about which vgo is not concerned? It happens, and it can utterly derail development for days.

Mateusz Czapliński

unread,
Feb 27, 2018, 8:36:05 AM2/27/18
to golang-dev
Re 1: Sources are preferable for us, in accordance with the report I submitted at https://github.com/golang/go/issues/24088. We do quite often need to browse sources of third-party libs, be it:
 - for understanding what really happens (if godoc is unclear),
 - or fixing bugs (in our code or the library code; esp. w.r.t. races).
 - Also for evaluating viability of a fork to tweak the lib to our needs.
By the way, this now reminded me that we have similar needs w.r.t. the Go stdlib from time to time. Now that I think of it, would feel very weird if browsing the standard library source was easier than browsing third party libs sources.

Re 2: Feels irrelevant to me; cannot think of any reason why one or the other would make any difference.

Re 3: Mostly what other people mentioned:
 - both for CI builds (on specialized server) and dev builds (on workstations), it should be trivial to reproduce our whole app ecosystem from our monorepo, with an assurance that nothing is downloaded from the Internet;
 - it should be trivial for devs to add new libraries with transitive dependencies to the monorepo;
 - it should be trivial for devs to prune the monorepo of unused dependencies (transitively).
I'm mentioning the monorepo instead of "module" or "go.mod", as this is what's of concern to us personally. I see notion of "module" or "go.mod file" as a mostly irrelevant artifact.
Some extra notes Re 3:
 - notably, it would be preferable for us if vgo could fetch all dependencies for *all combinations of GOOS+GOARCH* we support, and merge them all into our monorepo. If basic 'vgo get' is additive, then this might be only relevant to a theorized 'vgo prune'. In case of our in-house vendoring tool, we chose to follow an algorithm of: 1) prune whole vendor.json; 2) iterate all GOOS+GOARCH combinations and additively fetch required packages from disk (if available) or network (otherwise). Notably, there's no need to further iterate GOOS+GOARCH when recursing through transitive dependencies.
 - I completely forgot that we actually tried to open-source the tool at some point in time (though we didn't have enough resources to find time to write at least a readme, unfortunately). The code dump is at: https://github.com/zpas-lab/vendo/. Notably, a study of use-cases which preceded it is at: https://github.com/zpas-lab/vendo/blob/master/use-cases.md, complete with high level drafts of the algorithms used.

/Mateusz.

maciej.l...@gmail.com

unread,
Feb 27, 2018, 12:57:03 PM2/27/18
to golang-dev
Re 3: anytime you have a need to distribute a subset of vgo cache

- shipping dependencies that are not go get-able under most circumstances (e.g. proprietary module bundled with proprietary dependencies)
- preserving a copy of dependencies in case package(s) becomes unavailable (e.g. deleted), limiting the number of single points of failure when trying to get a module to a buildable state
- offline mode (no/limited connectivity, air-gaped or severely locked down environment)
- CI:
  - limit single points of failure ("can't build because Github is down")
  - hosted ones are generally brutally slow building dependency list and downloading. Even if you use whatever caching they provide cloning repo with vendor is usually quicker
  - internal ones will often be locked down 

Re 2: I'd rather see full minor versions - just like vgo - with old format being a fallback to be deprecated and removed in a couple releases. I'm not sure how we could solve backwards compatibility :/

- no need for a separate vendoring tool if we could add a flag to vgo to put a specific dependency in vendor instead, which means it automatically gets better as vgo improves (e.g. you will start being able to vendor C dependencies)
- git, svn and mercurial can handle renames (for both history and diffs)
- converting old format to new one is trivial

Re 1: See 2 - whatever vgo does

giacomo...@gmail.com

unread,
Feb 28, 2018, 1:13:01 PM2/28/18
to golang-dev

I think vendor should be eventually abandoned as it in use today.
As I see it GOPATH and vendor are both local code caches; vendor was introduced as a cache with control on the version of the code, a feature missing in GOPATH.
Semantic import versioning fixed that, rendered vendor (a second cache) obsolete.
There is still a case for vendor in the distribution of source trees of dependencies, but if the code will be distributed in zip files these zip archives can contain the bundled dependencies aside the modules.
The tool that fetches the archives can put the dependencies in local cache and check for signatures/hashes, to be sure that the source tree being put in the cache is the correct one.

As for the offline builds that is another tooling problem, telling vgo to just use the local cache (or the proxy now that is in the game) is possible.
If a missing dependency in vendor breaks a build why keep GOPATH at all? Why vgo get shouldn't just put code in vendor then? And at this point what is the use for GOPATH?

Assuming this makes sense I think that:
1) vendor should contain whole modules unzipped.
2) modules directory names should include the full minor version to be compatible with the other local cache (GOPATH).
3) vendor should, eventually, be visible only in the zip distribution file and used only to populate the local cache (GOPATH).

Giacomo

Caleb Spare

unread,
Feb 28, 2018, 2:23:42 PM2/28/18
to Russ Cox, golang-dev
> For anyone here have direct experience with wanting vendor to stay, does this seem like a reasonable set of answers?

At my company, vendoring has been a robust method of managing
dependencies in a way that you mostly don't have to think about it
(except when you want to update them, which is more annoying than it
should be). But the normal vgo workflow seems like it will be even
better.

Assuming that we continue to use vendoring with vgo, your answers seem
reasonable to me, except that I want to echo an important point that
others have made: it would be useful to have a "vendor only" flag for
when you want to ensure that vgo isn't pulling in non-vendored modules
when you want a production-like build. (In a pre-vgo world we manage
this with various GOPATH manipulation tricks, but it's less than
ideal.)

Lorenz Bauer

unread,
Mar 1, 2018, 12:22:21 PM3/1/18
to golang-dev
At Cloudflare we have a bunch of Go services, each in their own repository. Many of them store their dependencies in vendor to enforce versions and allow offline builds.


On Wednesday, February 28, 2018 at 6:13:01 PM UTC, Tartari Giacomo wrote:
I think vendor should be eventually abandoned as it in use today.

As for the offline builds that is another tooling problem, telling vgo to just use the local cache (or the proxy now that is in the game) is possible.

That would mean that Go needs special support from CI. Also, requiring a cache is another external dependency which can break my build.
 
If a missing dependency in vendor breaks a build why keep GOPATH at all? Why vgo get shouldn't just put code in vendor then? And at this point what is the use for GOPATH?

Good question. For our use case GOPATH is not particularly useful. We even have https://github.com/cloudflare/hellogopher to kludge around this fact.

Filippo Valsorda

unread,
Mar 1, 2018, 3:01:15 PM3/1/18
to golan...@googlegroups.com
2018-02-28 10:12 GMT-0800 giacomo...@gmail.com:
If a missing dependency in vendor breaks a build why keep GOPATH at all? Why vgo get shouldn't just put code in vendor then? And at this point what is the use for GOPATH?

Exactly what I'm hoping for. GOPATH-less development.

Stephen J Day

unread,
Mar 5, 2018, 4:55:08 PM3/5/18
to golang-dev
In regards to item 1, what will happen with non-Go resources, such as protobufs or c files? Apologies if this was already answered in your posts, but we do use the vendor path to distribute and set the import path for protobuf files. While we would compiled Go pacakges against pre-compiled protobuf output, we also have to consider the case of compiling new Go packages from protobuf files dependent on vendored dependencies (ie references rpc.Status from another protobuf file). I can provide some more concrete examples if that description is too dense.

Stephen.


On Monday, February 26, 2018 at 11:50:02 AM UTC-8, rsc wrote:

Jakob Borg

unread,
Mar 6, 2018, 5:55:34 AM3/6/18
to Stephen J Day, golang-dev
On 5 Mar 2018, at 22:30, Stephen J Day <stev...@gmail.com> wrote:

 we do use the vendor path to distribute and set the import path for protobuf files

I do this too, but there’s nothing to say this is the only way to do it. These files could just be added to the repository in some other way, like any other assets that don’t come from a dependency.

//jb

gab...@soundtrackyourbrand.com

unread,
Mar 6, 2018, 3:41:34 PM3/6/18
to golang-dev
3. When is the vendor directory used?

I think it's really important to have clear, easy assurance that the network will not be used and that the build is self-contained for security reasons.

The easiest answer here might be: if the vendor folder is there, it's the only thing used. If it's incomplete, it's an error.

This avoids a kludgy extra flag, is simple, clear, and makes sure vendor folders are kept tidy and machine-generated. Errors would surface immediately, not later in CI, and be easily fixed.

Having a different minor version in vendor/ which doesn't get used in the build would be an extremely confusing experience, and incompatible with current tooling. Disallowing version mismatch but allowing missing modules starts being complicated.

I second this. I really appreciate the current vendor directory and the way it speeds up checkout to build. I think having a vendor directory should be enough to trigger "offline" builds that fail for an incomplete vendor directory. Only touching the network when changing dependencies is such a nice work flow.

--
Gabriel Falkenberg

Peter Waller

unread,
Mar 7, 2018, 5:41:55 AM3/7/18
to Filippo Valsorda, golang-dev
On 26 February 2018 at 23:52, Filippo Valsorda <fil...@ml.filippo.io> wrote:
The easiest answer here might be: if the vendor folder is there, it's the only thing used. If it's incomplete, it's an error.

This avoids a kludgy extra flag, is simple, clear, and makes sure vendor folders are kept tidy and machine-generated. Errors would surface immediately, not later in CI, and be easily fixed.

Having a different minor version in vendor/ which doesn't get used in the build would be an extremely confusing experience, and incompatible with current tooling. Disallowing version mismatch but allowing missing modules starts being complicated.

Great to see care and attention here. Happy with most of the thoughts expressed. I'd like to also second the above thoughts and ideas of Filippo.

A piece of context from my company: currently we use git submodules to put things in the vendor directory, a workflow which has happened to work well, though is clearly not to everyone's tastes. A benefit so far is not depending on any other not-yet-an-official-standard tooling.

I'm happy to see that process superseded by official tooling - as Caleb said, updates are more annoying than they should be - it would be good if old programs using submodules happened to continue to work into the future without modification, if possible.

I still 100% want to and intend to drop vendoring in the sense of "any package can have a vendor directory under it that changes the meaning of imports in that package".

If having a vendor'd directory anywhere is disallowed, it will (until a .mod file is introduced to depA?) break things of this form, which I have occasionally encountered:

/cmd/main
/vendor/depA
/vendor/depA/cmd/A
/vendor/depA/pkg/A
/vendor/depA/vendor/depB

Where depA depends on and vendors depB because depA/cmd/A needs its dependencies vendored and a depA/pkg/A is thing which you might want to depend on; in which case depB should be vendored at the top level.

I see the utility of disallowing (and failing) /vendor/depA/vendor/depB when building cmd/main, so perhaps we'll have to live with this particular arrangement breaking until intervention with the new rule. In the new world depA would anyway gain a .mod file, and then the arrangement would be allowed again? Then, when building cmd/main, the depB would not be used from the depA/vendor directory.

It may still be necessary to allow the presence of the depA/vendor directory, even if it is not used when building cmd/main, because depA/vendor/depB is required to build depA/cmd/A. This seems like a potential for confusion in this case.

Not sure I'm yet at the bottom of my thinking here. I can see an argument that go 1.1X might reasonably break the build of some existing "well vendor'd" programs in the case of such vendor directory nesting, to make the ecosystem easier to reason about in general. Nice to avoid if possible though.

roger peppe

unread,
Mar 10, 2018, 1:11:44 PM3/10/18
to Peter Waller, Filippo Valsorda, golang-dev
Somewhat orthogonal to whether ignoring vendor directories is breaking the Go 1 compatibility promise, it is actually possible to use vendor directories with vgo by specifying modules in the vendor directory as the target of a replace clause of the go.mod file. 

It seems to me that this could work pretty well. The vendored modules are only used when you build the module itself but not when the module is imported by other modules, which avoids the usual problems with type pollution from vendoring.

One current issue with this is that it's not possible to use a replace clause robustly because currently one can only specify a single version to be replaced, so a patch version change in a dependency can render the replace clause useless.

I think I'd like to be able to specify a version without minor or patch components;  for example:

    replace "github.com/foo/bar" v2 => "./vendor/github.com/foo/bar"

would specify that any v2 version of the module would be replaced by the vendored module (also useful for temporararily modified local copies).

--
You received this message because you are subscribed to the Google Groups "golang-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com.

Paul Jolly

unread,
Apr 3, 2018, 11:57:06 AM4/3/18
to Russ Cox, golang-dev
Apologies, joining this conversation rather late.

I’ll use the term “vendor”/“cache”, because I don’t per se see a need to keep the concept of vendor as it currently exists (but I might be missing something). Indeed I see a reason against it (see below)

Re question 1: 
  • ultimately need/want access to the sources (either as a human or to make things easier for tooling, code generators etc)
  • but at a minimum whole modules referenced at that point in time
    • Per Daniel, we can for now ignore pruning of go.mod files and ultimately unreferenced modules in the “vendor”/“cache”
    • Per Mateusz, such pruning would to my mind need to consider GOOS, GOARCH etc in terms of a calculation of “currently referenced”

Per Filippo - this “vendor”/“cache” should be used in an all-or-nothing mode
    • I think it's really important to have clear, easy assurance that the network will not be used and that the build is self-contained for security reasons.
    • I.e. something missing from the “vendor”/“cache” breaks whatever is being attempted
    • This of course raises the question then of how to populate the “vendor”/“cache” with a new dependency or upgrade

    Per Matt: If those module dependencies also have their original repository available then checking for consistency would be a plus
    • Agreed (and we would want this), but this could be made an orthogonal concern to vgo, i.e. another tool could do this

    Per Henrik:
      • If it would simply work as "the cache" for vgo then why not externalize it and be gone with vendor.
      • Agreed; that is largely why I refer to “vendor”/“cache” here because I’m not wedded to the concept of vendor at all

      Per Lorenz:
        • Also, requiring a cache is another external dependency which can break my build.
        • Whether this “vendor”/“cache” is committed as part of the same repository is to my mind a logistical concern (open source projects might not want to do so because generally speaking users of a module far outnumber developers). But I agree on the principle of this point. I think there are different use cases (partial list below)

        On thing in support of dropping vendor as it is currently implemented is the need to be able to go install PKG /go run PKG (see also https://github.com/golang/go/issues/22726#issuecomment-368587344). This is a notable problem with the current vendor approach because you need to go install ./vendor/PKG, which makes READMEs more complex and code generation steps complicated etc. More generally we must be able to use the (v)go tool on dependencies if they are in the “vendor”/“cache”. vgo already works 100% with dependencies; I wouldn’t want to see a regression with vgo’s use of vendor.

        This has been covered elsewhere, but need to support the workflow of taking a dependency in its current form (in the “vendor”/“cache”), making some changes, testing etc, then potentially forking and updating the module dependencies to be the fork

        More broadly I see the spectrum of use cases something like this:
        • UC0 no “vendor”/“cache” required, simply use vgo out of the box
        • UC1 An open source project (e.g. https://github.com/myitcv/react) where I do not want to commit the “vendor”/“cache” to the repo (because I have more users than developers). It instead statically lives elsewhere (another repository), no need to maintain a server anywhere. The “vendor”/“cache” (which would, in effect, be append-only) would be used by any developers of the project and its CI with appropriate tooling etc to help update it in step with the main repo
        • UC2 Company use-case where I would want the “vendor”/“cache” to be part of the repo. Because our code base and mono-repo is sufficiently small for this not to be a problem, we don’t want to maintain a proxy service, we would only have one “vendor”/“cache” that applies to all our code. A poor-man’s proxy if you will.
        • UC3 Company-maintained proxy. The fully monty. Details not covered here

        Incidentally, a lot of the functionality (ignoring detail) of what I would like from the “vendor”/“cache” for UC1 and UC2 is covered by having a subdirectory of the main repo that is used as GOPATH with a non-existent proxy. E.g. https://github.com/myitcv/vgocachedemo
         I've deliberately not committed the source code because that gets expanded on the first vgo command that references a dependency. 


        Paul

        --

        rich...@gmail.com

        unread,
        Jun 28, 2018, 4:19:23 PM6/28/18
        to golang-dev
        Are there any updates on point 3?
        As it stands (with getmode) I would have expected getmode=vendor to make use of the vendor folder but as on today that is not the case, at least not when NOT under GOPATH

        Russ Cox

        unread,
        Jun 28, 2018, 8:35:21 PM6/28/18
        to rich...@gmail.com, golang-dev
        -getmode=vendor is supposed to use the vendor directory and only the vendor directory.
        If you've observed otherwise, that's a bug. Please file an issue, and we'll fix it.

        Thanks!
        Russ


        --

        rich...@gmail.com

        unread,
        Jun 29, 2018, 2:47:11 AM6/29/18
        to golang-dev
        Reply all
        Reply to author
        Forward
        0 new messages