Packaging a Go module for Nix

1,418 views
Skip to first unread message

Wael Nasreddine

unread,
Mar 10, 2019, 7:30:02 PM3/10/19
to golang-nuts
TL;DR Given a Go module, assuming that I have already done `go mod download`: Is it possible to prevent network access if I delete the entire `$GOPATH/pkg/mod/cache`?

Hello,

I'm a member of Nix, and I'm attempting to create a new infrastructure for packaging Go modules that rely on the reproducibility of Go modules. I have the following requirements:

  1. Nix comes with Sandbox for building packages.
  2. Sandbox does not allow writing to HOME. There are three places where the write is allowed: $NIX_BUILD_TOP, $TMPDIR and output directories such as $out and $bin.
  3. Sandbox does not allow access to the internet, except if the package is intended to fetch dependencies, we call these packages a `fetcher`
  4. Every source of input, be it a tarball or dependencies created by a `fetcher`, must be compared against a fixed hash for security purposes.

Given the requirements above, I was able to work with Go modules by setting GOCACHE to `$TMPDIR/go-cache` and by setting GOPATH to `$NIX_BUILD_TOP/go` (<off-topic>Please have a different variable control where Go modules are downloaded, GOPATH is confusing</off-topic>). I'm currently trying to figure out how to get go-modules to work without network access. Here's the algorithm for the packaging:

Please note that a derivation is just another name for a package.
  1. Intermediate derivation: Fetch all the dependencies. This is done by setting GOPATH to a temporary directory and run `go mod download`. I then proceed to remove `$GOPATH/pkg/mod/cache` before copying the entire `$GOPATH/pkg/mod` to the output of this derivation.
  2. Final derivation: Build the go module. This is done by setting GOPATH to a temporary directory and restore the output of the intermediate derivation to `$GOPATH/pkg/mod`. However, this is where Go attempt network access to re-download the modules and fails to build.

I am willing to patch Go in order for it to assume that the modules are available under pkg/mod guarded by an environment variable such as __NIX_GO_SKIP_MOD_DOWNLOAD but I do not know where I can do that. Can someone please point me in the right direction?

Thank you,

Wael

Manlio Perillo

unread,
Mar 10, 2019, 8:01:01 PM3/10/19
to golang-nuts
On Monday, March 11, 2019 at 12:30:02 AM UTC+1, Wael Nasreddine wrote:
TL;DR Given a Go module, assuming that I have already done `go mod download`: Is it possible to prevent network access if I delete the entire `$GOPATH/pkg/mod/cache`?


Yes.  Use `go mod vendor` and the -mod=vendor build flag.

> [...]

Manlio Perillo

Wael Nasreddine

unread,
Mar 10, 2019, 9:06:44 PM3/10/19
to golang-nuts
This works! Thank you!

Is the vendor mode going to be supported per Go 1.x compatibility? As in, can I assume it works for all 1.x versions >= 1.11?
 

> [...]

Manlio Perillo

Manlio Perillo

unread,
Mar 10, 2019, 9:44:10 PM3/10/19
to golang-nuts
Go 1.x compatibility does not cover the go tools.

However the vendor mode should remain because it was added to solve a specific use case, and has nothing to do with the old vendoring support:

See also:

"If invoked with -mod=vendor, the go command assumes that the vendor directory holds the correct copies of dependencies and ignores the dependency descriptions in go.mod."


Manlio Perillo

Manlio Perillo

unread,
Mar 10, 2019, 9:59:07 PM3/10/19
to golang-nuts
On Monday, March 11, 2019 at 12:30:02 AM UTC+1, Wael Nasreddine wrote:
TL;DR Given a Go module, assuming that I have already done `go mod download`: Is it possible to prevent network access if I delete the entire `$GOPATH/pkg/mod/cache`?


Another solution is to delete only $GOPATH/pkg/mod/cache/vcs.  This is the directory that takes more disk space since it contains the full history of the vcs.

> [...]

Manlio

Wael Nasreddine

unread,
Mar 11, 2019, 1:12:05 AM3/11/19
to golang-nuts
Thank you for the documentation link, this is very helpful! 

On Sunday, March 10, 2019 at 6:59:07 PM UTC-7, Manlio Perillo wrote:
On Monday, March 11, 2019 at 12:30:02 AM UTC+1, Wael Nasreddine wrote:
TL;DR Given a Go module, assuming that I have already done `go mod download`: Is it possible to prevent network access if I delete the entire `$GOPATH/pkg/mod/cache`?


Another solution is to delete only $GOPATH/pkg/mod/cache/vcs.  This is the directory that takes more disk space since it contains the full history of the vcs.
 
That was my first attempt, however it did not work as expect because the contents of $GOPATH/pkg/mod/cache/download does change anytime a new version is created upstream. My hash failed to match multiple times a day!

Wael Nasreddine

unread,
Mar 11, 2019, 1:24:12 AM3/11/19
to golang-nuts
On Sunday, March 10, 2019 at 5:01:01 PM UTC-7, Manlio Perillo wrote:
I found an issue: If one of the dependencies includes cgo dependencies, not every directory and file gets copied to the vendor directory.

Steps to reproduce:

λ export GO111MODULE=on
λ unset GOPATH
λ cd hugo
λ go mod download
libs  LICENSE

But if you look at https://github.com/wellington/go-libsass, you can see that it's missing quite a bit, especially the submodule libsass-src and libsass-build and it made my build of Hugo fail!

I'm going to try the following patch:

diff --git a/src/cmd/go/internal/modload/load.go b/src/cmd/go/internal/modload/load.go
index
6d6c037af2..77765023be 100644
--- a/src/cmd/go/internal/modload/load.go
+++ b/src/cmd/go/internal/modload/load.go
@@ -968,11 +968,44 @@ func (r *mvsReqs) required(mod module.Version) ([]module.Version, error) {
 
base.Fatalf("go: internal error: %s@%s: unexpected invalid semantic version", mod.Path, mod.Version)
 
}
 
- data, err := modfetch.GoMod(mod.Path, mod.Version)
- if err != nil {
- base.Errorf("go: %s@%s: %v\n", mod.Path, mod.Version, err)
- return nil, ErrRequire
+ // ~HACK~: Support for buildGoModule
+ //
+ // compute the content of go.mod for this module either by using
+ // modfetch.GoMod() in regular day by day use or by reading the version
+ // already on disk.
+ var (
+ data []byte
+ )
+ if _, ok := os.LookupEnv("__NIX_GO_SKIP_MOD_DOWNLOAD"); ok {
+ // XXX: Must not depend on hardcoded GOPATH
+ gomodPath := fmt.Sprintf("%s/pkg/mod/%s@%s/go.mod", os.Getenv("GOPATH"), mod.Path, mod.Version)
+
+ // does gomodPath exist? If so set data to the contents of this module. If
+ // it does not exist then we must set it to `module mod.Path`.
+ _, err := os.Stat(gomodPath)
+ if err == nil {
+ data, err = ioutil.ReadFile(gomodPath)
+ if err != nil {
+ base.Errorf("go [NIX-REQUIRED]: %s@%s: %v\n", mod.Path, mod.Version, err)
+ return nil, ErrRequire
+ }
+ } else {
+ data = []byte("module " + mod.Path)
+ }
+ } else {
+ // the following call has been taken as it existed before the patch. The
+ // `:=` had to be converted to an `=` as the data and error are now both
+ // defined at the top of the if block.
+ var err error
+ data, err = modfetch.GoMod(mod.Path, mod.Version)
+ if err != nil {
+ base.Errorf("go: %s@%s: %v\n", mod.Path, mod.Version, err)
+ return nil, ErrRequire
+ }
 
}
+ //
+ // ~HACK~: Support for buildGoModule
+
  f
, err := modfile.ParseLax("go.mod", data, nil)
 
if err != nil {
 
base.Errorf("go: %s@%s: parsing go.mod: %v", mod.Path, mod.Version, err)

Manlio Perillo

unread,
Mar 11, 2019, 6:02:10 AM3/11/19
to golang-nuts
It may be caused by the missing pkg/mod/cache/vcs, but it seems unusual.
I was assuming GOPATH was on a temporary directory.
And indeed you wrote that GOPATH is set to $NIX_BUILD_TOP/go, but later you wrote that it is set to a temporary directory.

You can try with the following flow (not tested).

1. download modules with `go mod download`
2. *copy* the content of $GOPATH/pkg/mod/cache/download to the output directory
3. set GOPROXY=file:///$out/... when you `go build`



Manlio Perillo

Wael M. Nasreddine

unread,
Mar 11, 2019, 11:22:04 AM3/11/19
to Manlio Perillo, golang-nuts
I am removing the entire cache folder, as the content changes when the git repository of any of the dependencies change. So if I would hash the cache folder to A and one of the dependencies get a pull request merged upstream, even though it does not technically change the version that Go is building with you will still get B when you hash it. This break the concept of packaging :(


You can try with the following flow (not tested).

1. download modules with `go mod download`
2. *copy* the content of $GOPATH/pkg/mod/cache/download to the output directory
3. set GOPROXY=file:///$out/... when you `go build`

This is quite interesting, thank you for this! However, as I said above the contents of the download directory changes as upstream changes :(

Manlio Perillo

unread,
Mar 11, 2019, 1:01:04 PM3/11/19
to golang-nuts
On Monday, March 11, 2019 at 4:22:04 PM UTC+1, Wael Nasreddine wrote:

> [...]
 
On Sunday, March 10, 2019 at 6:59:07 PM UTC-7, Manlio Perillo wrote:
On Monday, March 11, 2019 at 12:30:02 AM UTC+1, Wael Nasreddine wrote:
TL;DR Given a Go module, assuming that I have already done `go mod download`: Is it possible to prevent network access if I delete the entire `$GOPATH/pkg/mod/cache`?


Another solution is to delete only $GOPATH/pkg/mod/cache/vcs.  This is the directory that takes more disk space since it contains the full history of the vcs.
 
That was my first attempt, however it did not work as expect because the contents of $GOPATH/pkg/mod/cache/download does change anytime a new version is created upstream. My hash failed to match multiple times a day!

It may be caused by the missing pkg/mod/cache/vcs, but it seems unusual.
I was assuming GOPATH was on a temporary directory.
And indeed you wrote that GOPATH is set to $NIX_BUILD_TOP/go, but later you wrote that it is set to a temporary directory.


I am removing the entire cache folder, as the content changes when the git repository of any of the dependencies change. So if I would hash the cache folder to A and one of the dependencies get a pull request merged upstream, even though it does not technically change the version that Go is building with you will still get B when you hash it. This break the concept of packaging :(


Do you perhaps have the same requirements as in the thread
as reported by Nicolas Mailhot?

That is, you need to patch the upstream source but keep the same version, because you can't (or don't want to) update all the versions of the required modules.

In this case vendoring is, IMHO, currently the only solution, because the go tool ignores the go.sum files in this case.
You can try to open a bug report about the incorrect behavior of `go mod vendor` when using cgo.

Or you can patch cmd/go.  However instead of writing a patch specific to Nix, I suggest to write a more generic patch.

You can add a new module download mode, e.g. `-mod trust` to instruct cmd/go to not check go.sum.
The future notary can be disabled with GONOVERIFY, so its not a problem.
Maybe you can use GONOVERIFY to decide if the checksum should be ignored.

Finally, you can try to coordinate with other package managers, since this seems to be a shared problem.

Here is a patch.  It seems to work but one the Go test fails, and it should probably be updated:

An alternative is to check for trust mode in the initGoSum function:

Finally the last alternative is to check for trust mode in the checkGoMod function, but this will also disable the notary.  Maybe it is the correct thing to do:

> [...]


Manlio Perillo 

thepud...@gmail.com

unread,
Mar 11, 2019, 3:33:29 PM3/11/19
to golang-nuts
Hi Wael,

Sorry, I am not quite following what you have and have not tried yet, and which issues you have hit with which techniques.

Is one of the issues that the 'vcs' cache directory changes, even if the actual code you need for your build has not changed?

If so, I wonder if you might be able use a filesystem-based module cache via 'GOPROXY=file:///file/path', which could avoid using the 'vcs' directory at build time?

It sounds like you have looked into multiple options at this point, so this might be well known to you at this point, but:

 * The module download cache location is controlled by GOPATH. In particular, 'go mod download', 'go build', etc. populate the module cache in GOPATH/pkg/mod.
 * In addition, when you want to use a particular module cache, you can tell the 'go' command to use a local module cache by setting 'GOPROXY=file:///file/path' environment variable.
 * You can put those two things together:

     # Populate a module download cache in /tmp/gopath-for-cache
     $ GOPATH=/tmp/gopath-for-cache  go mod download

     # Build using the contents of the module download cache in /tmp/gopath-for-cache
     $ GOPROXY=file:///tmp/gopath-for-cache/pkg/mod/cache/download  go build

Note that /tmp/gopath-for-cache/pkg/mod/cache/download would not contain the 'vcs' directory.

I understand those are not the exact steps you would follow, but perhaps something along those lines could be adapted within your constraints?

Also, note that even though you are setting the GOPROXY environment variable in the steps above, there is no actual proxy process involved, and everything is just being read directly from the local filesystem. An even more detailed example is here in this "Go Modules by Example" walk-through: https://github.com/go-modules-by-example/index/tree/master/012_modvendor

Sorry if anything here is off-base, or if this is not helpful.

Regards, 
thepudds

Wael Nasreddine

unread,
Mar 13, 2019, 2:25:35 PM3/13/19
to golang-nuts
On Monday, March 11, 2019 at 10:01:04 AM UTC-7, Manlio Perillo wrote:
Do you perhaps have the same requirements as in the thread
as reported by Nicolas Mailhot?

That is, you need to patch the upstream source but keep the same version, because you can't (or don't want to) update all the versions of the required modules.


Not precisely. In my case, I'm doing the build in two stages a) fetch dependencies and make sure they pass the hash and b) use (a) to build the module. I can add patches to the stage (a) to patch dependencies, but obviously, it does need some patching work due to the path of the dependency itself. I'm not too worried about patching at this time as I'm more worried about packaging instead.
 
In this case vendoring is, IMHO, currently the only solution, because the go tool ignores the go.sum files in this case.
You can try to open a bug report about the incorrect behavior of `go mod vendor` when using cgo.

Or you can patch cmd/go.  However instead of writing a patch specific to Nix, I suggest to write a more generic patch.

You can add a new module download mode, e.g. `-mod trust` to instruct cmd/go to not check go.sum.
The future notary can be disabled with GONOVERIFY, so its not a problem.
Maybe you can use GONOVERIFY to decide if the checksum should be ignored.

Finally, you can try to coordinate with other package managers, since this seems to be a shared problem.

Here is a patch.  It seems to work but one the Go test fails, and it should probably be updated:

An alternative is to check for trust mode in the initGoSum function:

Finally the last alternative is to check for trust mode in the checkGoMod function, but this will also disable the notary.  Maybe it is the correct thing to do:


None of these patches seems to work if I remove the entire $GOPATH/pkg/mod/cache directory. Again, I'm removing this entire directory because of both $GOPATH/pkg/mod/cache/vcs and $GOPATH/pkg/mod/cache/download seem to change whenever upstream has changed even though go.mod/go.sum are still on the same commit.

I do agree with you, however, that the correct solution is to rely on vendoring instead of patching the module support. I will report a bug in vendoring with regards to cgo.

Wael Nasreddine

unread,
Mar 13, 2019, 2:46:59 PM3/13/19
to golang-nuts
On Monday, March 11, 2019 at 12:33:29 PM UTC-7, thepud...@gmail.com wrote:
Hi Wael,

Sorry, I am not quite following what you have and have not tried yet, and which issues you have hit with which techniques.

No worries, I'll try to address your questions to clarify the issues I'm having.
 

Is one of the issues that the 'vcs' cache directory changes, even if the actual code you need for your build has not changed?

I also noticed the download directory changing even if the actual code changes, but I might be incorrect on this assumption as I could not validate it today.
 

If so, I wonder if you might be able use a filesystem-based module cache via 'GOPROXY=file:///file/path', which could avoid using the 'vcs' directory at build time?

It sounds like you have looked into multiple options at this point, so this might be well known to you at this point, but:

 * The module download cache location is controlled by GOPATH. In particular, 'go mod download', 'go build', etc. populate the module cache in GOPATH/pkg/mod.
 * In addition, when you want to use a particular module cache, you can tell the 'go' command to use a local module cache by setting 'GOPROXY=file:///file/path' environment variable.
 * You can put those two things together:

     # Populate a module download cache in /tmp/gopath-for-cache
     $ GOPATH=/tmp/gopath-for-cache  go mod download

     # Build using the contents of the module download cache in /tmp/gopath-for-cache
     $ GOPROXY=file:///tmp/gopath-for-cache/pkg/mod/cache/download  go build

Note that /tmp/gopath-for-cache/pkg/mod/cache/download would not contain the 'vcs' directory.

Using GOPROXY worked, at least I'm not seeing any network attempt. I just must validate that the contents of the download do not change unless the actual code of the module has changed.

Manlio Perillo

unread,
Mar 13, 2019, 5:18:19 PM3/13/19
to golang-nuts
On Wednesday, March 13, 2019 at 7:25:35 PM UTC+1, Wael Nasreddine wrote:
On Monday, March 11, 2019 at 10:01:04 AM UTC-7, Manlio Perillo wrote:
Do you perhaps have the same requirements as in the thread
as reported by Nicolas Mailhot?

That is, you need to patch the upstream source but keep the same version, because you can't (or don't want to) update all the versions of the required modules.


Not precisely. In my case, I'm doing the build in two stages a) fetch dependencies and make sure they pass the hash

How do you fetch the dependencies?
 
and b) use (a) to build the module. I can add patches to the stage (a) to patch dependencies, but obviously, it does need some patching work due to the path of the dependency itself. I'm not too worried about patching at this time as I'm more worried about packaging instead.
 
In this case vendoring is, IMHO, currently the only solution, because the go tool ignores the go.sum files in this case.
You can try to open a bug report about the incorrect behavior of `go mod vendor` when using cgo.

Or you can patch cmd/go.  However instead of writing a patch specific to Nix, I suggest to write a more generic patch.

 
> [...]
 
None of these patches seems to work if I remove the entire $GOPATH/pkg/mod/cache directory.

What -mod=trust do is to make cmd/go ignore go.sum hashes. 
This is exactly what vendoring do, but instead of having the required modules in a per module vendor directory, you have a shared pool of modules, accessible using GOPROXY.

By the way, note that only the first patch is correct.

Again, I'm removing this entire directory because of both $GOPATH/pkg/mod/cache/vcs and $GOPATH/pkg/mod/cache/download seem to change whenever upstream has changed even though go.mod/go.sum are still on the same commit.

I still don't understand why you need to remove the cache directory.
vcs contains the clone for each of the module repository required.  Cloning a module repository may be a very *expensive* operation.
download contains the module data synthesized by cmd/go from the vcs repositories (AFAIK).  This is *not* an inexpensive operation.

When cmd/go download all the dependencies of a module, it may download different versions of the same module, even if they are not used in the build.
For this reason both vcs and download are updated, but each module version data is immutable.


I do agree with you, however, that the correct solution is to rely on vendoring instead of patching the module support. I will report a bug in vendoring with regards to cgo.


Regards
Manlio Perillo 

Manlio Perillo

unread,
Mar 13, 2019, 5:31:27 PM3/13/19
to golang-nuts
On Wednesday, March 13, 2019 at 7:25:35 PM UTC+1, Wael Nasreddine wrote:
On Monday, March 11, 2019 at 10:01:04 AM UTC-7, Manlio Perillo wrote:
Do you perhaps have the same requirements as in the thread
as reported by Nicolas Mailhot?

That is, you need to patch the upstream source but keep the same version, because you can't (or don't want to) update all the versions of the required modules.


Not precisely. In my case, I'm doing the build in two stages a) fetch dependencies and make sure they pass the hash and b) use (a) to build the module. I can add patches to the stage (a) to patch dependencies, but obviously, it does need some patching work due to the path of the dependency itself. I'm not too worried about patching at this time as I'm more worried about packaging instead.
 

This is how I would do things in order to have a consistent snapshot of Go modules for an OS distribution:

1) Clone each repository of the Go modules you want to include in the
    snapshot, and all the indirect dependencies
2) Patch all the go.mod files to ensure that *only* one version
    of each module is used.  Do not rely on cmd/go dependency
    resolution algorithm
3) Synthesize a Go module data for each repository, and make
    it accessible from GOPROXY
4) Build

Note that 2) will cause hash checks to fail; this is where -mod=trust came to help.

> [...]

Manlio Perillo 

Wael Nasreddine

unread,
Mar 14, 2019, 1:52:52 AM3/14/19
to golang-nuts
I'm going to describe how I ended up packaging the go modules (and so far it seems to work correctly). I have also replied inline below

I'm using a two-phase approach to package Go modules for Nix:
  1. During the first phase, a package named after the module with the suffix -go-modules is built by running go mod download and saving only $GOPATH/pkg/mod/cache/download. The contents of this package are then hashed and compared against a fixed known hash. The build fails if the hash does not match. My only concern is with regards to the stability of $GOPATH/pkg/mod/cache/download, does it ever change given the exact same go.mod?
  2. The Go module is then built with $GOPROXY set to file://${go-modules} and allows Go to download the dependencies locally. No concerns during this step.

On Wednesday, March 13, 2019 at 2:31:27 PM UTC-7, Manlio Perillo wrote:
Not precisely. In my case, I'm doing the build in two stages a) fetch dependencies and make sure they pass the hash and b) use (a) to build the module. I can add patches to the stage (a) to patch dependencies, but obviously, it does need some patching work due to the path of the dependency itself. I'm not too worried about patching at this time as I'm more worried about packaging instead.
 

This is how I would do things in order to have a consistent snapshot of Go modules for an OS distribution:

1) Clone each repository of the Go modules you want to include in the
    snapshot, and all the indirect dependencies
2) Patch all the go.mod files to ensure that *only* one version
    of each module is used.  Do not rely on cmd/go dependency
    resolution algorithm
3) Synthesize a Go module data for each repository, and make
    it accessible from GOPROXY
4) Build

Note that 2) will cause hash checks to fail; this is where -mod=trust came to help.

 
What's wrong with using the Go toolchain to grab the dependencies with go mod download?
 
> [...]

Manlio Perillo 

Manlio Perillo

unread,
Mar 14, 2019, 7:32:48 AM3/14/19
to golang-nuts
On Thursday, March 14, 2019 at 6:52:52 AM UTC+1, Wael Nasreddine wrote:
I'm going to describe how I ended up packaging the go modules (and so far it seems to work correctly). I have also replied inline below

I'm using a two-phase approach to package Go modules for Nix:
  1. During the first phase, a package named after the module with the suffix -go-modules is built by running go mod download and saving only $GOPATH/pkg/mod/cache/download. The contents of this package are then hashed and compared against a fixed known hash. The build fails if the hash does not match. My only concern is with regards to the stability of $GOPATH/pkg/mod/cache/download, does it ever change given the exact same go.mod?

No, it shouldn't.  There are vxxx.ziphash files for each version.
  1. The Go module is then built with $GOPROXY set to file://${go-modules} and allows Go to download the dependencies locally. No concerns during this step.

On Wednesday, March 13, 2019 at 2:31:27 PM UTC-7, Manlio Perillo wrote:
Not precisely. In my case, I'm doing the build in two stages a) fetch dependencies and make sure they pass the hash and b) use (a) to build the module. I can add patches to the stage (a) to patch dependencies, but obviously, it does need some patching work due to the path of the dependency itself. I'm not too worried about patching at this time as I'm more worried about packaging instead.
 

This is how I would do things in order to have a consistent snapshot of Go modules for an OS distribution:

1) Clone each repository of the Go modules you want to include in the
    snapshot, and all the indirect dependencies
2) Patch all the go.mod files to ensure that *only* one version
    of each module is used.  Do not rely on cmd/go dependency
    resolution algorithm
3) Synthesize a Go module data for each repository, and make
    it accessible from GOPROXY
4) Build

Note that 2) will cause hash checks to fail; this is where -mod=trust came to help.

 
What's wrong with using the Go toolchain to grab the dependencies with go mod download?
 

Nothing, it is just that I have never used it and when I tried to use it I didn't notice that it *requires* a version.
However, how do you find all the direct and indirect dependencies of a module?

In my test I created a temporary directory with a temporary module named snapshot, and use go get do download a module.
Then I use go mod graph to get all the dependencies, save them and delete the temporary directory.


Regards
Manlio Perillo

thepud...@gmail.com

unread,
Mar 14, 2019, 11:24:02 AM3/14/19
to golang-nuts
Hi Wael,

I am curious if the approach outlined in this branch of the thread has continued to work for you so far, vs. perhaps you hit a snag after the initial success you reported below?

Regards,
thepudds
Reply all
Reply to author
Forward
0 new messages