What is the total max size of embed content?

939 views
Skip to first unread message

Glen Newton

unread,
Sep 20, 2021, 12:19:51 PM9/20/21
to golang-nuts
Hello,

The maximum size for any single embedded file as []byte is 4GB. What is the total maximum size for all embedded files included this way in a Go binary?
Also, are there any platform dependencies?

Thanks,
Glen

Ian Lance Taylor

unread,
Sep 20, 2021, 12:25:08 PM9/20/21
to Glen Newton, golang-nuts
On Mon, Sep 20, 2021 at 9:20 AM Glen Newton <glen....@gmail.com> wrote:
>
> The maximum size for any single embedded file as []byte is 4GB. What is the total maximum size for all embedded files included this way in a Go binary?
> Also, are there any platform dependencies?

The total maximum size for all embedded files depends on what the
platform is able to support. And, yes, there are platform
dependencies. Most 32-bit platforms can't support a total size larger
than 2G. Most 64-bit platforms have much larger limits, though I
don't know specifically what they are offhand.

Ian

Glen Newton

unread,
Sep 20, 2021, 5:04:33 PM9/20/21
to golang-nuts
Thanks.

I am testing this with an increasing number of 1GB files. At 3 files, and I am getting this error:

compile: writing output: write $WORK/b001/_pkg_.a: no space left on device

I haven't been able to find how to change $WORK to point to a larger partition. Suggestions?

Oh, running:  
  go1.17.1.linux-amd64
  go version go1.17.1 linux/amd64
  Linux OptiPlex-7010 5.8.0-63-generic #71-Ubuntu SMP Tue Jul 13 15:59:12 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux


Thanks,
Glen

Ian Lance Taylor

unread,
Sep 20, 2021, 5:11:45 PM9/20/21
to Glen Newton, golang-nuts
On Mon, Sep 20, 2021 at 2:04 PM Glen Newton <glen....@gmail.com> wrote:
>
> I am testing this with an increasing number of 1GB files. At 3 files, and I am getting this error:
>
> compile: writing output: write $WORK/b001/_pkg_.a: no space left on device
>
> I haven't been able to find how to change $WORK to point to a larger partition. Suggestions?

Set the TMPDIR environment variable to control temporary files in
general, or the GOTMPDIR environment variable to control just
temporary files created by the go tool.

Ian

Glen Newton

unread,
Sep 20, 2021, 8:29:30 PM9/20/21
to golang-nuts
Hello,

Thanks for this, and also sorry for the lazy question.

My testing on my machine indicates a limit of 2GB total limit on compile:

too much data in section SXCOFFTOC (over 2e+09 bytes)
too much data in section SDATA (over 2e+09 bytes)

I have built a bash script that creates large random files of a fixed size and then generates a short Go program that embeds them, increasing the number of files with each run, so you can see when your compile fails.
You can find the bash and instructions here: https://github.com/gnewton/test_go_embed

I just threw it together, so might be a little rough around the edges. Feedback welcome.

Question: Wondering why the above error complains about too much data in two sections: are the embedded files stored across multiple sections (I know nothing of how this is done internally to the elf format)?

Thanks,
Glen

Glen Newton

unread,
Sep 21, 2021, 9:22:44 AM9/21/21
to golang-nuts
Hello,

Looking at https://groups.google.com/g/golang-codereviews/c/xkHLQHidF5s and https://github.com/golang/go/issues/9862 and the code to which they refer, it seems to me that the limit is 2GB independent of platform (is this true?), as the linker is limited to 2e9.
Again, should have done more of my homework!  :-)


// cutoff is the maximum data section size permitted by the linker

// (see issue #9862).
const cutoff = 2e9 // 2 GB (or so; looks better in errors than 2^31)
const cutoff = 2e9 // 2 GB (or so; looks better in errors than 2^31)

The comment indicates that this is a limit in the linker; is this a limit in the elf format? If No, could the linker et al be modified to accept larger static data?

Thanks,
Glen

Ian Lance Taylor

unread,
Sep 21, 2021, 1:28:12 PM9/21/21
to Glen Newton, golang-nuts
On Tue, Sep 21, 2021 at 6:23 AM Glen Newton <glen....@gmail.com> wrote:
>
> Looking at https://groups.google.com/g/golang-codereviews/c/xkHLQHidF5s and https://github.com/golang/go/issues/9862 and the code to which they refer, it seems to me that the limit is 2GB independent of platform (is this true?), as the linker is limited to 2e9.
> Again, should have done more of my homework! :-)
>
> https://github.com/golang/go/blob/master/src/cmd/internal/obj/objfile.go#L305
>
> // cutoff is the maximum data section size permitted by the linker
> // (see issue #9862). const cutoff = 2e9 // 2 GB (or so; looks better in errors than 2^31)
> const cutoff = 2e9 // 2 GB (or so; looks better in errors than 2^31)
>
> The comment indicates that this is a limit in the linker; is this a limit in the elf format? If No, could the linker et al be modified to accept larger static data?

You're right: it does look like cmd/link imposes a 2G total limit.
This is not a limitation of the 64-bit ELF format. It should be
possible to make it larger.

Note that such files are going to take a long time to create and will
be unwieldy to use. While it should be possible to increase the size,
I would not be surprised if you hit other limits fairly quickly.

Ian

Glen Newton

unread,
Sep 21, 2021, 3:41:42 PM9/21/21
to golang-nuts
Hello Ian,

Yes, I think it is time I explain the 'why' of my inquiries into this. :-)

My use case is this: Go, with its fast startup, pretty fast execution and pretty small memory footprint, as well as it's ability to deploy as a single binary, makes it a great language for cloud function-as-a-service (FAAS).

My specific FAAS use case is for static, read-only databases: I am looking at embedding modest sized (few GB to 10s of GB) read-only databases in the Go binary.

This makes it possible to avoid the cost/complication of either using a cloud db, or storing the db in (in the case of AWS) on a file system like EFS (NFS), which the lambda has to mount. The databases are only updated every couple of months / yearly.
This is perhaps rather an obscure use case, but one that I think a number of people would want to take advantage of.

>Note that such files are going to take a long time to create and will be unwieldy to use.
Agreed. On my older desktop (Dell 7010 Kubuntu 20.10 4core i5 16GM) it takes 45s to compile a trivial Go program that embeds three 500GB files.

Actually, this is no different from those who want the simplicity and cost savings of serving an entire static web site from embedded files. The primary difference is the scale.

Thanks,
Glen

Amnon

unread,
Sep 22, 2021, 12:19:40 PM9/22/21
to golang-nuts
A cloud DB looks like a great option here.
The cost/complication would be trivial compared to the cost/complication fixing the tool chain to handle enormous executables,
and of shipping these around.

Tamás Gulácsi

unread,
Sep 22, 2021, 1:16:53 PM9/22/21
to golang-nuts
Go back to the venerable SFX zip archives: Append the file as a (uncompressed?) ZIP file to the Go binary.
That binary can then open itself as a ZIP and read/use the database, and every other program will see it as a zip file.

Or just simply append the database and its uint64 size to the go binary.
Then it can open/mmap the exact bytes needed.

Howard C. Shaw III

unread,
Sep 22, 2021, 1:22:53 PM9/22/21
to golang-nuts
Before the addition of binary packaged assets into Go as a standard library feature, there were various tools to accomplish the same task. Some of them, such as https://github.com/GeertJohan/go.rice , could use an alternate embedding. Basically, instead of having the binary files packaged as Go code in the executable itself, they simply appended a .zip file to the binary, and accessed that directly. 

Now, I am not suggesting that as a direct solution (though it may be, as Go does apparently support Zip64 which does away with the 4GB limit), but perhaps instead looking at how it does what it does, and adapting that to simply appending a sqlite or other single-file database format to the Go binary and using the same basic technique for accessing it.

Glen Newton

unread,
Sep 23, 2021, 8:15:36 PM9/23/21
to golang-nuts

TLDR: a MWE embedding a key-value store file of 20 million k/v pairs (1.9GB) into a Go binary and using the db from the binary.

The same Go program that writes the db then is recompiled, embedding the db in the newly compiled Go binary. Size: ~1.9GB

Then the same Go program is used to read the embedded db.

To read 1 random record from the embedded 1.9GB 20 million record key-value db, takes 133.166µs, 2244k (~2.2MB) resident. Cold start.

This does not seem to bad, for this particular use case.  :-)

I'd be interested in trying out BBolt or Badger, as the K/V db I am using here (cdb https://en.wikipedia.org/wiki/Cdb_(software) ) does not support prefix or range scans or buckets.

Comments welcome.

Thanks,
Glen

Glen Newton

unread,
Sep 23, 2021, 8:32:16 PM9/23/21
to golang-nuts
Oh, regarding the comments about breaking tool chains: valgrind fails on this binary:

$ valgrind --version
valgrind-3.16.1
$ valgrind ./goembeddb -S -q
valgrind: mmap(0x54e000, 1977884672) failed in UME with error 22 (Invalid argument).
valgrind: this can be caused by executables with very large text, data or bss segments.
$


Glen Newton

unread,
Oct 6, 2023, 11:18:03 AM10/6/23
to golang-nuts
Just re-ran this (2 years later) with:
  go version go1.21.1 linux/amd64
  Ubuntu 22.04.3 LTS

and there is a new limit that is unhappy:

  too much data, last section SXCOFFTOC (2097170064, over 2e+09 bytes)
  too much data, last section SDATA (2097170064, over 2e+09 bytes)
  too much data, last section SCOVERAGE_COUNTER (2097373648, over 2e+09 bytes)

(Which makes sense).


Glen
Reply all
Reply to author
Forward
0 new messages