Heads up: Memory overflow while copying a 230GB file into upspin

Filip Filmar

unread,

Mar 6, 2024, 2:52:01 PMMar 6

to Upspin

Hi folks.

This is just a heads up of an issue I observed recently. I have not done a full analysis yet, but wanted to let you know there's an issue, and perhaps someone has ideas before I get the forensics results. (This may take some time, since repro is lengthy.)

I tried to copy a 230GB file into upspin on a GCP Linux VM. I tried both `upspinfs` with `rsync`, and `upspin cp`, with the same results. The file gets transferred, but at the end, `upspinfs` (or `upspin` as the case may be) uses up all the system memory (64GB RAM, no swap) and gets killed by `oomkiller`.

This is what happens to `upspinfs-with-rsync`:

```

╰─>$ rsync --progress <redacted> $HOME/u/<redacted>
242,244,464,202 100% 40.18MB/s 1:35:50 (xfr#1, to-chk=0/1)
rsync: [receiver] close failed on "<redacted>": Software caused connection abort (103)
rsync error: error in file IO (code 11) at receiver.c(888) [receiver=3.2.7]
┬─[filmil@instance-3:~/b2]─[09:52:34]
╰─>$

┬─[filmil@instance-3:~/b2]─[09:48:18]
╰─>$ ./upspin/run.upspin.client.sh: line 13: 12539 Killed nice nohup upspinfs -allow_other -config=$HOME/upspin/<redacted> $HOME/u >> $HOME/run/upspin/nohup.out

```

Rodrigo Schio

unread,

Mar 6, 2024, 4:37:42 PMMar 6

to Upspin

Hi, I didn't dive deep into the problem, but the problem could be that the whole file is kept in memory:

https://github.com/upspin/upspin/blob/master/client/file/file.go#L22-L44

```

// File is a simple implementation of upspin.File.
// It always keeps the whole file in memory under the assumption
// that it is encrypted and must be read and written atomically.
type File struct {
name upspin.PathName // Full path name.
offset int64 // File location for next read or write operation. Constrained to <= maxInt.
writable bool // File is writable (made with Create, not Open).
closed bool // Whether the file has been closed, preventing further operations.

// Used only by readers.
config upspin.Config
entry *upspin.DirEntry
size int64
bu upspin.BlockUnpacker
// Keep the most recently unpacked block around
// in case a subsequent readAt starts at the same place.
lastBlockIndex int
lastBlockBytes []byte

// Used only by writers.
client upspin.Client // Client the File belongs to.
data []byte // Contents of file.
}

```

Eric Grosse

unread,

Mar 6, 2024, 4:58:13 PMMar 6

to Rodrigo Schio, Upspin

Right, that was the original and, to me, simplest design. We did revise the underlying implementation to split the file into blocks that could be independently encrypted/authenticated but I'm not astonished if we missed comprehensively carrying that through all pieces of the code.

Without disputing that this ought to be fixed, I am interested in understanding some of the 230GB context, to the extent that can be described while respecting privacy. As an example, I have sometimes adopted the design of breaking big files into a Unix directory of smaller files. It just seemed to make things easier for editing or parallel processing or recovering from some errors or whatever. PDF and zip are familiar examples of doing it the other way: single big file with an index at the start allowing random access speedup. Maybe you have another illuminating format?

--
You received this message because you are subscribed to the Google Groups "Upspin" group.
To unsubscribe from this group and stop receiving emails from it, send an email to upspin+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/upspin/8195c969-bc94-4d4b-9fed-e91026a06a6fn%40googlegroups.com.

Filip Filmar

unread,

Mar 6, 2024, 5:12:06 PMMar 6

to Eric Grosse, Rodrigo Schio, Upspin

On Wed, Mar 6, 2024 at 1:58 PM Eric Grosse <gro...@gmail.com> wrote:

Without disputing that this ought to be fixed, I am interested in understanding some of the 230GB context, to the extent that can be described while respecting privacy.

The content is not confidential. It's a product of `docker save`, and is a .tar.gz of a Docker image. It is produced by https://github.com/filmil/vivado-docker. With Vivado being a monstrosity, I am not sure if the packaging could be made more palatable to upspin as-is. The whole reason I packaged it into a docker container is so that I wouldn't need to deal with its gazillion files and directories.

F

Filip Filmar

unread,

Mar 6, 2024, 5:36:12 PMMar 6

to Eric Grosse, Rodrigo Schio, Upspin

Hm, I suppose I could pre-split the archive before upload, and then reassemble it when needed.

Context here: https://groups.google.com/g/bazel-discuss/c/3Xi1Tm88wHg

F

Eric Grosse

unread,

Mar 6, 2024, 6:22:14 PMMar 6

to Filip Filmar, Rodrigo Schio, Upspin

Thanks, perfectly reasonable. It is mostly a matter of personal taste
these days whether to "tar c; scp; tar x" or "scp -rp". In the old
days I did the former; nowadays I do the latter.

I'll look into what upspin changes we need to make to help you out.

Filip Filmar

unread,

Mar 6, 2024, 6:25:06 PMMar 6

to Eric Grosse, Rodrigo Schio, Upspin

On Wed, Mar 6, 2024 at 3:22 PM Eric Grosse <gro...@gmail.com> wrote:

Thanks, perfectly reasonable. It is mostly a matter of personal taste
these days whether to "tar c; scp; tar x" or "scp -rp". In the old
days I did the former; nowadays I do the latter.

Well, had I known what I know now, I might have done differently... :)

F

Albert-Jan de Vries

unread,

Mar 11, 2024, 4:37:56 AMMar 11

to Upspin

I had some memory issues with writing files, and made this (rough) change to the upspin client:

- https://groups.google.com/g/upspin/c/D7tYevaiHm0

Op donderdag 7 maart 2024 om 00:25:06 UTC+1 schreef Filip Filmar:

Eric Grosse

unread,

Mar 16, 2024, 6:26:21 PMMar 16

to Albert-Jan de Vries, Upspin

When we adopted chunk encryption in upspin, I regret that we did not drop Client.Get and Put in favor of only the File interface. (We should have implemented buffering and streaming chunks in Write as well as Read there.)

But now I contemplate a streaming app S that uses the new style compared to an app A that extravagantly spends memory to accumulate all results in a byte slice until success and only then calls Put. Doesn't S tend to create more junk files in upspin? We're not great at garbage collection, so I wonder if there is a hidden cost.

Albert-Jan, since you have already implemented a form of this, I wonder if you have any experience on dealing with cleanup, or is it not an issue?

--

You received this message because you are subscribed to the Google Groups "Upspin" group.
To unsubscribe from this group and stop receiving emails from it, send an email to upspin+un...@googlegroups.com.

To view this discussion on the web visit https://groups.google.com/d/msgid/upspin/2fb5df0a-352f-4c68-b522-b8c256a89334n%40googlegroups.com.

Albert-Jan de Vries

unread,

Mar 18, 2024, 7:05:14 AMMar 18

to Upspin

Hi Eric,

Thanks for your message. You're right. There is a hidden cost, and garbage collection for upspin services isn't great. I created a cronjob that runs the upspin audit commands to clean things up, and I'm working on a upspin store and dir server that is better at cleaning up the 'garbage' that is no longer needed.

I like the idea of the File interface (or maybe io.Reader and io.Writer is enough).

Regards,

-- AJ

Op zaterdag 16 maart 2024 om 23:26:21 UTC+1 schreef Eric Grosse:

Eric Grosse

unread,

Mar 18, 2024, 11:48:09 AMMar 18

to Albert-Jan de Vries, Upspin

Very glad to hear that; thanks!

"Correct" garbage collection of course requires ability to see all pointers. Upspin audit can do this in a context where there is one dominant owner. In the general global case where there are links in confidential directories to semi-public resources by other upspin users, this seems hard.

One possibility is a more shared version of audit that has a convention on how to publish lists of blocks in use. All readers of a storeserver would have to participate, at penalty of losing files.

Another idea, from the time of the original Plan 9 snapshot filesystem that Upspin builds on, is to have storage so cheap that garbage collection is not a pressing matter. It is never free, so we still want to avoid generating huge temporary files. (And hence bringing up the topic on this thread.)

I'm unclear whether our existing cloud storage does an adequate automatic job of distinguishing cold, very cheap, blocks from warmer ones with faster access and higher cost.

To view this discussion on the web visit https://groups.google.com/d/msgid/upspin/657c6245-dcc5-4f95-9755-552a297182bbn%40googlegroups.com.

David Presotto

unread,

Mar 18, 2024, 12:14:04 PMMar 18

to Eric Grosse, Albert-Jan de Vries, Upspin

Garbage collection has always been our sore point. While I'd love a world with infinite free storage, that doesn't exist. I've started my storage from scratch a number of times to clean up. In addition to the problem of unknown or unreadable directories servers, our dump directories tie down storage forever. Unless you have a separate dump free dir and store server for temporary files, they're here forever if caught by a dump.

To view this discussion on the web visit https://groups.google.com/d/msgid/upspin/CAHfGVNc_B%2Bm3GtFGgXVR9pq%3D3DQ2aWDwKBXPV%2BTs4yAmSgu%2BcA%40mail.gmail.com.

Reply all

Reply to author

Forward