multi-user file storage

81 views
Skip to first unread message

Tom Lieber

unread,
Jan 29, 2020, 12:25:39 PM1/29/20
to Upspin
I remember reading somewhere in the docs that the content a user uploads go to their own storage server, regardless of whose namespace they write to.

Does that mean that if you overwrite a file in someone else's namespace, then shut your server down, the file's contents become unavailable? Does the content disappear from snapshots too, or is does snapshotting copy it over?

David Presotto

unread,
Jan 29, 2020, 12:57:02 PM1/29/20
to all...@gmail.com, Upspin
The contents become inaccessible to everyone via that name.  However, the original contents are still in the original writer's store and any previous snapshots will point to it so it is still available (assuming a snapshot happened twixt the two writes).  

Upspin is at heart a write once storage system.  When you overwrite something, nothing goes away.  You are adding blocks with the new contents to the store, and rewriting the directory tree (also just blocks in the store) up to the root.  Our snapshots are just a list of directory entries pointing to the roots at the time they were made.  Thus (1) snapshots are fast and take very little room and (2) represent the directory tree and contents exactly as it was at the time of the snapshot.

The flip side is garbage collection.  We don't have infinite storage.  We have a few attempts at garbage collectors though nothing to write home about, actually something the community could help with.  To clean up blocks that no longer are referenced, one has to descend down the current tree and all snapshots marking blocks as one goes.  Any not referenced can be removed.  Ditto one could conceivably trim the number of snapshots to reduce the referenced blocks.

The fly in the garbage collection ointment is actually related to your question.  If you write into someone else's directory tree, the blocks are in your store.  They will not be referenced in your directory tree or snapshots.  Therefore, garbage collection will not know about them.  I do not have a good solution to that since you won't necessarily be able to walk their tree even if you know it is there.  One could indeed occasionally copy foreign blocks into ones own store that would be a possible solution.



On Wed, Jan 29, 2020 at 9:25 AM Tom Lieber <all...@gmail.com> wrote:
I remember reading somewhere in the docs that the content a user uploads go to their own storage server, regardless of whose namespace they write to.

Does that mean that if you overwrite a file in someone else's namespace, then shut your server down, the file's contents become unavailable? Does the content disappear from snapshots too, or is does snapshotting copy it over?

--
You received this message because you are subscribed to the Google Groups "Upspin" group.
To unsubscribe from this group and stop receiving emails from it, send an email to upspin+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/upspin/CAProexvAWo6JRqQCPWx5x5JTRo-%3DFzN5MMxydtqD%3D%2BrgVaS3-Q%40mail.gmail.com.

Eric Grosse

unread,
Jan 29, 2020, 1:17:23 PM1/29/20
to David Presotto, all...@gmail.com, Upspin
"will not be referenced in your directory tree"

Yes! That's my primary use of symbolic links in Upspin. I try to remember to create symbolic links to directories in other people's trees where I have written content. If I am consistent enough about it, this helps with the GC. But it is a kludge and I wish we had a better answer.

Tom Lieber

unread,
Jan 29, 2020, 1:43:31 PM1/29/20
to David Presotto, Upspin
Interesting, thanks. That's a lot to think about. It sounds like if you want long-term access to the contents of a shared directory, you need to periodically archive it. Has anyone else wanted an option for snapshot to copy over remote blocks?

Reply all
Reply to author
Forward
0 new messages