upspinfs write with expired token returns error but writes anyway

61 views
Skip to first unread message

Tom Lieber

unread,
May 8, 2020, 4:12:48 PM5/8/20
to Upspin
I noticed that if I write a file after waking my laptop, the write appears to fail with "permission denied: auth token expired" but the data will be written anyway. I suppose that internally it refreshes my token and tries again?

What it looks like in a text editor is:
1) Save, editor says it failed
2) Save, editor says the file has been modified
3) Save, succeeds

I attached a (sanitized) log excerpt.
auth-error-excerpt.log

Eric Grosse

unread,
May 14, 2020, 5:32:38 PM5/14/20
to all...@gmail.com, David Presotto, Upspin
Not my part of the code, but I'll take a stab at it. I don't understand the sanitized log well enough to say if this summary is plausible or not. Dave?

Given the log dates and our previous email thread, I surmise you were using writethrough cacheserver and the laptop had been closed for a few hours and the storeserver had expired the session for the authtoken. The first editor Save would fail, no bytes would be written, and the cacheserver would clear its authtoken. The second editor Save would fail with warning inside the editor, without even talking to the cacheserver. The third editor Save would succeed and bytes would be written all the way.


--
You received this message because you are subscribed to the Google Groups "Upspin" group.
To unsubscribe from this group and stop receiving emails from it, send an email to upspin+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/upspin/CAProexsDXi%2BjfRmYMZXRx6GCLOT0X2QBRO6ipFmWFOddSMmkXg%40mail.gmail.com.

Tom Lieber

unread,
May 14, 2020, 6:30:25 PM5/14/20
to Eric Grosse, David Presotto, Upspin
Yup, that's about it.

From David's e-mail off-list, the only explanation missing from your description is that the first save puts the data in the cache even if the write fails, which is why the mtime is incremented before the second save. So, for example, this can happen if getting the token takes a little while (which I've seen a handful of times but didn't understand before):

1) Save, get "auth token expired"
2) Save, get "file has been modified"
3) Save, get "auth token expired"
4) Save, get "file has been modified"
5) Save, succeed

The best solutions I see are (in descending order):
1) Attempt to refresh the auth token synchronously before failing the write.
2) More aggressively seek to refresh an expired auth token, such as after wake, or when it expires.
3) Don't write to the local cache in writethrough mode if the remote write fails.

The way it things are, it's unclear when the data has actually reached the server. When I first started noticing these errors, I used to take "file has been modified" to mean that the data had been committed, but it really just means it's in the cache.

David Presotto

unread,
May 14, 2020, 11:15:20 PM5/14/20
to all...@gmail.com, Eric Grosse, Upspin
Sorry, my original reply didn't realize that you had writethrough set.

Looking at your log snippet I see a Flush followed by a Release which corresponds to a single Unix close() in the editor.  On some systems one gets both of these on a close(), one to flush the file and the other to actually forget about the open file.  On some we just get the Release. upspinfs tries to do a Put on both of these if the file is dirty.  If the Flush's Put fails, the file is left in dirty state and the subsequent Release will try the Put again, presumably reauthenticating and even reconnecting if the underlying tcp connection went away.

It was my belief however, that the Unix close() would not return until either or both of the Flush and/or Release had happened.  Therefore, if you didn't close your laptop until after the editor closed the file, I'm confused as to why it hadn't already been written to the store and dir servers.  That really should be synchronous in writethrough mode and at least guaranteeing that the file is in the cacheserver cache in writeback mode.

We changed both sam and acme to retry a number of times if they see errors on the return from close().  This was precisely to fix the case of the close happening after we had lost and regained connectivity.  But I'm not seeing that in your log, just the one attempt. 

David Presotto

unread,
May 15, 2020, 12:41:21 PM5/15/20
to all...@gmail.com, Eric Grosse, Upspin
And to make it clear, the cacheserver does cache the file on writethrough but does the Put as soon as the application does the *nix close() so that you should be safe immediately after the close(). Things are actually being doubled cached because upspinfs has any open files cached though forgets about them after the Release.  The cacheserver however will cache any files read or written for faster access next time you read them.  By Watch'ing the dir server, the cacheserver knows when to invalidate the cached files should they change around its back.  And we use PutSequenced wherever possible to make sure you don't accidentally overwrite a file that has been modified behind your back while you are working on it.
Reply all
Reply to author
Forward
0 new messages