Currently package cookiejar has no way to persist cookies,
Jar is in-memory only. This was a deliberate decision
because all proposed solution of persistent storage for a
cookiejar had some drawbacks. Please see Nigel's excellent
writeup [1] for details and the open questions.
The following will use "disk" as a synonym for any kind
of persistent storage.
I did a (highly unscientific) experiment to see what kind
of actions happen on a cookie jar how often. It seems as
if "normal web browsing" (some work in different web
applications, some information retrieval, procrastinating)
generates much more updates to the LastAccess field than
actual cookie mutations. Also mutations of a cookie (creation
and updates of its values) dominate, deletions are very
rare. Thus a distribution like 100 : 10 : 1 for
(update LastAccess) : (create or modify cookie) : (delete cookie)
seems reasonable. Other use cases, e.g. interaction with
a handful of web services might produce different relations
but I think deletions will still be rare.
One of the main issues with storing cookies to disk is
reporting errors: Cookie handling happens opaque in package
http, saving might happen e.g. during following redirects
and it is unclear how to report or handle a "disk full"
error here. Any scheme where the Jar itself writes to disk
will face these types of problems. I'd thus like to propose
a different way in which the user of Jar is responsible for
initiating the dump to disk. While a bit more complicated
for the user of Jar than the previously proposed solutions
(see references in [1]) this is very flexible and doesn't
suffer from the missing way to report write errors.
The user of Jar may save to disk anytime she wants, e.g.:
- After each http request.
- Periodically, maybe every second.
- Only when the disk or overall system is idle.
- ...
Errors are reported to the user and she might retry to
save the unsaved data or handle the error in any appropriate
other way.
In code this is realized by enumerating all the modifications
to Jar: Each cookie modification (create, change, update
LastAcces and delete) gets a serial number which acts as a
kind of time stamp. The user may use these time stamps
in the form of type Marker to specify which changes to
a Jar should be dumped to disk. This way of persisting
cookies should work pretty well as the number of deleted
cookies seems fairly low.
Only drawback/pitfall: Saving twice with the same marker
but to two different persistent storages might not delete
cookies in the second write.
Two more changes:
First, the LastAccess field is used for limiting the number of
cookies in a Jar (by deleting the least used ones). I suggest to
expose this feature, again completely user controlled.
Second, handling of session cookies: RFC 6265 section 5.3 requires
that the user agent MUST remove non-persistent cookies at session
end but does not define what a session end is. I'd like to propose
that Jar gets an additional method to end a session and delete
all the non-persistent cookies while allowing the Save method to
store session cookies to disk. This would allow to continue
a broken session.
Any comments welcome.
V.
API proposal:
// Marker represents a certain time in the lifetime of a Jar.
type Marker uint64
// Save persists all accumulated changes (new, modified and deleted
// cookies) done to jar since from to storage. A zero value for from
// means to save all cookies.
// PersistentOnly controls handling of non-persistent/session cookies.
// Next indicates the unsaved portion of changes. If err is non nil
// than none or just some of the accumulated changes have been saved.
func (jar *Jar) Save(from Marker, storage *Storage, persistentOnly bool) (next Marker, err error)
// Load merges the content from storage to jar's current content.
// The jar might drop cookies with domains which are not allowed
// according to its public suffix list. Also expired cookies won't
// be loaded.
// The returned next marker is the savepoint for upcoming changes.
// TODO: Explanation of next is incomprehensible.
func (jar *Jar) Load(storage *Storage) (next Marker, err error)
// CookieData is used to transfer cookies between a Jar and a Storage.
type CookieData struct {
Key string // Key is the ID of his cookie in the form "Domain;Name;Path".
Data string // Data is the opaque payload data of the cookie.
}
// Storage is a persistent storage for cookies.
type Storage interface {
// Save writes cookies to the persistent storage.
// A nil Data in a cookie indicates to delete the cookie identified by Key.
// Save returns the number of successfully written or deleted cookies
// which might be less than len(cookies) in which case err contains
// the reason.
Save(cookies []CookieData) (nWritten int, err error)
// ReadAll calls callback for each persisted cookie.
ReadAll(callback func(cookie CookieData)) error
}
// Limit deletes the least used cookies from jar until jar
// contains no more than n cookies. The number of deleted cookies
// is returned.
// Calling Limit(0) is the most expensive way to empty a jar.
func (jar *Jar) Limit(n int) int
// EndSession deletes all non-persistent (session) cookies from jar.
func (jar *Jar) EndSession()