Starting today binary tiddler handling on Tank has been adjusted so
that the actual binary content of the tiddler is stored on Amazon's
S3 service. This is done automatically, whether you use the drag and
drop service in the interface, or if you PUT a tiddler that has a
non-textual content type. The process is described below.
But first, some comments about privacy, as I imagine there will be
some curiosity in this area. When the binary content is saved to S3,
it is put in a bucket in which the objects within are "public-read"
but the bucket itself cannot be listed. That means that in order to
get object out of the bucket you have to have the URI for the
object.
I've set up the code so that the URI is a uuid. These are seriously
hard to guess (you're more likely to win the lottery while getting
struck by lightning, twice). This means that the chances of someone
finding your binary content, despite the "public-read" nature
of the bucket, is vanishingly small, unless you have shared the URI
with then, either because you gave it to them, or you have made it
possible for them to read your tank.
How it works:
There are two entry points to saving a binary tiddler:
1. The original TiddlyWeb one where you PUT a tiddler to its URI
2. Dragging and droping a file onto the interface, which does a POST
to /closet/<name of the tank>.
In case 1, a validator[1] checks the content type of the incoming
tiddler. If it is, a connection is made to S3 and the content is put
in a bucket there, with a uuid name. The URI of that content is set
as the value of the '_canonical_uri' field on the tiddler.
tiddler.text (the binary content) is set to the empty string and
the usual PUT process continues.
In case 2, a custom handler at /closet checks to see if the content
is binary. If it is _not_ a normal tiddler is created and saved as
if a PUT were being done. If it is, the same S3 handling described
above is done, with a new tiddler created with the '_canonical_uri'
field.
In either case, the tank now has a new tiddler it in. When that
tiddler is referenced the '_canonical_uri' handling kicks in and the
browser is sent a redirect to the S3 uri.
This has a variety of benefits:
* It's easier to manage tags and other metadata on binaries, after
the fact.
* Processing of the bags/tanks that include binaries is much faster
because the bytes which make up the binary content are not read,
but the title and other metadata are. This is most relevant when
creating lists of tiddlers, like RecentChanges or the lists needed
for the navigation links that were added recently.
* It doesn't use my limited disk space.
The upshot of this is that it is now fairly easy to dump binaries in
Tank, if that's your thing.
I suspect there will be some issues with TWC and TW5 but we'll cross
that bridge when or if they show up.
Finally, creating this feature has been one of those moments where I
get to feel a sense of pride for how well the TiddlyWeb architecture
has supported creating the feature. I never felt any doubt that it
would be relatively trivial to do and it never felt awkward or like
I was fighting the system to make it happen.
[1]
http://tiddlyweb.tiddlyspace.com/validator
--
Chris Dent
http://burningchrome.com/
[...]