[Tank] binary file handling

21 views
Skip to first unread message

chris...@gmail.com

unread,
Mar 6, 2014, 7:08:25 AM3/6/14
to tidd...@googlegroups.com

Starting today binary tiddler handling on Tank has been adjusted so
that the actual binary content of the tiddler is stored on Amazon's
S3 service. This is done automatically, whether you use the drag and
drop service in the interface, or if you PUT a tiddler that has a
non-textual content type. The process is described below.

But first, some comments about privacy, as I imagine there will be
some curiosity in this area. When the binary content is saved to S3,
it is put in a bucket in which the objects within are "public-read"
but the bucket itself cannot be listed. That means that in order to
get object out of the bucket you have to have the URI for the
object.

I've set up the code so that the URI is a uuid. These are seriously
hard to guess (you're more likely to win the lottery while getting
struck by lightning, twice). This means that the chances of someone
finding your binary content, despite the "public-read" nature
of the bucket, is vanishingly small, unless you have shared the URI
with then, either because you gave it to them, or you have made it
possible for them to read your tank.

How it works:

There are two entry points to saving a binary tiddler:

1. The original TiddlyWeb one where you PUT a tiddler to its URI
2. Dragging and droping a file onto the interface, which does a POST
to /closet/<name of the tank>.

In case 1, a validator[1] checks the content type of the incoming
tiddler. If it is, a connection is made to S3 and the content is put
in a bucket there, with a uuid name. The URI of that content is set
as the value of the '_canonical_uri' field on the tiddler.
tiddler.text (the binary content) is set to the empty string and
the usual PUT process continues.

In case 2, a custom handler at /closet checks to see if the content
is binary. If it is _not_ a normal tiddler is created and saved as
if a PUT were being done. If it is, the same S3 handling described
above is done, with a new tiddler created with the '_canonical_uri'
field.

In either case, the tank now has a new tiddler it in. When that
tiddler is referenced the '_canonical_uri' handling kicks in and the
browser is sent a redirect to the S3 uri.

This has a variety of benefits:

* It's easier to manage tags and other metadata on binaries, after
the fact.
* Processing of the bags/tanks that include binaries is much faster
because the bytes which make up the binary content are not read,
but the title and other metadata are. This is most relevant when
creating lists of tiddlers, like RecentChanges or the lists needed
for the navigation links that were added recently.
* It doesn't use my limited disk space.

The upshot of this is that it is now fairly easy to dump binaries in
Tank, if that's your thing.

I suspect there will be some issues with TWC and TW5 but we'll cross
that bridge when or if they show up.

Finally, creating this feature has been one of those moments where I
get to feel a sense of pride for how well the TiddlyWeb architecture
has supported creating the feature. I never felt any doubt that it
would be relatively trivial to do and it never felt awkward or like
I was fighting the system to make it happen.

[1] http://tiddlyweb.tiddlyspace.com/validator

--
Chris Dent http://burningchrome.com/
[...]

Jeremy Ruston

unread,
Mar 6, 2014, 7:41:48 AM3/6/14
to tidd...@googlegroups.com
Hi Chris

That's very cool, well done.

In terms of getting it to work with TW5, at the moment TW5 will PUT binary tiddlers in JSON with the content base64 encoded. Would that still work?

Best wishes

Jeremy




--
You received this message because you are subscribed to the Google Groups "TiddlyWeb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tiddlyweb+unsubscribe@googlegroups.com.
To post to this group, send email to tidd...@googlegroups.com.
Visit this group at http://groups.google.com/group/tiddlyweb.
For more options, visit https://groups.google.com/groups/opt_out.



--
Jeremy Ruston
mailto:jeremy...@gmail.com

chris...@gmail.com

unread,
Mar 6, 2014, 8:39:33 AM3/6/14
to tidd...@googlegroups.com
On Thu, 6 Mar 2014, Jeremy Ruston wrote:

> That's very cool, well done.

Thanks. It all just sort of fell into place. The hardest part ended
up being decoding s3 bucket policies.

> In terms of getting it to work with TW5, at the moment TW5 will PUT binary
> tiddlers in JSON with the content base64 encoded. Would that still work?

PUTting them works just fine, and they initially display fine as
well, but when the wiki is reloaded since the tiddler comes in as

{"text": "", "type": "image/gif", "fields": {
"_canonical_uri": "<the s3 uri>"}}

There's no image data to display. That _canonical_uri needs to be
chased to get the stuff in some fashion.

Jeremy Ruston

unread,
Mar 6, 2014, 9:05:52 AM3/6/14
to tidd...@googlegroups.com
Hi Chris

> There's no image data to display. That _canonical_uri needs to be chased to get the stuff in some fashion.

Great, I've added a ticket for it:


Best wishes

Jeremy



--
You received this message because you are subscribed to the Google Groups "TiddlyWeb" group.
To unsubscribe from this group and stop receiving emails from it, send an email to tiddlyweb+unsubscribe@googlegroups.com.
To post to this group, send email to tidd...@googlegroups.com.
Visit this group at http://groups.google.com/group/tiddlyweb.
For more options, visit https://groups.google.com/groups/opt_out.
Reply all
Reply to author
Forward
0 new messages