Google Groups

Re: [tiddlyweb] binary content in tiddlyweb

cdent May 4, 2012 6:41 AM
Posted in group: TiddlyWeb
On Thu, 3 May 2012, Ben Gillies wrote:

> How much are binary tiddlers slowing things down by on different
> TiddlyWeb instances (e.g. in general? If large binary
> tiddlers slow things down a lot, but aren't generally PUT/GET very
> often, then it's likely not that important, on the other hand, if they
> are, then it is.

It's sort of a matter of what people want to be able to do. The
current situation on is that the inefficiency is noise
in the analysis done to understand usage patterns. The slowdowns we
saw while at the hothouse should be less of a problem now that I've
made some changes to how the content is saved into the database. Still
causes load, but not load the current user agent should notice.

>> 2 If that is important, how important is it that store and serialization
>> �code changes made to accomodate binaries take account of existing
>> �tiddlyweb plugin code and maintain backward compatibility?
> I'm not sure there's _that_ much code that exists that deals with
> binary tiddlers directly, so I'm not sure it would be _that_ much
> effort to patch things up. From my point of view, I'm quite happy to
> make any necessary changes to my various plugins and push them out at
> the same time to keep things working if it's needed.

The issue is that it might be useful to change the serialization api
so that the as_tiddler method takes either a string or a filehandle,
and that change might need to be reflected across various

It's a bit not correct for me to say binary tiddlers. What I really
means is "tiddlers which are really large". That these are usually
binary is coincidence.

>> 3 How imporant is it that binary tiddlers remain raw tiddlers or could
>> �binary tiddlers be pointers to binary content?
> They could be, we'd likely have to make several changes to the various
> clientside plugins that deal with them (not _that_ big a deal). Though
> IIRC, large binary tiddlers already appear as links anyway.

That's only in TiddlyWiki. An important aspect of this discussion is
that binary tiddler handing really only matters _outside_ of
TiddlyWiki as that's the only place (currently) where you'd want to
mess with them directly (in TiddlyWiki links are better).

The main issue is that a large tiddler (greater than 10s of MB) gets
read into memory during a PUT or a GET. That's not ideal. Coming up
with ways to avoid that is _required_ if (and only if) people want to
store large tiddlers _in_ TiddlyWeb.

My read on what's been said so far is that people are fairly keen on
being able to have them in the same place, with the same API, but
would be willing to accept a layer of indirection if that was

> What do you mean binary URI? Are you talking about a separate URI, or
> just treating the standard URI in different ways depending on the
> Accept header (e.g. request as JSON, txt, etc -> retrieve that
> standard tiddler, request as default, image/jpeg, etc -> retrieve from
> the binary store). If the latter, that seems fairly sensible.

I mean that binaries are stored in some kind of auxilliary storage,
which provides a URI for the content. The tiddler which operates as a
pointer to that content has a field of "_uri" or something like that
which points into the aux storage.

What gets sent to the client depends on the accept header. If default
is request, then the request redirects to the binary URI (so the aux
storage can delivery the content efficiently (i.e. without a big use
of memory)). If something else, like JSON, you'd get a tiddler which
has a pointer field to the binary content.

The details are still a bit fuzzy.

I'd prefer not to mess with the existing external data structures if
at all possible.

> I care to the extent that I want to be able to upload images to
> TiddlySpace without abusing things too much (that is, it's possible
> now, but better to use something else and hotlink (assuming the
> something else permits that of course)). I don't really do video, pdf,
> doc, etc, but if TiddlyWeb is supposed to hold notes, ideas, whatever,
> then I think it's important not to restrict the format that the notes
> come in.

This is pretty much what I think too.

Chris Dent