Gloda consumers, say hi now

23 views
Skip to first unread message

Jonathan Protzenko

unread,
Aug 11, 2011, 8:21:14 PM8/11/11
to tb-pl...@mozilla.org, dev-apps-t...@lists.mozilla.org
tl;dr

If you're using any sort of gloda query in your Thunderbird addon, please reply immediately to this thread and tell us which message attributes (from, starred, tags, ...) you're searching on. See <https://developer.mozilla.org/en/Thunderbird/Creating_a_Gloda_message_query> for a refresher. Track progress at <https://bugzilla.mozilla.org/show_bug.cgi?id=678405>.

the story

I'm currently working on boosting the performance of the Gloda database. Gloda was initially designed with high hopes in mind, and as such, has an extremely powerful query mechanism: one can say "return all read messages", or "return all messages with John in CC", etc. This is explained at <https://developer.mozilla.org/en/Thunderbird/Creating_a_Gloda_message_query>. However, this power comes with a cost: there's a messageAttributes table that (roughly) contains a list of triples message id / attribute id / attribute value, so that you can query on these attributes. This means there are many, many rows in that table.

jonathan@ramona:~ $ sqlite3 ~/.thunderbird/4ckb7o7v.default/global-messages-db.sqlite
SQLite version 3.7.6.3
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select count(*) from messageAttributes;
930757

I have nearly one million rows in that table. This not only costs space but also computing power, since sqlite needs to maintain an index on that table.

the plan

I plan on making this opt-in: that is, when an attribute is defined, by default, it won't generate rows in messageAttributes. This should be a big win, if not the biggest of this summer's gloda improvements. The attributes will still be readable once you have the GlodaMessage handy: they're also stored in the JSON blob that belongs to the messages table; simply, you won't be able to query on them. Making this opt-in will help make sure people who define their own attributes don't add an extra burden to the table. Most Gloda internal attributes will remain searchable, however, we plan to make a fair number non-searchable (non exhaustive list):
read, fromMe, toMe, to, cc, bcc, isEncrypted, attachmentInfos, repliedTo, forwarded.

the call

If you or your extension depends on being able query on these attributes, speak up now and we can make sure you still have the ability to query on these attributes. Otherwise, this ability will go away soon (Thunderbird 9). As far as I know, I'm the only one (Thunderbird Conversations) making serious use of Gloda. :TheOne from IRC mentioned using headerMessageID. If you're using Gloda in your Thunderbird addon, we'd like to know about it. Really.

Please note that if you use the attributes to refine a query, you can still filter them manually afterwards: instead of doing a query on message from Bob AND message read, you can just do a query on message from Bob, and then filter the results to only keep read messages.

If anything is unclear, I'd be happy to clarify.

Cheers,

jonathan

Jim

unread,
Aug 12, 2011, 4:13:34 AM8/12/11
to Jonathan Protzenko, tb-pl...@mozilla.org
On 08/11/2011 07:21 PM, Jonathan Protzenko wrote:
> If you're using any sort of gloda query in your Thunderbird addon,
> please reply immediately to this thread and tell us which message
> attributes (from, starred, tags, ...) you're searching on.

I use gloda in my (currently very experimental) attachment tab add-on.
My query looks like:

let query = Gloda.newQuery(Gloda.NOUN_MESSAGE);

if (type)
query.attachmentTypesCategory(MimeCategory(type));
else
query.attachmentTypes();

if (name)
query.attachmentNamesMatch(name);

though I imagine it will get more complicated as I add more features.
I'm not sure I'd use any of the to-be-removed attributes; maybe
attachmentInfos, if I could figure out how it's supposed to work (a lot
of the attachment querying seems strange).

Would it be possible to make all the current attributes searchable, but
just make the uncommon ones slower (i.e. not indexed)? That would ensure
that it's relatively easy to build queries while still making the
database smaller.

- Jim
_______________________________________________
tb-planning mailing list
tb-pl...@mozilla.org
https://mail.mozilla.org/listinfo/tb-planning

Jonathan Protzenko

unread,
Aug 12, 2011, 11:01:00 AM8/12/11
to Jim, tb-pl...@mozilla.org
AttachmentInfos is typically one attribute that I don't expect anyone
to query, since actually it's a list of objects, which I'm pretty sure
Gloda doesn't know how to query. It's merely intended as a way for
people to have detailed information about attachments (including size &
urls) that they can use without streaming each message in the
collection to get to this information. So this one is definitely going
away from messageAttributes.

If one would like to facet-search on the attachment sizes, for
instances, it would then be easy to add back an attribute (say,
attachmentSizes just like attachmentTypes) that is query-able, and add
special-code to allow queries on ranges (just like for .dateRanges). I
don't know how the search code works, but if anyone feels the need for
extra attributes, these are easily addable.

jonathan

Reply all
Reply to author
Forward
0 new messages