groups to tags changes

0 views

Skip to first unread message

Mark Hammond

unread,

Jul 21, 2010, 3:23:15 AM7/21/10

to raindr...@googlegroups.com

I've got some back-end changes wrt tags which I'm ready to push.
Specifically, the changes use the new 'Tags' table to record things like
mailing-lists and 'personal' messages and remove the 'MessageGroup' and
'MessageGroupRecipient' tables as we discussed last week. The back-end
tests all pass with these changes and I've attached the patch.

The API, however, is obviously broken with these changes - the entire
'groupings' API and a couple of the 'conversation' API relating to
groups need to be upgraded to work with tags instead of groupings. I've
2 main issues which prevent me from just replacing the existing grouping
concepts with tags:

* Support for tags in request params (eg, /api/conversations supports a
groups param - this needs to change to support tags). I was thinking
that in the short term we could support 'tags=tagval[,tagval]...' where
'tagval' is of the form 'tag_type:tag_value' - eg:
tags=mailing-list:raindrop-dev@whereever. This means tag 'types' could
not have a ':' char, and tag values could not have a comma - not ideal,
but probably reasonable in the short (and possibly even long) term.

* How to perform efficient queries on these tags. As tags are currently
1:many, it isn't possible to first get an integer ID for the tags which
applies to all messages with the tag - so IIUC we *must* query based on
the string values. Then, given we need to match tag *and* value, it
doesn't seem possible to use 'in_', so supporting an arbitrary number of
tags per request seems tricky.

Assuming I'm not missing something obvious, I guess I'm asking for some
thoughts from Shane how we can support these concepts in the short term
- ie, how to specify tags in the request params, and also how to turn
this into a reasonable runtime query. The 'obvious' answer to me is
that tags become many-to-many so resolve a tag to a single ID - but I
understand Shane's concerns regarding performance in this model...

Thanks,

Mark

tags-to-groupings.patch

Shane Caraveo

unread,

Jul 21, 2010, 6:01:20 PM7/21/10

to raindr...@googlegroups.com, Mark Hammond

On 10-07-21 12:23 AM, Mark Hammond wrote:
> I've got some back-end changes wrt tags which I'm ready to push.
> Specifically, the changes use the new 'Tags' table to record things like
> mailing-lists and 'personal' messages and remove the 'MessageGroup' and
> 'MessageGroupRecipient' tables as we discussed last week. The back-end
> tests all pass with these changes and I've attached the patch.
>
> The API, however, is obviously broken with these changes - the entire
> 'groupings' API and a couple of the 'conversation' API relating to
> groups need to be upgraded to work with tags instead of groupings. I've
> 2 main issues which prevent me from just replacing the existing grouping
> concepts with tags:
>
> * Support for tags in request params (eg, /api/conversations supports a
> groups param - this needs to change to support tags). I was thinking
> that in the short term we could support 'tags=tagval[,tagval]...' where
> 'tagval' is of the form 'tag_type:tag_value' - eg:
> tags=mailing-list:raindrop-dev@whereever. This means tag 'types' could
> not have a ':' char, and tag values could not have a comma - not ideal,
> but probably reasonable in the short (and possibly even long) term.

We had got to the same syntax, with the addition of !type:value.

> * How to perform efficient queries on these tags. As tags are currently
> 1:many, it isn't possible to first get an integer ID for the tags which
> applies to all messages with the tag - so IIUC we *must* query based on
> the string values. Then, given we need to match tag *and* value, it
> doesn't seem possible to use 'in_', so supporting an arbitrary number of
> tags per request seems tricky.
>
> Assuming I'm not missing something obvious, I guess I'm asking for some
> thoughts from Shane how we can support these concepts in the short term
> - ie, how to specify tags in the request params, and also how to turn
> this into a reasonable runtime query. The 'obvious' answer to me is that
> tags become many-to-many so resolve a tag to a single ID - but I
> understand Shane's concerns regarding performance in this model...

I'm not sure what is going to be better performing, a join across 3
tables or a string index. I'm playing right now with changing to a
many-to-many, since after seeing this in action I feel the many-to-many
might be logically better, but am having a problem figuring out how to
do this model (best I can figure is many-to-many between Tags and
ContentIdentity).