Related Tags

92 views
Skip to first unread message

clayton collie

unread,
Jun 30, 2012, 9:04:21 AM6/30/12
to rav...@googlegroups.com
I'm porting some generic tagging code from NHibernate, and i'm trying to figure out how to implement "related tags".
IOW, find related items based on a set of common tags.

Abbreviated model follows :

public class Vocabulary 
{
public long Id {get; set;}
public string Name {get; set;}
}

public class Term 
{
public long Id {get; set;}
public string Name {get; set;}
public long VocabularyId {get; set;}
}

    [Serializable]
    public class TaggedItem : DomainEntity
    {
        private DateTime _taggedDate;
        public TaggedItem()
        {
            _taggedDate = SystemTime.Now();
        }
        public TaggedItem(Type itemType, string itemId, DateTime taggedDate, Term tag, Person taggedBy)
        {
            ItemType = itemType;
            ItemId = itemId;
            TaggedDate = taggedDate.EnsureUtc();
            TermId = tag.Id;
            TaggedById = taggedBy.Id;
        }
        public DateTime TaggedDate
        {
            get { return _taggedDate; }
            set { _taggedDate = value.EnsureUtc(); }
        }
        public Type ItemType { get; set; }
// Raven Id of item being tagged
        public string ItemId { get; set; }
        public long TermId { get; set; }
        public long TaggedById { get; set; }
    }
What i need is that given a list of term names (forget vocabulary for now), give me a set of TaggedItems that were tagged with those terms, but exclude those terms from the result. So suppose i have 
a Albums as follows :
 Album1 - tagged with "rock", "punk"
Album2 -  "jazz", "vocals"
Album3 - "punk", "rock", "jazz"

If i search related items for {"punk", "rock"}, i should get tagged items document for Album3 


In NHibernate, i built this with the criteria API, which was involved there even given its sophistication.

Any clues on 
1. How to setup an index to handle this efficiently
2. How to build the query. In NH, i used multiple self-joins. I dont think raven handles this


clayton

Kijana Woodard

unread,
Jun 30, 2012, 10:22:02 AM6/30/12
to rav...@googlegroups.com
Your model *looks* like a port from a relational database. You may want to consider re-designing in a more document oriented manner.

Then your tag search is trivial:

If you want the "tagging occurrences" as separate docs, you should be able to query the "tagging occurrences" collection and then project back to your original entity type (see live projections):

And if you want to get back a mixed bag of DomainEntity objects:

Does that help?

clayton collie

unread,
Jun 30, 2012, 12:41:24 PM6/30/12
to rav...@googlegroups.com
On Saturday, June 30, 2012 10:22:02 AM UTC-4, Kijana Woodard wrote:
Your model *looks* like a port from a relational database. You may want to consider re-designing in a more document oriented manner.

It is indeed a port,
See below:
 
If you want the "tagging occurrences" as separate docs, you should be able to query the "tagging occurrences" collection and then project back to your original entity type (see live projections):

Independent tag occurrences allow me to show individual personal preferences (for recommendations)
as well as showing site-wide popularity of tags over time. It allows for tagging orthogonal to the domain, i.e.
i can tag any document in the system without including tags in individual documents.

And if you want to get back a mixed bag of DomainEntity objects:

Does that help?
 
Somewhat. The clearest way i can express the question is: Given a set of "tagging occurrences'(TaggedItem in my case), how do I query for the set of these
which have a superset of tags in a given list.

i've created an index (TaggedItem_Index) which includes the TermName in addition to the fields of TaggedItem.

Oren Eini (Ayende Rahien)

unread,
Jun 30, 2012, 1:03:15 PM6/30/12
to rav...@googlegroups.com
Clayton,
Let us leave aside the current model.
What is it that you are trying to _do_?

clayton collie

unread,
Jun 30, 2012, 1:54:50 PM6/30/12
to rav...@googlegroups.com
I'm creating a generic tagging system, where "tagging occurrences" are stored as docs outside of individual models. When a user tags a document, one of these documents is created.

So lets say i have an Event, tagged with "jazz", "music", "adult". On the event page, i want to allow the user to find related events (documents) based on those tags. Essentially, find all other
documents which have those tags in common. 

So suppose i have Events (concerts in this example):

Earl Klugh - "jazz", "music", "adult"
Mulgrew Miller - "jazz", "music", "classic", "adult"
Poncho Sanchez - "jazz", "latin"

If i'm on the Earl Klugh page, and i click on "Related", i should find Mulgrew Miller, since the tags for Mulgrew Miller are a superset. 





On Saturday, June 30, 2012 1:03:15 PM UTC-4, Oren Eini wrote:
Clayton,
Let us leave aside the current model.
What is it that you are trying to _do_?

Itamar Syn-Hershko

unread,
Jun 30, 2012, 2:05:40 PM6/30/12
to rav...@googlegroups.com
You are still describing the scenario in relational terms

What you want to do is have a list of strings in your classes, which will contain the tag names. When you click on Related, you simply issue a query to RavenDB asking it to find all documents with at least the tags the currently viewed document has.

clayton collie

unread,
Jun 30, 2012, 2:13:45 PM6/30/12
to rav...@googlegroups.com

On Saturday, June 30, 2012 2:05:40 PM UTC-4, Itamar Syn-Hershko wrote:
You are still describing the scenario in relational terms

What you want to do is have a list of strings in your classes, which will contain the tag names. When you click on Related, you simply issue a query to RavenDB asking it to find all documents with at least the tags the currently viewed document has.
 
Strings would be ids, but thats certainly a way to go. My only difficulties
1. Terms have identity  ("set" has a different meaning if 'im talking math, music or sports) - hence ids and not strings.
2. I want to know who tagged what and when
2. I want to do this once, as a service, so i dont need have ids in each document i want to tag. In my app, there are many such document types.

I'll try the embedded case and see if a general pattern emerges.

Oren Eini (Ayende Rahien)

unread,
Jun 30, 2012, 3:38:52 PM6/30/12
to rav...@googlegroups.com
1) So use ids.
2) So use a complex relation (Entity.Tags where Tags contains object with (TagId, TaggedBy, TaggedWhen).
3) You can do as a base class collection, but I don't think you need a service for that. It is just data, and easily used

Oren Eini (Ayende Rahien)

unread,
Jun 30, 2012, 4:34:49 PM6/30/12
to rav...@googlegroups.com

clayton collie

unread,
Jun 30, 2012, 6:50:07 PM6/30/12
to rav...@googlegroups.com
Thanks a million, i'll try this route.

 For the case of multiple document types having tags, an ITaggable interface (with a Tags collection property) may be sufficient to hack a general solution.

Justin A

unread,
Jul 2, 2012, 11:41:27 PM7/2/12
to rav...@googlegroups.com
@Clayton - here's a demo app i have up on github. It lists questions. U can filter the question list, by tags. This is the equivalent of saying 'gimme all the Events that are tagged as 'jazz' '

This is the sample controller code which returns the filtered list.

and this is the sample domain object / poco which has a list of tags. (which are used to filter, on).

Now i know it's not -exactly- what you're after, but it's a start. It's not too hard to replace my Tag property with something more to your requirements.

HTH.

clayton collie

unread,
Jul 3, 2012, 5:48:33 AM7/3/12
to rav...@googlegroups.com
Thanks, will have a look..
Reply all
Reply to author
Forward
0 new messages