Add override points for Query transformation and processing per object

22 views
Skip to first unread message

Ted Elliott

unread,
Oct 30, 2012, 12:03:05 PM10/30/12
to mongodb...@googlegroups.com
Would it be possible to add some override points in the driver to do Linq query transformation as well as some processing done on an object after its deserialized but before its returned from the enumerator?  Here is what we are trying to accomplish, we have some fields where we've "compressed" some strings to save space in the database.  Basically we've taken a string, stuck it in a lookup table to get back an int Id and the int gets stored in the database.  We would like to put that as close to the database as possible to make it more automatic and not require the developer to think about it too much.  The object is modeled like this:


class Foo {
    [BsonIgnore]
    [MyCompressedField("BarId")]
    public string BarString { get; set; }

    [BsonElement("B")]
    public int BarId { get; set; }
}

So we want to do a few things here:
1. Rewrite Linq queries so that when someone does FooCollection.Where(f => f.BarString == "bar"), that is translated to FooCollection.Where(f => f.BarId == 123) which results in the mongo query: "{ B : 123 }". where 123 is the key for the string "bar".
2. As the object comes out of the database repopulate our string field:
   f => { f.BarString = stringLookup.GetString(f.BarId) }
3. Inserts/Updates need to do the reverse:
   f=> { f.BarId = stringLookup.GetStringId(f.BarString) }

We have 1 and 2 working, but we've had to wrap the MongoQueryProvider and MongoQueryable in odd ways.  The BeginInit/EndInit methods don't really work for this scenario as we require a reference to external object to do the conversion, and it is not a singleton.

So could something like this be added so that we have some places to register some callbacks to occur?  I would be willing to write the code if I could get some pointers as to where the best places to put these would be.  I think Item #1 seems like it would go on MongoCollection.   
      
Thanks,
Ted

Robert Stam

unread,
Oct 30, 2012, 12:40:24 PM10/30/12
to mongodb...@googlegroups.com
You can do this already using a custom serializer. Here's a program illustrating how you might do it:


I've created  a StringTable class that maps strings to ids and back, and a StringIdSerializer that serializes a string by looking up the id and serializing the id, and deserializes the id and looks up the corresponding string. Since you might want to have multiple string tables, you can provide the string table as an argument to the StringIdSerializer.

This does mean that you can't completely initialize the class using attributes alone, but you can use Automap to do most of the work and then just set the serializer for you string property.

The LINQ implementation knows all about custom serializers, so by doing it this way your LINQ queries will be correctly transformed to MongoDB queries using the string ids. See the sample program.

Let me know if you have any questions.

Robert

p.s. Note: the StringTable class in my sample program is not thread safe. You would also need some way to load it with a predictable set of ids.

Ted Elliott

unread,
Oct 30, 2012, 2:04:50 PM10/30/12
to mongodb...@googlegroups.com
I like the solution but I'm not sure if it would work for us.  We are a multi-tenant app, and basically the string lookup is backed by another collection in the mongo database, with a couple of layers of cache in-between for performance.  The instance of the StringLookup would be different for each client database we're talking to, so it doesn't look like it would work to register it in the class map.  Is there a way around that?

Robert Stam

unread,
Oct 30, 2012, 2:13:23 PM10/30/12
to mongodb...@googlegroups.com
Probably not. The C# driver's serialization is built on the assumption that there is ONE global serialization configuration, so we don't support serializing the same data type in different ways in different scenarios (your case of having  a different string lookup table for each different database would require serializing things differently depending on which database was involved).

Unless you can share a single global string table (at least per data type) you may have to push this logic back to your application level.

The only other solution would be some kind of hack that lets you pass different string tables to your serializer out of band, something like thread local storage. Not really a good solution either.
Reply all
Reply to author
Forward
0 new messages