pymongo: problems with as_class option

258 views
Skip to first unread message

Paul-Olivier Dehaye

unread,
Feb 14, 2012, 4:25:40 AM2/14/12
to mongod...@googlegroups.com
Hi
I have a class MyModel that can directly be instantiated from json records of a specific shape. For instance, MyModel might require a list with exactly 3 components whose second component is a dict with specific key-type pairings, the other entries being free (but all types that occur are always subclasses of valid json types). It then returns the same json record, but with corresponding entries "upgraded" to the appropriate subclasses.
As a consequence, MyModel() would not work for instance, since MyModel is not passed such a dictionary here.
 I was hoping I could use the as_class = MyModel option to basically create for myself a low-level ODM, with all searches, sorting and so on handled by pymongo, but immediate conversion to MyModel as it goes out of the database (seems natural to me, no?).
Unfortunately, it hiccups here:
https://github.com/mongodb/mongo-python-driver/blob/master/bson/__init__.py#L264
This seems to require that as_class have a valid constructor with empty arguments. MyModel does not.
In addition, looking at the code around there, there seems to be many functions _get_ where the code does not use "as_class". Is this normal? Maybe I misunderstood what "as_class" is supposed to do.
What is the alternative for me? It should be possible to hook to a pymongo cursor a last minute call to MyModel.__init__ right before it returns each of its entries, but I don't see how to actually implement that. It works when I call MyModel on each output, but is not as nice for various reasons.
Thank you

Ross Lawley

unread,
Feb 14, 2012, 6:28:17 AM2/14/12
to mongod...@googlegroups.com
Hi Paul-Olivier,

I'm not sure I follow what you are trying to do - could you provide some sudo code of what you want to achieve.

There are a couple of tickets open about as_class behaviour:

 https://jira.mongodb.org/browse/PYTHON-215 and https://jira.mongodb.org/browse/PYTHON-175 which might be relavent, if so add feedback there.

Otherwise, lets discuss this more to see what behaviour you are after and if there is something that could be done to help it.

Ross

Paul-Olivier Dehaye

unread,
Feb 14, 2012, 7:43:46 AM2/14/12
to mongod...@googlegroups.com
Hi Ross,
It looks like the as_class comments are on par with what I want (i.e. non recursive application). It also looks like my use case is pushing the ideas of the people who commented even further. I think it is very powerful.
Here is what I am trying to do. I have two functions Dict and Array (see very rough code at  http://pastebin.com/NgFZ8S7k ). These functions can be combined to construct arbitrarily complicated types. For instance:

class GeoData(Array(Float, n = 2)):
    # I know mongo includes geo data as well, this is just an example of a fixed length array.
    pass

class City(Dict({"name": String, "population": Int, "area": Float, "coordinates": GeoData})):
    def density(self):
        return self["population"]/self["area"]

Country = Dict({"cities": Array(City), "name": String})

> GeoData
>>>> <class '__main__.GeoData'>

> City
>>>> <class '__main__.City'>
 

These are so far just models of what the data should look like. Now let's look at actual data:

barcelona = {"name": "Barcelona", "population":"1621537", "area":"101.9"}
madrid = {"name":"Madrid", "population": 3273049, "area": 607}
spain = {"name":"Spain", "cities":[barcelona, madrid]}
You can see that the types are off, but fortunately we have conversion (this could also simply perform type checking). Notice the difference:


> City(barcelona)
>>> {'area': 101.90000000000001, 'name': 'Barcelona', 'population': 1621537}
> barcelona
>>> {'area': '101.9', 'name': 'Barcelona', 'population': '1621537'}

This makes the following work (style could easily be improved with extra methods):

>for city in Country(spain)["cities"]:
>    print city["name"], city.density()
>>> Barcelona 15913.0225711
>>> Madrid 5392.17298188

What's the link with mongo? Well, imagine I have a collection of countries, whose data is just as messy as for spain. Then, if as_class was behaving like I thought it did (and apparently like others expected, judging from your links), then I could just put my model as a parameter to find(), and get out of there an instance of Country, with all the automatic conversions performed (or the checks). This is a very non-intrusive way to tie a ODM (or even several independent ones) with a mongo database. A further advantage is that it helps define an interface. People submitting data only need to make sure the model accepts their data, and people handling the data just need to know the data is accepted according to the model. Other pymongo ODM don't have such a clear separation, I feel.

Ross Lawley

unread,
Feb 14, 2012, 8:42:12 AM2/14/12
to mongod...@googlegroups.com
Hi Paul,

Thanks for the update - I think the best course of action is to add this to the feature request: https://jira.mongodb.org/browse/PYTHON-175 and then up vote it.  Seems like it would open up some interesting opportunities to interact with the data in a different ways.

Currently, you have to get the data back as a dict then pass it to your classes to achieve whats required.

Ross

Bernie Hackett

unread,
Feb 14, 2012, 3:11:47 PM2/14/12
to mongodb-user
You can do this with son manipulators. Here's an example:

import pprint
import pymongo

from pymongo.son_manipulator import SONManipulator

class MyWrapper(dict):
def __init__(self, *args, **kwargs):
super(MyWrapper, self).__init__(*args, **kwargs)

def what_am_i(self):
return "I'm an instance of MyWrapper!"

class MyManipulator(SONManipulator):

def transform_outgoing(self, son, collection):
return MyWrapper(son)

c = pymongo.Connection()

db = c.foo
db.add_son_manipulator(MyManipulator())

db.bar.remove()

db.bar.insert({'foo': {'bar': 'baz'}})

doc = db.bar.find_one()
assert isinstance(doc, MyWrapper)

print type(doc)
pprint.pprint(doc)
print doc.what_am_i()

On Feb 14, 1:25 am, Paul-Olivier Dehaye <pauloliv...@gmail.com> wrote:
> Hi
> I have a class MyModel that can directly be instantiated from json records
> of a specific shape. For instance, MyModel might require a list with
> exactly 3 components whose second component is a dict with specific
> key-type pairings, the other entries being free (but all types that occur
> are always subclasses of valid json types). It then returns the same json
> record, but with corresponding entries "upgraded" to the appropriate
> subclasses.
> As a consequence, MyModel() would not work for instance, since MyModel is
> not passed such a dictionary here.
>  I was hoping I could use the as_class = MyModel option to basically create
> for myself a low-level ODM, with all searches, sorting and so on handled by
> pymongo, but immediate conversion to MyModel as it goes out of the database
> (seems natural to me, no?).
> Unfortunately, it hiccups here:https://github.com/mongodb/mongo-python-driver/blob/master/bson/__ini...

Christopher Coté

unread,
Feb 14, 2012, 5:47:58 PM2/14/12
to mongodb-user
I also just released Humongolus, see the thread here. It handles this
situation quite nicely.

http://groups.google.com/group/mongodb-user/browse_thread/thread/b2ed9ee76fe841ea

Paul-Olivier Dehaye

unread,
Feb 15, 2012, 3:51:27 AM2/15/12
to mongod...@googlegroups.com
Hi Christophe,
It looks like the schema validation is limited in some ways for my usage. Correct me if I am wrong, but there is no way to specify in your schemas that a list has a fixed length, or a fixed sequence of types (as the name orm.Relationship implies). Otherwise many of the features are present indeed, and your work looks great!
Paul

Paul-Olivier Dehaye

unread,
Feb 15, 2012, 4:05:32 AM2/15/12
to mongod...@googlegroups.com
I was not aware of SON mainpulators, so yes, it might be necessary to expand on the documentation a bit.
Still, two comments:
1) SON manipulators seem to affect whole databases. I want something at the level of collections, which makes more sense from my perspective (of course I can add ifs in my SON manipulator, but with at at least one per collection, this will become a long string of SON manipulators to test for on each document)
2) SON manipulators are a bit more cumbersome with dynamic types (as in my original example): compared to the suggestion of adding one keyword to find, I need to do a SONmanipulator factory, add each to the db, etc. If the types are modified, I need to make sure previously added SON manipulators don't interact.
Paul

Christopher Coté

unread,
Feb 15, 2012, 11:08:37 AM2/15/12
to mongodb-user
Not a bad idea. I'll see if I can add it in. :)

Paul-Olivier Dehaye

unread,
Feb 15, 2012, 11:49:04 AM2/15/12
to mongod...@googlegroups.com
Thanks, that would be great!

Christopher Coté

unread,
Feb 16, 2012, 11:14:50 AM2/16/12
to mongodb-user
A few updates, Relationship is now called List.

List has a length attribute now for ensuring a maximum length.

It also supports multiple types for validation.
Reply all
Reply to author
Forward
0 new messages