force_autorefs_current_db

22 views
Skip to first unread message

Andrew Degtiariov

unread,
Feb 19, 2010, 7:42:41 AM2/19/10
to MongoKit
Returning to the topic: where I should turn on the force_autorefs_current_db ? Is it possible to turn on it dynamically only for some results?

I have Document's class EventDocument. When I turn on "force_autorefs_current_db" globally for EventDocument class all work ok.
But I want to turn on this feature only for results of map_reduce and when I enable this functionality in registered document (there is mongokit.schema_document.CallableEventDocument)
I got RuntimeError with message " It appears that you try to use autorefs. I found a DBRef without database specified...."

Here the code:

    def __iter__(self):
        obj = object.__getattribute__(self, "_obj")
        result_document = object.__getattribute__(self, "_result_document")
        for item in obj:
            yield result_document(doc=item['value'])

result_document in this case - this is mongokit.schema_document.CallableEventDocument
item - this is instance of temporary collection after map_reduce where key "value" -> a copy of corresponding instance of EventDocument's collection

--
Andrew Degtiariov
DA-RIPE

Nicolas Clairon

unread,
Feb 23, 2010, 6:09:45 AM2/23/10
to mong...@googlegroups.com
To what object this `__iter__` method is attached ?

If I understood well, you would like to pass the
`force_autorefs_current_db` only some time
but not all the time.

So, I expect you want something like that :

event_doc = con.db.col.EventDocument(force_autorefs_current_db=True).find_one(...)

is that right ?

Andrew Degtiariov

unread,
Feb 23, 2010, 6:30:55 AM2/23/10
to mong...@googlegroups.com
2010/2/23 Nicolas Clairon <cla...@gmail.com>

To what object this `__iter__` method is attached ?

If I understood well, you would like to pass the
`force_autorefs_current_db` only some time
but not all the time.


Right.
 
So, I expect you want something like that :

event_doc = con.db.col.EventDocument(force_autorefs_current_db=True).find_one(...)

is that right ?


Not right :-)

We have used map_reduce for complex queries. Result collection contains full document in key "value".
I want to load results from DB and return *Document objects not raw dict.

For do it I'm implemented FindResultCursor - a proxy over any iterable objects (but it is used only over pymongo's cursor) and FindResultCollection - subclass of pymongo.collection.Collection.

So code of RootDocument.find_all look like:

        result = cls.collection.map_reduce(js_map, js_reduce, scope=scope, query=query, **kwargs)
        return FindResultCollection(result, cls)

where cls is Callable*Document.

Code of FindResultCollection:

class FindResultCollection(pymongo.collection.Collection):
   def __init__(self, result_collection, result_document):
        super(FindResultCollection, self).__init__(result_collection.database, result_collection.name)
        self.result_document = result_document

...
    def find(self, spec=None, **kwargs):
        if spec:
            rv_spec = {}
            for key, value in spec.items():
                if key != '_id' and not key.startswith('value.'):
                    key = 'value.' + key
                rv_spec[key] = value
        else:
            rv_spec = spec
        result = super(FindResultCollection, self).find(spec=rv_spec, **kwargs)
        return FindResultCursor(result, self.result_document)

You may see I'm initialized FindResultCursor by cursor object (result) and Callable*Document (self.result_document)

Code of FindResultCursor:

class FindResultCursor(object):
..
  def __init__(self, obj, result_document):
        object.__setattr__(self, "_obj", obj)
        object.__setattr__(self, "_result_document", result_document)
...
  def __iter__(self):
        obj = object.__getattribute__(self, "_obj")
        result_document = object.__getattribute__(self, "_result_document")

        for item in obj:
            yield result_document(doc=item['value'])

In FindResultCursor.__iter__ I'm return generator with Callable*Document

I need a way to turn on force_autorefs_current_db only for objects returned from FindResultCursor.__iter__.
I don't need this globally

--
Andrew Degtiariov
DA-RIPE

Nicolas Clairon

unread,
Feb 23, 2010, 8:06:20 AM2/23/10
to mong...@googlegroups.com
What about create to distinct document : a vanilia document and a
MapDocument which handle
the result of map/reduce :

def test_simple_mapreduce(self):
class MyDoc(Document):
structure = {
'user_id': int,
}
self.connection.register([MyDoc])

for i in range(20):
self.col.MyDoc({'user_id':i}).save()

class MapDoc(Document):
"""
document which handle result of map/reduce
"""
structure = {
'value':float,
}
self.connection.register([MapDoc])

m = 'function() { emit(this.user_id, 1); }'
r = 'function(k,vals) { return 1; }'
mapcol = self.col.map_reduce(m,r)
mapdoc = mapcol.MapDoc.find_one()
assert mapdoc == {u'_id': 0.0, u'value': 1.0}, mapdoc
assert isinstance(mapdoc, MapDoc)

The example below shows how to work with map/reduce and dbref :

def test_mapreduce_with_dbref_force_autorefs_current_db(self):
class MyDoc(Document):
structure = {
'user_id': int,
}
self.connection.register([MyDoc])

for i in range(20):
self.col.MyDoc({'_id':'bla'+str(i), 'user_id':i}).save()

m = 'function() { emit(this.user_id, 1); }'
r = 'function(k,vals) { return {"embed":{"$ref":"mongokit",
"$id":"bla"+k}}; }'

class MapDoc(Document):
use_autorefs = True
force_autorefs_current_db = True
structure = {
'value':{
"embed":MyDoc,
}
}
self.connection.register([MapDoc])

mapcol = self.col.map_reduce(m,r)
mapdoc = mapcol.MapDoc.find_one()
assert mapdoc == {u'_id': 0.0, u'value': {u'embed': {u'_id':
u'bla0', u'user_id': 0}}}
assert isinstance(mapdoc, MapDoc)

It is what you want ? You can do query easily :

results = mapcol.MapDoc.find( my_complex_query )

Andrew Degtiariov

unread,
Feb 23, 2010, 4:29:52 PM2/23/10
to mong...@googlegroups.com


2010/2/23 Nicolas Clairon <cla...@gmail.com>

The example below shows how to work with map/reduce and dbref :

....
Very, very interesting. This will reduce disk IO but increase network IO and query ratio...
Need to analyze this...
Thank you for this advise.

--
Andrew Degtiariov
DA-RIPE
Reply all
Reply to author
Forward
0 new messages