pymongo 1.4: tailable find?

28 views
Skip to first unread message

Roger Binns

unread,
Feb 17, 2010, 1:48:59 AM2/17/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I am trying to use a tailable find with pymongo but there is no
documentation and it doesn't work the way I would expect.

The collection is used for log messages. I am doing something like this:

for row in db.log.find(tailable=True):
print row

However when there are no new items the find iteration finishes
(StopIteration) making it no different than a non-tailable find. Examining
the object returned by find shows no methods for waiting or something
similar. Presumably I can do something like this, but it consumes 100% cpu:

finder=db.log.find(tailable=True)
while True:
for row in finder:
print row

I would expect a tailable find to either block until there are new results
available on calling next()/iteration or that there is a separate method
that does the blocking.

My actual find is a little more complex. I print out some existing log
messages first and then record the largest _id seen. I also only want a
particular log level. Consequently the actual query is:

db.log.find({"_id": {"$gt": highestid}, "level": {"$gte": loglevel}},
tailable=True)

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkt7kVsACgkQmOOfHg372QRlkQCglzxK+97DdIrecnc0YIGMnRTT
c7UAoM8o0OSZmK/m3qdNDIYyKE67xgsD
=2SJY
-----END PGP SIGNATURE-----

Eliot Horowitz

unread,
Feb 17, 2010, 1:54:19 AM2/17/10
to mongod...@googlegroups.com
I think you want it to work via a long poll basically, right?
We're going to be adding a way to do that as part of
http://jira.mongodb.org/browse/SERVER-510

> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
>

Roger Binns

unread,
Feb 17, 2010, 2:14:36 AM2/17/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Eliot Horowitz wrote:
> I think you want it to work via a long poll basically, right?

That is exactly what the existing documentation implies tailable does. The
pymongo docs for it point to
http://www.mongodb.org/display/DOCS/Tailable+Cursors

'tail -f' is exactly the equivalent of what I am implementing.

If tailable cursors have no way of blocking then I don't see how they could
be particularly useful since you'd have no way of knowing if there are new
results short of a busy loop.

> We're going to be adding a way to do that as part of
> http://jira.mongodb.org/browse/SERVER-510

Is the release date listed in Jira (24 Feb) still relatively accurate? Will
the existing tailable cursor implementation in pymongo block or will I have
to wait for a new version and new api?

A further problem is that the pymongo tailable cursor only works as a
tailable cursor if there is no query provided. (If there is a query then
you don't get any results.) Looking at the test code in git, the testing
does not use a query.

In the short term I guess I am stuck repeating the query every second or so.

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkt7l1sACgkQmOOfHg372QSk4ACg0l92SmXYeCAdHxcotJQmLVWy
BQMAn15jyc5TfNasUr5uX2a4NTSrpIWO
=6Bpw
-----END PGP SIGNATURE-----

Eliot Horowitz

unread,
Feb 17, 2010, 9:15:47 AM2/17/10
to mongod...@googlegroups.com
That date is 50% i would say. I forgot about 1 vacation when scheduling.

I don't think the python side will have to change, if so, will be a
trivial change.

Dwight Merriman

unread,
Feb 17, 2010, 9:39:36 AM2/17/10
to mongod...@googlegroups.com
right - this is coming - in the short term repoll with the tailable cursor every second or so. 

the re-poll is a very lightweight operation as the cursor remembers where it was (at the end) - although obviously it would be better if there is no repoll at all.  the C++ http://www.mongodb.org/display/DOCS/Tailable+Cursors demonstrates the current way to use, with a sleep call in the code



-----END PGP SIGNATURE-----

Roger Binns

unread,
Feb 17, 2010, 2:43:41 PM2/17/10
to mongod...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dwight Merriman wrote:
> right - this is coming - in the short term repoll with the tailable
> cursor every second or so.

Sadly there is a bug with tailable cursors if the expression involves _id.
I have reported the bug as http://jira.mongodb.org/browse/PYTHON-104
although I suspect it is a server bug and not pymongo.

What I do first is a find().sort(descending).limit(n) to get the preceding n
entries. I note the highest _id seen. Then for the tailable activity I do:

find({"_id": {"$gt": highestid}})

If I make that tailable then it only returns one new entry and never any
other new ones. Consequently I have to use a non-tailable find every second
remembering the highest _id I have seen so far. (This does assume that
records are returned in _id order which appears true. I'll update the code
once mongo had a better way of doing this.)

Roger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkt8Ru0ACgkQmOOfHg372QSIIQCcDxEFALh2Tn6Lc0sYz0nooO+3
CrsAn2qXfT+WUT52wfjTYfZXNVlftEWB
=p1ci
-----END PGP SIGNATURE-----

Michael Dirolf

unread,
Feb 17, 2010, 3:16:39 PM2/17/10
to mongod...@googlegroups.com
On Wed, Feb 17, 2010 at 2:43 PM, Roger Binns <rog...@rogerbinns.com> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Dwight Merriman wrote:
>> right - this is coming - in the short term repoll with the tailable
>> cursor every second or so.
>
> Sadly there is a bug with tailable cursors if the expression involves _id.
> I have reported the bug as http://jira.mongodb.org/browse/PYTHON-104
> although I suspect it is a server bug and not pymongo.

You're correct - it's a server issue. I've moved that case and
committed a failing test for SERVER-645.

> What I do first is a find().sort(descending).limit(n) to get the preceding n
> entries.  I note the highest _id seen.  Then for the tailable activity I do:
>
>   find({"_id": {"$gt": highestid}})
>
> If I make that tailable then it only returns one new entry and never any
> other new ones.  Consequently I have to use a non-tailable find every second
> remembering the highest _id I have seen so far.  (This does assume that
> records are returned in _id order which appears true.  I'll update the code
> once mongo had a better way of doing this.)
>
> Roger
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.9 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iEYEARECAAYFAkt8Ru0ACgkQmOOfHg372QSIIQCcDxEFALh2Tn6Lc0sYz0nooO+3
> CrsAn2qXfT+WUT52wfjTYfZXNVlftEWB
> =p1ci
> -----END PGP SIGNATURE-----
>

Reply all
Reply to author
Forward
0 new messages