Message from discussion
Explicit blocking
Received: by 10.52.178.166 with SMTP id cz6mr7474462vdc.1.1337583818129;
Mon, 21 May 2012 00:03:38 -0700 (PDT)
X-BeenThere: mongodb-user@googlegroups.com
Received: by 10.52.93.230 with SMTP id cx6ls1431659vdb.6.gmail; Mon, 21 May
2012 00:03:23 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.52.69.48 with SMTP id b16mr148011vdu.13.1337583802890; Mon, 21
May 2012 00:03:22 -0700 (PDT)
Authentication-Results: ls.google.com; spf=pass (google.com: domain of
s...@a.co.il designates internal as permitted sender)
smtp.mail=s...@a.co.il; dkim=pass
header...@a.co.il
Received: by f30g2000vbz.googlegroups.com with HTTP; Mon, 21 May 2012 00:03:22
-0700 (PDT)
Date: Mon, 21 May 2012 00:03:22 -0700 (PDT)
In-Reply-To: <CALKyTE5aCE52yFdaH6=UedEowSH4KTGmBp_CMG78jfGME6okmg@mail.gmail.com>
References: <5b763aed-ed81-4fc1-a280-df0b9b3b1cd6@v2g2000vbx.googlegroups.com>
<29313708.825.1336577732756.JavaMail.geo-discussion-forums@yndm3>
<ea5ddcad-d16f-4516-8775-5127daa8da99@n33g2000vbi.googlegroups.com>
<CAFppeEQMuHzcVm3LGCnk6MR0bMdKPwZjKjkVQ-BMZMxvKX3Shg@mail.gmail.com>
<b2725758-0175-4c85-a7c0-3fcd37db2bec@5g2000vbf.googlegroups.com>
<CAFppeERNFF5DGg6T-DoVdQYEdiZnJzS3a8U1=Ns_yMyAuYRNAQ@mail.gmail.com>
<00f7bc32-ed62-4f66-bf55-95cca4334a8a@b26g2000vbt.googlegroups.com>
<CALKyTE59z-3jR4bBr1DaLOvsBeWZpwepL6GJV19Wi5caWpFPcQ@mail.gmail.com> <CALKyTE5aCE52yFdaH6=UedEowSH4KTGmBp_CMG78jfGME6okmg@mail.gmail.com>
User-Agent: G2/1.0
X-HTTP-UserAgent: Mozilla/5.0 (Windows NT 5.1) AppleWebKit/536.5 (KHTML, like
Gecko) Chrome/19.0.1084.46 Safari/536.5,gzip(gfe)
Message-ID: <36480dfb-6915-47e6-8e15-2e5bcc802411@f30g2000vbz.googlegroups.com>
Subject: Re: Explicit blocking
From: Saar Korren <s...@a.co.il>
To: mongodb-user <mongodb-user@googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
I don't see how subclassing MongoCursor would help, considering the
flags exist only in the private "opts" field of the native object, and
not in the exposed class. Since the field is not exposed in any way,
and since it is used directly in serializing the cursor options for
the outgoing packet, I have no way to manually modify the cursor's
options.
I've contemplated using db.eval, but I doubt that would work, since
the code is evaluated on the server.
On May 20, 7:26=A0pm, Sam Millman <sam.mill...@gmail.com> wrote:
> Actually in the way your using the query, it might be reliable and should
> be unless I've forgotten something. Since even though it evaluates on
> insertion order, as Spencer says, it does evaluate the query.
>
> On 20 May 2012 17:22, Sam Millman <sam.mill...@gmail.com> wrote:
>
>
>
>
>
>
>
> > "On the topic of question 2: It is possible to get an ObjectId from an
> > insert, that much I know. What I'm wondering is if I can make a query
> > that starts from an ObjectId, and proceeds in the natural order. Would
> > {"_id":{"$gte":my_obj._id}} work as expected?"
>
> > Possibly, well depends but normally no. _id is not ordered on disk and =
in
> > order for that query to be reliable you would need to sort by _id.
>
> > Though there are cases where _id with natural order might be feasible..=
....
>
> > Why not just extend the mongo cursor into your own? I do it all the tim=
e.
> > Add a variable flag to the extended class and bam you have your stuff,
> > unless I'm missing something from not reading the rest of the thread.
>
> > On 20 May 2012 12:28, Saar Korren <s...@a.co.il> wrote:
>
> >> On the topic of question 2: It is possible to get an ObjectId from an
> >> insert, that much I know. What I'm wondering is if I can make a query
> >> that starts from an ObjectId, and proceeds in the natural order. Would
> >> {"_id":{"$gte":my_obj._id}} work as expected?
>
> >> On a slightly different note, I've stumbled into an issue which might
> >> be a show-stopper for me. The PHP driver seems to have no way to turn
> >> on flag 5 (QueryOption_AwaitData) on a cursor. I've looked into the
> >> code, and only flags 1 (QueryOption_CursorTailable), 2
> >> (QueryOption_SlaveOk), 4 (QueryOption_NoCursorTimeout), and 7 appear
> >> to be supported through special functions. There does not appear to be
> >> an option to set arbitrary flags.
>
> >> Short of modifying and compiling a custom MongoDB PHP driver, I see no
> >> solution to this. And that is hardly a viable option, due to
> >> deployment difficulties involving a custom driver.
>
> >> On May 14, 10:46 pm, Spencer T Brody <spen...@10gen.com> wrote:
> >> > 1) Yes, you can still do a value query when using a tailable cursor.
> >> > =A0Tailable cursors should always sort on $natural - it will iterate=
over
> >> the
> >> > documents in insertion order, not using any index. =A0But as it cons=
iders
> >> > every document in insertion order, it can check that the document
> >> satisfies
> >> > a given query expression and only return it if it does
> >> > 2) There is no built-in way to do this, but you could do it by
> >> including a
> >> > timestamp or objectId in the documents as they get inserted, and que=
ry
> >> for
> >> > documents where the timestamp is greater than the one that you creat=
e
> >> with
> >> > the insert.
> >> > 3) It is currently not possible to shard a capped collection.
> >> > 4) In 2.0.x releases the timeout is approximately 4 seconds and ther=
e
> >> is no
> >> > way to configure that. =A0If you want your code to be able to wait f=
or new
> >> > documents for longer than that, you will need to put the querying cl=
ient
> >> > code in a loop that will retry the query from where it left off if i=
t
> >> times
> >> > out. =A0The documentation athttp://
> >>www.mongodb.org/display/DOCS/Tailable+Cursorshassome examples of
> >> > doing this.
>
> >> > On Sun, May 13, 2012 at 2:57 AM, Saar Korren <s...@a.co.il> wrote:
> >> > > Okay, just a couple more questions:
> >> > > 1. It is mentioned tailing cursors do not use indexes. I assume th=
is
> >> > > only refers to the sorting order, but like I said, the documentati=
on
> >> > > is confusing. Is it still possible to tail a collection based on a
> >> > > specific value query?
> >> > > 2. While I could probably find this out myself through some
> >> > > experimentation, it could save time if you could answer this: Is i=
t
> >> > > possible to produce a tailable cursor from an insert? That is, ins=
ert
> >> > > a dummy document, and then wait for documents that are inserted
> >> > > "approximately after" the inserted document? (Preferably with some
> >> > > additional constraints)
> >> > > 3. Do tailable cursors support sharding? (Assuming the tail is on =
a
> >> > > single shard, of course. It would just make more sense to have a
> >> > > single collection with several tailable shards than to have a sepa=
rate
> >> > > collection for each blockable event)
> >> > > 4. The documentation mentions a timeout on tailable cursors, which=
is
> >> > > good, but it does not mention what said timeout is. Is it possible=
to
> >> > > configure the timeout (per cursor or per connection or per
> >> > > collection)? Since I do need to handle a 30 seconds timeout, if th=
e
> >> > > timeout is much longer than that I could be stuck with needlessly
> >> > > prolonged connections.
>
> >> > > On May 10, 6:10 pm, Spencer T Brody <spen...@10gen.com> wrote:
> >> > > > You're right, the documentation here was a bit confusing. =A0I j=
ust
> >> updated
> >> > > > it to make the blocking behavior more explicit.
> >> > > > There is some more documentation about the QueryOption_AwaitData
> >> flag
> >> > > here:
> >> > >http://api.mongodb.org/cplusplus/1.5.1/namespacemongo.html#a7261673=
f7.
> >> ..
> >> > > > .
>
> >> > > > Using a tailable cursor will not cause a busy-wait - replication
> >> uses
> >> > > > tailable cursors to read the oplog and that would be really bad =
if
> >> > > > replication maxed out your CPU.
>
> >> > > > On Thu, May 10, 2012 at 3:43 AM, Saar Korren <s...@a.co.il> wrot=
e:
> >> > > > > Sounds like I could just use a tailable cursor with my own cap=
ped
> >> > > > > collection to create a global optimistic signalling system.
> >> However,
> >> > > > > the documentation on tailable cursors says nothing about block=
ing
> >> if
> >> > > > > no document exist, save for a comment in the first example. Ar=
e
> >> you
> >> > > > > certain about the blocking behavior? Would it prevent a busy-w=
ait
> >> spin-
> >> > > > > lock? (I wouldn't want to max my CPU on a signal wait) Is this
> >> just a
> >> > > > > case of unclear documentation?
>
> >> > > > > Actually, re-reading the examples, it seems to be related to t=
he
> >> > > > > QueryOption_AwaitData flag. What does it do for cursors which =
are
> >> not
> >> > > > > tailable? (The documentation is quite confusing)
>
> >> > > > > On May 9, 6:35 pm, Spencer T Brody <spen...@10gen.com> wrote:
> >> > > > > > MongoDB does not currently have any general signaling and
> >> blocking
> >> > > > > > capabilities. =A0If you just want to be notified when a docu=
ment
> >> is
> >> > > > > updated,
> >> > > > > > however, one thing you could do is tail the oplog. =A0The op=
log
> >> records
> >> > > > > every
> >> > > > > > update that happens for use in replication. =A0Because the o=
plog
> >> is a
> >> > > > > capped
> >> > > > > > collection, you can use a tailable cursor on it. =A0Unlike a
> >> normal
> >> > > cursor,
> >> > > > > > when you query a tailable cursor and it runs out of results,
> >> rather
> >> > > than
> >> > > > > > returning it will block until there are new results to retur=
n.
> >> =A0You
> >> > > could
> >> > > > > > write something to query the oplog with a tailable cursor so
> >> that
> >> > > you'll
> >> > > > > > receive notification of every new write right when it happen=
s.
>
> >> > > > > > More information on the oplog is
> >> > > > > > here:http://www.mongodb.org/display/DOCS/Replica+Sets+-+Oplo=
g,
> >> and
> >> > > > > > information on tailable cursors is
> >> > > > > > here:http://www.mongodb.org/display/DOCS/Tailable+Cursors
>
> >> > > > > > On Wednesday, May 9, 2012 10:24:10 AM UTC-4, Saar Korren wro=
te:
>
> >> > > > > > > I am currently planning the migration of a large-scale
> >> service from
> >> > > > > > > MySQL to MongoDB. While in most use cases involved in the
> >> system
> >> > > > > > > MongoDB seems to be a superior solution (Especially
> >> considering
> >> > > most
> >> > > > > > > of the SQL queries were already optimized in a manner that=
is
> >> more
> >> > > > > > > consistent with MongoDB's paradigms), there is one feature=
of
> >> > > which I
> >> > > > > > > could not find an equivalent in MongoDB: Explicit locks.
>
> >> > > > > > > Currently, I'm using explicit locks for three purposes:
> >> > > > > > > 1. Schema migration - This will be completely removed with=
the
> >> > > move to
> >> > > > > > > MongoDB.
> >> > > > > > > 2. Thrash reduction in job-queue polling. I use a lock to
> >> reduce
> >> > > > > > > collisions between workers polling job queues. The polling
> >> queries
> >> > > > > > > themselves are already designed to be atomic and collision
> >> free.
> >> > > > > > > However, a collision would result in a wasted poll-spin al=
l
> >> but
> >> > > one of
> >> > > > > > > the workers, and with a sufficiently large quantity of
> >> workers a
> >> > > > > > > pessimistic lock has proven to drastically boost performan=
ce.
> >> This
> >> > > is
> >> > > > > > > not mandatory, but I would rather keep it.
> >> > > > > > > 3. Transforming asynchronous calls into a synchronous
> >> callback.
> >> > > This
> >> > > > > > > part is mandatory. I have one synchronous script which can
> >> block
> >> > > up to
> >> > > > > > > 30 seconds waiting for a call to another script which woul=
d
> >> release
> >> > > > > > > it. I was able to achieve a semaphore-like blocking system
> >> using
> >> > > > > > > MySQL's GET_LOCK, IS_USED_LOCK, and KILL. It's a bit of a
> >> hack,
> >> > > but it
> >> > > > > > > serves its purpose, to a point.
>
> >> > > > > > > Point number 3 above is the most important one for me. As =
far
> >> as I
> >> > > > > > > could tell, MongoDB has no method to explicitly block for =
any
> >> > > > > > > controllable event, and, as the wait can reach several sec=
onds
> >> > > even on
> >> > > > > > > normal operations, a spin-lock is simply not an option for=
me.
> >> > > > > > > Obviously, if there was a "wait for change" command for
> >> documents,
> >> > > > > > > even an optimistic one with no hard guarantees about chang=
es
> >> made,
> >> > > > > > > that would solve my problem (Using a semi-blocking semi-sp=
in
>
> ...
>