Zardosht Kasheff
unread,May 19, 2013, 10:22:49 PM5/19/13Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to mongo...@googlegroups.com
I think I see a potential issue with secondaries reporting the
position of other secondaries (through GhostSync), and I am wondering
if my understanding is correct.
Suppose we have a replication of chain S3->S2->S1->P. P is the
primary, and S1, S2, S3 are secondaries that replicate down the chain.
S1's GhostSync will be responsible for percolating the slave positions
of S3 and S2. The problem I see is that the GhostSync::percolate calls
are serialized, and are done via getMore tailable cursor queries that
will wait if there is no new data. I think I am seeing the following
happening:
- S2 is up to date, and is percolating this fact through S1
- S1 relays this fact by doing a getMore query that is hanging,
because it is a tailable cursor awaiting data
- while the query is hanging, S3 percolates slave position through
S2, and is now stuck on S1 behind the running query for S2.
As a result, S3's information is not immedietely delivered to P,
because S1 is busy waiting on a getMore query while percolating S2's
information. S3's information can be delayed by seconds (whatever the
getMore timeout is).
Is this issue valid?
On a side note, why does the GhostSync percolate code use a cursor to
update slave position through the chain and not use some custom
command? If it could use a command that said "my new position is X",
then this issue, along with something like SERVER-8073 would go away
(I think).
-Zardosht