Updates produce incorrect oplog after upgrade to 2.2.1

285 views
Skip to first unread message

DaveC

unread,
Nov 16, 2012, 9:06:03 AM11/16/12
to mongod...@googlegroups.com
The oplog seem broken on updates that perform multiple $set operations in 2.2.1.  In my case, this is causing problems tailing the oplog, but I think it is also probably causing some updates not to be replicated properly.

In prior versions, including 2.2.0, I would see this type of oplog entry when performing a update on a document, setting multiple fields:

{
        "ts" : Timestamp(1353074108000, 3),
        "h" : NumberLong("-3950642131243057651"),
        "op" : "u",
        "ns" : "mydb.mycollection",
        "o2" : {
                "_id" : "my-unique-id-1234"
        },
        "o" : {
                "$set" : {
                        "aaa" : "1234",
                        "bbb" : "2345",
                        "ccc" : "3456",
                        "ddd" : "4567",
                        "eee" : "5678",
                        "fff" : "6789",
                        "ggg" : "7890",
                        "hhh" : "asdf"
                }
        }
}

When I perform updates in 2.2.1, the rs.oplog gets strange entries.  For example, if I set 8 different fields on a document, I will see the same $set operation in the oplog entry 8 times, only including one of the $set operations from the update.  I see the same behavior in 2.2.2-rc0:

{
        "ts" : Timestamp(1353073548000, 3),
        "h" : NumberLong("7749278417229495821"),
        "v" : 2,
        "op" : "u",
        "ns" : "mydb.mycollection",
        "o2" : {
                "_id" : "my-unique-id-1234"
        },
        "o" : {
                "$set" : {
                        "aaa" : "12345"
                },
                "$set" : {
                        "aaa" : "12345"
                },
                "$set" : {
                        "aaa" : "12345"
                },
                "$set" : {
                        "aaa" : "12345"
                },
                "$set" : {
                        "aaa" : "12345"
                },
                "$set" : {
                        "aaa" : "12345"
                },
                "$set" : {
                        "aaa" : "12345"
                },
                "$set" : {
                        "aaa" : "12345"
                }
        }
}


gregor

unread,
Nov 16, 2012, 2:26:27 PM11/16/12
to mongod...@googlegroups.com
what opetataion was this for - what was the original text of the update on the primary?

Eliot

unread,
Nov 16, 2012, 4:00:33 PM11/16/12
to mongod...@googlegroups.com
I think this is a shell display issue.
To confirm, can you do a mongodump of local.oplog.rs.
Then do a bsondump of that file, and grep for 3950642131243057651
You should see the correct update.

DaveC

unread,
Nov 16, 2012, 4:28:02 PM11/16/12
to mongod...@googlegroups.com
Here is a sample of my shell commands on a brand new 2.2.1 replica set:

rs0:PRIMARY> db.mycoll.save({_id:"unique-id-1234", "aaa":"0000", "bbb": "1111"})
rs0:PRIMARY> db.mycoll.update({_id:"unique-id-1234"}, {$set:{"aaa":"1234","bbb":"7654"}})
rs0:PRIMARY> use local
rs0:PRIMARY> db.oplog.rs.find().pretty()
{
        "ts" : Timestamp(1353099630000, 1),
        "h" : NumberLong(0),
        "v" : 2,
        "op" : "n",
        "ns" : "",
        "o" : {
                "msg" : "initiating set"
        }
}
{
        "ts" : Timestamp(1353099716000, 1),
        "h" : NumberLong("6562103115049336849"),
        "v" : 2,
        "op" : "i",
        "ns" : "sample.mycoll",
        "o" : {
                "_id" : "unique-id-1234",
                "aaa" : "0000",
                "bbb" : "1111"
        }
}
{
        "ts" : Timestamp(1353099904000, 1),
        "h" : NumberLong("-7902296395036453900"),
        "v" : 2,
        "op" : "u",
        "ns" : "sample.mycoll",
        "o2" : {
                "_id" : "unique-id-1234"
        },
        "o" : {
                "$set" : {
                        "aaa" : "1234"
                },
                "$set" : {
                        "aaa" : "1234"
                }
        }
}

If I do a mongodump of local.oplog.rs, and then a bsondump, this is what I get:

c:\mongodb-2.2.1\bin>bsondump.exe dump\local\oplog.rs.bson
{ "ts" : { "t" : 1353099630000 , "i" : 1 }, "h" : 0, "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } }
{ "ts" : { "t" : 1353099716000 , "i" : 1 }, "h" : 6562103115049336849, "v" : 2,"op" : "i", "ns" : "sample.mycoll", "o" : { "_id" : "unique-id-1234", "aaa" : "0000", "bbb" : "1111" } }
{ "ts" : { "t" : 1353099904000 , "i" : 1 }, "h" : -7902296395036453900, "v" : 2, "op" : "u", "ns" : "sample.mycoll", "o2" : { "_id" : "unique-id-1234" }, "o" :{ "$set" : { "aaa" : "1234" }, "$set" : { "bbb" : "7654" } } }
3 objects found

This looks better, but there is still a problem since the "o" subdocument from the update contains multiple keys with the same name, "$set".  Is this a valid BSON representation?

"o" :{ "$set" : { "aaa" : "1234" }, "$set" : { "bbb" : "7654" } }

If I use a driver to query the oplog, I get an exception, so I don't think is it just a shell display issue.  Here is an example with the C# driver:

var local = MongoDB.Driver.MongoDatabase.Create("mongodb://localhost/local");
foreach (var doc in local["oplog.rs"].FindAll()) {
    Console.WriteLine(doc);
}

Unhandled Exception: System.InvalidOperationException: Duplicate element name '$set'.
   at MongoDB.Bson.BsonDocument.Add(BsonElement element) in C:\work\rstam\mongo-csharp-driver\Bson\ObjectModel\BsonDocument.cs:line 468
   at MongoDB.Bson.BsonDocument.Deserialize(BsonReader bsonReader, Type nominalType, IBsonSerializationOptions options) in C:\work\rstam\mongo-csharp-driver\Bso
n\ObjectModel\BsonDocument.cs:line 783
   at MongoDB.Bson.BsonDocument.ReadFrom(BsonReader bsonReader) in C:\work\rstam\mongo-csharp-driver\Bson\ObjectModel\BsonDocument.cs:line 409
   at MongoDB.Bson.BsonValue.ReadFrom(BsonReader bsonReader) in C:\work\rstam\mongo-csharp-driver\Bson\ObjectModel\BsonValue.cs:line 1069
   at MongoDB.Bson.BsonElement.ReadFrom(BsonReader bsonReader, BsonElement& element) in C:\work\rstam\mongo-csharp-driver\Bson\ObjectModel\BsonElement.cs:line 1
67
   at MongoDB.Bson.BsonDocument.Deserialize(BsonReader bsonReader, Type nominalType, IBsonSerializationOptions options) in C:\work\rstam\mongo-csharp-driver\Bso
n\ObjectModel\BsonDocument.cs:line 783
   at MongoDB.Bson.Serialization.BsonSerializer.Deserialize(BsonReader bsonReader, Type nominalType, IBsonSerializationOptions options) in C:\work\rstam\mongo-c
sharp-driver\Bson\Serialization\BsonSerializer.cs:line 211
   at MongoDB.Driver.Internal.MongoReplyMessage`1.ReadFrom(BsonBuffer buffer, IBsonSerializationOptions serializationOptions) in C:\work\rstam\mongo-csharp-driv
er\Driver\Internal\MongoReplyMessage.cs:line 105
   at MongoDB.Driver.Internal.MongoConnection.ReceiveMessage[TDocument](BsonBinaryReaderSettings readerSettings, IBsonSerializationOptions serializationOptions)
 in C:\work\rstam\mongo-csharp-driver\Driver\Internal\MongoConnection.cs:line 47
5
   at MongoDB.Driver.MongoCursorEnumerator`1.GetReply(MongoConnection connection, MongoRequestMessage message) in C:\work\rstam\mongo-csharp-driver\Driver\Core\
MongoCursorEnumerator.cs:line 296
   at MongoDB.Driver.MongoCursorEnumerator`1.GetFirst() in C:\work\rstam\mongo-csharp-driver\Driver\Core\MongoCursorEnumerator.cs:line 251
   at MongoDB.Driver.MongoCursorEnumerator`1.MoveNext() in C:\work\rstam\mongo-csharp-driver\Driver\Core\MongoCursorEnumerator.cs:line 141
   at OplogTest.Program.Main(String[] args) in C:\src\Samples\OplogTest\OplogTest\Program.cs:line 18

Eliot Horowitz

unread,
Nov 17, 2012, 9:06:43 AM11/17/12
to mongod...@googlegroups.com
Ok thanks.

Having multiple fields in the same document is actually valid BSON.
It i used internally for replication for some historical reasons.
However, it is not handled well by the shell or by the drivers.
We are you going to be changing replication to not use multiple fields
with the same name.
What this does mean is that the replication side of things is working
as expected, but introspection on the oplog is a bit trickier.

Do you need to introspect it or were you just looking?
> --
> You received this message because you are subscribed to the Google
> Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com
> To unsubscribe from this group, send email to
> mongodb-user...@googlegroups.com
> See also the IRC channel -- freenode.net#mongodb

DaveC

unread,
Nov 17, 2012, 12:45:03 PM11/17/12
to mongod...@googlegroups.com, el...@10gen.com
I tail the oplog to perform some application side logic when certain collections are updated, so I need to be able to parse the documents.  This was working until 2.2.1, so it seems strange that the oplog format would change to this new format where there are multiple $set keys.

I don't understand what you mean by this statement:

We are you going to be changing replication to not use multiple fields with the same name.

Does that mean that you are changing the format back or will there be a new format?

Eliot Horowitz

unread,
Nov 17, 2012, 11:16:22 PM11/17/12
to DaveC, mongod...@googlegroups.com
We've used duplicate names in replication forever, so 2.2.1 just
changed one case.

Yes, I meant we are changing to not use duplicate names anywhere
internally as they don't work that well with any driver except c++ at
this point.

Rob Moore

unread,
Nov 18, 2012, 11:22:34 AM11/18/12
to mongod...@googlegroups.com, DaveC, el...@10gen.com

Dave -

If your app is written in Java the async driver[1] can read and manipulate these documents correctly.

You will want to stay away from the "get(...)" methods on the Document interface as the implementations build a map of the keys in the background. 

If you have enabled assertions you will want to turn those off as there are a few places that will validate the document only contains a single value for each key.  Assertions are off by default but some test launchers will turn them on.

Rob.

[1] http://www.allanbank.com/mongodb-async-driver/

DaveC

unread,
Nov 18, 2012, 2:56:34 PM11/18/12
to mongod...@googlegroups.com, DaveC, el...@10gen.com
@Eliot - duplicate names in replication may have been supported, but they certainly weren't appearing as a result of $set operations until 2.2.1.  I'm not sure what queries even produce them as I've been running this code since MongoDB 1.8 on a variety of different applications and never encountered an oplog entry with duplicate keys until upgrading to 2.2.1.  Is there a chance that you'll revert back to the old format or at least choose a format that won't use duplicate key names for 2.2.2?  I can hold off on upgrading to 2.2.1, but at some point, there might be a fix or feature that necessitates an upgrade and I hope the oplog will be tailable again at that point.

@Rob - thanks, I can try that to see if it will suit my needs. My code is in .NET and has remained unchanged since it was written on MongoDB 1.8, so it's a little frustrating that the oplog in 2.2.1 uses a new format for $set operations that drivers don't handle well.  I can try the Java async driver, and if it works, maybe the rest of the code wouldn't be too difficult to port.

Thanks,
Dave

DaveC

unread,
Nov 20, 2012, 4:11:48 PM11/20/12
to mongod...@googlegroups.com, DaveC, el...@10gen.com
Since 2.2.2-rc1 is out, I tried there as well.  Same untailable oplog format that appeared in 2.2.1.  Do I need to create a JIRA to get this fixed?

Eliot Horowitz

unread,
Nov 21, 2012, 7:05:50 AM11/21/12
to DaveC, mongod...@googlegroups.com
There is a ticket to change overall, but can you open a ticket
specific to your case?

DaveC

unread,
Dec 7, 2012, 12:25:56 PM12/7/12
to mongod...@googlegroups.com, el...@10gen.com
I've created a ticket for this.  Sorry for the delay.  This behavior still exists in 2.2.2, preventing me from taking new releases.



On Wednesday, November 21, 2012 7:06:03 AM UTC-5, Eliot Horowitz wrote:
There is a ticket to change overall, but can you open a ticket
specific to your case?

Reply all
Reply to author
Forward
0 new messages