On Sun, Oct 21, 2012 at 1:09 PM, Assaf Lavie <assafla
...@gmail.com> wrote:
> Hi Tad,
> I rebuilt the RS from a backup and the assertions are now gone, so it
> definitely looks like corruption.
> The mongod instances are running on my development Windows machine and are
> usually killed off by closing their console windows. There haven't been
> many hard shutdowns (e.g. power loss) at all - can't remember when one
> happened (there's a UPS here).
> In addition, the drive that contains the data is actually a mirrored RAID
> drive, so the chances of HW problems are very slim indeed.
> Nothing in Window's event log related to NTFS.
> I should mention, though, that almost every time I re-launch the mongod's
> of the RS I get the following:
> preallocating a journal file d:/data/mongosets/set3/journal/prealloc.2
> 52428800/1073741824 4%
> ...
> Which usually takes about a minute. This happens even now, after
> completely rebuilding the RS.
> Assaf
> On Sun, Oct 21, 2012 at 12:33 PM, Tad Marshall <t...@10gen.com> wrote:
>> Hi Assaf,
>> The assertion pointing to pdfile.h line 360 is indicating corruption.
>> This line verifies that an extent contains a valid signature, and the test
>> is failing.
>> This was also happening in the earlier log snippet you posted, which I
>> should have noticed, but I focused on the TTLMonitor aspect and missed the
>> corruption part. The second log post adds another indication of
>> corruption; Assertion: 10334:Invalid BSONObj size.
>> The collections that are named are in your 'local' database; 'local.me'
>> and 'local.slaves'. It seems that both are corrupt.
>> The messages every second are triggered by a loop that tries to reconnect
>> to the server that the secondary is trying to sync from. It is failing
>> each time, so you get repeating messages. At log level 1 you would see an
>> additional message; "replset could not connect to localhost:27017".
>> You are not getting a good stack trace (indicated by the "???" lines).
>> Is mongod.pdb in the same directory as mongod.exe? It needs to
>> be find-able by mongod.exe to get information about mongod.exe in the stack
>> trace.
>> Has this server had hard shutdowns? Can you check the Windows Event Log
>> for NTFS errors?
>> Tad
>> On Sunday, October 21, 2012 4:07:58 AM UTC-4, Assaf Lavie wrote:
>>> Thanks Tad.
>>> Running the recent nightly yield a different exception every second
>>> (regardless of actual queries):
>>> Sun Oct 21 09:59:38 [rsSyncNotifier] kernel32.dll
>>> BaseThreadInitThunk+0xd
>>> Sun Oct 21 09:59:38 [rsSyncNotifier] replset tracking exception:
>>> exception: 0 assertion d:\slave\windows_64bit_v2.2\**
>>> mongo\src\mongo\db\pdfile.h:**360
>>> Sun Oct 21 09:59:39 [rsSyncNotifier] replset setting oplog notifier to
>>> localhost:27017
>>> Sun Oct 21 09:59:39 [rsSyncNotifier] local.me Assertion failure isOk()
>>> d:\slave\windows_64bit_v2.2\**mongo\src\mongo\db\pdfile.h 360
>>> However, this only happens on one of the secondary instances.
>>> If I run mongod with --traceExceptions I don't see an exception every
>>> second - I only see the following exceptions whenever a query is performed:
>>> Sun Oct 21 10:02:56 [slaveTracking] update local.slaves query: { _id:
>>> ObjectId('**4fba5e0d0c61cd1166308cac'), host: "127.0.0.1", ns: "
>>> local.oplog.rs" } update: { $set: { syncedTo: Timestamp 1350806576000|1
>>> } } keyUpdates:0 exception: Invalid BSONObj size: 0 (0x00000000) first
>>> element: EOO code:10334 locks(micros) w:88764 88ms
>>> Sun Oct 21 10:02:56 [slaveTracking] Assertion: 10334:Invalid BSONObj
>>> size: 0 (0x00000000) first element: EOO
>>> Sun Oct 21 10:02:56 [slaveTracking] mongod.exe ???
>>> ...
>>> Sun Oct 21 10:02:56 [slaveTracking] mongod.exe ???
>>> Sun Oct 21 10:02:56 [slaveTracking] kernel32.dll
>>> BaseThreadInitThunk+0xd
>>> Sun Oct 21 10:02:56 [slaveTracking] warning: DBException thrown ::
>>> caused by :: 10334 Invalid BSONObj size: 0 (0x00000000) first element: EOO
>>> Assaf
>>> On Fri, Oct 19, 2012 at 9:53 PM, Tad Marshall <t...@10gen.com> wrote:
>>>> Hi Assaf,
>>>> The error messages from TTLMonitor may be https://jira.mongodb.org/**
>>>> browse/SERVER-6911 <https://jira.mongodb.org/browse/SERVER-6911> ,
>>>> which is fixed in version 2.2.1-rc1. Version 2.2.1-rc1 is close to release
>>>> but not released as of this writing. You could try a recent nightly build
>>>> in the 2.2 branch ( http://downloads.mongodb.org/**
>>>> win32/mongodb-win32-x86_64-v2.**2-latest.zip<http://downloads.mongodb.org/win32/mongodb-win32-x86_64-v2.2-latest.zip> )
>>>> to see if it makes the TTLMonitor messages go away.
>>>> The [slaveTracking] message may be something different. Was there a
>>>> stack trace following this line? This would help pin down where the error
>>>> is coming from.
>>>> If there was no stack trace, could you try either restarting mongod.exe
>>>> with an added "--traceExceptions" command line switch or issue a command in
>>>> the shell to enable stack traces for all exceptions:
>>>> db.adminCommand( { setParameter: 1, traceExceptions: true} )
>>>> On Windows, you need to have the mongod.pdb file in the same directory
>>>> as mongod.exe to get usable stack traces.
>>>> Let us know is this works for you, thanks!
>>>> Tad
>>>> On Wednesday, October 17, 2012 11:14:20 AM UTC-4, Assaf Lavie wrote:
>>>>> After upgrading Mongo to 2.3 on my local Windows dev machine, suddenly
>>>>> the replica set console windows report an assertion whenever any query is
>>>>> made against the DB:
>>>>> [slaveTracking] update local.slaves query: { _id: ObjectId('**
>>>>> 4fba5ef4bc2c86368476**beed'), host: "127.0.0.1", ns:
>>>>> "local.oplog.rs" } update: { $set: { syncedTo: Timestamp
>>>>> 1350486453000|1 } } keyUpdates:0 exception: Invalid BSONObj size: 0 (0x0
>>>>> 0000000) first element: EOO code:10334 locks(micros) w:44430 44ms
>>>>> [TTLMonitor] assertion 0 assertion d:\slave\windows_64bit_v2.2\**mo**
>>>>> ngo\src\mongo\db\pdfile.h:**360 ns:local.system
>>>>> .indexes query:{ expireAfterSeconds: { $exists: true } }
>>>>> TTLMonitor] problem detected during query over local.system.indexes :
>>>>> { $err: "assertion d:\slave\windows_64b
>>>>> it_v2.2\mongo\src\mongo\db\**pdf**ile.h:360" }
>>>>> [TTLMonitor] ERROR: error processing ttl for db: local 10065 invalid
>>>>> parameter: expected an object ()
>>>>> Any idea what could cause this? All 3 nodes in teh RS are on the same
>>>>> machine and aside from these errors the queries themselves seem to succeed.
>>>>> Assaf
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "mongodb-user" group.
>>>> To post to this group, send email to mongod...@googlegroups.com
>>>> To unsubscribe from this group, send email to
>>>> mongodb-user...@**googlegroups.com
>>>> See also the IRC channel -- freenode.net#mongodb
>>> --
>> You received this message because you are subscribed to the Google
>> Groups "mongodb-user" group.
>> To post to this group, send email to mongodb-user@googlegroups.com
>> To unsubscribe from this group, send email to
>> mongodb-user+unsubscribe@googlegroups.com
>> See also the IRC channel -- freenode.net#mongodb