Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
"access violation" MongoDB v2.0.4 on Windows 2008 R2
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Mo Cassidy  
View profile  
 More options May 22 2012, 5:18 pm
From: Mo Cassidy <mo.cass...@gmail.com>
Date: Tue, 22 May 2012 14:18:19 -0700 (PDT)
Local: Tues, May 22 2012 5:18 pm
Subject: "access violation" MongoDB v2.0.4 on Windows 2008 R2

Today, we had the following seemingly spontaneously occur:

Tue May 22 14:51:45 [conn27990] update default.instances query: {
[...SNIP...] } idhack:1 upsert:1 124ms
Tue May 22 14:51:47 access violation
Tue May 22 14:52:15 [rsHealthPoll] replSet member iis-man-002:27021 is now
in state PRIMARY
Tue May 22 14:52:17 [rsSync] replSet syncing to: iis-man-002:27021

***** SERVER RESTARTED *****

Tue May 22 14:52:19 [initandlisten] MongoDB starting : pid=1960 port=27021
dbpath=C:\SitesData\testing-website\mongodata 64-bit host=iis-man-003
Tue May 22 14:52:19 [initandlisten] db version v2.0.4, pdfile version 4.5
Tue May 22 14:52:19 [initandlisten] git version:
329f3c47fe8136c03392c8f0e548506cb21f8ebf
Tue May 22 14:52:19 [initandlisten] build info: windows
sys.getwindowsversion(major=6, minor=1, build=7601, platform=2,
service_pack='Service Pack 1') BOOST_LIB_VERSION=1_42
Tue May 22 14:52:19 [initandlisten] options: { dbpath:
"C:\SitesData\testing-website\mongodata", logappend: true, logpath:
"C:\SitesData\testing-website\mongo.log", port: 27021, quiet: true,
replSet: "rsTest", rest: true, service: true }
Tue May 22 14:52:19 [initandlisten] journal
dir=C:/SitesData/testing-website/mongodata/journal
Tue May 22 14:52:19 [initandlisten] recover begin
Tue May 22 14:52:19 [initandlisten] recover lsn: 258699461
Tue May 22 14:52:19 [initandlisten] recover
C:/SitesData/testing-website/mongodata/journal/j._4
Tue May 22 14:52:20 [initandlisten] recover skipping application of section
seq:209344204 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section
seq:209382318 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section
seq:209420683 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section
seq:209458887 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section
seq:209497082 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section
seq:209535336 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section
seq:209573520 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section
seq:209611794 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section
seq:209649958 < lsn:258699461
Tue May 22 14:52:20 [initandlisten] recover skipping application of section
more...
Tue May 22 14:52:58 [initandlisten] recover
C:/SitesData/testing-website/mongodata/journal/j._5
Tue May 22 14:53:17 [initandlisten] recover cleaning up
Tue May 22 14:53:17 [initandlisten] removeJournalFiles
Tue May 22 14:53:17 [initandlisten] recover done
Tue May 22 14:53:18 [initandlisten] waiting for connections on port 27021
Tue May 22 14:53:18 [websvr] admin web console waiting for connections on
port 28021
Tue May 22 14:53:19 [rsStart] trying to contact iis-man-002:27021
Tue May 22 14:53:19 [rsHealthPoll] replSet member iis-man-002:27021 is up
Tue May 22 14:53:19 [rsHealthPoll] replSet member iis-man-002:27021 is now
in state PRIMARY
Tue May 22 14:53:19 [rsStart] replSet STARTUP2
Tue May 22 14:53:19 [rsHealthPoll] replSet member iis-man-001:27021 is up
Tue May 22 14:53:19 [rsHealthPoll] replSet member iis-man-001:27021 is now
in state ARBITER
Tue May 22 14:53:19 [rsSync] replSet SECONDARY
Tue May 22 14:53:23 [rsSync] replSet syncing to: iis-man-002:27021
Tue May 22 14:53:24 access violation

***** SERVER RESTARTED *****

Tue May 22 14:53:25 [initandlisten] MongoDB starting : pid=2356 port=27021
dbpath=C:\SitesData\testing-website\mongodata 64-bit host=iis-man-003
Tue May 22 14:53:25 [initandlisten] db version v2.0.4, pdfile version 4.5
Tue May 22 14:53:25 [initandlisten] git version:
329f3c47fe8136c03392c8f0e548506cb21f8ebf
Tue May 22 14:53:25 [initandlisten] build info: windows
sys.getwindowsversion(major=6, minor=1, build=7601, platform=2,
service_pack='Service Pack 1') BOOST_LIB_VERSION=1_42
Tue May 22 14:53:25 [initandlisten] options: { dbpath:
"C:\SitesData\testing-website\mongodata", logappend: true, logpath:
"C:\SitesData\testing-website\mongo.log", port: 27021, quiet: true,
replSet: "rsTest", rest: true, service: true }
Tue May 22 14:53:25 [initandlisten] journal
dir=C:/SitesData/testing-website/mongodata/journal
Tue May 22 14:53:25 [initandlisten] recover begin
Tue May 22 14:53:25 [initandlisten] recover lsn: 844
Tue May 22 14:53:25 [initandlisten] recover
C:/SitesData/testing-website/mongodata/journal/j._0
Tue May 22 14:53:25 [initandlisten] recover cleaning up
Tue May 22 14:53:25 [initandlisten] removeJournalFiles
Tue May 22 14:53:25 [initandlisten] recover done
Tue May 22 14:53:25 [initandlisten] waiting for connections on port 27021
Tue May 22 14:53:25 [websvr] admin web console waiting for connections on
port 28021
Tue May 22 14:53:25 [rsStart] replSet STARTUP2
Tue May 22 14:53:25 [rsSync] replSet SECONDARY
Tue May 22 14:53:25 [rsHealthPoll] replSet member iis-man-002:27021 is up
Tue May 22 14:53:25 [rsHealthPoll] replSet member iis-man-002:27021 is now
in state PRIMARY
Tue May 22 14:53:25 [rsHealthPoll] replSet member iis-man-001:27021 is up
Tue May 22 14:53:25 [rsHealthPoll] replSet member iis-man-001:27021 is now
in state ARBITER
Tue May 22 14:53:30 [rsSync] replSet syncing to: iis-man-002:27021
Tue May 22 14:53:30 access violation

***** SERVER RESTARTED *****

Tue May 22 14:53:30 [initandlisten] MongoDB starting : pid=2268 port=27021
dbpath=C:\SitesData\testing-website\mongodata 64-bit host=iis-man-003
Tue May 22 14:53:30 [initandlisten] db version v2.0.4, pdfile version 4.5
Tue May 22 14:53:30 [initandlisten] git version:
329f3c47fe8136c03392c8f0e548506cb21f8ebf
Tue May 22 14:53:30 [initandlisten] build info: windows
sys.getwindowsversion(major=6, minor=1, build=7601, platform=2,
service_pack='Service Pack 1') BOOST_LIB_VERSION=1_42
Tue May 22 14:53:30 [initandlisten] options: { dbpath:
"C:\SitesData\testing-website\mongodata", logappend: true, logpath:
"C:\SitesData\testing-website\mongo.log", port: 27021, quiet: true,
replSet: "rsTest", rest: true, service: true }
Tue May 22 14:53:30 [initandlisten] journal
dir=C:/SitesData/testing-website/mongodata/journal
Tue May 22 14:53:30 [initandlisten] recover begin
Tue May 22 14:53:30 [initandlisten] info no lsn file in journal/ directory
Tue May 22 14:53:30 [initandlisten] recover lsn: 0
Tue May 22 14:53:30 [initandlisten] recover
C:/SitesData/testing-website/mongodata/journal/j._0
Tue May 22 14:53:30 [initandlisten] recover cleaning up
Tue May 22 14:53:30 [initandlisten] removeJournalFiles
Tue May 22 14:53:30 [initandlisten] recover done
Tue May 22 14:53:30 [initandlisten] waiting for connections on port 27021
Tue May 22 14:53:30 [websvr] admin web console waiting for connections on
port 28021
Tue May 22 14:53:30 [rsStart] replSet STARTUP2
Tue May 22 14:53:30 [rsSync] replSet SECONDARY
Tue May 22 14:53:30 [rsHealthPoll] replSet member iis-man-002:27021 is up
Tue May 22 14:53:30 [rsHealthPoll] replSet member iis-man-002:27021 is now
in state PRIMARY
Tue May 22 14:53:30 [rsHealthPoll] replSet member iis-man-001:27021 is up
Tue May 22 14:53:30 [rsHealthPoll] replSet member iis-man-001:27021 is now
in state ARBITER
Tue May 22 14:53:35 [rsSync] replSet syncing to: iis-man-002:27021
Tue May 22 14:53:35 access violation

***** SERVER RESTARTED *****

The last pattern above repeated many times until someone noticed the error
in our website logging system.
After seeing that, I manually stopped the service, which gave the following
log messages:

***** SERVER RESTARTED *****

Tue May 22 15:20:03 [initandlisten] MongoDB starting : pid=3132 port=27021
dbpath=C:\SitesData\testing-website\mongodata 64-bit host=iis-man-003
Tue May 22 15:20:03 [initandlisten] db version v2.0.4, pdfile version 4.5
Tue May 22 15:20:03 [initandlisten] git version:
329f3c47fe8136c03392c8f0e548506cb21f8ebf
Tue May 22 15:20:03 [initandlisten] build info: windows
sys.getwindowsversion(major=6, minor=1, build=7601, platform=2,
service_pack='Service Pack 1') BOOST_LIB_VERSION=1_42
Tue May 22 15:20:03 [initandlisten] options: { dbpath:
"C:\SitesData\testing-website\mongodata", logappend: true, logpath:
"C:\SitesData\testing-website\mongo.log", port: 27021, quiet: true,
replSet: "rsTest", rest: true, service: true }
Tue May 22 15:20:03 [initandlisten] journal
dir=C:/SitesData/testing-website/mongodata/journal
Tue May 22 15:20:03 [initandlisten] recover begin
Tue May 22 15:20:03 [initandlisten] info no lsn file in journal/ directory
Tue May 22 15:20:03 [initandlisten] recover lsn: 0
Tue May 22 15:20:03 [initandlisten] recover
C:/SitesData/testing-website/mongodata/journal/j._0
Tue May 22 15:20:03 [initandlisten] recover cleaning up
Tue May 22 15:20:03 [initandlisten] removeJournalFiles
Tue May 22 15:20:03 [initandlisten] recover done
Tue May 22 15:20:03 [initandlisten] waiting for connections on port 27021
Tue May 22 15:20:03 [websvr] admin web console waiting for connections on
port 28021
Tue May 22 15:20:03 [rsStart] replSet STARTUP2
Tue May 22 15:20:03 [rsSync] replSet SECONDARY
Tue May 22 15:20:03 [rsHealthPoll] replSet member iis-man-002:27021 is up
Tue May 22 15:20:03 [rsHealthPoll] replSet member iis-man-002:27021 is now
in state PRIMARY
Tue May 22 15:20:03 [rsHealthPoll] replSet member iis-man-001:27021 is up
Tue May 22 15:20:03 [rsHealthPoll] replSet member iis-man-001:27021 is now
in state ARBITER
Tue May 22 15:20:04 shutdown: going to close listening sockets...
Tue May 22 15:20:04 closing listening socket: 320
Tue May 22 15:20:04 closing listening socket: 368
Tue May 22 15:20:04 shutdown: going to flush diaglog...
Tue May 22 15:20:04 shutdown: going to close sockets...
Tue May 22 15:20:04 shutdown: waiting for fs preallocator...
Tue May 22 15:20:04 shutdown: lock for final commit...
Tue May 22 15:20:04 shutdown: final commit...
Tue May 22 15:20:04 shutdown: closing all files...
Tue May 22 ...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Tad Marshall  
View profile  
 More options May 23 2012, 6:18 am
From: Tad Marshall <t...@10gen.com>
Date: Wed, 23 May 2012 03:18:26 -0700 (PDT)
Local: Wed, May 23 2012 6:18 am
Subject: Re: "access violation" MongoDB v2.0.4 on Windows 2008 R2

There are a couple of things going on here.  The main one is that it seems
that something being replicated is causing an access violation on your
secondary.  This is an error in memory addressing, the same thing as a
segfault (signal 11) on Unix.  The second thing is that the Windows service
settings that were created by mongod.exe when you used --install to install
the service instruct the Windows Service Control Manager to restart
mongod.exe if it ever fails.  Because you have journaling enabled, mongod
is able to recover from each restart, but when it recovers it starts
syncing from the primary again, reads the same oplog entry that killed it
the previous time and so gets another access violation and the process
repeats.

Version 2.0.4 gives very little information about the access violation,
just the faulting address, which makes it hard to debug.  In addition,
Address Space Layout Randomization (ASLR) makes the address hard to use in
finding out what source code line triggered the crash.  Version 2.2 will
behave a little better on access violations in Windows: it will show the
thread name that crashed (which is almost certainly rsSync for your crash),
the type of memory access and the address that was used.  In addition, it
will generate a minidump that can help diagnose the problem and then shut
down "cleanly" so no recovery will be required on restart.

There are a couple of things that could be tried to get you out of this
hole.  One possibility is to temporarily try the most recent nightly build
and see if the crashing problem has been fixed.  If it hasn't been fixed,
you will get a minidump that we could use to analyze the crash.  If this
fixes the problem, you could let it run long enough to fully resync from
the primary and then switch back to your 2.0.4 version.

There are other options that could be tried, such as wiping the data and
restoring from a dump from the primary, but it might be worth trying the
nightly build first.  If it worked, it would be a lot faster.  The current
nightly build is getting close to the version that will ship as 2.2.0-rc0
(the first release candidate for version 2.2) but it isn't there yet and we
don't recommend running "unstable" nightly builds in production.  As a
short term test to get you past this outage, it might be worth trying.

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mo Cassidy  
View profile  
 More options Jun 29 2012, 12:31 pm
From: Mo Cassidy <mo.cass...@gmail.com>
Date: Fri, 29 Jun 2012 09:31:48 -0700 (PDT)
Local: Fri, Jun 29 2012 12:31 pm
Subject: Re: "access violation" MongoDB v2.0.4 on Windows 2008 R2

Wow, sorry Tad, I had totally forgotten about this until today (guess what
happened again today). To get out of the hole, I just went ahead and did a
full resync. It didn't affect performance a whole lot and was done
relatively quickly (I think maybe 30 minutes). Thanks for all the info,
it's really interesting. I'll just do the full resync again today, and I
look forward to 2.2 so I'll have more info if this happens again! :)

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Mo Cassidy  
View profile  
 More options Aug 7 2012, 11:01 am
From: Mo Cassidy <mo.cass...@gmail.com>
Date: Tue, 7 Aug 2012 08:01:12 -0700 (PDT)
Local: Tues, Aug 7 2012 11:01 am
Subject: Re: "access violation" MongoDB v2.0.4 on Windows 2008 R2

Hmmm. It happened a third time, today. I noticed something fishy about the
timing, so I calculated the time between failures (some of it is estimated,
because my records aren't complete), and it was ~38 days each time (930
hours between the 1st and 2nd failure, and 910 hours between the 2nd and
3rd failure). It's always my secondary (I have two active nodes and an
arbiter) that crashes like this, but the secondary is always running on our
8GB RAM server, while the primary is on a 12GB server. Both servers are
shared by Mongo and fairly busy IIS nodes of a webfarm. I haven't been
closely watching what's going on with 2.2, but hopefully this is useful
information for the devs.

...

read more »


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »