After finally catching up, we get this line in the logs:
replSet initial sync query minValid
Then it has to get caught up and do the final sync so it catches up.
Unfortunately, something seems to be wrong here because the DB's are
just syncing really slowly. I am not sure if it has something to do
with the box that its syncing from, the RAID0 setup or what. At its
current pace, it is just not catching up to the master, its falling
behind.
What I want to do is upgrade to 2.0.2, and try to get it caught up
again but I do not want to do this if the server is going to do an
entirely fresh sync again, because this has taken a long time
already. Will that happen or will it just try to catch up since the
'inital sync query minValid' has been reached?
Also, any ideas on what to do about our RAID0 setup. It's just weird,
the new servers are literally falling more and more behind and it just
seems like it shouldn't take this long to sync.
What does mongostat --discover look like? Are there lots of ops on the
primary, if so that will show it.
What does rs.status() look like?
Please post all results to gist/pastie/pastebin so they are more readable.
> --
> You received this message because you are subscribed to the Google Groups "mongodb-user" group.
> To post to this group, send email to mongod...@googlegroups.com.
> To unsubscribe from this group, send email to mongodb-user...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
>
I assume 4/5/6 (id from rs.conf) are the new ones, is that correct?
Where is iostat from, and what devices are what in those stats? Where
are the db files (which devices and what config)? (I assume md0 is a
stripe of xvd*)
It looks like there are a decent number of faults and the disk is
showing a good bit of use from that iostat.
You have ~350GB of data files and only 24GB of memory (just guessing,
can you run 'free -ltm' and post those results as well for new
servers) it seems.
Please also run this:
db.printReplicationInfo()
db.printSlaveReplicationInfo()
On Thu, Dec 15, 2011 at 12:24 AM, progolferyo <sfa...@gmail.com> wrote:
> Scott,
>
> iostat:
>
> https://gist.github.com/b09c19e00a615b5696ca
>
> mongostat:
>
> https://gist.github.com/8efd20511d04aa185873
>
> rs.status()
>
> https://gist.github.com/180fc682e1f793d2632d
Yes
Where is iostat from, and what devices are what in those stats? Where
are the db files (which devices and what config)? (I assume md0 is a
stripe of xvd*)
iostat is coming from the new db server. the md0 is just a raid0
stripe of xvd*. i have the /dev/md0 mounted as:
/dev/md0 2.0T 378G 1.6T 19% /data3
and the db files are in /data3/mongo
free -ltm gives:
https://gist.github.com/c6270ee227967b9a5b26
the print commands give:
https://gist.github.com/b45d9fdbeb3f91df5bcf
I'm gonna try one and see if upgrading will do anything. Do you think
upgrading from 2.0 will have any added effect?
In general you are limited to a max of 2gbs from your instance to the
EBS system (all volumes). Depending on your instance type that could
effectively be much worse.
You are basically getting the worst write performance of the worst EBS
volume I suspect.
> I'm gonna try one and see if upgrading will do anything. Do you think
> upgrading from 2.0 will have any added effect?
No, probably not in this case, but it can't hurt.
On Dec 14, 6:26 pm, Scott Hernandez <scotthernan...@gmail.com> wrote:
Did you issue a db.shutdownServer() command from the mongo shell? What
does the log say?
Unfortunately with RAID0, it is falling more and more behind, it
doesn't look like this is going to work. It's weird, I have two other
secondary servers in this replica set with single drives that can
catch up just fine. It sounds like there may be some issues with EBS
and RAID0. Both servers on RAID0 just cannot keep up.
We have it on our to-do list to try RAID1+0 sometime soonish, after
all, if everyone says it's better, it must be, right?
On Dec 15, 1:16 am, progolferyo <sfan...@gmail.com> wrote:
> Yes, I am running with journaling. It did finally stop and I updated
> and then reboot (and di not do a kill -9)
>
> Unfortunately withRAID0, it is falling more and more behind, it
> doesn't look like this is going to work. It's weird, I have two other
> secondary servers in this replica set with single drives that can
> catch up just fine. It sounds like there may be some issues with EBS
> andRAID0. Both servers onRAID0just cannot keep up.
>
> On Dec 14, 6:57 pm, Scott Hernandez <scotthernan...@gmail.com> wrote:
>
>
>
>
>
>
>
> > Are you running with journaling, if not, don't kill -9 it.
>
> > Did you issue a db.shutdownServer() command from the mongo shell? What
> > does the log say?
>
> > On Thu, Dec 15, 2011 at 2:51 AM, progolferyo <sfan...@gmail.com> wrote:
> > > Argh, shutdown timesout. Are you sure if I do a kill -9 on the
> > > instance, I wont have to start again from scratch? Or is there a
> > > better way to kill, update, start up?
>
> > > On Dec 14, 6:26 pm, Scott Hernandez <scotthernan...@gmail.com> wrote:
> > >> On Thu, Dec 15, 2011 at 2:19 AM, progolferyo <sfan...@gmail.com> wrote:
> > >> > Ok thats good. The issue is just that they are falling more and more
> > >> > behind. I'm wondering if myRAID0setup with EBS on AWS is just not
> > >> >> >> >> > we expanded to new boxes (3) and setupRAID0under 4 drives each. The
> > >> >> >> >> > new nodes were added to the primary and the initial sync began.
>
> > >> >> >> >> > After finally catching up, we get this line in the logs:
>
> > >> >> >> >> > replSet initial sync query minValid
>
> > >> >> >> >> > Then it has to get caught up and do the final sync so it catches up.
> > >> >> >> >> > Unfortunately, something seems to be wrong here because the DB's are
> > >> >> >> >> > just syncing really slowly. I am not sure if it has something to do
> > >> >> >> >> > with the box that its syncing from, theRAID0setup or what. At its
> > >> >> >> >> > current pace, it is just not catching up to the master, its falling
> > >> >> >> >> > behind.
>
> > >> >> >> >> > What I want to do is upgrade to 2.0.2, and try to get it caught up
> > >> >> >> >> > again but I do not want to do this if the server is going to do an
> > >> >> >> >> > entirely fresh sync again, because this has taken a long time
> > >> >> >> >> > already. Will that happen or will it just try to catch up since the
> > >> >> >> >> > 'inital sync query minValid' has been reached?
>
> > >> >> >> >> > Also, any ideas on what to do about ourRAID0setup. It's just weird,