How to repair corrupted mongodb

288 views
Skip to first unread message

Keith Laidlaw

unread,
Nov 15, 2018, 11:07:32 PM11/15/18
to sipxcom-users
I have 18.08 single server system.  A short while ago I had an unexpected, abrupt shutdown.  I'm still mostly up and running but I can't backup my config via GUI.  Job status says backup failed.  Also when I restart Supervisor (which shows running, sometimes and shut down sometimes), job status says:

Could not read agent results. Write end dead null
and
Error. Error connecting to server (timeout). System error for connect: "Operation now in progress". Unable to connect to server 127.0.0.1. System reports error for connect: "Operation now in progress". No server is responding on this port null
 
Looking at messages, I see 

!! tchdbput: Could not write key to DB "/var/cfengine/state/cf_lock.tcdb": read error
 
Looking at /var/cfengine/state I see (among other files) these two files: cf_lock.tcdb and cf_lock.tcdb.corrupt.

I'm optimistic that I've found my problem.  I believe the abrupt shutdown happened while a lock was in effect and I now have a slightly corrupted system.

Obvious question: how can I repair this?  This link documents the command "mongod --repair" but I hesitate to do such a low-level repair without checking with the experts first.  If this command is my best option, should I stop sipxecs first?  Any other protection I should do?  I do have a VM snapshot since I can't actually back up the config/vmail.

I'm hopeful this is at the heart of the problem but I'm open to any other advice.  I was expecting to fix this and see if messages (and backup and Supervisor) come up fine.  If it does, thorough testing and I'm happy.  If it doesn't, I have one less error to fix and I will find the next one!

I'm also happy to post any other info, but I suspect it isn't yet necessary.

Thanks for any help

Niek Vlessert

unread,
Nov 16, 2018, 3:27:11 AM11/16/18
to sipxcom-users
Hi,

Since mongodb is the replacement of the non-disk-persistent in memory database system, at least almost all data is the source for the mongo database; the postgres database. Probably not everything, other people in this usergroup probably know the details.

If you do send profiles at the server overview, the mongo database will be filled all over again. Not sure if this fixes any database corruption. But before that, can you still access the mongo db through the shell? 'mongo' and then 'show dbs', 'use node' or 'use imdb' and then google for some commands to get the contents.

Regards,

Niek
 
Op vrijdag 16 november 2018 05:07:32 UTC+1 schreef Keith Laidlaw:

George Niculae

unread,
Nov 16, 2018, 3:31:44 AM11/16/18
to niekvl...@gmail.com, sipxco...@googlegroups.com
The error you posted is related to cfengine and not to MongoDB. Try service sipxsupervisor stop, then remove  /var/cfengine/state/promise_compliance.tcdb and /var/cfengine/state/cf_lock.tcdb (make a copy of them just in case) then restart supervisor and rerun sipxagent

George

--
You received this message because you are subscribed to the Google Groups "sipxcom-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to sipxcom-user...@googlegroups.com.
To post to this group, send email to sipxco...@googlegroups.com.
Visit this group at https://groups.google.com/group/sipxcom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/sipxcom-users/aaa9520f-4624-43fd-8c50-9aeb7458b927%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Keith Laidlaw

unread,
Nov 16, 2018, 11:59:11 AM11/16/18
to sipxcom-users
You're awesome, George.  I didn't see any promise_compliance.tcdb and I also deleted cf_lock.tcdb.corrupt as that seemed to make sense.  Worked perfecty first try and no downtime!

Many thanks
Reply all
Reply to author
Forward
0 new messages