Corrupt database file

225 views
Skip to first unread message

Chris Ashton

unread,
Mar 21, 2014, 7:46:12 AM3/21/14
to us...@couchdb.apache.org
Hi

I am a novice with CouchDB and really struggling to support a product that someone developed before leaving the company. No one else has any skills in Couch so I'm trying to pick up the pieces.

Our database has stopped responding and we've tried compacting but to no avail.

We are getting errors like the following in the log file, which I presume mean that we have a corrupt db file for one reason or another:


[Fri, 21 Mar 2014 10:06:56 GMT] [error] [<0.213.0>] ** Generic server <0.213.0> terminating
** Last message in was {'EXIT',<0.217.0>,
                               {file_corruption,<<"file corruption">>}}

I was wondering if there was a utility which would scan DB files and remove badly formed parts or anything like that? We are running on Windows.

Many thanks

Chris

Tim Tisdall

unread,
Mar 21, 2014, 9:19:42 AM3/21/14
to us...@couchdb.apache.org, Chris Ashton
I don't think there's any tool for fixing corrupted db files... What I'd
try doing is dumping all the content from the DB and reconstructing it.
You can fetch everything in the database by downloading
http://127.0.0.1:5984/my_database/_all_docs?include_docs=true (where the IP
and domain is your server, and 'my_database' is your DB). You'd then need
to write some sort of script to read that JSON document and then write the
values back into a _new_ database.

Does anyone know if there's a way to do this same sort of thing with
replication? (I have to do it the other way because I don't have enough
space for 2 copies of my DB on my system)

Chris Ashton

unread,
Mar 21, 2014, 9:34:53 AM3/21/14
to Tim Tisdall, us...@couchdb.apache.org
Hi Tim

Many thanks for this reply. I have tried this dump method but even that fails, complaining about the database file : "curl: (56) Problem (2) in the Chunked-Encoded data"

Our files are huge so everything takes ages.

The only other thought I have right now is to process this enormous text file, add whatever curly braces and the like are required to turn it back into valid JSON and then, as you suggest, rewrite it into the database in some way.

I'm just surprised there is no utility to fix bad data sections, it's a real pity.


By the way, I probably should have mentioned, we appear to have couchbase single version 1.2.0

Thanks again

Chris

________________________________
From: Tim Tisdall <tis...@gmail.com>
To: us...@couchdb.apache.org; Chris Ashton <chrisj...@yahoo.co.uk>
Sent: Friday, 21 March 2014, 13:19
Subject: Re: Corrupt database file

Knudsen, Ken

unread,
Mar 21, 2014, 9:49:34 AM3/21/14
to us...@couchdb.apache.org, Chris Ashton, Tim Tisdall
Sorry that I can't help with your exact circumstance (I'm new as well)...but is there any additional information you could provide so far as a use-case as to how this happened? My current readings and understandings with couchDB is that something like this shouldn't happen at all....So I find this very interesting...(in a sad way)....

Thanks,

Ken
______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

Garren Smith

unread,
Mar 21, 2014, 9:53:26 AM3/21/14
to us...@couchdb.apache.org, Chris Ashton, Tim Tisdall
Hi Chris,

If you have Couchbase, that is different to Couchdb. Its better to ask on the Couchbase mailing list. They might be able to help you there.

Cheers
Garren

Chris Ashton

unread,
Mar 21, 2014, 9:53:22 AM3/21/14
to Knudsen, Ken, us...@couchdb.apache.org, Tim Tisdall
Thanks Ken

Unfortunately, I don't know much. We have a desktop application which uses CouchDB as its back end. Once the developer left, no-one knew how to run it and (of course) last week, it stopped responding.

My best guess as to how it happened is the server ran out of disk space but I can't confirm that as I didn't see the server immediately, it was messed with by our hardware providers. We had not run any compacts or anything through ignorance and it grew very large indeed!

Cheers

Chris

________________________________
From: "Knudsen, Ken" <Ken.K...@imaginecommunications.com>
To: "us...@couchdb.apache.org" <us...@couchdb.apache.org>; Chris Ashton <chrisj...@yahoo.co.uk>; Tim Tisdall <tis...@gmail.com>
Sent: Friday, 21 March 2014, 13:49
Subject: RE: Corrupt database file

Chris Ashton

unread,
Mar 21, 2014, 10:04:59 AM3/21/14
to Garren Smith, us...@couchdb.apache.org, Tim Tisdall
Yikes, guess that shows just what a beginner I am with it!

Thanks for the tip

Chris


________________________________
From: Garren Smith <gar...@apache.org>

Cc: Tim Tisdall <tis...@gmail.com>
Sent: Friday, 21 March 2014, 13:53

Jens Alfke

unread,
Mar 21, 2014, 10:12:37 AM3/21/14
to us...@couchdb.apache.org, Chris Ashton, Tim Tisdall

On Mar 21, 2014, at 6:53 AM, Garren Smith <gar...@apache.org> wrote:

> If you have Couchbase, that is different to Couchdb. Its better to ask on the Couchbase mailing list. They might be able to help you there.

No, "Couchbase Single Server" was just a repackaged version of CouchDB. But Couchbase hasn’t offered or supported it since 2011. So this is the correct mailing list.

—Jens

Alexander Shorin

unread,
Mar 21, 2014, 11:43:25 AM3/21/14
to us...@couchdb.apache.org
Not sure in that since this is a different product of a different
company, we don't have any knowledge about how it was repacked, what
changes it includes, how it different from original CouchDB 1.2.0 and
so on and so forth. CouchBase team probably should knows more about
their products.

--
,,,^..^,,,

Jens Alfke

unread,
Mar 21, 2014, 3:46:33 PM3/21/14
to us...@couchdb.apache.org

On Mar 21, 2014, at 8:43 AM, Alexander Shorin <kxe...@gmail.com> wrote:

> we don't have any knowledge about how it was repacked, what
> changes it includes, how it different from original CouchDB 1.2.0 and
> so on and so forth.

Is that relevant? Nothing's come up in this thread that depends on exact details of CouchDB internals. Can we focus on the issue at hand, namely that the OP has a CouchDB that ran out of disk space and corrupted its database and he’d like to recover the data?

(IIRC, if there were any source changes from stock 1.2 they were minor, maybe just around branding. Maybe Jan, Dale, or Filipe remember more about it?)

> CouchBase team probably should knows more about their products.

Couchbase’s forums are not going to respond to support requests for a product that’s been discontinued for over two years, from someone who’s (presumably) not a paying customer.

—Jens

Tim Tisdall

unread,
Mar 21, 2014, 4:19:42 PM3/21/14
to us...@couchdb.apache.org
If the DB was corrupted because the disk became full, shouldn't the DB
still be fine but just missing the most recent commits? Or would the a
person need to truncate a certain number of bytes off the end of the DB to
get it to read properly?

As for JSON file size... I always dump the DB into a GZ file and then my
scripts work on it as a GZ'ed file. In my case the JSON is 20gb and the gz
file is 3.5gb. Dealing with the file as a gz adds a little more complexity
to the script you use to process it, though.
> --Jens

Robert Samuel Newson

unread,
Mar 21, 2014, 6:44:02 PM3/21/14
to us...@couchdb.apache.org
"If the DB was corrupted because the disk became full, shouldn't the DB still be fine but just missing the most recent commits?"

Yes, that’s the virtue of the append-only nature of database files, though the code that detects file corruption happens when the md5 checksums fail to verify, it’s hard to imagine it being a false positive.

Did the compaction attempt fail? Can it be replicated? If not, I would reluctantly truncate a few meg off the file and see if it can be opened (do this when couchdb is not running). The actual corrupted file would be useful to couchdb developers so that we could investigate the raw data at the corruption site.

What was the disk system here? RAID? filesystem? Would your disk controllers reorder writers at all?

B.
Reply all
Reply to author
Forward
0 new messages