Database too large for partition

20 views
Skip to first unread message

Matthias Eck

unread,
Jul 11, 2013, 2:56:25 PM7/11/13
to us...@couchdb.apache.org
Hello,

I have a CouchDB database that is on a 1TB partition on Amazon EC2.
Unfortunately the database now filled the whole partition and 1TB is the
size limit on Amazon EC2, so I cannot copy it to a larger partition.

1. I would like to get the database running again. Are there any files that
I can safely delete to save some immediate space so CouchDB will run again?

2. Any suggestions to solve this problem longer-term? Is it possible to
have a database spanning multiple partitions?

Thanks,
Matthias

Robert Newson

unread,
Jul 11, 2013, 2:59:41 PM7/11/13
to us...@couchdb.apache.org
Have you been compacting this database? That is, is it *really* a 1TB
database or is this just accumulated trash because you didn't compact?

B.

Tim Tisdall

unread,
Jul 11, 2013, 3:00:55 PM7/11/13
to us...@couchdb.apache.org, matthi...@gmail.com
You could delete the generated views in /var/lib/couchdb/.your_database/
and recover a little extra space. There could also be a partially done
compaction in /var/lib/couchdb/

The longer term solution is to do compactions... but you need space equal
to the content of the database to do a compaction.

Robert Newson

unread,
Jul 11, 2013, 3:13:41 PM7/11/13
to us...@couchdb.apache.org
" but you need space equal to the content of the database to do a compaction."

That's not true. You need space equal to the "disk_size" value from
GET /dbname. The only time your statement would be true is if you
compacted a database that you had just compacted and had made no
changes to. Compacting it again will just write the whole thing out
again. Obviously this worse case is also the case where there's no
point compacting anyway.

B.

Robert Newson

unread,
Jul 11, 2013, 3:14:20 PM7/11/13
to us...@couchdb.apache.org
Hit send to soon. I meant the 'data_size' field of course.

b.

Matthias Eck

unread,
Jul 11, 2013, 3:20:22 PM7/11/13
to us...@couchdb.apache.org
I did occasional compaction, but not very frequently.
The database file itself has about 500GB, the views have another 500GB.
If I delete the views can I then try compaction without generating the
views at first?

Matthias

Robert Newson

unread,
Jul 11, 2013, 3:23:54 PM7/11/13
to us...@couchdb.apache.org
Yes, you can delete the view files then compact the database, just
don't query the views otherwise they'll be rebuilt. Were you also
compacting your views? It's a separate operation, one per design
document, and sometimes folks don't realize that.

The real answer here is to shard this huge database. BigCouch is the
current option and we're in the middle of integrating that into a
future CouchDB release.

B.

Robert Newson

unread,
Jul 11, 2013, 3:24:26 PM7/11/13
to us...@couchdb.apache.org
And do check the 'data_size' property in GET /dbname, that will tell
you how big the db file will be after compaction.

Tim Tisdall

unread,
Jul 11, 2013, 3:38:16 PM7/11/13
to us...@couchdb.apache.org
When I said "equal to the content" my intended meaning was the data size
and not the disk size of the database.

Robert Newson

unread,
Jul 11, 2013, 4:25:26 PM7/11/13
to us...@couchdb.apache.org
ah, cool.

Keith Gable

unread,
Jul 11, 2013, 5:22:14 PM7/11/13
to us...@couchdb.apache.org
Another, less desirable, option is software RAID or LVM. You could
create several 1TB EBS volumes and add them to a volume group with LVM
and create a logical volume that spans those disks. No redundancy
though, so if you lose an EBS volume, you probably lose the entire
disk. So you could use software RAID. Of course, I would still do
BigCouch, but it sounds like OP needs a right now solution. From:
Robert Newson
Sent: 7/11/2013 2:23 PM
To: us...@couchdb.apache.org
Subject: Re: Database too large for partition
Yes, you can delete the view files then compact the database, just
don't query the views otherwise they'll be rebuilt. Were you also
compacting your views? It's a separate operation, one per design
document, and sometimes folks don't realize that.

The real answer here is to shard this huge database. BigCouch is the
current option and we're in the middle of integrating that into a
future CouchDB release.

B.


On 11 July 2013 20:20, Matthias Eck <matthi...@gmail.com> wrote:
> I did occasional compaction, but not very frequently.
> The database file itself has about 500GB, the views have another 500GB.
> If I delete the views can I then try compaction without generating the
> views at first?
>
> Matthias
>
>

Jens Alfke

unread,
Jul 12, 2013, 3:03:15 PM7/12/13
to us...@couchdb.apache.org, matthi...@gmail.com

On Jul 11, 2013, at 12:20 PM, Matthias Eck <matthi...@gmail.com> wrote:

> The database file itself has about 500GB, the views have another 500GB.

Is the database 500GB of pure JSON, or are there attachments? (The idea of 500GB of JSON boggles my mind, but then, I’m not a big-data guy.)

I ask because if a lot of that size is attachments, you can save space by storing those elsewhere, like in S3, and leaving just URLs or other IDs in the docs. This will also speed up compaction because the attachment data doesn’t have to be copied. The drawback is that you’ll have to manually delete attachments after their corresponding doc[s] are deleted or updated.

—Jens
Reply all
Reply to author
Forward
0 new messages