Setting up smoosh for database compaction

196 views
Skip to first unread message

Paul Milner

unread,
Aug 18, 2021, 2:58:39 AM8/18/21
to us...@couchdb.apache.org
Hello

I'm looking at the maintenance of my databases and how I could implement
tools to do that. Smoosh seems to be the main option, but I'm struggling to
set it up as the documentation seems a bit limited.

I have only really found this:

5.1. Compaction — Apache CouchDB® 3.1 Documentation
<https://docs.couchdb.com/en/3.1.1/maintenance/compaction.html#database-compaction>

I could do it manually but wanted to explore this first and was wondering
if there are any smoosh examples about, that could help me on my way?

If anyone could point me in the right direction please, I would appreciate
it.

Thanks a lot
Best regards
Paul

Adam Kocoloski

unread,
Aug 18, 2021, 2:01:16 PM8/18/21
to us...@couchdb.apache.org
Hi Paul, sorry to hear you’re finding it a challenge to configure. The default configuration described in the documentation does give you an example of how things are set up:

https://docs.couchdb.org/en/3.1.1/maintenance/compaction.html#channel-configuration

Cross-referenced from that section you can find the full configuration reference that describes all the supported configuration keys at the channel level:

https://docs.couchdb.org/en/3.1.1/config/compaction.html#config-compactions

The general idea is that you create [smoosh.<channelname>] configuration blocks with whatever settings you deem appropriate to match a certain set of files and prioritize them, and then use the [smoosh] block to activate those channels.

Can you say a little more about what you’re finding lacking in the docs? Cheers,

Adam

Paul Milner

unread,
Aug 19, 2021, 12:29:39 AM8/19/21
to us...@couchdb.apache.org
Hi Adam

Thanks for the feedback. I was actually struggling with which options to set per channel and what to set them to. Anyway after more thought, I’ve decided on a manual approach as I need it to be more custom than automatic.

But thanks again
I appreciate it.

Best regards
Paul

Sent from my iPad

> On 18 Aug 2021, at 20:01, Adam Kocoloski <koco...@apache.org> wrote:
>
> Hi Paul, sorry to hear you’re finding it a challenge to configure. The default configuration described in the documentation does give you an example of how things are set up:

Robert Newson

unread,
Aug 19, 2021, 4:24:12 AM8/19/21
to user
Hi Paul,

We welcome feedback on why the automatic compaction system (in its default configuration or custom) is not appropriate for you.

B.

Paul Milner

unread,
Aug 19, 2021, 6:32:50 AM8/19/21
to us...@couchdb.apache.org
Hi B (?? ;-) )

I have a log database that could encounter high frequency updates and
deletes. It's not required to be read by multiple users, but will be
updated by all users. So rather than compacting it, which at certain
frequencies of updates could lead to possible race conditions (thinking of
extremes), I was going to do the following steps:

1) Switch the active log to a new database
2) Copy the old database without orphans/history to the new database
3) delete the old database

I would toggle databases as needed.

Best regards
Paul

Robert Newson

unread,
Aug 19, 2021, 9:19:36 AM8/19/21
to user
Hi Paul,


I think that’s reasonable though do note that compaction is also for performance, even if you never update or delete a document, as couchdb defers rebalancing the b+tree disk structures until then (i.e, couchdb isn’t adhering to the b+tree algorithm from literature).

Left uncompacted the lookup/insert performance will drop from roughly O(log n) to O(n) over time (though only as a consequence of writing documents).

None of what I’ve said will apply in CouchDB 4.0 (compaction no longer required there)

B (short for Bob)

Paul Milner

unread,
Aug 19, 2021, 9:40:52 AM8/19/21
to us...@couchdb.apache.org
Hi Bob,

Ok thanks, interesting. Can you tell me when 4.0 is planned to be released
please?

Thanks
Paul

Kyle Snavely

unread,
Aug 19, 2021, 12:02:55 PM8/19/21
to us...@couchdb.apache.org
4.0 is still in development today.

If you tweak compaction settings and have very large shards, do take care
to leave some disk space headroom to allow the compaction process to take
place. Basically don't run your disks at 90% in production without
experience there. ;)

Kyle
Reply all
Reply to author
Forward
0 new messages