The official documentation has moved to http://docs.couchdb.org — The transition is not 100% complete, but http://docs.couchdb.org should be seen as having the latest info. In some cases, the wiki still has some more or older info on certain topics inside CouchDB.

You need to be added to the ContributorsGroup to edit the wiki. But don't worry! Just email any Mailing List or grab us on IRC and let us know your user name.

Compaction

See also the official documentation for the database and design document compaction topics.

Database Compaction

Compaction compresses the database file by removing unused sections created during updates. Old revisions of documents are also removed from the database though a small amount of meta data is kept for use in conflict during replication. The number of revisions (default of 1000) can be configured using the _revs_limit URL endpoint, available since version 0.8-incubating.

Compaction is manually triggered per database. Support for queued compaction of multiple databases is planned. Please note that compaction will be run as a background task.

Example

Compaction is triggered by an HTTP POST request to the _compact sub-resource of your database. On success, HTTP status 202 is returned immediately. Although the request body is not used you must still specify "application/json" as Content-Type for the request.

curl -H "Content-Type: application/json" -X POST http://localhost:5984/my_db/_compact
#=> {"ok":true}

GET requesting your database base URL (see HTTP_database_API#Database_Information) gives a hash of statuses that look like this:

curl -X GET http://localhost:5984/my_db
#=> {"db_name":"my_db", "doc_count":1, "doc_del_count":1, "update_seq":4, "purge_seq":0, "compact_running":false, "disk_size":12377, "instance_start_time":"1267612389906234", "disk_format_version":5}

The compact_running key will be set to true during compaction.

Compaction of write-heavy databases

It is not a good idea to attempt compaction on a database node that is near full capacity for its write load. The problem is the compaction process may never catch up with the writes if they never let up, and eventually it will run out of disk space.

Compaction should be attempted when the write load is less than full capacity. Read load won't affect its ability to complete, however. To have the least impact possible on clients, the database remains online and fully functional to readers and writers. It is a design limitation that database compaction can't complete when at capacity for write load. It may be reasonable to schedule compactions during off-peak hours.

In a clustered environment the write load can be switched off for any node before compaction and brought back up to date with replication once complete.

In the future, a single CouchDB node can be changed to stop or fail other updates if the write load is too heavy for it to complete in a reasonable time.

View compaction

Views need compaction like databases. There is a compact views feature introduced with CouchDB 0.10.0:

curl -H "Content-Type: application/json" -X POST http://localhost:5984/dbname/_compact/designname
#=> {"ok":true}

This compacts the view index from the current version of the design document. The HTTP response code is 202 Accepted (like compaction for databases) and a compaction background task will be created. Information on running compactions can be fetched with HTTP_view_API#Getting_Information_about_Design_Documents_(and_their_Views).

View indexes on disk are named after their MD5 hash of the view definition. When you change a view, old indexes remain on disk. To clean up all outdated view indexes (files named after the MD5 representation of views, that does not exist anymore) you can trigger a view cleanup:

curl -H "Content-Type: application/json" -X POST http://localhost:5984/dbname/_view_cleanup
#=> {"ok":true}

Automatic Compaction

Since CouchDB 1.2 it is possible to configure automatic compaction, so that compaction of databases and views is automatically triggered based on various criteria. Automatic compaction is configured in CouchDB's configuration files. The compaction daemon is responsible for triggering the compaction. It is automatically started, but disabled by default

[daemons]
#...
compaction_daemon={couch_compaction_daemon, start_link, []}

[compaction_daemon]
; The delay, in seconds, between each check for which database and view indexes
; need to be compacted.
check_interval = 300
; If a database or view index file is smaller then this value (in bytes),
; compaction will not happen. Very small files always have a very high
; fragmentation therefore it's not worth to compact them.
min_file_size = 131072

The criteria for triggering the compactions is configured in the "compactions" section.

[compactions]
; List of compaction rules for the compaction daemon.
; The daemon compacts databases and their respective view groups when all the
; condition parameters are satisfied. Configuration can be per database or
; global, and it has the following format:
;
; database_name = [ {ParamName, ParamValue}, {ParamName, ParamValue}, ... ]
; _default = [ {ParamName, ParamValue}, {ParamName, ParamValue}, ... ]

Possible Parameters

Before a compaction is triggered, an estimation of how much free disk space is needed is computed. This estimation corresponds to 2 times the data size of the database or view index. When there's not enough free disk space to compact a particular database or view index, a warning message is logged.

Examples

  1. [{db_fragmentation, "70%"}, {view_fragmentation, "60%"}]
    The foo database is compacted if its fragmentation is 70% or more. Any view index of this database is compacted only if its fragmentation is 60% or more.

  2. [{db_fragmentation, "70%"}, {view_fragmentation, "60%"}, {from, "00:00"}, {to, "04:00"}]
    Similar to the preceding example but a compaction (database or view index) is only triggered if the current time is between midnight and 4 AM.

  3. [{db_fragmentation, "70%"}, {view_fragmentation, "60%"}, {from, "00:00"}, {to, "04:00"}, {strict_window, true}]
    Similar to the preceding example - a compaction (database or view index) is only triggered if the current time is between midnight and 4 AM. If at 4 AM the database or one of its views is still compacting, the compaction process will be canceled.

  4. [{db_fragmentation, "70%"}, {view_fragmentation, "60%"}, {from, "00:00"}, {to, "04:00"}, {strict_window, true}, {parallel_view_compaction, true}]
    Similar to the preceding example, but a database and its views can be compacted in parallel.

Default Configuration

The default configuration - if enabled - applies to all databases. For example

_default = [{db_fragmentation, "70%"}, {view_fragmentation, "60%"}, {from, "23:00"}, {to, "04:00"}]

Compaction (last edited 2013-06-05 22:19:47 by NathanVanderWilt)