The official documentation has moved to http://docs.couchdb.org — The transition is not 100% complete, but http://docs.couchdb.org should be seen as having the latest info. In some cases, the wiki still has some more or older info on certain topics inside CouchDB.

You need to be added to the ContributorsGroup to edit the wiki. But don't worry! Just email any Mailing List or grab us on IRC and let us know your user name.

Frequently Unasked Questions

On IRC and the Mailing List, these are the Questions People should have asked to help them stay Relaxed.

Documents

  1. Why should I generate my own UUIDs?
    • While CouchDB will generate a unique identifier for the _id field of any doc that you create, there are three reasons why you are in most cases better off generating them yourself.
    • If for any reason you miss the 200 OK reply from CouchDB, and storing the document is attempted again, you would end up with the same document content stored under duplicate _ids. This could easily happen with intermediary proxies and cache systems that may not inform developers that the failed transaction is being retried.
    • _ids are are the only unique enforced value within CouchDB so you might as well make use of this.
    • CouchDB stores its documents in a B+ tree. Each additional or updated document is stored as a leaf node, and may require re-writing intermediary and parent nodes. You may be able to take advantage of sequencing your own ids more effectively than the automatically generated ids if you can arrange them to be sequential yourself.
  2. What is the benefit of using the _bulk_docs API instead of PUTting single documents to CouchDB?
    • Aside from the HTTP overhead and roundtrip you are saving, the main advantage is that CouchDB can handle the B tree updates more efficiently, decreasing rewriting of intermediary and parent nodes, both improving speed and saving disk space.
  3. Why can't I use MVCC in CouchDB as a revision control system for my docs?
    • The revisions CouchDB stores for each document are removed when the database is compacted. The database may be compacted at any time by a DB admin to save hard drive space. If you were using those revisions for document versioning, you'd lose them all upon compaction. In addition, your disk usage would grow with every document iteration and (if you prevented database compaction) you'd have no way to recover the used disk space.
  4. Does compaction remove deleted documents’ contents?
    • We keep the latest revision of every document ever seen, even if that revision has '"_deleted":true' in it. This is so that replication can ensure eventual consistency between replicas. Not only will all replicas agree on which documents are present and which are not, but also the contents of both.
    • Deleted documents specifically allow for a body to be set in the deleted revision. The intention for this is to have a "who deleted this" type of meta data for the doc. Some client libraries delete docs by grabbing the current object blob, adding a '"_deleted":true' member, and then sending it back which inadvertently (in most cases) keeps the last doc body around after compaction.

Replication

  1. What is the difference between PULL and PUSH replication?
  2. Why do I need to permit deleted docs in validation functions?
  3. How do compaction and purging impact replication?

Views

  1. In a view, why should I not emit(key,doc) ?

    • The key point here is that by emitting ,doc you are duplicating the document which is already present in the database (a .couch file), and including it in the results of the view (a .view file, with similar structure). This is the same as having a SQL Index that includes the original table, instead of using a foreign key.

      The same effect can be acheived by using emit(key,null) and ?include_docs=true with the view request. This approach has the benefit of not duplicating the document data in the view index, which reduces the disk space consumed by the view. On the other hand, the file access pattern is slightly more expensive for CouchDB. It is usually a premature optimization to include the document in the view. As always, if you think you may need to emit the document it's always best to test.

  2. What happens if I don't ducktype the variables I am using in my view?
  3. Does it matter if my map function is complex, or takes a long time to run?

Tools

  1. I decided to roll my own CouchApp tool or CouchDB client in <myfavouritelanguage>. How cool is that?

    • Pretty cool! In fact its a great way to get familiar with the API. However - wrappers around the HTTP API are not necessarily of great use as CouchDB already makes this very easy. Mapping CouchDB semantics onto your language's native data structures is much more useful to people. Many languages are already covered and we'd really like to see your ideas and enhancements incorporated into the existing tools if possible, and helping to keep them up to date. Ask on the mailing list about contributing!

Log Files

  1. Those Erlang messages in the log are pretty confusing. What gives?
    • While the Erlang messages in the log can be confusing to someone unfamiliar with Erlang, with practice they become very helpful. The CouchDB developers do try to catch and log messages that might be useful to a system administrator in a friendly format, but occassionally a bug or otherwise unexpected behavior manifests itself in more verbose dumps of Erlang server state. These messages can be very useful to CouchDB developers. If you find many confusing messages in your log, feel free to inquire about them. If they are expected, devs can work to ensure that the message is more cleanly formatted. Otherwise, the messages may indicate a bug in the code.

      In many cases, this is enough to identify the problem. For example, OS errors are reported as tagged tuples {error,enospc} or {error,enoacces} which respectively is "You ran out of disk space", and "CouchDB doesn't have permission to access that resource". Most of these errors are derived from C used to build the Erlang VM and are documented in errno.h and related header files. IBM provides a good introduction to these, and the relevant POSIX and GNU and Microsoft Windows standards will cover most cases.

FUQ (last edited 2013-01-21 17:18:04 by DaveCottlehuber)