A simple example of a CouchDB application schema.

The way CouchDB works is still a bit cryptic for many of us coming from the RDBMS world. So the idea of this page is to write the schema of a typical application the RDBMS way and rewrite it the Document oriented way. I was thinking about a simple blog application:

RDBMS Schema

Relevant Threads

This part needs to be expanded by people who better understand the subject.

This thread describes how to manage comments:

These threads describe how to manage tags:

CouchDB Schema

These examples use the database blog.

Data Storage

Users

_id: autogenerated_user_id1
type: "user"
name: "J. D. Citizen"
password: "qwerty"

Articles

_id: autogenerated_article_id1
user_id: autogenerated_user_id1
type: "article"
title: "RDBMSs SUCK OMG"
body: "I thought what I'd do was, I'd pretend I was one of those deaf-
mutes."
created: "Fri Oct 12 04:46:58 +1000 2007"
comments: comment_id1, comment_id2, etc
tags: [ "rdbms", "things that suck" ]

Comments

_id: autogenerated_comment_id1
article_id: autogenerated_article_id1
user_id: autogenerated_user_id2
type: "comment"
title: "i liek ur blog"
body: "i liek ur blog a lot LOL. please ad me on my space. clik hear for hot girl action LOL"
created: "Fri Oct 12 04:46:59 +1000 2007"

Bear in mind that due to the flat nature of the document storage that there are no "tables" to do a SELECT from. Most of the real power of CouchDB is in it's views.

Views

Views are be stored into design documents for grouping and giving a sense of structure. Design documents are regular documents in CouchDB, it knows they are design documents because you preface the document's _id attr with _design/.

Let's add a group of views for our tags. Everybody wants to be able to fetch all documents tagged rdbms so lets add a view that will get us there.

_design/tags

"views": {
   "to_docs": "
      function(doc) {
        for( var i=0; i < doc.tags.length; i++) {
          emit(doc.tags[i], null);
        }
      }"
}

This view would be accessed by going to /blog/_design/tags/to_docs. The returned content would look like:

{"view":"_design/tags/to_docs_inc","total_rows":9, "offset":0,
 "rows":[
  {"_id":"autogenerated_article_id1","_rev":"684343246","key":"things that suck"},
  {"_id":"autogenerated_article_id1","_rev":"684343246","key":"rdbms"}

Note, 2 rows returned for the same article. This is intended. CouchDB will let us filter the data in our views a bit with extra query parameters. Going to /blog/_design/tags/to_docs?key="rdbms" yields only documents tagged rdbms.

Say we're returning a lot of records here and want to reduce them by including the title (or if no title exists, the name) attribute of the thing that was tagged along with the results. Change the following line in the to_docs query above:

// emit(doc.tags[i], null);
// becomes
emit(doc.tags[i], doc.title || doc.name);

Optionally, this view could be named to_docs_with_title so as to be available only if the extra data is desired.

Using the sample view given above many other views would likely be implemented. I give only a few sample URLs as follows, implementation is left as an exercise:

  • /blog/_design/comments/on_article?key="autogenerated_article_id1"
  • /blog/_design/comments/by_user?key="autogenerated_user_id1"
  • /blog/_design/articles/by_user?key="autogenerated_user_id1"
  • /blog/_design/articles/by_date?startkey="date1"&endkey="date2"

I'm not sure this last one is possible, how does CouchDB handle date indexing?