Differences between revisions 32 and 33
Revision 32 as of 2008-11-05 01:46:53
Size: 17155
Editor: MartinCzura
Comment: redirect
Revision 33 as of 2009-09-20 21:45:16
Size: 17155
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 185: Line 185:
See ["DocumentRevisions"] for additional notes on revisions. See [[DocumentRevisions]] for additional notes on revisions.

An introduction to the CouchDB HTTP document API.

Naming/Addressing

Documents stored in a CouchDB have a DocID. DocIDs are case-sensitive string identifiers that uniquely identify a document. Two documents cannot have the same identifier in the same database, they are considered the same document.

http://localhost:5984/test/some_doc_id
http://localhost:5984/test/another_doc_id
http://localhost:5984/test/BA1F48C5418E4E68E5183D5BD1F06476

The above URLs point to some_doc_id, another_doc_id and BA1F48C5418E4E68E5183D5B!D1F06476 in the database test.

Valid Document Ids

  • Q: What's the rule on a valid document id? The examples suggest it's restricted to [a-zA-Z0-9_]? What about multi-byte UTF-8 characters? Any other non alphanums other than _? A: There is no restriction yet on document ids at the database level. However, I haven't tested what happens when you try to use multibyte in the URL. It could be it "just works", but most likely there is a multi-byte char escaping/encoding/decoding step that needs to be done somewhere. For now, I'd just stick with valid URI characters and nothing "special". The reason database names have strict restrictions is to simplify database name-to-file mapping. Since databases will need to replicate across operating systems, the file naming scheme needed to be the lowest common denominator.

JSON

A CouchDB document is simply a JSON object. (Along with metadata revision info if ?full=true is in the URL query arguments.

This is an example document:

{
 "_id":"discussion_tables",
 "_rev":"D1C946B7",
 "Subject":"I like Plankton",
 "Author":"Rusty",
 "PostedDate":"2006-08-15T17:30:12-04:00",
 "Tags":["plankton", "baseball", "decisions"],
 "Body":"I decided today that I don't like baseball. I like plankton."
}

The document can be an arbitrary JSON object, but note that any top-level fields with a name that starts with a _ prefix are reserved for use by CouchDB itself. Common examples for such fields are _id and _rev, as shown above.

Another example:

{
 "_id":"discussion_tables",
 "_rev":"D1C946B7",
 "Subrise":true,
 "Sunset":false,
 "FullHours":[1,2,3,4,5,6,7,8,9,10],
 "Activities": [
   {"Name":"Football", "Duration":2, "DurationUnit":"Hours"},
   {"Name":"Breakfast", "Duration":40, "DurationUnit":"Minutes", "Attendees":["Jan", "Damien", "Laura", "Gwendolyn", "Roseanna"]}
 ]
}

Note that by default the structure is flat; in this case, the Activities attribute is structure imposed by the user.

All Documents

To get a listing of all documents in a database, use the special _all_docs URI:

GET somedatabase/_all_docs HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT

Will return a listing of all documents and their revision IDs, ordered by DocID (case sensitive):

HTTP/1.1 200 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{
  "total_rows": 3, "offset": 0, "rows": [
    {"id": "doc1", "key": "doc1", "value": {"rev": "4324BB"}},
    {"id": "doc2", "key": "doc2", "value": {"rev":"2441HF"}},
    {"id": "doc3", "key": "doc3", "value": {"rev":"74EC24"}}
  ]
}

Use the query argument descending=true to reverse the order of the output table:

Will return the same as before but in reverse order:

HTTP/1.1 200 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{
  "total_rows": 3, "offset": 0, "rows": [
    {"id": "doc3", "key": "doc3", "value": {"_rev":"74EC24"}}
    {"id": "doc2", "key": "doc2", "value": {"_rev":"2441HF"}},
    {"id": "doc1", "key": "doc1", "value": {"_rev": "4324BB"}},
  ]
}

The query string parameters startkey and count may also be used to limit the result set. For example:

GET somedatabase/_all_docs?startkey=doc2&count=2 HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT

Will return:

HTTP/1.1 200 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{
  "total_rows": 3, "offset": 1, "rows": [
    {"id": "doc2", "key": "doc2", "value": {"_rev":"2441HF"}},
    {"id": "doc3", "key": "doc3", "value": {"_rev":"74EC24"}}
  ]
}

And combined with descending:

GET somedatabase/_all_docs?startkey=doc2&count=2&descending=true HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT

Will return:

HTTP/1.1 200 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{
  "total_rows": 3, "offset": 1, "rows": [
    {"id": "doc3", "key": "doc3", "value": {"_rev":"74EC24"}}
    {"id": "doc2", "key": "doc2", "value": {"_rev":"2441HF"}},
  ]
}

Working With Documents Over HTTP

GET

To retrieve a document, simply perform a GET operation at the document's URL:

GET /somedatabase/some_doc_id HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT

Here is the server's response:

HTTP/1.1 200 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{
 "_id":"123BAC",
 "_rev":"946B7D1C",
 "Subject":"I like Planktion",
 "Author":"Rusty",
 "PostedDate":"2006-08-15T17:30:12Z-04:00",
 "Tags":["plankton", "baseball", "decisions"],
 "Body":"I decided today that I don't like baseball. I like plankton."
}

Accessing Previous Revisions

See DocumentRevisions for additional notes on revisions.

The above example gets the current revision. You can get a specific revision by using the following syntax:

GET /somedatabase/some_doc_id?rev=946B7D1C HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT

To find out what revisions are available for a document, you can do:

GET /somedatabase/some_doc_id?revs=true HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT

This returns the current revision of the document, but with an additional field, _revs, the value being a list of the available revision IDs. Note though that not every of those revisions of the document is necessarily still stored on disk. For example, the content of an old revision may get removed by compacting the database, or it may only exist in a different database if it was replicated.

To get more detailed information about the available document revisions, use the revs_info parameter instead. In this case, the JSON result will contain a _revs_info property, which is an array of objects, for example:

{
  "_revs_info": [
    {"rev": "123456", "status": "disk"},
    {"rev": "234567", "status": "missing"},
    {"rev": "345678", "status": "deleted"},
  ]
}

Here, disk means the revision content is stored on disk and can still be retrieved. The other values indicate that the content of that revision is not available.

PUT

To create new document you can either use a POST operation or a PUT operation. To create/update a named document using the PUT operation, the URL must point to the document's location.

The following is an example HTTP PUT. It will cause the CouchDB server to generate a new revision ID and save the document with it.

PUT /somedatabase/some_doc_id HTTP/1.0
Content-Length: 245
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json

{
  "Subject":"I like Planktion",
  "Author":"Rusty",
  "PostedDate":"2006-08-15T17:30:12-04:00",
  "Tags":["plankton", "baseball", "decisions"],
  "Body":"I decided today that I don't like baseball. I like plankton."
}

Here is the server's response.

HTTP/1.1 201 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{"ok": true, "id": "some_doc_id", "rev": "946B7D1C"}

To update an existing document, you also issue a PUT request. In this case, the JSON body must contain a _rev property, which lets CouchDB know which revision the edits are based on. If the revision of the document currently stored in the database doesn't match, then a 409 conflict error is returned.

If the revision number does match what's in the database, a new revision number is generated and returned to the client.

For example:

PUT /somedatabase/some_doc_id HTTP/1.0
Content-Length: 245
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json

{
  "_id":"some_doc_id",
  "_rev":"946B7D1C",
  "Subject":"I like Planktion",
  "Author":"Rusty",
  "PostedDate":"2006-08-15T17:30:12-04:00",
  "Tags":["plankton", "baseball", "decisions"],
  "Body":"I decided today that I don't like baseball. I like plankton."
}

Here is the server's response if what is stored in the database is revision 946B7D1C of document some_doc_id.

HTTP/1.1 201 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{"ok":true, "id":"some_doc_id", "rev":"946B7D1C"}

And here is the server's response if there is an update conflict (what is currently stored in the database is not revision 946B7D1C of document some_doc_id).

HTTP/1.1 409 CONFLICT
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Length: 33
Connection: close

{"error":{"id":"conflict","reason":"3073715634"}}

POST

The POST operation can be used to create a new document with a server generated DocID. To create a named document, use the PUT method instead. It is recommended that you avoid POST when possible, because proxies and other network intermediaries will occasionally resend POST requests, which can result in duplicate document creation. If your client software is not capable of generating cryptographically secure UUIDs, use a POST to /_uuids?count=100 to retrieve a list of unused document IDs for future PUT requests.

The following is an example HTTP POST. It will cause the CouchDB server to generate a new DocID and revision ID and save the document with it.

POST /somedatabase/ HTTP/1.0
Content-Length: 245
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json

{
  "Subject":"I like Planktion",
  "Author":"Rusty",
  "PostedDate":"2006-08-15T17:30:12-04:00",
  "Tags":["plankton", "baseball", "decisions"],
  "Body":"I decided today that I don't like baseball. I like plankton."
}

Here is the server's response:

HTTP/1.1 201 Created
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{"ok":true, "id":"123BAC", "rev":"946B7D1C"}

Modify Multiple Documents With a Single Request

CouchDB provides a bulk insert/update feature. To use this, you make a POST request to the URI /{dbname}/_bulk_docs, with the request body being a JSON document containing a list of new documents to be inserted or updated. The bulk post is a transactional operation - all updates/insertions succeed, or all fail.

Doc formats below are as per CouchDB 0.8.0.

{
  "docs": [
    {"_id": "0", "integer": 0, "string": "0"},
    {"_id": "1", "integer": 1, "string": "1"},
    {"_id": "2", "integer": 2, "string": "2"}
  ]
}

If you omit the per-document _id specification, CouchDB will generate unique IDs for you, as it does for regular POST requests to the database URI.

The response to such a bulk request would look as follows:

{
  "ok":true,
  "new_revs": [
    {"id": "0", "rev": "3682408536"},
    {"id": "1", "rev": "3206753266"},
    {"id": "2", "rev": "426742535"}
  ]
}

Updating existing documents requires setting the _rev member to the revision being updated. To delete a document set the _deleted member to true.

{
  "docs": [
    {"_id": "0", "_rev": "3682408536", "_deleted": true},
    {"_id": "1", "_rev": "3206753266", "integer": 2, "string": "2"},
    {"_id": "2", "_rev": "426742535", "integer": 3, "string": "3"}
  ]
}

Note that CouchDB will return in the response an id and revision for every document passed as content to a bulk insert, even for those that were just deleted.

DELETE

To delete a document, perform a DELETE operation at the document's location, passing the rev parameter with the document's current revision. If successful, it will return the revision id for the deletion stub.

DELETE /somedatabase/some_doc?rev=1582603387 HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT

As an alternative you can submit the rev parameter with the etag header field If-Match.

DELETE /somedatabase/some_doc HTTP/1.0
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
If-Match: "1582603387"

And the response:

HTTP/1.1 200 OK
Date: Thu, 17 Aug 2006 05:39:28 +0000GMT
Content-Type: application/json
Connection: close

{"ok":true,"rev":"2839830636"}

Attachments

Documents can have attachments just like email. There are two ways to use attachments. The first one is inline with your document and it described first. The second one is a separate REST API for attachments that is described a little further down.

Inline Attachments

On creation, attachments go into a special _attachments attribute of the document. They are encoded in a JSON structure that holds the name, the content_type and the base64 encoded data of an attachment. A document can have any number of attachments.

When retrieving documents, the attachment's actual data is not included, only the metadata. The actual data has to be fetched separately, using a special URI.

Creating a document with an attachment:

{
  "_id":"attachment_doc",
  "_attachments":
  {
    "foo.txt":
    {
      "content_type":"text\/plain",
      "data": "VGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHRleHQ="
    }
  }
}

Please note that any base64 data you send has to be on a single line of characters, so pre-process your data to remove any carriage returns and newlines.

Requesting said document:

GET /database/attachment_doc

CouchDB replies:

{
  "_id":"attachment_doc",
  "_rev":1589456116,
  "_attachments":
  {
    "foo.txt":
    {
      "stub":true,
      "content_type":"text\/plain",
      "length":29
    }
  }
}

Note that the "stub":true attribute denotes that this is not the complete attachment. Also, note the length attribute added automatically.

Requesting the attachment:

GET /database/attachment_doc/foo.txt

CouchDB returns:

This is a base64 encoded text

Automatically decoded!

Multiple Attachments

Creating a document with an attachment:

{
  "_id":"attachment_doc",
  "_attachments":
  {
    "foo.txt":
    {
      "content_type":"text\/plain",
      "data": "VGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHRleHQ="
    },

   "bar.txt":
    {
      "content_type":"text\/plain",
      "data": "VGhpcyBpcyBhIGJhc2U2NCBlbmNvZGVkIHRleHQ="
    }
  }
}

Standalone Attachments

CouchDB allows to create, change and delete attachments without touching the actual document. As a bonus feature, you do not have to base64 encode your data. This can significantly speed up requests since CouchDB and your client do not have to do the base64 conversion.

You need to specify a MIME type using the Content-Type header. CouchDB will serve the attachment with the specified Content-Type when asked.

To create an attachment:

PUT somedatabase/document/attachment?rev=123 HTTP/1.0
Content-Length: 245
Content-Type: image/jpeg

<JPEG data>

CouchDB replies:

{"ok": true, "id": "document", "rev": "765B7D1C"}

Note that you can do this on a non-existing document. The document and attachment will be created implicitly for you. A revision id must not be specified in this case.

To change an attachment:

PUT somedatabase/document/attachment?rev=765B7D1C HTTP/1.0
Content-Length: 245
Content-Type: image/jpeg

<JPEG data>

CouchDB replies:

{"ok": true, "id": "document", "rev": "766FC88G"}

To delete an attachment:

DELETE somedatabase/document/attachment?rev=765B7D1C HTTP/1.0

CouchDB replies:

{"ok":true,"id":"document","rev":"519558700"}

To retrieve an attachment:

GET somedatabase/document/attachment HTTP/1.0

CouchDB replies

Content-Type:image/jpeg

<JPEG data>

ETags/Caching

CouchDB sends an ETag Header for document requests. The ETag Header is simply the document's revision in quotes.

For example, a GET request:

GET /database/123182719287

Results in a reply with the following headers:

cache-control: no-cache,
pragma: no-cache
expires: Tue, 13 Nov 2007 23:09:50 GMT
transfer-encoding: chunked
content-type: text/plain;charset=utf-8
etag: "615790463"

POST requests also return an ETag header for either newly created or updated documents.

HttpDocumentApi (last edited 2009-09-20 21:45:16 by localhost)