Differences between revisions 7 and 8
Revision 7 as of 2013-03-27 01:03:30
Size: 7005
Editor: MarkHahn
Comment:
Revision 8 as of 2013-03-27 01:16:42
Size: 7010
Editor: MarkHahn
Comment: Fixed extra slash in missing object feature example
Deletions are marked like this. Additions are marked like this.
Line 9: Line 9:
A partial update is a single HTTP request to CouchDB that is similar to a normal update (PUT). However the partial update request contains only information for updating (or deleting) one or more fields (or sub-fields) of a doc.  A partial update is a single HTTP request to CouchDB that is similar to a normal update (PUT). However the partial update request contains only information for updating (or deleting) one or more fields (or sub-fields) of a doc.
Line 95: Line 95:
See [[Document_Update_Handlers]] for general information about update handlers.   See [[Document_Update_Handlers]] for general information about update handlers.
Line 107: Line 107:
The property key of an update command specifies which field is to be updated. It can be a simple, top-level, property key or it can be a ''path'' into an object with nested objects or arrays. A ''path'' key is indicated by a leading slash `/` and multiple parts separated by slashes.   The property key of an update command specifies which field is to be updated. It can be a simple, top-level, property key or it can be a ''path'' into an object with nested objects or arrays. A ''path'' key is indicated by a leading slash `/` and multiple parts separated by slashes.
Line 128: Line 128:
The update handler will automatically create any objects missing from any part of a path. For example, the following update command `/topLevelObject/nestedField_three/dblNest/: 73` will add the missing objects ''nestedField_three'' and ''dblNest'' to the doc. The update handler will automatically create any objects missing from any part of a path. For example, the following update command `/topLevelObject/nestedField_three/dblNest: 73` will add the missing object ''nestedField_three'' and new property ''dblNest'' to the doc.
Line 137: Line 137:
  "/topLevelObject/nestedField_three/dblNest/": 73   "/topLevelObject/nestedField_three/dblNest": 73

We have a new wiki. The migration is not 100% complete. You can help out by moving pages across. This wiki will exist for as long as there are pages left.

The official documentation has moved to http://docs.couchdb.org — The transition is not 100% complete, but http://docs.couchdb.org should be seen as having the latest info. In some cases, the wiki still has some more or older info on certain topics inside CouchDB.

You need to be added to the ContributorsGroup to edit the wiki. But don't worry! Just email any Mailing List or grab us on IRC and let us know your user name.

Future versions of CouchDB are expected to have a built-in partial update feature. However, partial updates can be accomplished now with current versions of CouchDB using the existing update handler feature. While one may write their own update handler for this purpose, an example is given here that anyone can use.

What is a partial update?

A partial update is a single HTTP request to CouchDB that is similar to a normal update (PUT). However the partial update request contains only information for updating (or deleting) one or more fields (or sub-fields) of a doc.

Why is a partial update useful?

  • A partial update is more efficient than a normal full update. Only the change information needs to be sent over HTTP, not the entire doc. In the general case of a full update, changing a single field in a doc requires reading the doc, changing it, and then putting the doc back to the DB. A partial update only needs the ID of the doc in order to make the field change.
  • Partial updates allow the code in an app, or multiple apps, to be partitioned into multiple pieces where each piece of code only knows about one part of the doc. In many cases this allows for a better separation of concerns. As an example, a routine may called with only the doc ID and then the routine may at any time update part of a doc without ever having a full copy of the doc. This is especially important when accessing a single DB from multiple apps (or workers) where a single copy of a doc can't be shared.
  • A partial update, like any other update using the update handler, has less chance of a 409 collision than the sequence of getting a doc, modifying it, and then putting it back. This is because the internal update is faster than getting and putting over a TCP link. Also, when different non-overlapping parts of a doc are being updated at once, collisions can usually be ignored.

Example update handler

While this is just an example, it can be used by anyone for accomplishing any partial update. The author (I, Mark Hahn) am making this code available to everyone under the standard Apache 2 license.

Coffescript version

  partialUpdate: (doc, req) ->
    if not doc then return [null, JSON.stringify status: 'nodoc']
    for k, v of JSON.parse req.body
      if k[0] is '/'
        nestedDoc = doc
        nestedKeys = k.split '/'
        for nestedKey in nestedKeys[1..-2]
          nestedDoc = (nestedDoc[nestedKey] ?= {})
        k = nestedKeys[-1..-1][0]
        if v is '__delete__' then delete nestedDoc[k]
        else nestedDoc[k] = v
        continue
      if v is '__delete__' then delete doc[k]
      else doc[k] = v
    [doc, JSON.stringify {doc, status: 'updated'}]

Javascript Version

partialUpdate: function(doc, req) {
  if (!doc) {
    return [
      null, JSON.stringify({
        status: 'nodoc'
      })
    ];
  }
  _ref = JSON.parse(req.body);
  for (k in _ref) {
    v = _ref[k];
    if (k[0] === '/') {
      nestedDoc = doc;
      nestedKeys = k.split('/');
      _ref1 = nestedKeys.slice(1, -1);
      for (_i = 0, _len = _ref1.length; _i < _len; _i++) {
        nestedKey = _ref1[_i];
        nestedDoc = ((_ref2 = nestedDoc[nestedKey]) != null ? _ref2 : nestedDoc[nestedKey] = {});
      }
      k = nestedKeys.slice(-1)[0];
      if (v === '__delete__') {
        delete nestedDoc[k];
      } else {
        nestedDoc[k] = v;
      }
      continue;
    }
    if (v === '__delete__') {
      delete doc[k];
    } else {
      doc[k] = v;
    }
  }
  return [
    doc, JSON.stringify({
      doc: doc,
      status: 'updated'
    })
  ];
}

Installing the update handler

To install the update handler, add the Javascript version of code above to the updates property of a design doc.

See Document_Update_Handlers for general information about update handlers.

Usage instructions

To quote the instructions at Document_Update_Handlers ...

  • To invoke a handler, use a PUT request against the handler function with a document id: /<database>/_design/<design>/_update/<function>/<docid>

The update document should be contained in the HTTP request body in JSON format. When using the partial update handler listed above, the update document must use a special format. The JSON doc should consist of one hash object where each property of the object is one update command.

The property key of an update command specifies which field is to be updated. It can be a simple, top-level, property key or it can be a path into an object with nested objects or arrays. A path key is indicated by a leading slash / and multiple parts separated by slashes.

Consider this original doc ...

{
  _id: "xxx"
  _rev: "1_something"
  field_one: "zzz"
  field_two: "Don't bother me."
  topLevelObject: {
    nestedField_one: "I'm doomed"
    nestedField_two: 42
  }
}

A command key of field_one is a command to replace "zzz" with the value of the update command property. A command key of /topLevelObject/nestedField_two will replace 42 with the associate property value.

An update command that has the magic property value of __delete__ will cause the corresponding original field to be deleted. The command property /topLevelObject/nestedField_one: "__delete__" will delete the entire nestedField_one property.

The update handler will automatically create any objects missing from any part of a path. For example, the following update command /topLevelObject/nestedField_three/dblNest: 73 will add the missing object nestedField_three and new property dblNest to the doc.

This update document ...

{
  field_one: "AAA",
  "/topLevelObject/nestedField_one": "__delete__",
  "/topLevelObject/nestedField_two": 99,
  "/topLevelObject/nestedField_three/dblNest": 73
}

will change the original document to ...

{
  _id: "xxx"
  _rev: "2_somethingElse"
  field_one: "AAA"
  field_two: "Don't bother me."
  topLevelObject: {
    nestedField_two: 99
    nestedField_three: {
      dblNest: 73
    }
  }
}

Reserved syntax

Note from the usage instructions above that if you use a doc key that starts with a slash or a doc value that is __delete__ you will have a problem using the update handler given above. This could be solved by adding some escape mechanism to the handler. Fixing this is left to the reader.

HTTP 409 error

Even though an update with an update handler has less chance of colliding, it is still possible for the the update request to return an HTTP error 409. This is caused by some other update incrementing the version of the doc while the update handler is executing. You will need to implement a retry mechanism and/or conflict resolution in the code making the HTTP request.

Partial_Updates (last edited 2013-03-27 01:16:42 by MarkHahn)