Updating a Solr Index with JSON

Solr accepts index updates in JSON format.

<!> Solr3.1

Requirements

<!> Solr3.1 is the first version with JSON support for updates.

The JSON request handler needs to be configured in solrconfig.xml This should already be present in the example solrconfig.xml

  <requestHandler name="/update/json" class="solr.JsonUpdateRequestHandler"/>

<!> In Solr4.0, JSON support is included in the standard UpdateRequestHandler

  <requestHandler name="/update" class="solr.UpdateRequestHandler"/>

Note, requests need to include Content-type:application/json or Content-type:text/json

Methods of sending JSON

JSON formatted update requests may be sent to Solr via the /solr/update/json URL. All of the normal methods for uploading content are supported.

Example

There is a sample JSON file at example/exampledocs/books.json that may be used to add documents to the solr example server.

Example of using HTTP-POST to index the JSON:

cd example/exampledocs
curl 'http://localhost:8983/solr/update/json?commit=true' --data-binary @books.json -H 'Content-type:application/json'

Note that we added "commit=true" to the URL so that the documents would be immediately searchable.

You should now be able to query for the newly added documents: http://localhost:8983/solr/select?q=name:monsters&wt=json&indent=true

{
  "responseHeader":{
    "status":0,
    "QTime":2,
    "params":{
      "indent":"true",
      "wt":"json",
      "q":"title:monsters"}},
  "response":{"numFound":1,"start":0,"docs":[
      {
        "id":"978-1423103349",
        "author":"Rick Riordan",
        "series_t":"Percy Jackson and the Olympians",
        "sequence_i":2,
        "genre_s":"fantasy",
        "inStock":true,
        "price":6.49,
        "pages_i":304,
        "name":[
          "The Sea of Monsters"],
        "cat":["book","paperback"]}]
  }}

It's also easy to specify JSON documents from the command line for testing purposes and scripts (assumes a UNIX environment):

URL=http://localhost:8983/solr/update/json
curl $URL -H 'Content-type:application/json' -d '
[
  {
    "id" : "MyTestDocument",
    "title" : "This is just a test"
  }
]'
curl "$URL?commit=true"

Here's a simple example of adding more than one document at once:

curl http://localhost:8983/solr/update/json -H 'Content-type:application/json' -d '
[
 {"id" : "TestDoc1", "title" : "test1"},
 {"id" : "TestDoc2", "title" : "another test"}
]'

Update Commands

The JSON update handler accepts all of the types of update commands that the XML update handler supports, through a straightforward mapping. Please see the documentation on XML updates for detailed descriptions of the commands.

Multiple commands may be contained in one message. Here is an example JSON update message demonstrating multiple update commands (note: comments are not legal JSON, but duplicate names are legal)

{ 
"add": {
  "doc": {
    "id": "DOC1",
    "my_boosted_field": {        /* use a map with boost/value for a boosted field */
      "boost": 2.3,
      "value": "test"
    },
    "my_multivalued_field": [ "aaa", "bbb" ]   /* use an array for a multi-valued field */
  }
},
"add": {
  "commitWithin": 5000,          /* commit this document within 5 seconds */
  "overwrite": false,            /* don't check for existing documents with the same uniqueKey */
  "boost": 3.45,                 /* a document boost */
  "doc": {
    "f1": "v1",
    "f1": "v2"
  }
},

"commit": {},
"optimize": { "waitFlush":false, "waitSearcher":false },

"delete": { "id":"ID" },                               /* delete by ID */
"delete": { "query":"QUERY" }                          /* delete by query */
"delete": { "query":"QUERY", 'commitWithin':'500' }    /* delete by query, commit within 500ms */
}

Just as in the other update handlers, parameters such as commit, commitWithin, optimize, and overwrite may be specified in the URL instead of in the body of the message.

Solr 3.1 Example

Solr 3.2 was the first version to support the array-of-JSONObject syntax, so in Solr 3.1 one needs to use duplicate names (the "add" tag) to add more than one document at once. It is legal in JSON to have duplicate names. Example:

curl http://localhost:8983/solr/update/json -H 'Content-type:application/json' -d '
{
 "add": {"doc": {"id" : "TestDoc1", "title" : "test1"} },
 "add": {"doc": {"id" : "TestDoc2", "title" : "another test"} }
}'

Solr 4.0 Example

Atomic Updates

Solr 4.0 supports Atomic_Updates and allows you to add, set and inc (increment on numeric fields). Example:

curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d '
[
 {
  "id"        : "TestDoc1",
  "title"     : {"set":"test1"},
  "revision"  : {"inc":3},
  "publisher" : {"add":"TestPublisher"}
 },
 {
  "id"        : "TestDoc2",
  "publisher" : {"add":"TestPublisher"}
 }
]'

Atomic Updates with Optimistic Concurrency

Solr 4.0 supports comes with a build in _version_ field that is automatically added by Solr, that allows you to perform Optimistic_Concurrency on Atomic Updates. Example:

curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d '
[
 {
  "id"        : "TestDoc1",
  "title"     : {"set":"test1"},
  "revision"  : {"inc":3},
  "publisher" : {"add":"TestPublisher"}
  "_version_" : {12345}
 }
]'

UpdateJSON (last edited 2013-02-22 18:27:44 by HossMan)