XML Messages for Updating a Solr Index
Solr accepts POSTed XML messages that Add/Update, Commit, Delete, and Delete by query, using the url /update. Here is the XML syntax that Solr expects to see:
Contents
The Update Schema
(Not to be confused with schema.xml.)
add/update
Example:
<add> <doc> <field name="employeeId">05991</field> <field name="office">Bridgewater</field> <field name="skills">Perl</field> <field name="skills">Java</field> </doc> [<doc> ... </doc>[<doc> ... </doc>]] </add>
Subversion contains many complex examples of <add> document messages.
Note: multiple documents may be specified in a single <add> command.
Optional attributes for "add"
overwrite = "true" | "false" — default is "true", meaning newer documents will replace previously added documents with the same uniqueKey.
commitWithin = "(milliseconds)" if the "commitWithin" attribute is present, the document will be added within that time.
Solr1.4 (deprecated) allowDups = "true" | "false" — default is "false"
(deprecated) overwritePending = "true" | "false" — default is negation of allowDups
(deprecated) overwriteCommitted = "true"|"false" — default is negation of allowDups
Optional attributes on "doc"
boost = <float> — default is 1.0 (See Lucene docs for definition of boost.)
- NOTE: make sure norms are enabled (omitNorms="false" in the schema.xml) for any fields where the index-time boost should be stored.
Optional attributes for "field"
boost = <float> — default is 1.0 (See Lucene docs for definition of boost.)
- NOTE: make sure norms are enabled (omitNorms="false" in the schema.xml) for any fields where the index-time boost should be stored.
Example of "add" with optional attributes:
<add> <doc boost="2.5"> <field name="employeeId">05991</field> <field name="office" boost="2.0">Bridgewater</field> </doc> </add>
"commit" and "optimize"
Example:
<commit/> <optimize/>
Optional attributes for "commit" and "optimize"
maxSegments = N — default is '1' — optimizes down to at most this number of segments
Solr1.3 waitFlush = "true" | "false" — default is true — block until index changes are flushed to disk
waitSearcher = "true" | "false" — default is true — block until a new searcher is opened and registered as the main query searcher, making the changes visible.
expungeDeletes = "true" | "false" — default is false — merge segments with deletes away.
Solr1.4
Example of "commit" and "optimize" with optional attributes:
<commit waitFlush="false" waitSearcher="false"/> <commit waitFlush="false" waitSearcher="false" expungeDeletes="true"/> <optimize waitFlush="false" waitSearcher="false"/>
Passing commit parameters as part of the URL
Update handlers can also get commit related parameters as part of the update URL. This example adds a small test document and causes a commit to happen after:
curl http://localhost:8983/solr/update?commit=true -H "Content-Type: text/xml" --data-binary '<add><doc><field name="id">testdoc</field></doc></add>'
This example will cause the index to be optimized down to at most 10 segments, but won't wait around until it's done (waitFlush=false):
curl 'http://localhost:8983/solr/update?optimize=true&maxSegments=10&waitFlush=false'
"rollback"
Example:
<rollback/>
The rollback command rollbacks all add/deletes made to the index since the last commit. It neither calls any event listeners nor creates a new searcher.
"delete" by ID and by Query
Delete by id deletes the document with the specified ID. (ID here means the value of the uniqueKey field declared in the schema (in these examples, employeeId).
Delete by query deletes all the documents that match the specified query.
Example:
<delete><id>05991</id></delete> <delete><query>office:Bridgewater</query></delete>
In Solr 1.2, delete query is much less efficient than delete by id, because Solr has to do much of the commit logic each time it receives a delete by query request. In Solr 1.3, however, most of the overhead will have been removed.
Solr1.4 Both delete by id and delete by query can be specified at the same time.
Example:
<delete> <id>05991</id><id>06000</id> <query>office:Bridgewater</query> <query>office:Osaka</query> </delete>
Optional attributes for "delete"
(deprecated) fromPending = "true" | "false" — default is "true"
(deprecated) fromCommitted = "true" | "false" — default is "true"
Updating a Data Record via curl
You can use curl to send any of the above commands. For example:
curl http://<hostname>:<port>/solr/update -H "Content-Type: text/xml" --data-binary '<add> <doc boost="2.5"> <field name="employeeId">05991</field> <field name="office" boost="2.0">Bridgewater</field> </doc> </add>'
curl http://<hostname>:<port>/solr/update -H "Content-Type: text/xml" --data-binary '<commit waitFlush="false" waitSearcher="false"/>'
Until a commit has been issued, you will not see any of the data in searches either on the master or the slave. After a commit has been issued, you will see the results on the master, then after a snapshot has been pulled by the slave, you will see it there also.
Updating a Data Record via GET
Short update requests can also be sent using a GET request (needs to be url-encoded) like:
http://localhost:8983/solr/update?stream.body=%3Cdelete%3E%3Cquery%3Eoffice:Bridgewater%3C/query%3E%3C/delete%3E http://localhost:8983/solr/update?stream.body=%3Ccommit/%3E