XML Messages for Updating a Solr Index
Solr accepts POSTed XML messages that Add/Update, Commit, Delete, and Delete by query, using the url /update. Here is the XML syntax that Solr expects to see:
The Update Schema
(Not to be confused with schema.xml.)
add/update
Example:
-
<add> <doc> <field name="employeeId">05991</field> <field name="office">Bridgewater</field> <field name="skills">Perl</field> <field name="skills">Java</field> </doc> [<doc> ... </doc>[<doc> ... </doc>]] </add>
Subversion contains many
complex examples of <add> document messages.
Note: multiple documents may be specified in a single <add> command.
Optional attributes for "add"
allowDups = "true" | "false" — default is "false"
(deprecated) overwritePending = "true" | "false" — default is negation of allowDups
(deprecated) overwriteCommitted = "true"|"false" — default is negation of allowDups
The defaults for overwritePending and overwriteCommitted are linked to allowDups such that those defaults make more sense:
If allowDups is false (overwrite any duplicates), it implies that overwritePending and overwriteCommitted are true by default.
If allowDups is true (allow addition of duplicates), it implies that overwritePending and overwriteCommitted are false by default.
Optional attributes on "doc"
boost = <float> — default is 1.0 (See Lucene docs for definition of boost.)
NOTE: make sure norms are enabled (omitNorms="false" in the schema.xml) for any fields where the index-time boost should be stored.
Optional attributes for "field"
boost = <float> — default is 1.0 (See Lucene docs for definition of boost.)
NOTE: make sure norms are enabled (omitNorms="false" in the schema.xml) for any fields where the index-time boost should be stored.
Example of "add" with optional attributes:
-
<add> <doc boost="2.5"> <field name="employeeId">05991</field> <field name="office" boost="2.0">Bridgewater</field> </doc> </add>
"commit" and "optimize"
Example:
-
<commit/> <optimize/>
Optional attributes for "commit" and "optimize"
waitFlush = "true" | "false" — default is true — block until index changes are flushed to disk
waitSearcher = "true" | "false" — default is true — block until a new searcher is opened and registered as the main query searcher, making the changes visible.
Example of "commit" and "optimize" with optional attributes:
-
<commit waitFlush="false" waitSearcher="false"/> <optimize waitFlush="false" waitSearcher="false"/>
"delete" by ID and by Query
Delete by id deletes the document with the specified ID. (ID here means the value of the uniqueKey field declared in the schema (in these examples, employeeId).
Delete by query deletes all the documents that match the specified query.
Example:
-
<delete><id>05991</id></delete> <delete><query>office:Bridgewater</query></delete>
In Solr 1.2, delete query is much less efficient than delete by id, because Solr has to do much of the commit logic each time it receives a delete by query request. In Solr 1.3, however, most of the overhead will have been removed.
Optional attributes for "delete"
(deprecated) fromPending = "true" | "false" — default is "true"
(deprecated) fromCommitted = "true" | "false" — default is "true"
Updating a Data Record via curl
You can use curl to send any of the above commands. For example:
curl http://<hostname>:<port>/update -H "Content-Type: text/xml" --data-binary '<add> <doc boost="2.5"> <field name="employeeId">05991</field> <field name="office" boost="2.0">Bridgewater</field> </doc> </add>'
curl http://<hostname>:<port>/update -H "Content-Type: text/xml" --data-binary '<commit waitFlush="false" waitSearcher="false"/>'
Until a commit has been issued, you will not see any of the data in searches either on the master or the slave. After a commit has been issued, you will see the results on the master, then after a snapshot has been pulled by the slave, you will see it there also.