SimplePostTool, also called post.jar, is a simple self-containted command line tool for indexing data to Solr. It is not meant for production use, but a quick way to get up to speed.
post.jar resides inside the Solr distribution, in the folder "example/exampledocs"
. It is made as a single .java file (see SVN) without dependencies, so it does on purpose not use SolrJ.
The tool can index both XML/JSON/CSV strucured files as well as a file tree of rich text documents. It also includes a simple web crawler.
Note that you do not *need* to use this tool to index data to Solr. Solr uses standards based HTTP protocol, so you can use any tool or library capable of communicating over HTTP GET/POST, such as for instance the popular curl tool.
Usage
java [SystemProperties] -jar post.jar [-h|-] [<file|folder|url|arg> [<file|folder|url|arg>...]]
Examples
Get full help:
cd solr/example/exampledocs java -jar post.jar -h
Post a single XML file in Solr's Update XML format:
java -jar post.jar *.xml
Send XML instructions directly on the command line, e.g. to delete a document:
java -Ddata=args -jar post.jar '<delete><id>42</id></delete>'
Post a JSON document, specifying the content-type:
java -Dtype=application/json -jar post.jar *.json
Post all CSV, XML, JSON and PDF documents using AUTO mode which detects type based on file name:
java -Dauto -jar post.jar *.csv *.xml *.json *.pdf
Posts all content of a folder recursively, with auto detection of file type and selecting correct handler:
java -Dauto -Drecursive -jar post.jar my-folder
Same as above. Post a folder recursively, but only index PPT and HTML file types:
java -Dauto -Dfiletypes=ppt,html -jar post.jar my-folder
Send the contents of a URL:
java -Ddata=web -jar post.jar http://example.no/
Crawl a web site recursively (default 1 level):
java -Ddata=web -Drecursive -jar post.jar http://example.no/