Differences between revisions 8 and 9
Revision 8 as of 2013-05-21 15:12:13
Size: 4286
Editor: ShawnHeisey
Comment: Core discovery doesn't work right in 4.3. Fix will be in 4.4.
Revision 9 as of 2013-05-21 17:28:53
Size: 4824
Comment: Added solr.xml (new style) example.
Deletions are marked like this. Additions are marked like this.
Line 8: Line 8:
Essentially, cores are no longer defined in in solr.xml. In fact, solr.xml is no longer necessary at all and will be obsoleted in Solr 5.x. As of Solr 4.3 the process is as follows:
 * If a solr.xml file is found in <SOLR_HOME>, then it is expected to be the old-style solr.xml that defines cores etc.
 * If there is no solr.xml but there ''is'' a solr.properties file, then exploration-based core enumeration is assumed.
 * If neither a solr.xml nor an solr.properties file is found, a default solr.xml file is assumed. NOTE: as of 5.0, this will not be true and an error will be thrown if no solr.properties file is found.
 * If a solr.properties file is found, all the information that used to be available in solr.xml is defined available here ''except'' there should be no core information. It's a straightforward mapping from the old schema.xml <solr> and <cores> nodes to the properties file version. See [[Solr.xml (pre 5.x)]] for their meanings <!> TODO: put the rest of the definitions here rather than reference an obsolete page!.
  * persistent [true|false] Whether to persist any changes to the files on disk, both solr.properties and the individual core.properties files. Defaults to false.
  * sharedLib
  * zkHost
  * cores.hostPort
  * cores.adminPath
  * cores.defaultCoreName
  * cores.shareSchema - soon to be obsolete. Real Soon Now (see SOLR-4478), configSets will be implemented. This will implicitly support sharing schemas, solrconfig, etc in alignment with SolrCloud initiatives.
  * cores.hostContext
  * cores.zkClientTimeout
  * cores.transientCacheSize - The number of cores that can be loaded before closing older ones on an LRU basis. Only cores defined with transient=true (see below) are placed in this cache.
  * cores.adminHandler
Cores are no longer necessarily defined in in solr.xml. For the near term, solr.xml can exist either in the new or old styles. Whether solr.xml is interpreted as new or old-style is triggered by the absence or presence of a <cores> tag.
 * If there is a <cores> element, it is assumed to be an old-style solr.xml.
 * If there is ''not'' a <cores> tag, then this is assumed to be a new-style solr.xml file and the cores are enumerated from SOLR_HOME (i.e. auto discovery).
 * In either case, if patterns are detected (say "coreLoadThreads" is found as an attribute in the <solr> tag in a new-style solr.xml file), errors are thrown.
Line 25: Line 13:
== Format for new-style solr.xml ==
The solr.xml file has a new format, here is a sample. NUMBERS MADE UP FOR NOW!
Line 26: Line 16:
{{{
<solr>
  <str name="adminHandler">handler</str>
  <int name="coreLoadThreads">3</str>
  <str name="coreRootDirectory">path to core root</str>
  <str name="managementPath">path</str>
  <str name="sharedLib">lib to share</str>
  <str name="shareSchema">true|false</str> // NOTE, may be replaced by using config sets
  <int name="transientCacheSize">25</str>
  
  <solrcloud>
    <int name="distribUpdateConnTimeout">30000</int>
    <int name="distribUpdateSoTimeout">15000</int>
    <str name="host">url for host</str>
    <str name="hostContext">solr</str>
    <int name="hostPort">7689</int>
    <int name="leaderVoteWait">15000</int>
    <int name="zkClientTimeout">30000</int>
    <str name="zkHost">url to host</str>
  </solrcloud>
  
  <logging>
    <str name="class">my.class.for.loggin</str>
    <str name="enabled">true|false</str>
    <watcher>
      <int name="size">12</int>
      <int name="threshold">42</int>
  </logging>
  
  <shardHandlerFactory class="qualified.path" name="handlerName">
    <int name="connTimeout">15000</int>
    <int name="socketTimeout">15000</int>
  </shardHandlerFactory>
</solr>
}}}
Line 27: Line 52:
== Finding cores ==
Line 30: Line 56:
 * instanceDir - the directory for this core. Usually you should just leave this out, it defaults to the directory that core.properties is in.  * instanceDir - the directory for this core. Usually you should just leave this out, it defaults to the directory that core.properties is in. ''This is being debated, it may not be allowed and the instanceDir may be the directory in which core.properties is found''
Line 41: Line 67:
 * configName - ''tentative'' a name (expected to be in <SOLR_HOME>/configs that contains all of the configuration files (schema.xml, solrconfig.xml and all supporting files e.g. synonyms.txt). Multiple collections can use the same set of configuration files.

Discovery-based core enumeration

<!> Solr4.4 - there's a new way of defining cores. Core discovery was introduced in 4.3.0 by SOLR-4196, but it is fundamentally broken in all 4.3 versions and not fixed until 4.4. Using the old style solr.xml is recommended until Solr4.4 is released. If you want to experiment with it now, download the source code or a nightly build for branch_4x.

TODO: Lots to flesh out here, this is just barely a start for the docs, consider it a placeholder that will be expanded as time permits. Please ask any questions that come up on the user's list or whatever, and we'll expand this page. Or edit it yourself, all help appreciated!

Cores are no longer necessarily defined in in solr.xml. For the near term, solr.xml can exist either in the new or old styles. Whether solr.xml is interpreted as new or old-style is triggered by the absence or presence of a <cores> tag.

  • If there is a <cores> element, it is assumed to be an old-style solr.xml.

  • If there is not a <cores> tag, then this is assumed to be a new-style solr.xml file and the cores are enumerated from SOLR_HOME (i.e. auto discovery).

  • In either case, if patterns are detected (say "coreLoadThreads" is found as an attribute in the <solr> tag in a new-style solr.xml file), errors are thrown.

Format for new-style solr.xml

The solr.xml file has a new format, here is a sample. NUMBERS MADE UP FOR NOW!

<solr>
  <str name="adminHandler">handler</str>
  <int name="coreLoadThreads">3</str>
  <str name="coreRootDirectory">path to core root</str>
  <str name="managementPath">path</str>
  <str name="sharedLib">lib to share</str>
  <str name="shareSchema">true|false</str> // NOTE, may be replaced by using config sets
  <int name="transientCacheSize">25</str>
  
  <solrcloud>
    <int name="distribUpdateConnTimeout">30000</int>
    <int name="distribUpdateSoTimeout">15000</int>
    <str name="host">url for host</str>
    <str name="hostContext">solr</str>
    <int name="hostPort">7689</int>
    <int name="leaderVoteWait">15000</int>
    <int name="zkClientTimeout">30000</int>
    <str name="zkHost">url to host</str>
  </solrcloud>
  
  <logging>
    <str name="class">my.class.for.loggin</str>
    <str name="enabled">true|false</str>
    <watcher>
      <int name="size">12</int>
      <int name="threshold">42</int>
  </logging>
  
  <shardHandlerFactory class="qualified.path" name="handlerName">
    <int name="connTimeout">15000</int>
    <int name="socketTimeout">15000</int>
  </shardHandlerFactory>
</solr>

Finding cores

Exploration of the core tree terminates when a file named core.properties is encountered. Discovery of a file of that name is assumed to define the root of a core. There is no a-priori limit on the depth of the tree. That is, the directories under <SOLR_HOME> are explored until a core.properties file is encountered, and that directory is assumed to be the instanceDir for that core. Nested cores are NOT supported. The core.properties file presently recognizes the following entries:

  • name - the name of the core required.
  • config - the configuration file. Defaults to solrconfig.xml
  • instanceDir - the directory for this core. Usually you should just leave this out, it defaults to the directory that core.properties is in. This is being debated, it may not be allowed and the instanceDir may be the directory in which core.properties is found

  • dataDir - the directory where the index, tlog, etc. are stored. Again, since this is discovery-based, omit this unless you have special needs.
  • ulogDir - where the transaction log resides. It may be advantageous to put the transaction lot on a different disk than the index.
  • schema - the schema file. Defaults to schema.xml
  • shard - the shard ID.
  • collection - the collection to which this core belongs
  • roles - SolrCloud role definition

  • properties - properties file to override core definitions. TBD: This is probably obsolete since we're reading a properties file in the first place. Is there a use case for supporting this now?
  • loadOnStartup - [true|false] this core should be loaded and a searcher opened when Solr starts.
  • transient - [true|false] this core may be unloaded if the core cache exceeds transientCacheSize (defined in solr.propreties)
  • coreNodeName - SolrCloud core node name

  • configName - tentative a name (expected to be in <SOLR_HOME>/configs that contains all of the configuration files (schema.xml, solrconfig.xml and all supporting files e.g. synonyms.txt). Multiple collections can use the same set of configuration files.

Core Discovery (4.4 and beyond) (last edited 2018-03-02 22:21:05 by ShawnHeisey)