Differences between revisions 4 and 5
Revision 4 as of 2006-11-14 20:33:09
Size: 4484
Editor: HossMan
Comment:
Revision 5 as of 2006-11-14 22:45:43
Size: 7753
Editor: HossMan
Comment:
Deletions are marked like this. Additions are marked like this.
Line 60: Line 60:
The [http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html Similarity] is a native lucene concept that determines how much of hte score calculations for the various types of queries are executed. For more information on how the methods in the Similarity class are used, consult the [http://lucene.apache.org/java/docs/scoring.html Lucene scoring documentation]. If you wish to override the !DefaultSimilarity provided by Lucene, you can specify your own subclass in your [wiki:SchemaXml schema.xml]... The [http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html Similarity] class is a native lucene concept that determines how much of hte score calculations for the various types of queries are executed. For more information on how the methods in the Similarity class are used, consult the [http://lucene.apache.org/java/docs/scoring.html Lucene scoring documentation]. If you wish to override the !DefaultSimilarity provided by Lucene, you can specify your own subclass in your [wiki:SchemaXml schema.xml]...
Line 69: Line 69:
The [http://incubator.apache.org/solr/docs/api/org/apache/solr/search/CacheRegenerator.html CacheRegenerator] API allows people who are writing custom !SolrRequestHandlers which utilize custom [wiki:SolrCaching User Caches] to specify how those caches should be populated during autowarming. A regenerator class can be specified hen the cache is declared in your [wiki:SolrConfigXml solrconfig.xml]... The [http://incubator.apache.org/solr/docs/api/org/apache/solr/search/CacheRegenerator.html CacheRegenerator] API allows people who are writing custom !SolrRequestHandlers which utilize custom [wiki:SolrCaching User Caches] to specify how those caches should be populated during autowarming. A regenerator class can be specified when the cache is declared in your [wiki:SolrConfigXml solrconfig.xml]...
Line 86: Line 86:
=== TokenizerFactory ===
=== TokenFilterFactory ===

The [http://lucene.apache.org/java/docs/api/org/apache/lucene/analysis/Analyzer.html Analyzer] class is a native lucene concept that determines how tokens are produced from a piece of text. Solr allows Analyzers to be specified for each fieldtype in your [wiki:SchemaXml schema.xml] that uses the !TextField class -- and even allows for differnent Analyzers to be specified for indexing text as documents are added, and parsing text specified in a query...

{{{
    <fieldtype name="text_foo" class="solr.TextField">
      <analyzer class="my.package.CustomAnalyzer"/>
    </fieldType>
    <fieldtype name="text_bar" class="solr.TextField">
      <analyzer type="index" class="my.package.CustomAnalyzerForIndexing"/>
      <analyzer type="query" class="my.package.CustomAnalyzerForQuering"/>
    </fieldType>
}}}

Solr also provides a [http://incubator.apache.org/solr/docs/api/org/apache/solr/analysis/SolrAnalyzer.html SolrAnalyzer] base class which can be used if you want to write your own Analyzer and configure the "positionIncrimentGap" in your schema.xml...

{{{
    <fieldtype name="text_baz" class="solr.TextField" positionIncrementGap="100">
      <analyzer class="my.package.CustomSolrAnalyzer" />
    </fieldType>
}}}

Specifing an Analyzer class in your schema.xml makes a lot of sense if you already have an existing Analyzer you wish to use as is, but if you are planning to write Analysis code from scratch that you would like to use in Solr, you should keep reading the following sections...

=== Tokenizers and TokenFilters ===

In addition to specifing specific Analyzer classes, Solr can construct Analyzers on the fly for each field type using a [http://lucene.apache.org/java/docs/api/org/apache/lucene/analysis/Tokenizer.html Tokenizer] and any number of [http://lucene.apache.org/java/docs/api/org/apache/lucene/analysis/TokenFilter.html TokenFilters]. To take advantage of this functionality with any Tokenizers or !TokenFilters you may have (or may want to impliment) you'll need to provide a [http://incubator.apache.org/solr/docs/api/org/apache/solr/analysis/TokenizerFactory.html TokenizerFactory] and [http://incubator.apache.org/solr/docs/api/org/apache/solr/analysis/TokenFilterFactory.html TokenFilterFactory] which takes care of any initialization and configuration, and specify these Factories in your [wiki:SchemaXml schema.xml]...

{{{
    <fieldtype name="text_zop" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
          <tokenizer class="my.package.CustomTokenizerFactory"/>
          <!-- this TokenFilterFactory has custom options -->
          <filter class="my.package.CustomTokenFilter" optA="yes" optB="maybe" optC="42.5"/>
          <!-- Solr has many existing FilterFactories that you can reuse -->
          <filter class="solr.StopFilterFactory" ignoreCase="true"/>
      </analyzer>
    </fieldtype>
}}}
Line 89: Line 126:

If you have very specialized data type needs, you can specify your own [http://incubator.apache.org/solr/docs/api/org/apache/solr/schema/FieldType.html FieldType] class for each <`fieldtype>` you declare in your [wiki:SchemaXml schema.xml], to control how the values for those fields are encoded in your index...

{{{
    <fieldtype name="wacko" class="my.package.CustomFieldType" />
}}}

Solr Plugins

Solr allows you to load custom code to perform a variety of of tasks within Solr -- from custom custom Request Handlers to process your searches, to custom Analyzers and Token Filters for your text field, even custom Field Types. TableOfContents

How to Load Plugins

Plugin code can be loaded into Solr by putting Jars containing your classes in a lib directory in your Solr Home directory prior to starting your servlet container.

This is a relatively new feature (as of 2006-11-13) which uses a custom Class Loader. It's not yet clear exactly how successful this approach works on the multitudes of servlet containers available in the wild.

The Old Way

Another method that works consistently on any servlet container is to:

  1. unpack the solr.war
  2. add a jar containing your custom classes to the WEB-INF/lib directory

  3. repack your new, customized, solr.war and use it.

List of Classes that are 'Pluggable'

The following is a complete list of every API that can be treated as a plugin in Solr, with information on how to use that configure your Solr instance to use an instance of that class.

/!\ :TODO: /!\ for each class, link to javadocs and show example of where/how to configure usage

Request Processing

SolrRequestHandler

Instances of [http://incubator.apache.org/solr/docs/api/org/apache/solr/request/SolrRequestHandler.html SolrRequestHandler] define the logic that is executed for any request. Multiple handlers (including multiple instances of the same SolrRequestHandler class with different configurations) can be specified in your [wiki:SolrConfigXml solrconfig.xml]...

  <requestHandler name="foo" class="my.package.CustomRequestHandler" />
  <requestHandler name="bar" class="my.package.AnotherCustomRequestHandler" />
  <requestHandler name="baz" class="my.package.AnotherCustomRequestHandler">
    <!-- initialization args may optionally be defined here -->
     <lst name="defaults">
       <int name="rows">10</int>
       <str name="fl">*</str>
       <str name="version">2.1</str>
     <lst>
     <int name="someConfigValue">42</int>
  </requestHandler>

QueryResponseWriter

Instances of [http://incubator.apache.org/solr/docs/api/org/apache/solr/request/QueryResponseWriter.html QueryResponseWriter] define the formatting used to output the results of a request. Multiple writers (including multiple instances of the same QueryResponseWriter class with different configurations) can be specified in your [wiki:SolrConfigXml solrconfig.xml]...

  <queryResponseWriter name="wow" class="my.package.CustomResponseWriter" />
  <queryResponseWriter name="woz" class="my.package.AnotherCustomResponseWriter" />
  <queryResponseWriter name="woz" class="my.package.AnotherCustomResponseWriter" >
    <!-- initialization args may optionally be defined here -->
    <int name="someConfigValue">42</int>
  </queryResponseWriter> 

Similarity

The [http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Similarity.html Similarity] class is a native lucene concept that determines how much of hte score calculations for the various types of queries are executed. For more information on how the methods in the Similarity class are used, consult the [http://lucene.apache.org/java/docs/scoring.html Lucene scoring documentation]. If you wish to override the DefaultSimilarity provided by Lucene, you can specify your own subclass in your [wiki:SchemaXml schema.xml]...

  <similarity class="my.package.CustomSimilarity"/>

CacheRegenerator

The [http://incubator.apache.org/solr/docs/api/org/apache/solr/search/CacheRegenerator.html CacheRegenerator] API allows people who are writing custom SolrRequestHandlers which utilize custom [wiki:SolrCaching User Caches] to specify how those caches should be populated during autowarming. A regenerator class can be specified when the cache is declared in your [wiki:SolrConfigXml solrconfig.xml]...

    <cache name="myCustomCacheInstance"
      class="solr.LRUCache"
      size="4096"
      initialSize="1024"
      autowarmCount="1024"
      regenerator="my.package.CustomCacheRegenerator"
      />

Fields

Analyzer

The [http://lucene.apache.org/java/docs/api/org/apache/lucene/analysis/Analyzer.html Analyzer] class is a native lucene concept that determines how tokens are produced from a piece of text. Solr allows Analyzers to be specified for each fieldtype in your [wiki:SchemaXml schema.xml] that uses the TextField class -- and even allows for differnent Analyzers to be specified for indexing text as documents are added, and parsing text specified in a query...

    <fieldtype name="text_foo" class="solr.TextField">
      <analyzer class="my.package.CustomAnalyzer"/>
    </fieldType>
    <fieldtype name="text_bar" class="solr.TextField">
      <analyzer type="index" class="my.package.CustomAnalyzerForIndexing"/>
      <analyzer type="query" class="my.package.CustomAnalyzerForQuering"/>
    </fieldType>

Solr also provides a [http://incubator.apache.org/solr/docs/api/org/apache/solr/analysis/SolrAnalyzer.html SolrAnalyzer] base class which can be used if you want to write your own Analyzer and configure the "positionIncrimentGap" in your schema.xml...

    <fieldtype name="text_baz" class="solr.TextField" positionIncrementGap="100">
      <analyzer class="my.package.CustomSolrAnalyzer" />
    </fieldType>

Specifing an Analyzer class in your schema.xml makes a lot of sense if you already have an existing Analyzer you wish to use as is, but if you are planning to write Analysis code from scratch that you would like to use in Solr, you should keep reading the following sections...

Tokenizers and TokenFilters

In addition to specifing specific Analyzer classes, Solr can construct Analyzers on the fly for each field type using a [http://lucene.apache.org/java/docs/api/org/apache/lucene/analysis/Tokenizer.html Tokenizer] and any number of [http://lucene.apache.org/java/docs/api/org/apache/lucene/analysis/TokenFilter.html TokenFilters]. To take advantage of this functionality with any Tokenizers or TokenFilters you may have (or may want to impliment) you'll need to provide a [http://incubator.apache.org/solr/docs/api/org/apache/solr/analysis/TokenizerFactory.html TokenizerFactory] and [http://incubator.apache.org/solr/docs/api/org/apache/solr/analysis/TokenFilterFactory.html TokenFilterFactory] which takes care of any initialization and configuration, and specify these Factories in your [wiki:SchemaXml schema.xml]...

    <fieldtype name="text_zop" class="solr.TextField" positionIncrementGap="100">
      <analyzer>
          <tokenizer class="my.package.CustomTokenizerFactory"/>
          <!-- this TokenFilterFactory has custom options -->
          <filter class="my.package.CustomTokenFilter" optA="yes" optB="maybe" optC="42.5"/>
          <!-- Solr has many existing FilterFactories that you can reuse -->
          <filter class="solr.StopFilterFactory" ignoreCase="true"/>
      </analyzer>
    </fieldtype>

FieldType

If you have very specialized data type needs, you can specify your own [http://incubator.apache.org/solr/docs/api/org/apache/solr/schema/FieldType.html FieldType] class for each <fieldtype> you declare in your [wiki:SchemaXml schema.xml], to control how the values for those fields are encoded in your index...

    <fieldtype name="wacko" class="my.package.CustomFieldType" />

Internals

SolrCache

SolrEventListener

UpdateHandler

SolrPlugins (last edited 2017-01-03 19:18:41 by ShawnHeisey)