<!> Solr4.8

{X} X-( This page is outdated and you should read about the Complex Phrase Query Parser at the Solr Reference Guide instead: https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-ComplexPhraseQueryParser. {X} X-(

Overview

The Complex phrase query parser plugin provides support for wildcards, ORs etc inside Phrase Queries.

From Complex Phrase Query Parser javadocs:

QueryParser which permits complex phrase query syntax e.g. "(john jon jonathan~) peters*"

After indexing example documents under example/exampledocs via 'java -jar post.jar *.xml' SimplePostTool utility.

The query string

q=manu:"a* c*"&defType=complexphrase

or

q={!complexphrase inOrder=true}manu:"a* c*"

will return :

http://localhost:8983/solr/collection1/select?q=manu:%22a*%20c*%22&defType=complexphrase&fl=manu

   1 <doc>
   2   <str name="manu">Apple Computer Inc.</str>
   3 </doc>
   4 <doc>
   5   <str name="manu">ASUS Computer Inc.</str>
   6 </doc>

inOrder Parameter can be set in two ways.

1) Its default value is true. If you want to set it to false in a permanent way : register query parser with a different name in solrconfig.xml

   1  <!-- Un-ordered complex phrase query parser -->
   2  <queryParser name="unorderedcomplexphrase" class="org.apache.solr.search.ComplexPhraseQParserPlugin">
   3    <bool name="inOrder">false</bool>
   4  </queryParser>

2) At query time via LocalParams.

q={!complexphrase inOrder=false df=name}"bla* pla*"

To mix ordered and unordered clauses in the same query.

+_query_:"{!complexphrase inOrder=true}manu:\"a* c*\""  +_query_:"{!complexphrase inOrder=false df=name}\"bla* pla*\""  

This was added to Solr through SOLR-1604

Limitations

maxBooleanClauses

You may need to increase

<maxBooleanClauses>1024</maxBooleanClauses>

according to index size in solrconfig.xml because

"a* c*"

is expanded into SpanNearQuery

spanNear([spanOr([manu:a, manu:america, manu:apache, manu:apple, manu:asus, manu:ati]), spanOr([manu:canon, manu:co, manu:computer, manu:corp, manu:corsair])], 0, false)

Stopwords

Lets say we add the, up, to to collection1/conf/stopwords.txt file and re-index example docs. While

q=features:"Stores up to 15,000"

returns "Stores up to 15,000 songs, 25,000 photos, or 150 hours of video",

q=features:"sto* up to 15*"&defType=complexphrase

does not return that document because SpanNearQuery has no good way to handle stopwords in a way analogous to PhraseQuery. It is recommended not to use stopword elimination with this query parser. If you really really have to remove stopwords, as a workaround, use a custom filter factory that reduces given stopwords to some impossible token.

ComplexPhraseQueryParser (last edited 2014-08-31 17:31:22 by ErickErickson)