FunctionQuery allows one to use the actual value of a numeric field and functions of those fields in a relevancy score.
Using FunctionQuery
There are a few ways to use FunctionQuery from Solr's HTTP interface:
Embed a FunctionQuery in a regular query expressed in SolrQuerySyntax via the _val_ hook
Use the FunctionQParserPlugin, ie: q={!func}log(foo)
Use a parameter that has an explicit type of FunctionQuery, such as DisMaxRequestHandler's bf (boost function) parameter.
NOTE: the bf parameter actually takes a list of function queries separated by whitespace and each with an optional boost. Make sure to eliminate any internal whitespace in single function queries when using bf. Example: q=dismax&bf="ord(popularity)^0.5 recip(rord(price),1,1000,1000)^0.3"
Function Query Syntax
There is currently no infix parser - functions must be expressed as function calls (e.g. sum(a,b) instead of a+b)
Available Functions
constant
Solr1.3 Floating point constants.
Example Syntax: 1.5
SolrQuerySyntax Example: _val_:1.5
fieldvalue
This function returns the numeric field value of an indexed field with a maximum of one value per document (not multiValued). The syntax is simply the field name by itself. 0 is returned for documents without a value in the field.
Example Syntax: myFloatField
SolrQuerySyntax Example: _val_:myFloatField
ord
ord(myfield) returns the ordinal of the indexed field value within the indexed list of terms for that field in lucene index order (lexicographically ordered by unicode value), starting at 1. In other words, for a given field, all values are ordered lexicographically; this function then returns the offset of a particular value in that ordering. The field must have a maximum of one value per document (not multiValued). 0 is returned for documents without a value in the field.
Example: If there were only three values for a particular field: "apple","banana","pear", then ord("apple")=1, ord("banana")=2, ord("pear")=3
Example Syntax: ord(myIndexedField)
Example SolrQuerySyntax: _val_:"ord(myIndexedField)"
WARNING: ord() depends on the position in an index and can thus change when other documents are inserted or deleted, or if a MultiSearcher is used.
rord
The reverse ordering of what ord provides.
Example Syntax: rord(myIndexedField)
Example: rord(myDateField) is a metric for how old a document is: the youngest document will return 1, the oldest document will return the total number of documents.
sum
Solr1.3 sum(x,y,...) returns the sum of multiple functions.
Example Syntax: sum(x,1)
Example Syntax: sum(x,y)
Example Syntax: sum(sqrt(x),log(y),z,0.5)
product
Solr1.3 product(x,y,...) returns the product of multiple functions.
Example Syntax: product(x,2)
Example Syntax: product(x,y)
div
Solr1.3 div(x,y) divides the function x by the function y.
Example Syntax: div(1,x)
Example Syntax: div(sum(x,100),max(y,1))
pow
Solr1.3 pow(x,y) raises the base x to the power y.
Example Syntax: pow(x,0.5) same as sqrt
Example Syntax: pow(x,log(y))
abs
Solr1.3 abs(x) returns the absolute value of a function.
Example Syntax: abs(-5)
Example Syntax: abs(x)
log
Solr1.3 log(x) returns log base 10 of the function x.
Example Syntax: log(x)
Example Syntax: log(sum(x,100))
sqrt
Solr1.3 sqrt(x) returns the square root of the function x
Example Syntax: sqrt(2)
Example Syntax: sqrt(sum(x,100))
map
Solr1.3 map(x,min,max,target) maps any values of the function x that fall within min and max inclusive to target. min,max,target are constants. It outputs the field's value if it does not fall between min and max.
Example Syntax 1: map(x,0,0,1) change any values of 0 to 1... useful in handling default 0 values
Example Syntax 2: map(x,0,0,1,0) change any values of 0 to 1 . and if the value is not zero it can be set to the value of the 5th argument instead of defaulting to the field's value
scale
Solr1.3 scale(x,minTarget,maxTarget) scales values of the function x such that they fall between minTarget and maxTarget inclusive.
Example Syntax: scale(x,1,2) all values will be between 1 and 2 inclusive.
NOTE: The current implementation currently traverses all of the function values to obtain the min and max so it can pick the correct scale.
NOTE: This implementation currently cannot distinguish when documents have been deleted or documents that have no value, and 0.0 values will be used for these cases. This means that if values are normally all greater than 0.0, one can still end up with 0.0 as the min value to map from. In these cases, an appropriate map() function could be used as a workaround to change 0.0 to a value in the real range. example: scale(map(x,0,0,5),1,2)
query
Solr1.4 query(subquery, default) returns the score for the given subquery, or the default value for documents not matching the query. Any type of subquery is supported through either parameter dereferencing $otherparam or direct specification of the query string in the LocalParams via "v".
:TODO:
need to define "via v"
Example Syntax: q=product(popularity, query({!dismax v='solr rocks'}) returns the product of the popularity and the score of the dismax query.
Example Syntax: q=product(popularity, query($qq)&qq={!dismax}solr rocks is equivalent to the previous query, using param dereferencing.
Example Syntax: q=product(popularity, query($qq,0.1)&qq={!dismax}solr rocks specifies a default score of 0.1 for documents that don't match the dismax query.
linear
linear(x,m,c) implements m*x+c where m and c are constants and x is an arbitrary function. This is equivalent to sum(product(m,x),c), but slightly more efficient as it is implemented as a single function.
Example Syntax: linear(x,2,4) returns 2*x+4
recip
A reciprocal function with recip(myfield,m,a,b) implementing a/(m*x+b). m,a,b are constants, x is any arbitrarily complex function.
When a and b are equal, and x>=0, this function has a maximum value of 1 that drops as x increases. Increasing the value of a and b together results in a movement of the entire function to a flatter part of the curve. These properties can make this an ideal function for boosting more recent documents when x is rord(datefield).
Example Syntax: recip(rord(creationDate),1,1000,1000)
max
max(x,c) returns the max of another function and a constant. Useful for "bottoming out" another function at some constant.
Example Syntax: max(myfield,0)
top
Solr1.4 Causes it's function query argument to derive it's values from the top-level IndexReader containing all parts of an index. For example, the ordinal of a value in a single segment will be different from the ordinal of that same value in the complete index. The ord() and rord() functions implicitly use top() and hence ord(foo) is equivalent to top(ord(foo)).
General Example
To give more idea about the use of the function query, suppose index stores dimensions in meters x, y,z of some hypothetical boxes with arbitrary names stored in field boxname. Suppose we want to search for box matching name findbox but ranked according to volumes of boxes, the query params would be:
q=boxname:findbox _val_:"product(product(x,y),z)".
Although this will rank the results based on volumes but in order to get the computed volume you will need to add parameter:
&fl=*, score where score will contain the resultant volume.
Suppose you also have a field containing weight of the box as 'weight', then to sort by the density of the box and return the value of the density in score you query should be:
http ://localhost:8983/solr/select/?q=boxname:findbox _val_:"div(weight,product(product(x,y),z))"&fl=boxname x y z weight score