MoreLikeThis

There are two ways to access MoreLikeThis from solr: from the MoreLikeThisHandler or with the MoreLikeThisComponent in SearchHandler.

Solr1.3

TermVectors, Analyzers and MoreLikeThis

MoreLikeThis constructs a lucene query based on terms within a document. For best results, use stored TermVectors in the schema.xml for fields you will use for similarity.

 <field name="cat" ... termVectors="true" />

If termVectors are not stored, MoreLikeThis will generate terms from stored fields.

Common Parameters

param	description	defaults (from Solr3.6 MoreLikeThis.java)
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="bbd9801b-cae9-473d-afe0-1ec0961ef44c"><ac:plain-text-body><![CDATA[	mlt.fl	The fields to use for similarity. NOTE: if possible, these should have a stored [TermVector]	DEFAULT_FIELD_NAMES = new String[] {"contents"}	]]></ac:plain-text-body></ac:structured-macro>
mlt.mintf	Minimum Term Frequency - the frequency below which terms will be ignored in the source doc.	DEFAULT_MIN_TERM_FREQ = 2
mlt.mindf	Minimum Document Frequency - the frequency at which words will be ignored which do not occur in at least this many docs.	DEFAULT_MIN_DOC_FREQ = 5
mlt.minwl	minimum word length below which words will be ignored.	DEFAULT_MIN_WORD_LENGTH = 0
mlt.maxwl	maximum word length above which words will be ignored.	DEFAULT_MAX_WORD_LENGTH = 0
mlt.maxqt	maximum number of query terms that will be included in any generated query.	DEFAULT_MAX_QUERY_TERMS = 25
mlt.maxntp	maximum number of tokens to parse in each example doc field that is not stored with TermVector support.	DEFAULT_MAX_NUM_TOKENS_PARSED = 5000
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="1707a566-5144-48e4-90d7-e55f91b6d1f3"><ac:plain-text-body><![CDATA[	mlt.boost	[true/false] set if the query will be boosted by the interesting term relevance.	DEFAULT_BOOST = false	]]></ac:plain-text-body></ac:structured-macro>
mlt.qf	Query fields and their boosts using the same format as that used in DisMaxQParserPlugin. These fields must also be specified in mlt.fl.

MoreLikeThisComponent

This method returns similar documents for each document in the response set. Perhaps this should be called "MoreLikeThese"

param	description
mlt	'true' to enable MoreLikeThis results
mlt.count	The number of similar documents to return for each result.

examples

http://localhost:8983/solr/select?q=apache&mlt=true&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&fl=id,score

MoreLikeThisHandler

When you specifically want information about similar documents, you can use the MoreLikeThisHandler.

If you want to filter the similar results given by MoreLikeThis you have to use the MoreLikeThisHandler. It will consider the similar document result set as the main one so will apply the specified filters (fq) on it. If you use the MoreLikeThisComponent and apply query filters it will be applyed to the result set returned by the main query (QueryComponent) and not to the one returned by the MoreLikeThisComponent.

Space shortcuts

Page tree

TermVectors, Analyzers and MoreLikeThis

Common Parameters

MoreLikeThisComponent

MoreLikeThisHandler