Differences between revisions 4 and 5
Revision 4 as of 2009-01-22 01:05:55
Size: 1232
Editor: cn-sfo1-natout
Comment: javadocs now served off hudson not the lucene zone
Revision 5 as of 2009-09-20 21:47:40
Size: 1235
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 13: Line 13:
See also [http://en.wikipedia.org/wiki/Information_retrieval#Performance_measures wikipedia/ir]. See also [[http://en.wikipedia.org/wiki/Information_retrieval#Performance_measures|wikipedia/ir]].
Line 18: Line 18:
[http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//contrib-benchmark/org/apache/lucene/benchmark/quality/package-summary.html  search quality package] [[http://hudson.zones.apache.org/hudson/job/Lucene-trunk/javadoc//contrib-benchmark/org/apache/lucene/benchmark/quality/package-summary.html|search quality package]]
Line 29: Line 29:
 * ["TREC 2007 Million Queries Track - IBM Haifa Team"].  * [[TREC_2007_Million_Queries_Track_-_IBM_Haifa_Team]].

Reports on Search Quality Experiments with Lucene

This page is for Lucene users and developers to report on experiments of measuring or improving Lucene search quality.

Search Quality?

First question is how to define the search quality. While each new experiment reported herein may define different measures, few standard ones are

  • MAP - Mean Average Precision.
  • MRR - Mean Reciprocal Precision.
  • P@n - Precision at n, where sometimes interesting n values are 1, 5, 10, and 20.

See also wikipedia/ir.

How to Measure?

In Lucene's contrib benchmark, the search quality package can be used for quality tests. The package comes with ready to use TREC evaluation and query parsing code, as well as submission reports creation for submitting to TREC, but is open for extension to any other evaluation data and queries.

The experiments

These are the experiments reported so far. Please add yours!

SearchQualityReports (last edited 2009-09-20 21:47:40 by localhost)