Solr Performance Data

Solr users are encouraged to update this page to share any information they can about how they use Solr and what kind of performance they have observed.

Pleae try to give as many specifics as you can regarding:

See also: SolrPerformanceFactors

See also: Lucene's benchmark page and this page on hardware considerations from Summa (which is also based on Lucene)

CNET Shopper.com

The numbers below are from testing done by CNET prior to launching a Solr powered Shopper.com search page. Shopper.com uses a modified version of the DisMaxRequestHandler which also does some faceted searching to pick categories for the page navigation options. On a typical request, the handler fetches the DocSets for 1500-2000 queries and intersects each with the DocSet for the main search results.

The plugin itself uses configuration nearly identical to the DisMaxRequestHandler. To give you an idea of the types of queries that it generates:

The index used in these tests contained ~400K records, took up ~900MB of disk, and was fully optimized.

During the tests, a cron job forcibly triggered a commit (even though the index hadn't changed) every 15 minutes to force a new searcher to be opened and autowarmed while the queries were being processed.

Solr was running on a 2.4GHz dual opteron (DL385) w/ 16GB memory Linux (2.6.9) using Resin 3.0.??. (I don't know the specific resin release or JVM options used)

Each remote client queried the server continously using randomly selected input from a dictionary built using live log files.

 Number of Concurrent Clients:   1       2       4       6
       
     Throughput (queries/sec):  33.9    49.2    58.2    60.1 
     Avg Response Time (secs):   0.030   0.041   0.069   0.100
    
     99.9th percentile (secs):   0.456   0.695   1.015   1.418
       99th percentile (secs):   0.245   0.301   0.496   0.661
       98th percentile (secs):   0.173   0.225   0.367   0.486
       95th percentile (secs):   0.095   0.124   0.220   0.323
       75th percentile (secs):   0.027   0.040   0.072   0.108
       50th percentile (secs):   0.017   0.024   0.042   0.063

Mailing list post "Two Solr Announcements: CNET Product Search and DisMax" describes a little more about Solr and CNET.

Netflix

Walter Underwood reports that Netflix's site search switched to being powered by Solr the week of 9/17/07:

(See http://www.nabble.com/forum/ViewPost.jtp?post=13009485&framed=y)

Walter also reported some figures from their testing phase:

At least for these test figures, they were not using fuzzy search, facets, or highlighting.

(See http://www.nabble.com/forum/ViewPost.jtp?post=12906462&framed=y)

Discogs.com

Solr powers keyword search on Discogs.com. From the email archive (alternate copy on nabble)...

I've been using Solr for keyword search on Discogs.com for a few
months with great results.

As of today Solr is running under Tomcat on a single dedicated box.
It's a 2.66Ghz P4, with 1 gig ram. The index has about 1.2 million
documents and is 1.2 gigs in size. This machine handles 250,000
queries per day with no problem. CPU load stays around 0.15 most of
the time.

HathiTrust Large Scale Solr Benchmarking

HathiTrust makes the digitized collections of some of the nation’s great research libraries available for all. We are planning to index 20 million full-text books in Solr. Our current index for 1 million full text books is about 225GB and we are getting average response times of about 1/2 a second, but the 0.5% slowest queries are taking between 10 seconds and 2 minutes. We are working on strategies to improve overall response time.

Our benchmarking efforts to date are reported in

SolrPerformanceData (last edited 2009-09-20 22:04:59 by localhost)