Solr Performance Data
Solr users are encouraged to update this page to share any information they can about how they use Solr and what kind of performance they have observed.
Pleae try to give as many specifics as you can regarding:
The Hardware and OS you used
The version of Solr you used
The Servlet Container and JVM you used
Your index
The types of operations you tested (ie: updates, commits, optimizes, searchers -- the RequestHandler used, etc...)
What's your greatest performance bottleneck: CPU? Disk speed? RAM?
See also: SolrPerformanceFactors
See also:
Lucene's benchmark page
CNET Shopper.com
The numbers below are from testing done by CNET prior to launching a Solr powered
Shopper.com search page. Shopper.com uses a modified version of the DisMaxRequestHandler which also does some faceted searching to pick categories for the page navigation options. On a typical request, the handler fetches the DocSets for 1500-2000 queries and intersects each with the DocSet for the main search results.
The plugin itself uses configuration nearly identical to the DisMaxRequestHandler. To give you an idea of the types of queries that it generates:
The qf param is used to search across 10-15 fields with various boosts.
The pf param is used to phrase search across 10-15 fields with various boosts.
The bq param contains a fairly complex BooleanQuery containing ~20 terms
The bf param contains two separate boosting functions, one of which contains two nested functions.
The fq param is used to filter out ~15% of the records that we don't want to ever surface.
The index used in these tests contained ~400K records, took up ~900MB of disk, and was fully optimized.
During the tests, a cron job forcibly triggered a commit (even though the index hadn't changed) every 15 minutes to force a new searcher to be opened and autowarmed while the queries were being processed.
Solr was running on a 2.4GHz dual opteron (DL385) w/ 16GB memory Linux (2.6.9) using Resin 3.0.??. (I don't know the specific resin release or JVM options used)
Each remote client queried the server continously using randomly selected input from a dictionary built using live log files.
Number of Concurrent Clients: 1 2 4 6
Throughput (queries/sec): 33.9 49.2 58.2 60.1
Avg Response Time (secs): 0.030 0.041 0.069 0.100
99.9th percentile (secs): 0.456 0.695 1.015 1.418
99th percentile (secs): 0.245 0.301 0.496 0.661
98th percentile (secs): 0.173 0.225 0.367 0.486
95th percentile (secs): 0.095 0.124 0.220 0.323
75th percentile (secs): 0.027 0.040 0.072 0.108
50th percentile (secs): 0.017 0.024 0.042 0.063
Mailing list post
"Two Solr Announcements: CNET Product Search and DisMax" describes a little more about Solr and CNET.
Netflix
Walter Underwood reports that
Netflix's site search switched to being powered by Solr the week of 9/17/07:
Here at Netflix, we switched over our site search to Solr two weeks ago. We've seen zero problems with the server. We average 1.2 million queries/day on a 250K item index. We're running four Solr servers with simple round-robin HTTP load-sharing.
This is all on 1.1. I've been too busy tuning to upgrade.
(See
http://www.nabble.com/forum/ViewPost.jtp?post=13009485&framed=y)
Walter also reported some figures from their testing phase:
We are searching a much smaller collection, about 250K docs, with great success. We see 80 queries/sec on each of four servers, and response times under 100ms. Each query searches against seven fields.
At least for these test figures, they were not using fuzzy search, facets, or highlighting.
(See
http://www.nabble.com/forum/ViewPost.jtp?post=12906462&framed=y)
Discogs.com
Solr powers keyword search on
Discogs.com. From the
email archive (
alternate copy on nabble)...
I've been using Solr for keyword search on Discogs.com for a few months with great results. As of today Solr is running under Tomcat on a single dedicated box. It's a 2.66Ghz P4, with 1 gig ram. The index has about 1.2 million documents and is 1.2 gigs in size. This machine handles 250,000 queries per day with no problem. CPU load stays around 0.15 most of the time.