Differences between revisions 5 and 6
Revision 5 as of 2009-02-02 09:01:34
Size: 2272
Comment:
Revision 6 as of 2009-09-20 22:05:02
Size: 2274
Editor: localhost
Comment: converted to 1.6 markup
Deletions are marked like this. Additions are marked like this.
Line 3: Line 3:
Databases and Solr have complementary strengths and weaknesses. SQL supports very simple wildcard-based text search with some simple normalization like matching upper case to lower case. The problem is that these are full table scans. In Solr all searchable words are stored in an "inverse index", which searches orders of magnitude faster. For a more complete description of Solr's features, see [http://lucene.apache.org/solr/features.html]. Databases and Solr have complementary strengths and weaknesses. SQL supports very simple wildcard-based text search with some simple normalization like matching upper case to lower case. The problem is that these are full table scans. In Solr all searchable words are stored in an "inverse index", which searches orders of magnitude faster. For a more complete description of Solr's features, see [[http://lucene.apache.org/solr/features.html]].

Assuming the user has a relational DB, why use Solr? If your use case requires a person to type words into a search box, you want a text search engine like Solr.

Databases and Solr have complementary strengths and weaknesses. SQL supports very simple wildcard-based text search with some simple normalization like matching upper case to lower case. The problem is that these are full table scans. In Solr all searchable words are stored in an "inverse index", which searches orders of magnitude faster. For a more complete description of Solr's features, see http://lucene.apache.org/solr/features.html.

Solr v.s. Relational Database

Lucene

Solr

Relational DB

Text Search

Fast and sophisticated

Minimal and slow

Features

Few, targeted to text search

Many

Deployment Complexity

Medium

Medium

Administration Tools

Minimal open source projects

Many open source & commercial

Monitoring Tools

Weak

Very Strong

Scaling Tools

Automated, medium scale

Large scale

Support Availability

Weak

Strong

Schema Flexibility

Must in general rebuild

Changes immediately visible

Indexing Speed

Slow

Faster and adjustable

Query Speed

Text search is fast & predictable

Very dependent on design & use case

Row Addition/Extraction Speed

Slow

Fast

Partial Record Modification

No

Yes

Time to visibility after addition

Slow

Immediate

Access to internal data structures

High

None

Technical knowledge required

Java (minimal), web server deployment, IT

SQL, DB-specific factors, IT

Regular maintenance tasks

From a database perspective, a Lucene index can be thought of as one DB table with very fast lookups and interesting enhancements for text search. This index is relatively expensive in space and creation time. Solr wraps this API with a full-featured front end, providing these additions:

  • schema design and text processing features that match most Lucene deployments
  • clean deployment as a web service for indexing and searching
  • convenient scalability across multiple servers
  • learning curve & adoption improvement of ~2 orders of magnitude

WhyUseSolr (last edited 2009-09-20 22:05:02 by localhost)