|
Size: 1707
Comment:
|
← Revision 5 as of 2009-09-20 21:47:55 ⇥
Size: 1707
Comment: converted to 1.6 markup
|
| No differences found! | |
It can be confusing at times to determine how to use the IndexReader, IndexWriter, and IndexSearcher. The semantics of some of the methods are tricky as in a sense when you create one of these objects you're starting a transaction and isolated from what is happening in other objects.
If you know you're going to update multiple documents, then the fastest approach is to batch things, e.g.:
- Open reader;
- Delete all old documents;
- Close reader;
- Open writer;
- Add all new documents;
- Close writer.
If, before step one, you open another IndexReader, then you can continue to use it for searches while the update is in progress. If you then, after step six, open a new IndexReader to use for searches, then no searches will ever see the intermediate state when documents have been deleted but not yet re-added.
If you're doing updates (as opposed to just additions) then you probably want to do something like:
keep a single open IndexReader used by all searches
- Every few minutes, process updates as follows:
open a second IndexReader
- delete all documents that will be updated
close this IndexReader, to flush deletions
open an IndexWriter
- add all documents that are updated
close the IndexWriter
replace the IndexReader used for searches (1, above)
Here are some links where these ideas came from:
http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-user@jakarta.apache.org&msgNo=7191
http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-user@jakarta.apache.org&msgId=1190557
http://nagoya.apache.org/eyebrowse/ReadMsg?listName=lucene-user@jakarta.apache.org&msgNo=3206