Federated Search Design

Follow the basic Lucene design for MultiSearcher/RemoteSearcher as a template.

Areas that will need change:

Network Transports


High Availability

How can High Availability be obtained on the query side?


How should the collection be updated? It would be complex for the client to partition the data themselves, since they would have to ensure that a particular document always went to the same server. Although user partitioning should be possible, there should be an easier default.

Single Master

A single master could partition the data into multiple local indicies... subsearchers would only pull the local index they are configured to have.


How to synchronize commits across subsearchers and top-level-searchers?