To understand the fundamental ideas behind Lucene, you should first get familiar with InformationRetrieval. This page tries to collect links to resources that explain some advanced topics.
In addition to VInt encoding, Lucene supports (or plans to support) other postings list encoding formats (FOR-delta, PFOR-delta, Simple9, ...).
An optimized codec for fields that have lots of rare terms.
Improved concurrency of index updates.
In addition to its binary-search based terms dictionary, Lucene has a "block tree" terms dictionary, inspired of burst tries.
Lucene has an optimized range query implementation for numeric types:
BKD trees have been implemented to support geo capabilities in Lucene and have superseded NumericRangeQuery for one-dimensional data.
Lucene 4.0 supports an improved fuzzy query implementation that is based on Levenshtein automata.
In addition to its default TF-IDF scoring algorithm, Lucene supports other scoring models such as Okapi BM25 and models based on language models.
The below paper describes implementation ideas behind Lucene's FeatureField to fold non-textual static signals like pagerank, url length, etc. into the final score.
Block MAX WAND is an iteration over WAND that helps efficiently skip scoring non-relevant documents.
Lucene uses FSTs a lot, so their in-memory size is important.
Modifications that Twitter made to Lucene to support lock-free updates and efficient early query termination for time-based relevance.