Super-speedy scoring in Lucene 8
Lucene 8 will have some remarkable speed-ups when it comes to querying across large datasets. In this talk I will describe how this has been implemented, from new data structures through to changes in the scoring API, and the trade-offs required to make them possible.
- Lightning overview of the structure of an inverted index, showing how current queries are executed and documents scored
- Extension of an inverted index to include Impacts (single level, multi-level)
- Given impacts, show how a scorer can skip large blocks of documents
- Restrictions on Similarity implementations to make this possible (scores must increase with docfreq, must be greater than 0, etc)
Speakers:
Alan Woodward