Super-speedy scoring in Lucene 8

FOSDEM 2019

Lucene 8 will have some remarkable speed-ups when it comes to querying across large datasets. In this talk I will describe how this has been implemented, from new data structures through to changes in the scoring API, and the trade-offs required to make them possible.

Lightning overview of the structure of an inverted index, showing how current queries are executed and documents scored
Extension of an inverted index to include Impacts (single level, multi-level)
Given impacts, show how a scorer can skip large blocks of documents
Restrictions on Similarity implementations to make this possible (scores must increase with docfreq, must be greater than 0, etc)

Speakers: Alan Woodward