In a number of workloads, primarily of analytical nature, several parts of postgres' query execution are limited by CPU speeds. One way to address that is by allowing queries to be processed by more than one core at a time; which the project started to enable with 9.6, extended a lot in version 10, with significant additional features expected in 11. But that does not address the efficiency / economics of overall query execution.
This talk will focus on increasing the efficiency of query execution by just in time compiling parts of queries. That is, instead of having roughly interpreter like code to execute the queries, emit native code doing precisely the work needed for a specific query.
The current goal of the project is to use LLVM to perform JIT of two main parts of query execution:
- expression evaluation (WHERE clause, aggregates, GROUP BY clauses, etc)
- tuple deforming (converting on-disk tuples into a more efficiently accessible in-memory representation)
and to integrate as much of that work as possible into PostgreSQL 11.
We'll discuss:
- how the above parts of the system were identified as the primary bottlenecks
- how JIT compilation can help resolve them
- which structural changes to postgres were required to allow for JIT compilation
- how JIT compilation is currently implemented, including the current development state
- the problems faced using LLVM for this task, and which improvements to it could make it better for the use-case
Depending on the available time, we'll also discuss possible future avenues for optimization.