Ensuring that top-of-trunk consistently generates high-quality code
remains harder than it should be. Continuous integration (CI) setups
that track correctness of top-of-trunk work pretty well today since
they automatically report correctness regressions with low false
positive rate to committers. In comparison, the output generated by CI
setups that track performance require far more human effort to
interpret.
In this talk, I’ll describe why I think effective performance tracking
is hard and what problems need solving, with a focus on our real world
experiences and observations.
As part of the bring-up of one of the public performance tracking
bots, I’ve done an in-depth analysis of its performance and noise
characteristics. The insights gained from this analysis drove a number
of improvements to LNT and the test-suite in the past year. I hope
that sharing these insights will help others in setting up low-noise
performance-tracking bots.
I’ll conclude by summarizing what seem to be the most important
missing pieces of CI functionality to make the performance-tracking
infrastructure as effective as the correctness-tracking
infrastructure.
This presentation has been given before at the US LLVM dev meeting
in San Jose. Given the interest there and the mostly non-overlapping
audience between the FOSDEM llvm dev room and the US dev meeting, I
think it's worthwhile to repeat this presentation at FOSDEM.