Python is a beautiful language for fast prototyping and and sketching ideas quickly. People often struggle to get their code into production though for various reasons. Besides of all security and safety concerns that usually are not addressed from the very beginning when playing around with an algorithmic idea, performance concerns are quite frequently a reason for not taking the Python code to the next level. We will look at the "missing performance" worries using a simple numerical problem and how to speed the corresponding Python code up to top notch performance.
We all know how much fun it is to play around with an algorithmic idea in Python. It's very satisfying to see the idea develop, doing what it's supposed to be doing and how simple and elegant the code finally looks like. Python being so feature complete with its standard library and the 3rd party universe of libraries and packages allows development to be very quick. And, we're all very grateful to be able to focus on the problem itself, not on the language specifics, to solve it. But when we're arriving at the point where everything just works, there is this one last step that needs to be mastered: Get it into production to finally let it do what it was supposed to be doing and make life easier for all of us. But at that stage there are those final hurdles, and they usually feel giant, that arise unpleasant questions. Will the algorithm really do what it was supposed to be doing under all circumstances? Will it be safe? What if it fails? Will it actually be fast enough for all the data it needs to process in production? Will it be capable of doing its job in the future, when the amount of work grows? Whilst the first worries usually can be addressed well using established software engineering habits and patterns, the performance related issue is often seen as the killer on the way to production use, as Python is still considered to be slow just based on the fact that it is an interpreted language. Quite often code is rewritten after the prototyping phase in other languages considered to be fast, such as C++ for example, for this very reason. We'll look at exactly this point and explore ways to accelerate Python code by simple modifications and using third party libraries to support us. To do that we will look at some code to solve a simple numerical problem - calculating the Mandelbrot Set - as it is well suited for this and quite simple to follow. Yet it generates stunning and beautiful results entertaining us through the course of the presentation. The strategies shown to accelerate the code, based on concepts taken from standard library, PyPy, numpy, numba and dask, however are transferable to other algorithmic problems as well. We will analyse the advantages as well as the drawbacks for each concept to see the overall effect and where else the solution might apply.
Speakers: Jens Nie