conferences | speakers | series

Introduction to Async programming

home

Introduction to Async programming
PyCon DE & PyData Berlin 2023

Asynchronous programming is a type of parallel programming in which a unit of work is allowed to run separately from the primary application thread. Post execution, it notifies the main thread about the completion or failure of the worker thread. There are numerous benefits to using it, such as improved application performance, enhanced responsiveness, and effective usage of CPU. Asynchronicity seems to be a big reason why Node.js is so popular for server-side programming. Most of the code we write, especially in heavy IO applications like websites, depends on external resources. This could be anything from a remote database POST API call. As soon as you ask for any of these resources, your code is waiting around for process completion with nothing to do. With asynchronous programming, you allow your code to handle other tasks while waiting for these other resources to respond. In this session, we are going to talk about asynchronous programming in Python. Its benefits and multiple ways to implement it.

How Do We Implement Asynchronicity in Python? 1. Multiple Processes: The most obvious way is to use multiple processes. From the terminal, you can start multiple scripts, and then all the scripts are going to run independently or at the same time. The operating system that's underneath will take care of sharing your CPU resources among all those instances. Alternatively you can use the multiprocessing library which supports spawning processes as shown in the example below. 2. Multiple Threads: The next way to run multiple things at once is to use threads. A thread is a line of execution, pretty much like a process, but you can have multiple threads in the context of one process and they all share access to common resources. But because of this, it's difficult to write a threading code. And again, the operating system is doing all the heavy lifting on sharing the CPU, but the global interpreter lock (GIL) allows only one thread to run Python code at a given time even when you have multiple threads running code. So, In CPython, the GIL prevents multi-core concurrency. Basically, you’re running in a single core even though you may have two or four or more. 3. Coroutines using yield: Coroutines are generalizations of subroutines. They are used for cooperative multitasking where a process voluntarily yield (gives away) control periodically or when idle in order to enable multiple applications to be run simultaneously. 4. Asynchronous Programming: The fourth way is asynchronous programming, where the OS is not participating is asyncio. Asyncio is the new concurrency module introduced in Python 3.4. It is designed to use coroutines and futures to simplify asynchronous code and make it almost as readable as synchronous code as there are no callbacks. 5. Using Redis and Redis Queue(RQ): Using asyncio and aiohttp may not always be in an option especially if you are using older versions of python. Also, there will be scenarios when you would want to distribute your tasks across different servers. In that case, we can leverage RQ (Redis Queue). It is a simple Python library for queueing jobs and processing them in the background with workers. It is backed by Redis - a key/value data store. A practical definition of Async is that it's a style of concurrent programming in which tasks release the CPU during waiting periods, so that other tasks can use it. In Python, there are several ways to achieve concurrency, based on our requirement, code flow, data manipulation, architecture design, and use cases we can select any of these methods.

Speakers: Dishant Sethi