Algorithm for fair multithreaded task scheduler for languages like C, C++, C#, Rust, Java. C++ version is open-sourced. Features: (1) formally verified in TLA+, (2) even CPU usage across worker threads, (3) coroutine-like functionality, (4) almost entirely lock-free, (5) up to 10 million RPS per thread.
"Task scheduling" essentially means asynchronous execution of callbacks, functions. Some kind of a "scheduler" is omnipresent in most services - an event loop; a thread-pool for blocking requests; a coroutine engine - you name it. Scheduler is an important basis on top of which the service’s logic can be built.
Gamedev is no exception. I work at Ubisoft - we have miles of code used in thousands of servers, mostly C++. There is a vast deal of task types to execute: download a save, send a chat message, join a clan, etc. Often they compose one multi-step task: (1) take a profile lock, (2) download a save, (3) free the lock, (4) respond to the player. Between each step there is a waiting time until the operation is done.
One of the game engines’ backend code had a simple scheduler generic enough to be used for every async job in all servers. It juggled tasks across several internal worker threads. But it had the following typical issues:
Unfairness. Tasks were distributed to worker threads in a round-robin. If tasks differ in duration, some threads can appear choking while others are idle.
Polling. In a naive scheduler multi-step task execution works via periodic wakeup of the task. When awake, the task checks if the current step is done and if it can go to the next one. With many thousands of tasks this polling eats notably more CPU than the actual workload.
The talk presents a new highly efficient general purpose threaded task scheduler algorithm, which solves these problems and achieves even more:
- Complete fairness - even CPU usage across worker threads and no task pinning;
- Coroutine-like - API to wake a task up on a deadline and for immediate wakeup;
- No contention - operation is mostly built on lock-free algorithms;
- Formal correctness - the scheduler is formally verified in TLA+;
After the scheduler was implemented in C++ and embedded into several highly loaded servers, it gave N-fold improvement of both RPS and latency (more than x10 speed up for one server).
At the same time the talk is not C++-specific. It is rather a presentation of several algorithms combined in the scheduler. They can be implemented in many languages: at least C, C++, Rust, Java, C#. Only need support of atomics and threads.
All that is completely open source - https://github.com/ubisoft/task-scheduler.