This presentation describes the new read (aka primary) balancer that is added to Ceph next version (Reef) and explains how the framework developed as part of this balancer for more sophisticated use cases. Specifically, it shows how you can use this framework and creates a policy that changes the SDS load dynamically so it can mitigate effects such as noisy neighbors and faulty network devices (NICs or ToR switch) without moving data around. This can be very useful when the effects described are temporary (for example noisy neighbor in hyper-converged environment)
The new balancer is based on a policy that defines the desired primary configuration and the engine that changes the configuration to meet the desired configuration (or at least be as close to it as possible). The engine is fast and involves no data movement (only metadata changes). As a result, it can be executed periodically over relatively short periods. Given this feature, anyone can write policies that react to changes in the cluster behavior in near real time. We will show some use cases and the logic to build policies that maximize the cluster performance for these use cases.
Speakers: Josh Salomon