In oVirt datacenter virtualization environments, a manager directs hosts to initiate operations to shared storage. These operations create or remove volumes, copy data between volumes, create or merge snapshots, and various other actions related to virtual machine storage. For efficiency and balance these operations should be distributed across multiple hosts and run in parallel when possible. Maintaining reliability under real world conditions requires careful management and resilient algorithms. This talk will introduce some of the problems that can arise including: dropped communications, scheduling conflicts, and host or storage array failure. Next, a solution to these problems using shared storage locking, atomic operations, volume generations, and forensic analysis of the storage will be presented. Through step by step examples, the audience will understand how the proposed solution can solve all of the outlined problems.
Speakers: Adam Litke