The talk will begin with an overview of the Synnefo architecture (Python,
Django, Ganeti, and Xen or KVM). Then, it will introduce Archipelago, a
software-defined, distributed storage layer that decouples Volume and File
operations/logic from the underlying storage technology, used to store the
actual data. Archipelago provides a unified way to provision, handle and
present Files, Images, Volumes and Snapshots independently of the storage
backend. We will focus on using Ceph/RADOS as a backend for Archipelago.
With Archipelago and its integration with Synnefo and Ganeti one can:
* maintain commonly-used data sets as snapshots and attach them as read-only
disks or writeable clones in VMs running the application,
* begin with a base OS image, bundle all application and supporting
library code inside it, and upload it to the Archipelago backend in a
syncing, Dropbox-like manner,
* start a parallel HPC application in hundreds of VMs, thinly provisioned
from this Image,
* modify the I/O processing pipeline to enable aggressive client-side
caching for improved number of IOPS,
* create point-in-time snapshots of running VM disks,
* share snapshots with other users, with fine-grained Access Control Lists,
* and sync them back to their PC for further processing.
The talk will include a live demonstration of a workflow including
this functionality in the context of a large-scale public cloud.
The intended audience spans from enterprise users comparing cloud platforms, to developers who wish to discover a different design approach to IaaS clouds and storage virtualization.
Please see the abstract above for a rough sketch of the proposed presentation.
The presentation will mostly consist of live demonstration and a small deck of slides,
meant to describe the main features of Archipelago and its integration with Synnefo,
as well as provoke discussion in the Q&A session.
The main workflow in the demonstration will be [copied and pasted from the Abstract]:
* begin with a base OS image, bundle all application and supporting
library code inside it, and upload it to the Archipelago backend in a
syncing, Dropbox-like manner,
* start a parallel HPC application in hundreds of VMs, thinly provisioned
from this Image,
* modify the I/O processing pipeline to enable aggressive client-side
caching for improved number of IOPS,
* create point-in-time snapshots of running VM disks,
* share snapshots with other users, with fine-grained Access Control Lists,
* and sync them back to their PC for further processing.