Unikernels aims at improving the way single purpose systems are built with minimalist kernels that the user’s application compiles within. This results in deployments that require less memory, less disk, less cpu and less time to be up and running. Also, the whole system spends most of the time in the user application or doing IO for that single application thus cpu time is used more efficiently. In this presentation, we talk about the use of unikernels for High Performance Computing. We present a work-in-progress that aims at implementing the MPI standard on top of Toro, an open-source non-POSIX unikernel. In this work, we implement a library that conforms to the Open MPI implementation. This library relies on Toro API to implement the MPI functions. In particular, the library leverages Toro’s features like per-CPU memory allocation, cooperative scheduler, thread migration and inter-core communication based on Virtio. During the initialization, Toro creates one instance of the MPI application per core. Each instance is a thread that is migrated to the corresponding core and then executes without any interference. When applications are required to allocate memory, each core has its own memory pool from where memory is allocated. This allows us to keep memory allocation local, thus improving the way cache is used. Also, primitives like MPIGather() or MPIScatter() that require communication between instances are implemented by relying on a new Virtio device named virtio-bus that allows core-to-core communication without locking. At the moment, we have implemented the following APIs: - MPIGather() - MPIScatter() - MPIReduce() - MPIBarrier() The goal of this PoC is to port benchmarks from the osu-microbenchmarking (http://mvapich.cse.ohio-state.edu/benchmarks/) to compare with existing implementations. During the presentation, we present how this is implemented and we demonstrate the use of the current implementation by executing different MPI applications on top of Toro.
Speakers: Matias Vara