Please note that this talk has been cancelled as the speaker is unfortunately unable to attend FOSDEM.
Containers are widely used in clouds due to their lightweight and scalability. GPUs have powerful parallel processing capabilities that are adopted to accelerate the execution of applications. In a cloud environment, containers may require one or more GPUs to fulfill the resource requirement of application execution, while on the other hand exclusive GPU resource of a container usually results in underutilized resource. Therefore, how to share GPUs among containers becomes an attractive problem to cloud providers. In this presentation, we propose an approach, called vCUDA, to sharing GPU memory and computing resources among containers. vCUDA partitions physical GPUs into multiple virtual GPUs and assigns the virtual GPUs to containers as request. Elastic resource allocation and dynamic resource allocation are adopted to improve resource utilization. The experimental results show that vCUDA only causes 1.015% of overhead by average and it effectively allocates and isolates GPU resources among containers.