Commonly used CI systems operate as SAAS solutions, where the user is not
running the CI stack locally. This lends itself to debugging pitfall as
developers cannot easily reproduce the problem locally and cannot interactively
examine it. This talk proposes an inverted design, where self-operated CI tool
can be used both in the cloud as well as locally, supporting interactive
debugging sessions.
The recent surge of CI systems has created an interesting new problem, where a
failure occurs in a specific test environment but does not appear in the
familiar environment used by the developer.
This problem is compounded by the batch nature of such systems, where a
developer can merely push additional patches to some branch to trigger an
asynchronous execution process.
During the development of Ubuntu Core operating system, this problem was
amplified by the fact building and testing a full OS image is a time-consuming
process, leading to cycles that spanned hours and lead to frustration. Snapd
developers created the spread program to solve this, among other, problem.
The SAAS solution became a thin wrapper around spread, which allocates,
provisions, uses and finally discards the test environment. Crucially, almost
all errors can be reproduced locally, as spread runs as a standalone tool, using
QEMU, LXD, Google Compute Engine or Linode as executors.
This allows anyone to run spread locally, in interactive mode, and explore the
problem without putting additional load on the centralized CI system, greatly
improving the debugging process.