Build recorder: a system to capture detailed information
An issue that is currently plaguing a number of people working in SBOMs is that, given a generated binary artifact of a project, it is not easy (or even possible) to point back to the exact files that were used for creating it.
In a typical setup, a project has a number of source files written in a programming language and a build process creates a binary executable. However, in most cases only a subset of the files is being processed (others being test cases, for example), a number of other files are also used (standard header files residing elsewhere in the system), and a number of tools are being invoked (each introducing another dependency).
In this presentation we will present build_recorder, an innovative command-line tool that allows tracking of complete and detailed information about every single file that somehow affects the build process. The tool works transparently while the software is being built, without requiring any change to the source code or build system. For each file used, a number of attributes are being saved, like name, full path, checksum and exact use. We will be detailing the information kept, the basics of operation, the generated output format, and planned future enhancements.
This work has been the result of a 2022 Google Summer of Code project for the GFOSS organization.