Code instrumentation is the primary means for extracting fine-grained performance data from programs. However, special care has to be taken to with regard to overhead management, as a full instrumentation can increase the runtime by orders of magnitude. Careful selection of the instrumentation configuration (IC), typically via filter lists, is therefore crucial to retain the performance characteristics of the original application. In order to give the user better control of what is measured, we have developed CaPI, an open-source tool for the creation of low-overhead, user-defined ICs. CaPI relies on a statically constructed whole-program call-graph as its central data structure, enabling the user to select functions based on the context they are called in, in addition to function-level metrics. Currently, this call-graph is generated externally by tools running on the source level. This can be cumbersome, especially when targeting large-scale scientific software. To mitigate this issue, we are developing an approach that runs the analysis on the LLVM intermediate representation during link-time optimization. Running during link-time also allows us to embed the result into the binary, improving the workflow and usability of CaPI. In this talk we will discuss the advantages and shortcomings of link time generated call graphs compared to source level generated call graphs, and show how statically generated information can be augmented dynamically at runtime.
Speakers: Tim Heldmann Sebastian Kreutzer