conferences | speakers | series

Analyzing the Software Development Life-Cycle using Data-Mining Techniques

home

Analyzing the Software Development Life-Cycle using Data-Mining Techniques
FOSDEM 2017

One of the major challenges for certification in the SIL2LinuxMP project, is to show that Linux does not only define a development process, but also follows it. To this end (and far beyond!) the meta-data of commits to the Linux kernel are analyzed.

The talk covers everything from gathering the data, to distributing it to every one in the project while keeping it the data up-to-date and of course our first analysis results. Each of these phases contain their own set of problems that needed to be considered, leading to a framework called DLCDM (Development Life-Cylce Data-Mining) that is introduced for the first time during this talk.

One of the major challenges for certification in the SIL2LinuxMP project, is to show that Linux does not only define a development process, but also follows it. To this end (and far beyond!) the meta-data of commits to the Linux kernel are analyzed. There are several intended outputs we hope to get out of this analysis, some examples are:

    - Competence of persons involved (IEC 61508-1, 6.2.13/6.2.14)
    - Dependencies amongst developers (Independence of persons doing code
      reviews)
    - Identify patches that did not get enough review (based on patch
      complexity, experience of author, reviews, etc.)
    - Automatic notification of patches in our configuration
    - Bug analysis (based on Fixes: tag)
    - Subsystem dependencies and conflicts

The talk covers everything from gathering the data, to distributing it to every one in the project while keeping it the data up-to-date and of course our first analysis results. Each of these phases contain their own set of problems that needed to be considered, leading to a framework called DLCDM (Development Life-Cylce Data-Mining) that is introduced for the first time during this talk.

Speakers: Andreas Platschek