conferences | speakers | series

Building the world’s first free open source database of FOSS and their vulnerabilities.

home

Building the world’s first free open source database of FOSS and their vulnerabilities.
FOSDEM 2021

VulnerableCode is a free and open source database of vulnerabilities and the FOSS packages they impact. It is made by the FOSS community to improve the security of the open source software ecosystem. It’s design solves various pre-existing problems like licensing, data complexity and usability.

Introduction

Using software with known vulnerabilities is one of OWASP’s Top 10 security vulnerabilities . This is increasingly becoming more important as more and more software is built on top of existing free and open source software. From the perspective of software composition analysis, it then becomes increasingly important to know about vulnerable components being used. Naturally a database of mappings of packages and their vulnerabilities is required. Below are some of the problems with existing solutions and how VulnerableCode solves these.

The CPE problem :-

The National Vulnerability Database by the US government uses CPE data format to map vulnerabilities to affected components. CPEs were invented before the mass the adoption of FOSS software, and has a vendor centric design. CPEs hence are ineffective in mapping vulnerabilities to FOSS packages, because of it’s distributed nature and distribution in form of different packaging systems.

VulnerableCode instead of CPEs uses package urls to map vulnerabilities and packages. Package URLs is a data format undergoing rapid adoption, it was designed at ScanCode for representing packages from different ecosystems.

The License problem:-

Majority of remaining solutions are commercial vulnerability databases. This is a problem because Data about vulnerabilities affecting FOSS must also be free and open source. Having such data behind paywall limits usage of FOSS . Commercial vulnerability databases have an incentive to bloat the data, to outshine their competition. Since the data is not open, auditing for fake vulnerabilities is also not possible.

VulnerableCode has a libre license and the data is fully open sourced, this prevents (1). Bloating the data is not an option since due to the open source nature of the project, the data is always available for public scrutiny.

The Data problem:-

Almost every major FOSS distributor provides some sort of public disclosure of vulnerabilities. They use different data formats to do so, some use variants of machine readable formats with diverse schemas, others rely on only providing human readable vulnerability disclosures. This reduces the usability of such data in SCA tools.

VulnerableCode instead parses and converts data from such sources into one single package url based data format. This allows other SCA tools to easily leverage it. The open source nature of project also allows community curation thus the data can be further enriched and purified by the community

Speakers: Shivam Sandbhor