Open source software powers incredible scientific and research discoveries, from images of the black hole in 2019, visualising proteins in 3D virtual reality, and open source hardware can affordably save lives during the COVID-19 pandemic. Despite this, the academic credit/metric system does not incentivise or reward activities around code sharing, instead preferring to use research papers, and citations for structured reward and promotion systems.
In this short talk, we will share the vision of the Wellcome Trust Data for Science and Health team, which is working over the next five years to incentivise the foundational tools, trust, and talent required to recognise and fund open research software sustainably.
Picture Sam, a researcher in a white coat: they spend hours in the lab with their beakers, test tubes, and agar plates. Once they’ve gathered results they think the world will want to know, they write up their conclusions, publish in a journal, and repeat. While this type of research has existed for a long time, and will continue to do so, this model doesn’t effectively acknowledge newer scenarios, where machines and mass data collection result in what is often referred to by the now-buzzword “big data” – data at a scale where it must be managed and analysed with the use of computers.
This type of research gap is often addressed by Joy, the computational researcher. Joy may identify as a coder, a researcher, an engineer, a data scientist, support staff, a lab assistant, or some other title. The main theme is that Joy writes computer code in order to make their job easier, or perhaps even to make it feasible at all. Given that this computer code feeds into and creates the end results, it is reasonable to consider it a research method, albeit a less traditional research method or output than the research paper.
In the current world, Joy is typically less supported than a more traditional academic like Sam: Fewer funding calls exist for funding code work, and ever fewer recognise or reward good open source behaviour, even though an incredible number of scientific analyses rely on open source software themselves, and even though the perils of not sharing research code could be completely wrong scientific results that don't look wrong since no-one can see inside the black-box of proprietary code. Things have gotten a little better - there are scientific journals that accept software papers, and even journals that are exclusive to open source software papers, but they're far from the norm. We still have a long way to go.
In order to make real progress, research code must be open to allow others to verify, re-use, and build upon it. This includes recognising and funding underfunded areas such as open source community management, accessibility, software maintenance, and user experience. The Wellcome Trust has dedicated £75 million GBP over the next five years towards making trustworthy data science a first-class citizen in the research ecosystem. By sharing our vision with others in the open source world we hope to encourage this attitude, meet others working towards the same goals, and meet prominent contributors to the open source research ecosystem who might be looking for funding opportunities.
Speakers: Yo Yehudi