We live in a world where we are relying more and more on data. The emergence (and progression) of computers have made data harvesting and storage much bigger both volumetrically and computationally. However, with the mass volume of data comes the question of "how can we extract any meaningful semantics out of this (usually unstructured) data to gain some sort of logical insight or sense, to perhaps build some sort of knowledge base?" This is where the umbrella term "Data Science" makes sense, jointly covering disciplines such as Data Analytics, Data Mining, Machine Learning, et cetera that are used to satisfy the (quite vague and abstract) aforementioned question. This talk will cover a few fundamental points. Firstly, I would like to introduce the field of "Data Science" more in-depth including different exemplars and complications. Following this, I will talk about the common tools, techniques and languages used (such as libraries & programs) as well as an analysis of the current Data Science tools that exist in Debian; their 'health', Debian's strong (and weak) points in this field and prospective packages that should be added. This will be a pre-recorded talk. This description is very much the bare minimum, and indeed more fruitful content may be added.
Speakers: Shayan Doust