Live broadcast: https://www.youtube.com/watch?v=j90tdZyK6FA
The move to cloud has opened a world of new possibilities in software development.
It's so easy to spin up resources in the cloud and together with the adoption of DevOps, software developers are more empowered than ever before. Of course this also puts more demand on the software developers, to take full control and have knowledge of the complete cycle from depolying infrastructure to develop and deploy code. Luckily this process has a lot of benefits and is less reliant on skills of key-persons, if infrasctructure can be deployed as code, this can also be automated with different tools.
The end goal is to be able to deploy more code enhancements and at the same time benefit from the rapid pace of hardware and cloud improvements.
For the compute heavy Data Science practice the adoption of Cloud and the flexibility of deploying enviroments has become a vital success factor. You can have a "supercomputer" at your fingertips for a short time and then you can decomission it again when your work is ready.
But to be able to use this approach over and over again with the same configuration the actual infrastructure need to be saved in code.
Of course this can be done with the help of Python and ideally this should be automated in a true DevOps and CI/CD manner.
I will walk through some of my key take aways of working with infrastructure as code for Data Science projects.