conferences | speakers | series

Airflow 2.0 for ML pipelines – design, implementation and management

home

Airflow 2.0 for ML pipelines – design, implementation and management
PyCon Sweden 2021

Live Stream: https://youtu.be/qWvJSIgOcPU With a lot of changes under the hood with Airflow 2.0, the workshop aims to give an overview on major updates in Airflow 2.0 from 1.0, major components and working of Airflow and hands-on demo of implementation and management of an end-to-end Machine Learning pipeline. Without a pipeline in-place, management of multiple Machine Learning stages in production can be difficult. This gives an overview of simplified process and management of Python based ML projects using Airflow.

## Prerequisites 1. Install [Docker Desktop](https://www.docker.com/get-started) (with minimum 3GB memory allocated) 2. Start Docker engine 3. Clone the workshop repo with `git clone https://github.com/pycon-ml/airflow_workshop.git` 4. Run `docker-compose pull` inside repo folder `airflow_workshop` ## Agenda - 05 min: Introduction - 05 min: Major changes in Airflow 2.0 - 05 min: Pre-requisites setup overview - 10 min: Walkthrough of different backend components - 10 min: Different stages of a DAG file – steps and operators - 10 min: Dynamic DAG creation to improve parallelism - 15 min: How to trigger Airflow DAG runs - 15 min: Debug and clear Airflow task errors - 10 min: Overview of production-level Airflow-based architecture - 05 min: Wrap up questions

Speakers: Alen Jacob Scott Zhou Lini Jose Nitin Bisht