conferences | speakers | series

Large Scale Feature Engineering and Datascience with Python & Snowflake

home

Large Scale Feature Engineering and Datascience with Python & Snowflake
PyCon DE & PyData Berlin 2023

[Snowflake](https://www.snowflake.com/en/) as a data platform is the core data repository of many large organizations. With the introduction of Snowflake's [Snowpark for Python](https://github.com/snowflakedb/snowpark-python), Python developers can now collaborate and build on one platform with a secure Python sandbox, providing developers with dynamic scalability & elasticity as well as security and compliance. In this talk I'll explain the core concepts of Snowpark for Python and how they can be used for large scale feature engineering and data science.

This talk is for technical people that would like to get a deep dive into how Snowflake enables large scale feature engineering and data science via Snowpark for Python. During this talk we'll explore Snowflake's Python capabilities using a simple machine learning use case. After this talk you will: * know how Snowpark avoids data movement and keeps existing security & governance intact, * understand the concept of the Snowpark DataFrame-API and how it enables accelerated performance compared to standard Pandas DataFrames, * know how to distribute Hyper Parameter Tuning and training of multiple models, * understand the concept of Vectorized User-Defined-Functions and how they can be used to perform large scale model inference.

Speakers: Michael Gorkow