AutoGluon: AutoML for Tabular, Multimodal and Time Series Data

PyCon DE & PyData Berlin 2023

AutoML, or automated machine learning, offers the promise of transforming raw data into accurate predictions with minimal human intervention, expertise, and manual experimentation. In this talk, we will introduce AutoGluon, a cutting-edge toolkit that enables AutoML for tabular, multimodal and time series data. AutoGluon emphasizes usability, enabling a wide variety of tasks from regression to time series forecasting and image classification through a unified and intuitive API. We will specifically focus on tasks on tabular and time series tasks where AutoGluon is the current state-of-the-art, and demonstrate how AutoGluon can be used to achieve competitive performance on tabular and time series competition data sets. We will also discuss the techniques used to automatically build and train these models, peeking under the hood of AutoGluon.

[AutoGluon](http://auto.gluon.ai) is a Python machine learning library which offers cutting edge accuracy and value-for-compute on a wide variety of tasks. These tasks include regression, classification and quantile regression in tabular data, as well as multimodal tasks such as image classification, image-to-text and text-to-text similarity. A recent addition to AutoGluon is AutoGluon-TimeSeries, the library's module for time series forecasting tasks. AutoGluon is organized into modules for Tabular, Multimodal and Time Series tasks all of which share an intuitive scikit-learn-like API for fitting and performing inference with cutting-edge machine learning in as little as three lines of code, with no in-depth understanding of ML. AutoGluon is widely considered the state-of-the-art in tabular tasks as confirmed by the independent [AutoML Benchmark](https://openml.github.io/automlbenchmark/papers.html), and is the current top performer on multimodal tasks on the RAFT leaderboard. In this talk, we will focus on the tabular and time series modules and showcase how the library can be used to get competitive results on competition platforms such as Kaggle. AutoGluon also differs quite significantly under the hood from other AutoML frameworks. The library does not take AutoML to primarily mean hyperparameter optimization, but leans heavily into building (stack) ensembles of strong but varied learning algorithms to achieve superior results. We will also showcase some of the theory and building blocks of AutoGluon, describing how we built an AutoML system that takes model ensembling as a central element.

Speakers: Caner Turkmen Oleksandr Shchur