Time Series Forecasting with scikit-learn's Quantile Gradient Boosted Regression Trees

EuroSciPy 2022

This tutorial will introduce how to leverage scikit-learn's powerful **histogram-based gradient boosted regression trees** with various loss functions (Least squares, **Poisson** and the **pinball loss** for quantile estimation) on a time series forecasting problem. We will see how to leverage pandas to build **lag and windowing features** and [scikit-learn](https://scikit-learn.org) time-series cross-validation tools and other model evaluation tools.

This tutorial is intended for an audience with some familiarity with data science tools and machine learning concepts. It will start from practical considerations on how to manipulate the data and fit simple yet powerful models and progressively move to more advanced considerations on model evaluation. The main focus is to show how to **cast a time series forecasting problem into a supervised machine learning problem** (non-linear regression) using basic pandas-based feature engineering, time series aware cross-validation and highlighting the impact of the choice of the loss function of gradient boosting models. We will compare this forecasting to a baseline that only leverages instantaneous contextual variables as predictors using scikit-learns feature engineering tools (column transformers and pipelines) with a particular emphasis how to build **cyclic time-derived features** and categorical variables. We will then dive deeper into model evaluation assessing various performance metrics with time-series aware cross-validation. We will compare uncertainty bounds from quantile regression with conformal prediction methods from MAPIE. Finally we will explore how to deal with the auto-regressive setting to predict forecast for a multi-step horizon with `sktime`. The tutorial will be available as a Jupyter notebook and the audience will be encouraged to develop there how intuitions by experimenting interactively with the teaching material. If time allows, we will also compare this approach with alternative solutions based on neural networks or linear models trained on rich spline-based features.

Speakers: Olivier Grisel