This tutorial will introduce how to leverage scikit-learn's powerful
**histogram-based gradient boosted regression trees** with various loss functions
(Least squares, **Poisson** and the **pinball loss** for quantile estimation) on a time
series forecasting problem. We will see how to leverage pandas to build **lag and
windowing features** and [scikit-learn](https://scikit-learn.org) time-series cross-validation tools and other
model evaluation tools.
This tutorial is intended for an audience with some familiarity with data
science tools and machine learning concepts. It will start from practical
considerations on how to manipulate the data and fit simple yet powerful
models and progressively move to more advanced considerations on model
evaluation.
The main focus is to show how to **cast a time series forecasting problem into a
supervised machine learning problem** (non-linear regression) using basic
pandas-based feature engineering, time series aware cross-validation and
highlighting the impact of the choice of the loss function of gradient boosting
models.
We will compare this forecasting to a baseline that only leverages instantaneous
contextual variables as predictors using scikit-learns feature engineering tools
(column transformers and pipelines) with a particular emphasis how to build
**cyclic time-derived features** and categorical variables.
We will then dive deeper into model evaluation assessing various performance metrics with time-series aware cross-validation.
We will compare uncertainty bounds from quantile regression with conformal prediction methods from MAPIE.
Finally we will explore how to deal with the auto-regressive setting to predict forecast for a multi-step horizon with `sktime`.
The tutorial will be available as a Jupyter notebook and the audience will be
encouraged to develop there how intuitions by experimenting interactively with
the teaching material.
If time allows, we will also compare this approach with alternative solutions
based on neural networks or linear models trained on rich spline-based features.