Statistical forecasting with exogenous factors

SFScon 2020

Ideas on how to improve time series models

Human beings have shown fascination for what the future would hold since the beginning of time. In data science and statistics, the process of forecasting is based on time series data, which can be defined as a sequence of observations taken sequentially in time. Statistical forecasting tries to predict what future observations will be, based on past ones. A clear example of how this is beneficial in practice is sales forecasting. Companies make use of sales forecasting to try and understand their customer demand, in order to improve inventory management and cut down production costs.

Many statistical methods exist and can be used for forecasting. Stochastic processes like and ARIMA are some of the most well-known methods. Recent advances in machine learning and particularly deep learning have created new possibilities for forecasting, which are mainly based on Recurrent Neural Networks, also known as RNNs. These neural architectures have the ability to model sequential data, which makes them optimal for time series forecasting.

Another important task for the forecasting field is that of the analysis of indicators, i.e., individuating explanatory, causative factors from which it is possible to draw conclusions on the direction or values of the focal signal. In simpler words, this would mean finding data which is external to our current data, hence; exogenous, and use it to see if it can help the forecast accuracy. This is something that human beings do often on their own at a subconscious level. Going back to the sales example, we already know that if we run an umbrella shop, we’re going to sell more products when it rains. However in the field of data science, indicator analysis and multivariate time series forecasting is often times neglected.

This talk will bring together all these topics and help the listeners understand what forecasting is, the most famous methods used for it and in the end it will demonstrate how Long Short Term Memory (LSTM) Neural Networks are able to learn non-linear relationships in time series with augmented exogenous indicators and improve forecasts. The research project itself was performed under the supervision of Prof. Marco Cristani and it will be presented here in a concrete and practical application.

Speakers: Geri Skenderi