conferences | speakers | series

Data-Driven Thresholding for Extreme Event Detection in Geosciences

home

Data-Driven Thresholding for Extreme Event Detection in Geosciences
EuroSciPy 2022

Extreme weather events are a well known source of human suffering, loss of life, and financial hardship. Amongst these, tropical cyclones are notoriously impactful, leading to significant interest in predicting the genesis, tracks, and intensity of these storms - a task which continues to present significant challenges. In particular, tropical cyclogenesis (TCG) can be described as “a needle in a haystack” problem, and steps must be taken to make predictions tractable. Previously, the filtering of non-genesis points by thresholding predictive variables has been described, with thresholds being selected to reduce the number of discarded TCG cases. In the art, this thresholding has often been carried out empirically, that while effective relies on domain knowledge. This talk instead proposes the development of a systematic, machine-learning-based approach implemented in Python. The method is designed to be interpretable to the point of becoming transparent machine learning. Threshold values that minimize the false-alarm rate and maintain a high recall are found, and then combined in a forward selection algorithm. As other extreme events in the geosciences are considered needle in the haystack problems, the described approach can be of use in reducing the variable space in which to study and predict the events. Finally, the transparent nature of the proposed approach can provide simple insight into the conditions in which these events occur.

Extreme weather events are a well known source of human suffering, loss of life, and financial hardship. Amongst these, tropical cyclones are notoriously impactful, leading to significant interest in predicting the genesis, tracks, and intensity of these storms - a task which continues to present significant challenges. In particular, tropical cyclogenesis (TCG) can be described as “a needle in a haystack” problem, and steps must be taken to make predictions tractable. Previously, the filtering of non-genesis points by thresholding predictive variables has been described, with thresholds being selected to reduce the number of discarded TCG cases. In the art, this thresholding has often been carried out empirically, that while effective relies on domain knowledge. This talk instead proposes the development of a systematic, machine-learning-based approach implemented in Python. The method is designed to be interpretable to the point of becoming transparent machine learning. Threshold values that minimize the false-alarm rate and maintain a high recall are found, and then combined in a forward selection algorithm. As other extreme events in the geosciences are considered needle in the haystack problems, the described approach can be of use in reducing the variable space in which to study and predict the events. Finally, the transparent nature of the proposed approach can provide simple insight into the conditions in which these events occur.

Speakers: Milton Gomez