Shank Final Paper

Assessing Machine Learning Probabilistic Forecast Utility for Severe Weather Forecasting

Ian D. Shank and Aaron J. Hill

What is already known:

AI and machine learning are already being used to help operational forecasters, like those at the Storm Prediction Center (SPC)
The operational Global Ensemble Forecast System Machine Learning Probabilities (GEFS-MLP) medium-range (day 4-8 lead times) forecasts have better skill than SPC convective outlooks, but skill decreases with increasing lead time
GEFS-MLP forecasts mimic SPC forecasts by predicting the occurrence of severe weather

What this study adds:

Increases our understanding of how unskillful probabilistic forecasts from the GEFS-MLP can still highlight high-impact severe weather events
Allows operational forecasters at the SPC to increase lead times for high-impact events
Creates new pathways for operational forecasters to use the GEFS-MLP to improve operational forecast products and warn the public of severe weather events at longer lead times

Abstract:

This work associated probabilistic values made by an Artificial Intelligence (AI) weather prediction system with historic high-impact severe weather events so that operational forecasters can use AI to help predict high-impact severe weather events in the continental United States (CONUS). This study examined how medium-range (day4-8) machine learning probabilistic forecasts is compared to the Storm Prediction Center (SPC) Day 1 Convective Outlook and observed severe weather reports. Five years of data from 2020-2025 were used to compare probabilities from the GEFS-MLP at different lead times within the medium-range to forecasts made by the SPC Day 1 Convective Outlook. This approach showed which probabilities in the medium-range displayed a better ability in predicting a high-impact severe weather event in CONUS. Higher probabilities in shorter lead times tend to correlate with higher SPC categorical forecasts for high-impact severe weather events. Probabilities in the 5-day lead time of the GEFS-MLP showed the best ability to predict high-impact severe weather events, specifically the 60% probabilistic threshold. Longer lead times, such as day 8 and day 7, had better ability at lower probabilistic thresholds, while shorter lead times, like day 4 and day 5, had better ability at higher probabilistic thresholds. Recognizing which lead times and which probabilities have better correspondence in predicting severe weather can enhance operational forecast products at longer lead times.

Full Paper [PDF]