Is there a high variance in weather forecasts for today? Traditional numerical weather prediction (NWP) models are limited in their ability to integrate new observations. The AIREN-NWP AI-powered numerical weather prediction (NWP) post-processing solution has been designed to fuse forecasts with real-time measurements for higher resolution and more accurate weather predictions up to six hours ahead. We have observed up to 2.5x improvement in the initial forecasts for precipitation, temperature and wind gusts. Get ready for forecasts that react fast, stay sharp and keep you ahead of the weather.
Remember the chaos of last summer’s flash floods and festivals being disrupted by unpredictable weather? These are just two examples of severe weather events where quick decision making is crucial to ensuring the safety of people and property. However, the high unpredictability of these types of weather leaves traditional forecasts inadequate. As these forecasts give only the general likelihood of risks in a larger region, issuing actionable warnings based on them is not viable. At Meteopress, we have already developed AIREN-Nowcasting for more granular predictions and real-time, high-resolution, accurate and specific nowcasts. Still, a longer prediction horizon may be necessary to give adequate information to the interested parties. What’s more, variables other than precipitation can also be important.
AIREN-NWP is a novel solution combining the strengths of rapidly updated AI nowcasts with the meteorological expertise and robustness of NWP. It uses data like synoptic-scale meteorological station measurements (SYNOP) as well as radar and satellite imagery to post-process NWP forecasts, delivering: enhanced accuracy by correcting biases in NWP forecasts and integrating the latest real-time data from diverse sources; more detailed predictions, like a one-hour time step improved from the three-hour step of the input GFS data; and the capability to compute new, more accurate predictions as often as the relevant real-world data update becomes available.
AIREN-NWP can be tailored to use any NWP model and any combination of relevant input data, empowering the recipients of its predictions to navigate even the most unpredictable weather scenarios with greater confidence and readiness. In the following sections, we will delve more deeply into the technical details of AIREN-NWP’s architecture and data sources used in this study, shedding light on the mechanisms behind these advancements and providing detailed performance analysis.
Problem background
While NWP models offer valuable insights into long-term weather patterns, their limitations become particularly evident in highly unpredictable scenarios. High computational power requirements restrict the forecasts’ spatial and temporal resolution while limiting the frequency of model updates. Moreover, the first hours of the forecast suffer from the ‘spin-up problem’. The accuracy of the forecast also suffers in the initial time period until the inconsistency between the meteorological inputs and numerical model tapers out. Thus, it is challenging for NWP models to use all available data effectively, potentially leaving us guessing about the immediately following weather development.
On the opposite end of the spectrum lie AI-powered solutions like AIREN-Nowcasting. They excel at short-term, high-resolution predictions by leveraging real-time radar data. While the highly accurate and granular radar data allows AIREN-Nowcasting to deliver precipitation forecasts for the next 90 minutes with unprecedented accuracy, this approach has drawbacks, too. Relying solely on extrapolation from a single data source leads to a deterioration of prediction performance beyond the near time horizon, roughly two hours. These forecasts are great for immediate decision making (e.g. taking shelter during storm events or choosing the right time to depart from work to avoid an unwanted shower). However, the 90-minute prediction horizon can be limiting. Additionally, the focus on radar data restricts AIREN-Nowcasting scope to precipitation, excluding other important weather variables like temperature and wind gusts.
Data and solution
As discussed, the NWP models and purely AI-based nowcasting approaches have limitations that hinder their ability to provide comprehensive and reliable weather forecasts. AIREN-NWP addresses these limitations by combining the strengths of both approaches, leveraging a recurrent neural network to enhance the original NWP forecasts. The real-time measurements are used to initialize the hidden state of the recurrent cell, priming it for step-by-step post-processing of input forecasts and metadata. The objective is to post-process the NWP forecasts to match the ground truth weather reanalysis as closely as possible. With the data set spanning from October 2015 to December 2022, we reserve two full years (2019 and 2022) for validation, ensuring robust and fair performance evaluation.
As the starting point, AIREN-NWP uses global forecast system (GFS) forecasts for the Central Europe area. These forecasts are issued four times daily at six-hour intervals, with a 0.25° spatial and three-hour temporal resolution. ERA5 reanalysis serves as the ground truth, matching the spatial resolution but providing hourly data. To achieve this one-hour output resolution, for each recurrent step, we combine the next nearest three-hour GFS forecast with metadata reflecting how far into the future the current prediction is. We currently focus on three target variables and predict them separately – accumulated precipitation (APCP), temperature 2m above the surface (TMP) and velocity of wind gusts (GUST).
The first of the three real-time data sources that initialize the recurrent cell are SYNOP station measurements. They are available in one-hour intervals and the last three are used. SYNOP measurements, originally local point data, are interpolated to a more continuous raster-like representation. The measured variables are air pressure, humidity, temperature, wind speed and precipitation. It should be noted that the SYNOP stations, even inside just the central European area, provide their data at somewhat arbitrary intervals and locally specific availability. Yet, AIREN-NWP demonstrates high robustness even in the case of random outages of the individual stations.
Radar reflectivity data spans the last hour in 10-minute intervals. Its spatial resolution, eight times the resolution of GFS, is downscaled using a convolutional encoder. Unfortunately, this experiment’s radar data set only contains sufficiently reliable data for the area of France and the Czech Republic, making it necessary to inform the model about these local limitations in the metadata. Training loss is weighted toward radar-covered areas to motivate the use of radar for predicting APCP, and evaluation is performed solely for these areas.
Similar to radar, satellite data covers the last hour in 10-minute intervals. We have selected seven infrared channels based on a statistical quality check. This data is four times the spatial resolution of GFS, requiring another convolutional encoder for downscaling.
We have decided on this data selection based on extensive experimentation (detailed in the following section). Each of the three variables is predicted separately, allowing for different configurations and straightforward inclusion of other predicted outputs in the future. Notably, APCP is forecast for three hours into the future while TMP and GUST forecasts reach six hours. This tailored approach leverages the strengths of each available data source to optimize prediction accuracy for different weather variables.
Results
Following the objective of AIREN-NWP to post-process NWP forecasts, we focus on the relative improvement of mean absolute error (MAE). This metric indicates by what factor the MAE of AIREN-NWP forecasts is lower than that of a baseline. ERA5 reanalysis is used as the reference for MAE computation of both the examined model and the baseline. The relative improvement should increase toward the event as the forecast time draws nearer.
The most essential baseline is the input GFS, which has a three-hour prediction step, in contrast to the one-hour step of AIREN-NWP. The most recent AIREN-NWP predictions are aggregated for these comparisons to match the GFS ones. It must be noted that this approach is disadvantageous to the AIREN-NWP, as the evaluated variable differs from the one used in training. This disadvantage may be seen in the following evaluations, where the improvement of APCP prediction stagnates against the GFS but increases toward the event against our GFS post-processing baseline.
We have created three models, one for each target variable, with performance summarized in the following plot. Precipitation (APCP) is predicted three hours into the future, and the model is updated every 30 minutes, achieving up to 1.4x lower MAE than that of the GFS. The temperature (TMP) and wind gusts (GUST) are predicted six hours into the future, with the model updated every 60 minutes. They are respectively achieving up to 2.1x and 2.5x improvement in MAE, and the improvement is increasing toward the event.
To explore the benefits of using real-time measurements and how AIREN-NWP models perform in their native one-hour output granularity, we compare them to our GFS post-processing baselines. These models are trained with the same training setup and using the same input samples, just a different combination of the real-time measurements (SYNOP, radar and satellite) on the input.
The plots above show a clear correlation between the number of input data sources and the improvement in prediction performance. Moreover, the relative MAE improvements always increase toward the event, as is expected of AIREN-NWP. For the last prediction before the event, the models with all inputs available achieve respectively 4.7%, 30.8% and 6.7% improvements over our GFS post-processing baseline.
AIREN-NWP has the potential to rapidly update its predictions due to the availability of real-time data sources like radar and satellite measurements. To explore this capability, we tested different update frequencies for precipitation (APCP) predictions, precisely 10, 30 and 60 minutes. The achieved errors are very close, within a 1% margin of the base 60-minute updates, with the 10-minute updates falling slightly behind. As a result, the 30-minute update interval frequency is optimal for the current generation of AIREN-NWP models.
We are considering two hypotheses as to why more frequent 10-minute updates are not beneficial. Since the prediction step is one hour, the new information available in every 10-minute measurement might be negligible for improving the lower-resolution one-hour forecast. Alternatively, the current architecture might need some tweaks to more effectively use information updates for one-hour predictions at such a rapid pace. Further investigation into this aspect will be done in future work to enable forecast updates with every new measurement available.
The choice of one-, three- and six-hour prediction horizons for APCP, TMP and GUST, respectively, was the result of a combination of data analysis and practical considerations. We initially experimented with models using only SYNOP data as real-time input, a 60-minute update step and a 14-hour prediction horizon. This exploration revealed that the improvement gains with each model update are more significant as nearing the predicted event for temperature. Based on this observation, a six-hour horizon was chosen to focus on maximizing accuracy within this timeframe, which is also the most interesting from the product point of view. The GUST model exhibits a more linear behavior, but the same six-hour horizon was adopted for consistency and ease of use.
For precipitation (APCP), however, the analysis indicated smaller improvement as the predicted event draws nearer, and storm prediction remains a well-known challenge. Furthermore, accurate precipitation measurements are among the less readily available data, with synoptic stations often only offering longer timespan aggregates. Consequently, a three-hour horizon was selected, with a future focus on enhancing the resolution of precipitation forecasts rather than extending this prediction horizon.
Prediction Examples
This section showcases sample AIREN-NWP predictions for each of the three target variables from the forecasting point of view. The ERA5 reanalysis is used as ground truth in this section as well.
APCP
On December 23, 2023, a waving cold front was over Central Europe, and precipitation was moving from the west at this boundary. Analyzing the three-hour aggregated AIREN-NWP forecasts, the peak precipitation amounts of the GFS model are correctly adjusted.
Moreover, AIREN-NWP increases output granularity to one hour, delivering a more detailed view of weather development in the studied situation.
TMP
While spring weather prevailed in Central Europe on February 28, 2024, a ridge of high pressure began to move into it from the west by a dissipating cold front over eastern Central Europe. The warmer air flowing into Hungary and Slovakia contrasted with colder air penetrating the regions of France and Germany.
The AIREN-NWP forecast is similar to the ERA5 reanalysis, with the only major mistake of predicting higher temperatures in southern Poland at 12 UTC. AIREN-NWP correctly lowered the temperatures of the input GFS forecast in the area over Germany and France and increased temperatures in Hungary, closing the gap to the ERA5.
GUST
A typical calm spring weather prevailing over Central Europe on February 24 is one of those situations when accurate forecasts are very important, for example, in the energy sector. The AIREN-NWP model better estimated the higher wind speed in France and correctly reduced the speed in the Baltic region and northeastern Poland at 12 UTC. The input GFS model predicted almost no wind in Central Europe, while AIREN-NWP increased the speeds toward the actual moderate winds present in ERA5. While the situation at 15 UTC is very similar, it has to be noted that AIREN-NWP slightly overshoots the speeds in France. At 18 UTC, AIREN-NWP corrects the wind speeds that are too high in France and the Baltic region while missing the gusts in the Alpine region in the same way that GFS does.
Conclusion
This work successfully demonstrates the potential of fusing AI and NWP models to enhance weather forecasts for the immediate future by incorporating real-time data. While AIREN-NWP has been developed to predict precipitation, temperature and wind gusts based on GFS forecasts, its core approach can be tailored to any combination of the NWP model and real-time weather measurements.
AIREN-NWP improves input GFS forecasts 1.4x, 2.1x and 2.5x, respectively, for APCP, TMP and GUST prediction. AIREN-NWP successfully combines diverse data sources with varying spatial and temporal resolutions, leveraging their strengths to improve forecast accuracy. It offers one-hour predictions, improving the three-hour step of GFS. The model continuously improves its predictions with each recompute, incorporating newly acquired weather data to stay aligned with the latest observations, ensuring the forecasts remain accurate and relevant.
Looking ahead, the development roadmap of AIREN-NWP includes integration of orographic data, model updates with every new measurement, adaptation to a regional NWP input, and architecture optimization to use the input data to its maximum.
In related news, Michal Najman, CEO and founder of Meteopress, recently gave an exclusive interview at Meteorological Technology World Expo, about the company’s Mobile Automatic Self-Erecting Container (MASEC) C-band radar. Click here to see the video.