Authors: Bernardo Perrone Ribeiro, Jana Faganeli Pucer (University of Ljubljana)
This repo contains the official implementation of the paper "FlowCast: Advancing Precipitation Nowcasting with Conditional Flow Matching" accepted to ICLR 2026.
Radar-based precipitation nowcasting, the forecasting of short-term precipitation fields from radar imagery, is a critical challenge for flood risk management. While diffusion models have recently shown promise in generating sharp, reliable forecasts, their iterative sampling process is often computationally prohibitive for time-critical applications.
We introduce FlowCast, the first end-to-end probabilistic model leveraging Conditional Flow Matching (CFM) as a direct noise-to-data generative framework for precipitation nowcasting. Unlike hybrid approaches or standard diffusion models, FlowCast learns a direct noise-to-data mapping in a compressed latent space.
This codebase is designed to be reproducible using Conda.
1. Create the environment:
conda create -n nowcasting python=3.10.16
conda activate nowcasting
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126
pip install -r requirements.txtWe utilize the SEVIR dataset. To prepare the data:
- Install AWS CLI:
Follow the instructions in https://docs.aws.amazon.com/cli/latest/userguide/getting-started-install.html- Download VIL data:
mkdir -p datasets/sevir/data/sevir_complete/data/vil
aws s3 cp --no-sign-request s3://sevir/CATALOG.csv datasets/sevir/data/sevir_complete/CATALOG.csv
aws s3 sync --no-sign-request s3://sevir/data/vil datasets/sevir/data/sevir_complete/data/vil- Preprocess:
python datasets/sevir/sevir_preprocessing.py \
--catalog_csv_path datasets/sevir/data/sevir_complete/CATALOG.csv \
--data_dir datasets/sevir/data/sevir_complete/data \
--val_cutoff "2019-01-01 00:00:00" \
--test_cutoff "2019-06-01 00:00:00" \
--output_dir datasets/sevir/data/sevir_full \
--img_type vil \
--keep_dtype True \
--downsample_factor 1The ARSO dataset mentioned in the paper is currently not publicly available.
The training pipeline consists of two main stages: training the Autoencoder to compress the data, and then training the FlowCast CFM model in the latent space.
Train the KL-regularized Autoencoder to compress high-dimensional radar data.
torchrun --nnodes=1 --nproc_per_node=4 \
experiments/sevir/autoencoder/dist_train_autoencoder_kl.py \
--config experiments/sevir/autoencoder/autoencoder_kl_config.yaml \
--train_file datasets/sevir/data/sevir_full/nowcast_training_full.h5 \
--train_meta datasets/sevir/data/sevir_full/nowcast_training_full_META.csv \
--val_file datasets/sevir/data/sevir_full/nowcast_validation_full.h5 \
--val_meta datasets/sevir/data/sevir_full/nowcast_validation_full_META.csvOnce the autoencoder is trained, generate the static latent dataset to speed up FlowCast training.
Note: Replace path/to/auto_encoder_model.pt with your actual checkpoint.
python experiments/sevir/autoencoder/generate_static_dataset.py \
--config experiments/sevir/autoencoder/autoencoder_kl_config.yaml \
--preload_model path/to/auto_encoder_model.pt \
--train_file datasets/sevir/data/sevir_full/nowcast_training_full.h5 \
--train_meta datasets/sevir/data/sevir_full/nowcast_training_full_META.csv \
--val_file datasets/sevir/data/sevir_full/nowcast_validation_full.h5 \
--val_meta datasets/sevir/data/sevir_full/nowcast_validation_full_META.csv \
--out_dir datasets/sevir/data/sevir_latent_vaeTrain the Conditional Flow Matching model on the latent data.
torchrun --nnodes=1 --nproc_per_node=4 \
experiments/sevir/runner/flowcast/dist_train_flowcast.py \
--config experiments/sevir/runner/flowcast/flowcast_config.yaml \
--train_file datasets/sevir/data/sevir_latent_vae/nowcast_training_full.h5 \
--train_meta datasets/sevir/data/sevir_latent_vae/nowcast_training_full_META.csv \
--val_file datasets/sevir/data/sevir_latent_vae/nowcast_validation_full.h5 \
--val_meta datasets/sevir/data/sevir_latent_vae/nowcast_validation_full_META.csv \
--partial_evaluation_file datasets/sevir/data/sevir_full/nowcast_validation_full.h5 \
--partial_evaluation_meta datasets/sevir/data/sevir_full/nowcast_validation_full_META.csvEvaluate the trained model. Point --artifacts_folder to the output directory of the previous step.
python experiments/sevir/runner/flowcast/test_flowcast.py \
--artifacts_folder saved_models/sevir/flowcast \
--config experiments/sevir/runner/flowcast/flowcast_config.yaml \
--test_file datasets/sevir/data/sevir_full/nowcast_testing_full.h5 \
--test_meta datasets/sevir/data/sevir_full/nowcast_testing_full_META.csvModel checkpoints for the SEVIR dataset can be downloaded from the Hugging Face Hub for the autoencoder and FlowCast. To properly set it up, create the following folder structure (and rename the downloaded files to match the structure):
saved_models/
└── sevir/
├── autoencoder/
│ └── models/
│ └── early_stopping_model.pt
└── flowcast/
└── models/
└── early_stopping_model.pt
Training times are estimated based on a single node with 4 x H100 GPUs. Inference times are estimated for a single node with 2 x H100 GPUs.
| Model Component | Script | Step | Approx. Time |
|---|---|---|---|
| Autoencoder | dist_train_autoencoder_kl.py |
Training | ~24 hours |
| FlowCast (CFM) | dist_train_flowcast.py |
Training | ~150 hours |
| FlowCast (CFM) | test_flowcast.py |
Inference | ~36 hours |
@inproceedings{
ribeiro2026flowcast,
title={FlowCast: Advancing Precipitation Nowcasting with Conditional Flow Matching},
author={Bernardo Perrone Ribeiro and Jana Faganeli Pucer},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=47ToW7T1iU}
}