FUN-DDPS: Function-space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage
This repository contains the official implementation of FUN-DDPS, a generative framework that combines function-space diffusion models with differentiable neural operator surrogates for forward and inverse modeling in CO2 storage modeling.
FUN-DDPS decouples the joint diffusion model into:
- Single-channel diffusion model trained on geomodel (permeability) fields
- Neural operator (LocalNO) mapping geomodel to dynamics (saturation)
This decoupled approach enables efficient posterior sampling for both forward (geomodel -> dynamics) and inverse (dynamics -> geomodel) problems, with improved scalability over joint 2-channel models. With only 25% of observations in forward modeling, FUN-DDPS achieves ~7.7% relative error versus 86.9% for standard approaches (11x improvement). The inverse solver produces physically realistic results validated against rejection sampling posteriors (JSD < 0.06) with 4x better sample efficiency.
| Method | Description |
|---|---|
| FUN-DDPS | Single-channel DM + surrogate LocalNO (decoupled) |
| FUN-DPS | Joint 2-channel DM (geomodel + dynamics) |
conda env create -f environment.yml
conda activate fun-ddps
pip install -e .pip install -e .
# For surrogate training:
pip install -e ".[surrogate]"pip install neuraloperator
# Or install from source for latest features:
# git clone https://github.com/neuraloperator/neuraloperator.git
# cd neuraloperator && pip install -e .All data and pre-trained weights are available for download from Google Drive:
Download FUN-DDPS-release.tar (5.9 GB)
After downloading, extract and place the contents at the repository root:
tar -xf FUN-DDPS-release.tar
cp -r FUN-DDPS-release/data ./
cp -r FUN-DDPS-release/pretrained ./data/
eclipse_14k_baseline/ # Raw Eclipse simulation outputs (.pt files)
ccs2d_geomodel_only/ # Single-channel processed data
train/ (12,492 samples)
test/ (1,390 samples)
ccs2d_2channel/ # Joint 2-channel processed data
train/ (12,492 samples)
test/ (1,390 samples)
pretrained/
geomodel_dm_best.pt # Single-channel diffusion model (253 MB)
joint_dm_best.pt # Joint 2-channel diffusion model (253 MB)
surrogate/
LocalNO_best.pt # LocalNO surrogate model (33 MB)
normalization_stats.json # Input/output normalization statistics
The datasets are derived from 14,000 Eclipse reservoir simulations of CO2 injection into heterogeneous geological formations.
| Dataset | Description | Channels | Size |
|---|---|---|---|
eclipse_14k_baseline |
Raw simulation outputs containing permeability fields and CO2 saturation maps as PyTorch tensors | -- | 1.4 GB |
ccs2d_geomodel_only |
Processed single-channel .npy arrays of log-permeability fields (64x64), used for training the single-channel diffusion model (FUN-DDPS) |
1 | 1.4 GB |
ccs2d_2channel |
Processed two-channel .npy arrays stacking log-permeability and CO2 saturation (2x64x64), used for training the joint diffusion model (FUN-DPS) |
2 | 2.0 GB |
Note: Rejection sampling data (
rs_data/, ~287 GB) is not included due to size. Seeexperiments/rs_benchmark/for scripts to regenerate it.RS posteriors: Pre-computed rejection sampling posterior samples (
posterior.ptandgt.pt) are available for download from Google Drive. These contain 26,082 accepted samples (from 2M proposals) for the columns observation case, used to validate DPS/DDPS approximate posteriors in the paper.
| Model | File | Description |
|---|---|---|
| Single-channel DM | geomodel_dm_best.pt |
UNO-based diffusion model trained on permeability fields only. Used as the prior in FUN-DDPS. |
| Joint 2-channel DM | joint_dm_best.pt |
UNO-based diffusion model trained on joint permeability + saturation fields. Used as the prior in FUN-DPS. |
| LocalNO surrogate | LocalNO_best.pt |
Local neural operator mapping permeability to CO2 saturation. Used as the differentiable forward model in FUN-DDPS for posterior sampling. |
| Normalization stats | normalization_stats.json |
Per-channel mean and standard deviation computed from training data, used to normalize surrogate inputs/outputs. |
If you want to re-process the data from the raw Eclipse outputs:
# Process 2-channel data (geomodel + dynamics)
python merge_ccs2d_data.py train --input_dir data/eclipse_14k_baseline
python merge_ccs2d_data.py test --input_dir data/eclipse_14k_baseline
# Process single-channel data (geomodel only)
python merge_ccs2d_data.py train --geomodel-only --input_dir data/eclipse_14k_baseline
python merge_ccs2d_data.py test --geomodel-only --input_dir data/eclipse_14k_baselinepython surrogate/train_surrogate.py \
--output_dir ./outputs/surrogate \
--models localno \
--task forward \
--epochs 100 \
--batch_size 32accelerate launch \
--config_file configs/accelerate/accelerate_config_uno.yaml \
train_acc.py \
--config configs/training/ccs2d_geomodel_only.ymlaccelerate launch \
--config_file configs/accelerate/accelerate_config_uno.yaml \
train_acc.py \
--config configs/training/ccs2d_uno_2channel.yml# Forward problem (FUN-DPS, joint model)
python generate_pde.py --config configs/inference/ccs2d-forward-ensemble-ulno.yaml
# Single-channel inference
python generate_pde.py --config configs/inference/ccs2d_single_dps.yaml# FUN-DDPS forward problem
python experiments/scripts/run_experiment.py \
--method ddps --exp forward --obs full \
--num_samples 10 --iterations 500 --batch_size 16 \
--batched --save_samples
# FUN-DDPS inverse problem
python experiments/scripts/run_experiment.py \
--method ddps --exp inverse --obs full \
--num_samples 10 --iterations 500 --batch_size 16 \
--batched --save_samples
# FUN-DPS forward problem
python experiments/scripts/run_experiment.py \
--method dps --exp forward --obs full \
--num_samples 10 --iterations 500 --batch_size 16 \
--batched --save_samples
# FUN-DPS inverse problem
python experiments/scripts/run_experiment.py \
--method dps --exp inverse --obs full \
--num_samples 10 --iterations 500 --batch_size 16 \
--batched --save_samples
# Surrogate baseline (forward only)
python experiments/scripts/run_experiment.py \
--method surrogate --exp forward --obs full \
--num_samples 10 --save_samples| Mode | Flag | Description |
|---|---|---|
| Full | --obs full |
Observe entire field |
| Random 50% | --obs random_50 |
Random 50% of grid points |
| Random 25% | --obs random_25 |
Random 25% of grid points |
| Column | --obs columns |
Observe specific column indices |
# Run rejection sampling
python experiments/rs_benchmark/run_rs.py \
--config experiments/rs_benchmark/configs/rs_config.yaml
# Compare RS vs DPS/DDPS
python experiments/rs_benchmark/compare_rs_dps.py \
--rs_dir experiments/rs_benchmark/results/ \
--dps_dir experiments/results/See experiments/rs_benchmark/RS_BENCHMARK.md for details.
python experiments/scripts/visualize_results.py \
--results_dir experiments/results/ \
--exp forward \
--obs full \
--save_dir results/visualization/FUN-DDPS/
train_acc.py # Diffusion model training
generate_pde.py # Inference entry point
merge_ccs2d_data.py # Data preparation
configs/
training/ # Training configs
accelerate/ # Distributed training configs
inference/ # Inference configs
training/ # Training infrastructure
generation/
ccs2d/ # Joint 2-channel (FUN-DPS)
ccs2d_single/ # Single-channel + surrogate (FUN-DDPS)
scripts/ # Generation scripts
surrogate/ # Surrogate model training
experiments/
scripts/ # Experiment runner & visualization
configs/ # Experiment configs
rs_benchmark/ # Rejection sampling benchmark
bash_scripts/ # Shell scripts for training/inference
torch_utils/ # Utility functions
dnnlib/ # Deep learning utilities
If you find this work useful, please cite:
@article{ju2026function,
title={Function-Space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage},
author={Ju, Xin and Yao, Jiachen and Anandkumar, Anima and Benson, Sally M and Wen, Gege},
journal={arXiv preprint arXiv:2602.12274},
year={2026}
}See LICENSE for details.
