FUN-DDPS: Function-space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage

This repository contains the official implementation of FUN-DDPS, a generative framework that combines function-space diffusion models with differentiable neural operator surrogates for forward and inverse modeling in CO2 storage modeling.

Overview

FUN-DDPS decouples the joint diffusion model into:

Single-channel diffusion model trained on geomodel (permeability) fields
Neural operator (LocalNO) mapping geomodel to dynamics (saturation)

This decoupled approach enables efficient posterior sampling for both forward (geomodel -> dynamics) and inverse (dynamics -> geomodel) problems, with improved scalability over joint 2-channel models. With only 25% of observations in forward modeling, FUN-DDPS achieves ~7.7% relative error versus 86.9% for standard approaches (11x improvement). The inverse solver produces physically realistic results validated against rejection sampling posteriors (JSD < 0.06) with 4x better sample efficiency.

Methods

Method	Description
FUN-DDPS	Single-channel DM + surrogate LocalNO (decoupled)
FUN-DPS	Joint 2-channel DM (geomodel + dynamics)

Installation

Option 1: Conda environment

conda env create -f environment.yml
conda activate fun-ddps
pip install -e .

Option 2: pip only

pip install -e .
# For surrogate training:
pip install -e ".[surrogate]"

Neural operator dependency

pip install neuraloperator
# Or install from source for latest features:
# git clone https://github.com/neuraloperator/neuraloperator.git
# cd neuraloperator && pip install -e .

Data and Pre-trained Models

All data and pre-trained weights are available for download from Google Drive:

Download FUN-DDPS-release.tar (5.9 GB)

After downloading, extract and place the contents at the repository root:

tar -xf FUN-DDPS-release.tar
cp -r FUN-DDPS-release/data ./
cp -r FUN-DDPS-release/pretrained ./

Directory structure

data/
  eclipse_14k_baseline/             # Raw Eclipse simulation outputs (.pt files)
  ccs2d_geomodel_only/             # Single-channel processed data
    train/  (12,492 samples)
    test/   (1,390 samples)
  ccs2d_2channel/                   # Joint 2-channel processed data
    train/  (12,492 samples)
    test/   (1,390 samples)

pretrained/
  geomodel_dm_best.pt              # Single-channel diffusion model (253 MB)
  joint_dm_best.pt                  # Joint 2-channel diffusion model (253 MB)
  surrogate/
    LocalNO_best.pt                 # LocalNO surrogate model (33 MB)
    normalization_stats.json        # Input/output normalization statistics

Dataset details

The datasets are derived from 14,000 Eclipse reservoir simulations of CO2 injection into heterogeneous geological formations.

Dataset	Description	Channels	Size
`eclipse_14k_baseline`	Raw simulation outputs containing permeability fields and CO2 saturation maps as PyTorch tensors	--	1.4 GB
`ccs2d_geomodel_only`	Processed single-channel `.npy` arrays of log-permeability fields (64x64), used for training the single-channel diffusion model (FUN-DDPS)	1	1.4 GB
`ccs2d_2channel`	Processed two-channel `.npy` arrays stacking log-permeability and CO2 saturation (2x64x64), used for training the joint diffusion model (FUN-DPS)	2	2.0 GB

Note: Rejection sampling data (rs_data/, ~287 GB) is not included due to size. See experiments/rs_benchmark/ for scripts to regenerate it.

RS posteriors: Pre-computed rejection sampling posterior samples (posterior.pt and gt.pt) are available for download from Google Drive. These contain 26,082 accepted samples (from 2M proposals) for the columns observation case, used to validate DPS/DDPS approximate posteriors in the paper.

Pre-trained model details

Model	File	Description
Single-channel DM	`geomodel_dm_best.pt`	UNO-based diffusion model trained on permeability fields only. Used as the prior in FUN-DDPS.
Joint 2-channel DM	`joint_dm_best.pt`	UNO-based diffusion model trained on joint permeability + saturation fields. Used as the prior in FUN-DPS.
LocalNO surrogate	`LocalNO_best.pt`	Local neural operator mapping permeability to CO2 saturation. Used as the differentiable forward model in FUN-DDPS for posterior sampling.
Normalization stats	`normalization_stats.json`	Per-channel mean and standard deviation computed from training data, used to normalize surrogate inputs/outputs.

Data preparation (from raw data)

If you want to re-process the data from the raw Eclipse outputs:

# Process 2-channel data (geomodel + dynamics)
python merge_ccs2d_data.py train --input_dir data/eclipse_14k_baseline
python merge_ccs2d_data.py test --input_dir data/eclipse_14k_baseline

# Process single-channel data (geomodel only)
python merge_ccs2d_data.py train --geomodel-only --input_dir data/eclipse_14k_baseline
python merge_ccs2d_data.py test --geomodel-only --input_dir data/eclipse_14k_baseline

Training

1. Train the surrogate neural operator

python surrogate/train_surrogate.py \
    --output_dir ./outputs/surrogate \
    --models localno \
    --task forward \
    --epochs 100 \
    --batch_size 32

2. Train the single-channel diffusion model (FUN-DDPS)

accelerate launch \
    --config_file configs/accelerate/accelerate_config_uno.yaml \
    train_acc.py \
    --config configs/training/ccs2d_geomodel_only.yml

3. Train the joint 2-channel diffusion model (FUN-DPS)

accelerate launch \
    --config_file configs/accelerate/accelerate_config_uno.yaml \
    train_acc.py \
    --config configs/training/ccs2d_uno_2channel.yml

Inference

Config-based inference

# Forward problem (FUN-DPS, joint model)
python generate_pde.py --config configs/inference/ccs2d-forward-ensemble-ulno.yaml

# Single-channel inference
python generate_pde.py --config configs/inference/ccs2d_single_dps.yaml

Experiment runner (recommended)

# FUN-DDPS forward problem
python experiments/scripts/run_experiment.py \
    --method ddps --exp forward --obs full \
    --num_samples 10 --iterations 500 --batch_size 16 \
    --batched --save_samples

# FUN-DDPS inverse problem
python experiments/scripts/run_experiment.py \
    --method ddps --exp inverse --obs full \
    --num_samples 10 --iterations 500 --batch_size 16 \
    --batched --save_samples

# FUN-DPS forward problem
python experiments/scripts/run_experiment.py \
    --method dps --exp forward --obs full \
    --num_samples 10 --iterations 500 --batch_size 16 \
    --batched --save_samples

# FUN-DPS inverse problem
python experiments/scripts/run_experiment.py \
    --method dps --exp inverse --obs full \
    --num_samples 10 --iterations 500 --batch_size 16 \
    --batched --save_samples

# Surrogate baseline (forward only)
python experiments/scripts/run_experiment.py \
    --method surrogate --exp forward --obs full \
    --num_samples 10 --save_samples

Observation modes

Mode	Flag	Description
Full	`--obs full`	Observe entire field
Random 50%	`--obs random_50`	Random 50% of grid points
Random 25%	`--obs random_25`	Random 25% of grid points
Column	`--obs columns`	Observe specific column indices

Rejection Sampling Benchmark

# Run rejection sampling
python experiments/rs_benchmark/run_rs.py \
    --config experiments/rs_benchmark/configs/rs_config.yaml

# Compare RS vs DPS/DDPS
python experiments/rs_benchmark/compare_rs_dps.py \
    --rs_dir experiments/rs_benchmark/results/ \
    --dps_dir experiments/results/

See experiments/rs_benchmark/RS_BENCHMARK.md for details.

Visualization

python experiments/scripts/visualize_results.py \
    --results_dir experiments/results/ \
    --exp forward \
    --obs full \
    --save_dir results/visualization/

Project Structure

FUN-DDPS/
  train_acc.py                      # Diffusion model training
  generate_pde.py                   # Inference entry point
  merge_ccs2d_data.py               # Data preparation
  configs/
    training/                       # Training configs
    accelerate/                     # Distributed training configs
    inference/                      # Inference configs
  training/                         # Training infrastructure
  generation/
    ccs2d/                          # Joint 2-channel (FUN-DPS)
    ccs2d_single/                   # Single-channel + surrogate (FUN-DDPS)
  scripts/                          # Generation scripts
  surrogate/                        # Surrogate model training
  experiments/
    scripts/                        # Experiment runner & visualization
    configs/                        # Experiment configs
    rs_benchmark/                   # Rejection sampling benchmark
  bash_scripts/                     # Shell scripts for training/inference
  torch_utils/                      # Utility functions
  dnnlib/                           # Deep learning utilities

Citation

If you find this work useful, please cite:

@article{ju2026function,
  title={Function-Space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage},
  author={Ju, Xin and Yao, Jiachen and Anandkumar, Anima and Benson, Sally M and Wen, Gege},
  journal={arXiv preprint arXiv:2602.12274},
  year={2026}
}

License

See LICENSE for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FUN-DDPS: Function-space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage

Overview

Methods

Installation

Option 1: Conda environment

Option 2: pip only

Neural operator dependency

Data and Pre-trained Models

Directory structure

Dataset details

Pre-trained model details

Data preparation (from raw data)

Training

1. Train the surrogate neural operator

2. Train the single-channel diffusion model (FUN-DDPS)

3. Train the joint 2-channel diffusion model (FUN-DPS)

Inference

Config-based inference

Experiment runner (recommended)

Observation modes

Rejection Sampling Benchmark

Visualization

Project Structure

Citation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
bash_scripts		bash_scripts
configs		configs
dnnlib		dnnlib
doc		doc
experiments		experiments
generation		generation
scripts		scripts
surrogate		surrogate
torch_utils		torch_utils
training		training
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
generate_pde.py		generate_pde.py
merge_ccs2d_data.py		merge_ccs2d_data.py
pyproject.toml		pyproject.toml
train_acc.py		train_acc.py

Folders and files

Latest commit

History

Repository files navigation

FUN-DDPS: Function-space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage

Overview

Methods

Installation

Option 1: Conda environment

Option 2: pip only

Neural operator dependency

Data and Pre-trained Models

Directory structure

Dataset details

Pre-trained model details

Data preparation (from raw data)

Training

1. Train the surrogate neural operator

2. Train the single-channel diffusion model (FUN-DDPS)

3. Train the joint 2-channel diffusion model (FUN-DPS)

Inference

Config-based inference

Experiment runner (recommended)

Observation modes

Rejection Sampling Benchmark

Visualization

Project Structure

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages