Skip to content

neuraloperator/Fun-DDPS

Repository files navigation

FUN-DDPS: Function-space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage

[Paper]

This repository contains the official implementation of FUN-DDPS, a generative framework that combines function-space diffusion models with differentiable neural operator surrogates for forward and inverse modeling in CO2 storage modeling.

Overview

FUN-DDPS Schematic

FUN-DDPS decouples the joint diffusion model into:

  • Single-channel diffusion model trained on geomodel (permeability) fields
  • Neural operator (LocalNO) mapping geomodel to dynamics (saturation)

This decoupled approach enables efficient posterior sampling for both forward (geomodel -> dynamics) and inverse (dynamics -> geomodel) problems, with improved scalability over joint 2-channel models. With only 25% of observations in forward modeling, FUN-DDPS achieves ~7.7% relative error versus 86.9% for standard approaches (11x improvement). The inverse solver produces physically realistic results validated against rejection sampling posteriors (JSD < 0.06) with 4x better sample efficiency.

Methods

Method Description
FUN-DDPS Single-channel DM + surrogate LocalNO (decoupled)
FUN-DPS Joint 2-channel DM (geomodel + dynamics)

Installation

Option 1: Conda environment

conda env create -f environment.yml
conda activate fun-ddps
pip install -e .

Option 2: pip only

pip install -e .
# For surrogate training:
pip install -e ".[surrogate]"

Neural operator dependency

pip install neuraloperator
# Or install from source for latest features:
# git clone https://github.com/neuraloperator/neuraloperator.git
# cd neuraloperator && pip install -e .

Data and Pre-trained Models

All data and pre-trained weights are available for download from Google Drive:

Download FUN-DDPS-release.tar (5.9 GB)

After downloading, extract and place the contents at the repository root:

tar -xf FUN-DDPS-release.tar
cp -r FUN-DDPS-release/data ./
cp -r FUN-DDPS-release/pretrained ./

Directory structure

data/
  eclipse_14k_baseline/             # Raw Eclipse simulation outputs (.pt files)
  ccs2d_geomodel_only/             # Single-channel processed data
    train/  (12,492 samples)
    test/   (1,390 samples)
  ccs2d_2channel/                   # Joint 2-channel processed data
    train/  (12,492 samples)
    test/   (1,390 samples)

pretrained/
  geomodel_dm_best.pt              # Single-channel diffusion model (253 MB)
  joint_dm_best.pt                  # Joint 2-channel diffusion model (253 MB)
  surrogate/
    LocalNO_best.pt                 # LocalNO surrogate model (33 MB)
    normalization_stats.json        # Input/output normalization statistics

Dataset details

The datasets are derived from 14,000 Eclipse reservoir simulations of CO2 injection into heterogeneous geological formations.

Dataset Description Channels Size
eclipse_14k_baseline Raw simulation outputs containing permeability fields and CO2 saturation maps as PyTorch tensors -- 1.4 GB
ccs2d_geomodel_only Processed single-channel .npy arrays of log-permeability fields (64x64), used for training the single-channel diffusion model (FUN-DDPS) 1 1.4 GB
ccs2d_2channel Processed two-channel .npy arrays stacking log-permeability and CO2 saturation (2x64x64), used for training the joint diffusion model (FUN-DPS) 2 2.0 GB

Note: Rejection sampling data (rs_data/, ~287 GB) is not included due to size. See experiments/rs_benchmark/ for scripts to regenerate it.

RS posteriors: Pre-computed rejection sampling posterior samples (posterior.pt and gt.pt) are available for download from Google Drive. These contain 26,082 accepted samples (from 2M proposals) for the columns observation case, used to validate DPS/DDPS approximate posteriors in the paper.

Pre-trained model details

Model File Description
Single-channel DM geomodel_dm_best.pt UNO-based diffusion model trained on permeability fields only. Used as the prior in FUN-DDPS.
Joint 2-channel DM joint_dm_best.pt UNO-based diffusion model trained on joint permeability + saturation fields. Used as the prior in FUN-DPS.
LocalNO surrogate LocalNO_best.pt Local neural operator mapping permeability to CO2 saturation. Used as the differentiable forward model in FUN-DDPS for posterior sampling.
Normalization stats normalization_stats.json Per-channel mean and standard deviation computed from training data, used to normalize surrogate inputs/outputs.

Data preparation (from raw data)

If you want to re-process the data from the raw Eclipse outputs:

# Process 2-channel data (geomodel + dynamics)
python merge_ccs2d_data.py train --input_dir data/eclipse_14k_baseline
python merge_ccs2d_data.py test --input_dir data/eclipse_14k_baseline

# Process single-channel data (geomodel only)
python merge_ccs2d_data.py train --geomodel-only --input_dir data/eclipse_14k_baseline
python merge_ccs2d_data.py test --geomodel-only --input_dir data/eclipse_14k_baseline

Training

1. Train the surrogate neural operator

python surrogate/train_surrogate.py \
    --output_dir ./outputs/surrogate \
    --models localno \
    --task forward \
    --epochs 100 \
    --batch_size 32

2. Train the single-channel diffusion model (FUN-DDPS)

accelerate launch \
    --config_file configs/accelerate/accelerate_config_uno.yaml \
    train_acc.py \
    --config configs/training/ccs2d_geomodel_only.yml

3. Train the joint 2-channel diffusion model (FUN-DPS)

accelerate launch \
    --config_file configs/accelerate/accelerate_config_uno.yaml \
    train_acc.py \
    --config configs/training/ccs2d_uno_2channel.yml

Inference

Config-based inference

# Forward problem (FUN-DPS, joint model)
python generate_pde.py --config configs/inference/ccs2d-forward-ensemble-ulno.yaml

# Single-channel inference
python generate_pde.py --config configs/inference/ccs2d_single_dps.yaml

Experiment runner (recommended)

# FUN-DDPS forward problem
python experiments/scripts/run_experiment.py \
    --method ddps --exp forward --obs full \
    --num_samples 10 --iterations 500 --batch_size 16 \
    --batched --save_samples

# FUN-DDPS inverse problem
python experiments/scripts/run_experiment.py \
    --method ddps --exp inverse --obs full \
    --num_samples 10 --iterations 500 --batch_size 16 \
    --batched --save_samples

# FUN-DPS forward problem
python experiments/scripts/run_experiment.py \
    --method dps --exp forward --obs full \
    --num_samples 10 --iterations 500 --batch_size 16 \
    --batched --save_samples

# FUN-DPS inverse problem
python experiments/scripts/run_experiment.py \
    --method dps --exp inverse --obs full \
    --num_samples 10 --iterations 500 --batch_size 16 \
    --batched --save_samples

# Surrogate baseline (forward only)
python experiments/scripts/run_experiment.py \
    --method surrogate --exp forward --obs full \
    --num_samples 10 --save_samples

Observation modes

Mode Flag Description
Full --obs full Observe entire field
Random 50% --obs random_50 Random 50% of grid points
Random 25% --obs random_25 Random 25% of grid points
Column --obs columns Observe specific column indices

Rejection Sampling Benchmark

# Run rejection sampling
python experiments/rs_benchmark/run_rs.py \
    --config experiments/rs_benchmark/configs/rs_config.yaml

# Compare RS vs DPS/DDPS
python experiments/rs_benchmark/compare_rs_dps.py \
    --rs_dir experiments/rs_benchmark/results/ \
    --dps_dir experiments/results/

See experiments/rs_benchmark/RS_BENCHMARK.md for details.

Visualization

python experiments/scripts/visualize_results.py \
    --results_dir experiments/results/ \
    --exp forward \
    --obs full \
    --save_dir results/visualization/

Project Structure

FUN-DDPS/
  train_acc.py                      # Diffusion model training
  generate_pde.py                   # Inference entry point
  merge_ccs2d_data.py               # Data preparation
  configs/
    training/                       # Training configs
    accelerate/                     # Distributed training configs
    inference/                      # Inference configs
  training/                         # Training infrastructure
  generation/
    ccs2d/                          # Joint 2-channel (FUN-DPS)
    ccs2d_single/                   # Single-channel + surrogate (FUN-DDPS)
  scripts/                          # Generation scripts
  surrogate/                        # Surrogate model training
  experiments/
    scripts/                        # Experiment runner & visualization
    configs/                        # Experiment configs
    rs_benchmark/                   # Rejection sampling benchmark
  bash_scripts/                     # Shell scripts for training/inference
  torch_utils/                      # Utility functions
  dnnlib/                           # Deep learning utilities

Citation

If you find this work useful, please cite:

@article{ju2026function,
  title={Function-Space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage},
  author={Ju, Xin and Yao, Jiachen and Anandkumar, Anima and Benson, Sally M and Wen, Gege},
  journal={arXiv preprint arXiv:2602.12274},
  year={2026}
}

License

See LICENSE for details.

About

Official pytorch repository for "Function-Space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors