SURGIN integrates a U-Net enhanced Fourier Neural Operator (UFNO) surrogate with a score-based generative model (SGM), framing the conditional generation as a surrogate prediction-guidance process in a Bayesian perspective. Instead of directly learning the conditional generation of geological parameters, an unconditional SGM is first pretrained in a self-supervised manner to capture the geological prior, after which posterior sampling is performed by leveraging a differentiable U-FNO surrogate to enable efficient forward evaluations conditioned on unseen observations.
Key capabilities:
- Unconditional generation of realistic permeability fields via diffusion prior
- Surrogate approximation from permeability fields to spatiotemporal pressure and saturation fields
- Conditional generation guided by a pre-trained UFNO surrogate, solving inverse problems from pressure/saturation/permeability measurements
This section describes how to use the pre-trained diffusion prior and UFNO surrogate to solve inverse problems from observations without retraining.
# Create and activate conda environment
conda create -n surgin python=3.10
conda activate surgin
# Install dependencies
pip install -r requirements.txtDownload the compressed archive and extract it in the repository root:
# After downloading surgin_pretrained_data.tar.gz to the repo root
tar -xzvf surgin_pretrained_data.tar.gzThis extracts the following files into their expected directories (please create the folders if needed):
| File | Location |
|---|---|
ufno_pre.pth |
checkpoint/ |
ufno_sat.pth |
checkpoint/ |
ufno_pre_vertical.pth |
checkpoint/ |
ufno_sat_vertical.pth |
checkpoint/ |
ema_0.9999_160000.pt |
UnconditionalDiffusionTraining_and_Generation/output/logs_gaussian_dit/ |
ema_0.9999_350000.pt |
UnconditionalDiffusionTraining_and_Generation/output/logs_gaussian_vertical_dit/ |
K_gstools_1w.npy |
dataset/ |
K_vertical.npy |
dataset/ |
Multi_Cartesian_Gaussian.hdf5 |
dataset/ |
Multi_Cartesian_Gaussian_vertical.hdf5 |
dataset/ |
Download links: surgin_pretrained_data.tar.gz on Hugging Face
Conditional generation (surrogate-guided diffusion) is performed via Jupyter notebooks in ConditionalDiffusionGeneration/inference_scripts/Case/Generation/:
| Notebook | Description |
|---|---|
Cartesian_inverse_sparse.ipynb |
Horizontal inverse with sparse well data |
Vertical_inverse_3wells_injection.ipynb |
Vertical case with 3-well injection observations |
- TODO: please define your forward function
$\mathcal{M}$ inConditionalDiffusionGeneration/src/guided_diffusion/measurements.pyto create arbitrary conditioning.
This project uses Weights & Biases for experiment tracking. Set the following environment variables before training:
export WANDB_API_KEY="YOUR_WANDB_API_KEY"
export WANDB_PROJECT="YOUR_WANDB_PROJECT_NAME"
export WANDB_ENTITY="YOUR_WANDB_ENTITY"Tip: You can get your API key from https://wandb.ai/authorize. Create a free account and a new project at https://wandb.ai. If the
WANDB_API_KEYenvironment variable is not set, training will proceed without W&B logging.
The UFNO learns the forward mapping from permeability fields to pressure/saturation solutions.
python train_ufno.pyTraining configuration is defined directly in the Config class within train_ufno.py. Key parameters:
| Parameter | Default | Description |
|---|---|---|
data_path |
dataset/Multi_Cartesian_Gaussian_vertical.hdf5 |
Training HDF5 dataset |
batch_size |
50 | Batch size |
num_epochs |
150 | Number of training epochs |
learning_rate |
0.001 | Initial learning rate |
mode1, mode2, mode3 |
10, 10, 8 | Fourier modes per dimension |
width |
36 | Hidden channel width |
Checkpoints are saved to checkpoint/.
cd UnconditionalDiffusionTraining_and_Generation
python scripts/train.py training_recipes/Multi_Cartesian.ymlFor the vertical case:
python scripts/train.py training_recipes/Multi_Cartesian_vertical.ymlTraining recipe configuration (YAML):
| Parameter | Default | Description |
|---|---|---|
image_size |
64 | Spatial resolution |
patch_size |
2 | DiT patch size |
hidden_size |
256 | Transformer hidden dim |
depth |
12 | Number of transformer blocks |
num_heads |
8 | Attention heads |
steps |
1000 | Diffusion timesteps |
noise_schedule |
cosine | Noise schedule type |
lr |
5e-5 | Learning rate |
ema_rate |
0.9999 | EMA decay rate |
save_interval |
10000 | Checkpoint save frequency |
Model checkpoints (including EMA weights) are saved to the log_path directory specified in the YAML config.
Generate permeability field samples from the diffusion prior:
cd UnconditionalDiffusionTraining_and_Generation
python scripts/inference.py training_recipes/Multi_Cartesian.ymlFor vertical cross-sections:
python scripts/inference.py training_recipes/Multi_Cartesian_vertical.ymlGenerated samples are saved to the path specified by save_gen_path in the YAML config.
SURGIN/
βββ README.md
βββ train_ufno.py # Train UFNO surrogate model
βββ basicutility/
β βββ ReadInput.py # YAML config loader
βββ Surrogate/
β βββ ufno.py # U-shaped Fourier Neural Operator (UFNO)
β βββ lploss.py # Relative Lp loss function
β βββ utility.py # Dataset loaders & HDF5 utilities
βββ UnconditionalDiffusionTraining_and_Generation/
β βββ scripts/
β β βββ train.py # Train unconditional DiT diffusion model
β β βββ inference.py # Generate unconditional samples
β βββ training_recipes/
β β βββ Multi_Cartesian.yml # Config for horizontal case
β β βββ Multi_Cartesian_vertical.yml # Config for vertical case
β βββ src/ # Diffusion model source code
β β βββ dit.py # Diffusion Transformer (DiT) architecture
β β βββ gaussian_diffusion.py # DDPM forward/reverse diffusion
β β βββ train_util.py # Training loop with EMA & DDP
β β βββ script_util.py # Model/diffusion factory functions
β β βββ logger.py # Logging backends (stdout, W&B, etc.)
β β βββ ... # fp16, losses, resampling, etc.
β βββ latents/
β βββ create_dataset.py # Dataset class for diffusion training
βββ ConditionalDiffusionGeneration/
βββ inference_scripts/
β βββ Case/Generation/ # Jupyter notebooks for conditional inference
β βββ Cartesian_inverse_sparse.ipynb
β βββ Vertical_inverse_3wells_injection.ipynb
βββ src/guided_diffusion/
βββ gaussian_diffusion.py # Guided diffusion with conditioning hooks
βββ condition_methods.py # Conditioning strategies
βββ measurements.py # Measurement operators & UFNO integration
βββ dit.py # DiT for conditional generation
βββ posterior_mean_variance.py # Posterior computation utilities
βββ ...
If you find this work useful, please cite:
@article{feng2025surgin,
title={SURGIN: SURrogate-guided Generative INversion for subsurface multiphase flow with quantified uncertainty},
author={Feng, Zhao and Yan, Bicheng and Zhao, Luanxiao and Shen, Xianda and Zhao, Renyu and Wang, Wenhao and Zhang, Fengshou},
journal={arXiv preprint arXiv:2509.13189},
year={2025}
}- UFNO β U-Net enhanced Fourier Neural Operator
- DiT β Scalable Diffusion Models with Transformers
- Guided Diffusion β OpenAI's guided diffusion codebase
- DPS β Diffusion Posterior Sampling
