Skip to content

fengzhao1239/SURGIN

Repository files navigation

SURGIN: SURrogate-guided Generative INversion

arXiv HuggingFace

SURGIN integrates a U-Net enhanced Fourier Neural Operator (UFNO) surrogate with a score-based generative model (SGM), framing the conditional generation as a surrogate prediction-guidance process in a Bayesian perspective. Instead of directly learning the conditional generation of geological parameters, an unconditional SGM is first pretrained in a self-supervised manner to capture the geological prior, after which posterior sampling is performed by leveraging a differentiable U-FNO surrogate to enable efficient forward evaluations conditioned on unseen observations.

SURGIN Framework

Key capabilities:

  • Unconditional generation of realistic permeability fields via diffusion prior
  • Surrogate approximation from permeability fields to spatiotemporal pressure and saturation fields
  • Conditional generation guided by a pre-trained UFNO surrogate, solving inverse problems from pressure/saturation/permeability measurements

πŸš€ Quick Start: Using Pre-trained Models for Conditional Inference

This section describes how to use the pre-trained diffusion prior and UFNO surrogate to solve inverse problems from observations without retraining.

1. Environment Setup

# Create and activate conda environment
conda create -n surgin python=3.10
conda activate surgin

# Install dependencies
pip install -r requirements.txt

2. Download Pre-trained Checkpoints & Datasets

Download the compressed archive and extract it in the repository root:

# After downloading surgin_pretrained_data.tar.gz to the repo root
tar -xzvf surgin_pretrained_data.tar.gz

This extracts the following files into their expected directories (please create the folders if needed):

File Location
ufno_pre.pth checkpoint/
ufno_sat.pth checkpoint/
ufno_pre_vertical.pth checkpoint/
ufno_sat_vertical.pth checkpoint/
ema_0.9999_160000.pt UnconditionalDiffusionTraining_and_Generation/output/logs_gaussian_dit/
ema_0.9999_350000.pt UnconditionalDiffusionTraining_and_Generation/output/logs_gaussian_vertical_dit/
K_gstools_1w.npy dataset/
K_vertical.npy dataset/
Multi_Cartesian_Gaussian.hdf5 dataset/
Multi_Cartesian_Gaussian_vertical.hdf5 dataset/

Download links: surgin_pretrained_data.tar.gz on Hugging Face

3. Run Conditional Inverse Modeling

Conditional generation (surrogate-guided diffusion) is performed via Jupyter notebooks in ConditionalDiffusionGeneration/inference_scripts/Case/Generation/:

Notebook Description
Cartesian_inverse_sparse.ipynb Horizontal inverse with sparse well data
Vertical_inverse_3wells_injection.ipynb Vertical case with 3-well injection observations
  • TODO: please define your forward function $\mathcal{M}$ in ConditionalDiffusionGeneration/src/guided_diffusion/measurements.py to create arbitrary conditioning.

πŸ”₯ Training from Scratch

Step 1: Configure Weights & Biases (W&B)

This project uses Weights & Biases for experiment tracking. Set the following environment variables before training:

export WANDB_API_KEY="YOUR_WANDB_API_KEY"
export WANDB_PROJECT="YOUR_WANDB_PROJECT_NAME"
export WANDB_ENTITY="YOUR_WANDB_ENTITY"

Tip: You can get your API key from https://wandb.ai/authorize. Create a free account and a new project at https://wandb.ai. If the WANDB_API_KEY environment variable is not set, training will proceed without W&B logging.

Step 2: Train the UFNO Surrogate Model

The UFNO learns the forward mapping from permeability fields to pressure/saturation solutions.

python train_ufno.py

Training configuration is defined directly in the Config class within train_ufno.py. Key parameters:

Parameter Default Description
data_path dataset/Multi_Cartesian_Gaussian_vertical.hdf5 Training HDF5 dataset
batch_size 50 Batch size
num_epochs 150 Number of training epochs
learning_rate 0.001 Initial learning rate
mode1, mode2, mode3 10, 10, 8 Fourier modes per dimension
width 36 Hidden channel width

Checkpoints are saved to checkpoint/.

Step 3: Train the Unconditional Diffusion Prior (DiT)

cd UnconditionalDiffusionTraining_and_Generation
python scripts/train.py training_recipes/Multi_Cartesian.yml

For the vertical case:

python scripts/train.py training_recipes/Multi_Cartesian_vertical.yml

Training recipe configuration (YAML):

Parameter Default Description
image_size 64 Spatial resolution
patch_size 2 DiT patch size
hidden_size 256 Transformer hidden dim
depth 12 Number of transformer blocks
num_heads 8 Attention heads
steps 1000 Diffusion timesteps
noise_schedule cosine Noise schedule type
lr 5e-5 Learning rate
ema_rate 0.9999 EMA decay rate
save_interval 10000 Checkpoint save frequency

Model checkpoints (including EMA weights) are saved to the log_path directory specified in the YAML config.

4. Run Unconditional Generation

Generate permeability field samples from the diffusion prior:

cd UnconditionalDiffusionTraining_and_Generation
python scripts/inference.py training_recipes/Multi_Cartesian.yml

For vertical cross-sections:

python scripts/inference.py training_recipes/Multi_Cartesian_vertical.yml

Generated samples are saved to the path specified by save_gen_path in the YAML config.


πŸ—‚οΈ Repository Structure

SURGIN/
β”œβ”€β”€ README.md
β”œβ”€β”€ train_ufno.py                          # Train UFNO surrogate model
β”œβ”€β”€ basicutility/
β”‚   β”œβ”€β”€ ReadInput.py                       # YAML config loader
β”œβ”€β”€ Surrogate/
β”‚   β”œβ”€β”€ ufno.py                            # U-shaped Fourier Neural Operator (UFNO)
β”‚   β”œβ”€β”€ lploss.py                          # Relative Lp loss function
β”‚   └── utility.py                         # Dataset loaders & HDF5 utilities
β”œβ”€β”€ UnconditionalDiffusionTraining_and_Generation/
β”‚   β”œβ”€β”€ scripts/
β”‚   β”‚   β”œβ”€β”€ train.py                       # Train unconditional DiT diffusion model
β”‚   β”‚   └── inference.py                   # Generate unconditional samples
β”‚   β”œβ”€β”€ training_recipes/
β”‚   β”‚   β”œβ”€β”€ Multi_Cartesian.yml            # Config for horizontal case
β”‚   β”‚   └── Multi_Cartesian_vertical.yml   # Config for vertical case
β”‚   β”œβ”€β”€ src/                               # Diffusion model source code
β”‚   β”‚   β”œβ”€β”€ dit.py                         # Diffusion Transformer (DiT) architecture
β”‚   β”‚   β”œβ”€β”€ gaussian_diffusion.py          # DDPM forward/reverse diffusion
β”‚   β”‚   β”œβ”€β”€ train_util.py                  # Training loop with EMA & DDP
β”‚   β”‚   β”œβ”€β”€ script_util.py                 # Model/diffusion factory functions
β”‚   β”‚   β”œβ”€β”€ logger.py                      # Logging backends (stdout, W&B, etc.)
β”‚   β”‚   └── ...                            # fp16, losses, resampling, etc.
β”‚   └── latents/
β”‚       └── create_dataset.py              # Dataset class for diffusion training
└── ConditionalDiffusionGeneration/
    β”œβ”€β”€ inference_scripts/
    β”‚   └── Case/Generation/               # Jupyter notebooks for conditional inference
    β”‚       β”œβ”€β”€ Cartesian_inverse_sparse.ipynb
    β”‚       └── Vertical_inverse_3wells_injection.ipynb
    └── src/guided_diffusion/
        β”œβ”€β”€ gaussian_diffusion.py          # Guided diffusion with conditioning hooks
        β”œβ”€β”€ condition_methods.py           # Conditioning strategies
        β”œβ”€β”€ measurements.py                # Measurement operators & UFNO integration
        β”œβ”€β”€ dit.py                         # DiT for conditional generation
        β”œβ”€β”€ posterior_mean_variance.py     # Posterior computation utilities
        └── ...

πŸ“„ Citation

If you find this work useful, please cite:

@article{feng2025surgin,
  title={SURGIN: SURrogate-guided Generative INversion for subsurface multiphase flow with quantified uncertainty},
  author={Feng, Zhao and Yan, Bicheng and Zhao, Luanxiao and Shen, Xianda and Zhao, Renyu and Wang, Wenhao and Zhang, Fengshou},
  journal={arXiv preprint arXiv:2509.13189},
  year={2025}
}

πŸ’‘ Acknowledgement

  • UFNO β€” U-Net enhanced Fourier Neural Operator
  • DiT β€” Scalable Diffusion Models with Transformers
  • Guided Diffusion β€” OpenAI's guided diffusion codebase
  • DPS β€” Diffusion Posterior Sampling

About

Surrogate-guided generative inversion framework

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors