Subsurface multiphase flow plays a fundamental role in geo-energy applications such as geological carbon sequestration. Recent advances in deep learning have shown the great potential of deep neural networks as surrogate models for characterizing the spatiotemporal dynamics of multiphase flow. However, existing studies have yet to fully explore the state-of-the-art deep learning architectures. Additionally, the challenge of various data modalities, which frequently arise in practical flow problems, remains largely unaddressed. To this end, we propose the conditional Swin Transformer network (CoSwinNet), a novel multimodal surrogate model tailored to simulate realistic subsurface multiphase flow scenarios. CoSwinNet synergistically blends a multiscale vision Transformer (Swin Transformer) with Feature-wise Linear Modulation (FiLM), facilitating fine-grained representation learning and lightweighted condition embedding. Two representative numerical experiments, including a 2D Cartesian system for CO2 injection with brine production and a 3D radial system for CO2 injection followed by a post-injection period, are conducted to comprehensively evaluate the performance of CoSwinNet. Results demonstrate the superiority of the proposed methodology in the context of accuracy, computation overhead, training cost, and inference cost. These findings underscore CoSwinNet’s potential as a versatile and computationally efficient tool for real-time multiphase flow optimization and uncertainty quantification.
Figure 1. Overall architecture of the proposed model.
Figure 2. Detailed design of the shifted window attention.
Figure 3. Multimodal feature fusion using FiLM conditioning strategy.
Figure 4. Representative prediction results of the proposed method.
- Swin.py: swin transformer layer implementation with FiLM conditioning
- ViT: swin transformer and vision transformer blocks
- Whole.py: neural networks assembly (i.e., UNet with different blocks)
- dataset.py: the dataset reader
- train.py: training script
- configuration.py: hyperparameters
@article{feng2026coswinnet,
title={CoSwinNet: A conditional Swin Transformer multimodal surrogate model for subsurface multiphase flow},
author={Feng, Zhao and Tariq, Zeeshan and Zhang, Zhong and Zhao, Peilin and Zhao, Renyu and Wang, Wenhao and Huang, Xinwo and Yan, Bicheng and Shen, Xianda and Zhang, Fengshou},
journal={Fuel},
volume={411},
pages={138067},
year={2026},
publisher={Elsevier}
}
The Swin Transformer block was modified from https://github.com/HuCaoFighting/Swin-Unet.