Peiyan Hu∗†1,3, Haodong Feng*1, Hongyuan Liu*1, Tongtong Yan2, Wenhao Deng1, Tianrun Gao†1,4, Rong Zheng†1,5, Haoren Zheng†1,2, Chenglei Yu1, Chuanrui Wang1, Kaiwen Li†1,2, Zhi-Ming Ma3, Dezhi Zhou2, Xingcai Lu6, Dixia Fan1, Tailin Wu†1.
1School of Engineering, Westlake University;
2Global College, Shanghai Jiao Tong University;
3Academy of Mathematics and Systems Science, Chinese Academy of Sciences;
4Department of Geotechnical Engineering, Tongji University;
5School of Physics, Peking University;
6Key Laboratory for Power Machinery and Engineering of M. O. E., Shanghai Jiao Tong University
*Equal contribution, †Work done as an intern at Westlake University, †Corresponding authors
RealPDEBench is the first scientific ML benchmark with paired real-world measurements and matched numerical simulations for complex physical systems, designed for spatiotemporal forecasting and sim-to-real transfer.
At a glance 👀
- 5 Datasets:
cylinder,fsi,controlled_cylinder,foil,combustion - 700+ Trajectories
- 10 Baseline models: U-Net, FNO, CNO, WDNO, DeepONet, MWT, GK-Transformer, Transolver, DPOT, DMD
- 9 Evaluation metrics: RMSE, MAE, Rel L₂, R², Update Ratio, fRMSE, FE, KE, MVPE
This repo is packaged with pyproject.toml and can be installed via pip (requires Python ≥ 3.10):
git clone https://github.com/AI4Science-WestlakeU/RealPDEBench.git
cd RealPDEBench
pip install -e .The repo id AI4Science-WestlakeU/RealPDEBench.
We provide a small pattern-based downloader:
# safe default: download metadata JSONs only
realpdebench download --dataset-root /path/to/data --scenario cylinder --what metadata
# to download Arrow shards (LARGE), explicitly set --what=hf_dataset or --what=all
# splits are stored in index JSONs under hf_dataset/ (no split directories)
realpdebench download --dataset-root /path/to/data --scenario cylinder --what hf_dataset --dataset-type realTips:
- Set
--endpoint https://hf-mirror.com(or envHF_ENDPOINT) to get acesss. - If you hit rate limits (HTTP 429) or need auth, login and set env
HF_TOKEN=.... - We recommend setting env
HF_HUB_DISABLE_XET=1.
Coming soon!
Coming soon!
# Simulated training (train on numerical data)
python -m realpdebench.train --config configs/cylinder/fno.yaml --train_data_type numerical
# Real-world training (train on real data from scratch)
python -m realpdebench.train --config configs/cylinder/fno.yaml --train_data_type real
# Real-world finetuning (finetune on real data)
python -m realpdebench.train --config configs/cylinder/fno.yaml --train_data_type real --is_finetuneHF Arrow datasets are stored under {dataset_root}/{scenario}/hf_dataset/{real,numerical}/ with split index files
{split}_index_{type}.json. To use them, enable:
--use_hf_dataset: load Arrow trajectories + index files (lazy slicing, dynamicN_autoregressive)--hf_auto_download: download missing artifacts from HF automatically (use--hf_endpoint https://hf-mirror.comfor easy accessing)
Example:
python -m realpdebench.train --config configs/cylinder/fno.yaml --use_hf_dataset --hf_auto_download --hf_endpoint https://hf-mirror.compython -m realpdebench.eval --config configs/cylinder/fno.yaml --checkpoint_path /path/to/checkpoint.pthpython -m realpdebench.eval --config configs/cylinder/fno.yaml --checkpoint_path /path/to/checkpoint.pth --use_hf_datasetComing soon!
We welcome contributions from the community! Please feel free to
- Add your models
- Contact us to submit to the leaderboard
- Contribute code improvements
- Improve documentation
If you find our work and/or our code useful, please cite us via:
@misc{hu2026realpdebenchbenchmarkcomplexphysical,
title={RealPDEBench: A Benchmark for Complex Physical Systems with Real-World Data},
author={Peiyan Hu and Haodong Feng and Hongyuan Liu and Tongtong Yan and Wenhao Deng and Tianrun Gao and Rong Zheng and Haoren Zheng and Chenglei Yu and Chuanrui Wang and Kaiwen Li and Zhi-Ming Ma and Dezhi Zhou and Xingcai Lu and Dixia Fan and Tailin Wu},
year={2026},
eprint={2601.01829},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2601.01829},
}- AI for Scientific Simulation and Discovery Lab: https://github.com/AI4Science-WestlakeU
- REALM: https://github.com/deepflame-ai/REALM/tree/main
- PDEBench: https://github.com/pdebench/PDEBench

