Name	Name	Last commit message	Last commit date
parent directory ..
SkyRL-OpenHands @ 6f3ba32	SkyRL-OpenHands @ 6f3ba32
docker	docker
docs	docs
examples	examples
patches	patches
recipe	recipe
scripts	scripts
tests	tests
verl	verl
.env.sql	.env.sql
.env.swebench	.env.swebench
INSTALL.md	INSTALL.md
LICENSE	LICENSE
Notice.txt	Notice.txt
README.md	README.md
pyproject.toml	pyproject.toml
requirements.txt	requirements.txt
requirements_sglang.txt	requirements_sglang.txt
uv.lock	uv.lock

SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning

News • Links • Getting Started • Evaluation • Citation • Acknowledgement

News

[2025/05/20] 🎉 We released SkyRL-SQL: a multi-turn RL training pipeline for Text-to-SQL, along with SkyRL-SQL-7B — a model trained on just 653 samples that outperforms both GPT-4o and o4-mini!
[2025/05/06] 🎉 We released SkyRL-v0: our open RL training pipeline for multi-turn tool use LLMs, optimized for long-horizon, real-environment tasks like SWE-Bench!

Getting Started

This repository contains training code for the SkyRL-v0 release. Our implementation is a fork of VeRL.
The repo is currently utilizing the SGLang async rollout feature introduced to VeRL in this draft PR, based on this commit. We will refactor the code soon so that the codebase can easily keep up with VeRL main branch.

Installation

The first step is to clone our repository:

git clone --recurse-submodules https://github.com/NovaSky-AI/SkyRL

For detailed installation instructions, please refer to INSTALL.md

Scripts for Reproduction

SkyRL-Agent

For reproducing our results for SkyRL-Agent-14B-v0, SkyRL-Agent-8B-v0, and SkyRL-Agent-7B-v0 you can refer to examples/sky/swebench.

SkyRL-SQL-7B

For reproducing our results for SkyRL-SQL-7B, you can refer to examples/sky/sql.

Evaluation

We report evaluation results of different downstream tasks as below.

SWE-Bench

We report the evaluation result on SWE-Bench-Verified below.

Model	Base	Base Performance	Performance	Training Time
SkyRL-Agent-7B-v0	OpenHands-7B-Agent	11%	14.6%	16hrs 8xH100
SkyRL-Agent-8B-v0	Qwen3-8B no thinking	3.6%	9.4%	27hrs 8xH200
SkyRL-Agent-14B-v0	Qwen3-14B thinking	18%	21.6%	20hrs 8xH200

Text-to-SQL

We report the evaluation result on a range of Spider benchmarks (evaluated in 5 turns) below.

Model	Spider-Dev	Spider-Test	Spider-Realistic	Spider-DK	Spider-Syn	Avg
Qwen-2.5-Coder-7B-Instruct	77.1	79.6	74.2	62.8	66.2	72.0
o4-mini	80.6	81.8	81.2	70.8	72.1	77.3
GPT-4o	81.3	82.4	80.1	72.1	71.9	77.6
SkyRL-SQL-7B	83.9 (+6.8%)	85.2 (+5.6%)	81.1 (+6.9%)	72.0 (+9.2%)	73.7 (+7.5%)	79.2 (+7.2%)

Acknowledgement

This work is done at Berkeley Sky Computing Lab, with the amazing compute support from Anyscale, Databricks, NVIDIA, Lambda Labs, and AMD.

Huge thanks to the contributors of the SGLang async rollout feature in VeRL: Hancheng Zhang, Rui Lu, Haoran Wang from Tsinghua University, Xiang Long from OpenBMB/ModelBest.

We would also like to thank Ying Sheng, Chenyang Zhao from SGLang team for supporting SGLang async rollout integration, and Kaichao You, Simon Mo from vLLM team for supporting vLLM performance optimization.

Citation

The code in this repository is mostly described in the post below. Please consider citing this work if you find the repository helpful.

@misc{cao2025skyrl,
  title     = {SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning},
  author    = {Shiyi Cao and Sumanth Hegde and Dacheng Li and Tyler Griggs and Shu Liu and Eric Tang and Jiayi Pan and Xingyao Wang and Akshay Malik and Graham Neubig and Kourosh Hakhamaneshi and Richard Liaw and Philipp Moritz and Matei Zaharia and Joseph E. Gonzalez and Ion Stoica},
  year      = {2025},
}

@misc{liu2025skyrlsql,
      title={SkyRL-SQL: Matching GPT-4o and o4-mini on Text2SQL with Multi-Turn RL},
      author={Shu Liu and Sumanth Hegde and Shiyi Cao and Alan Zhu and Dacheng Li and Tyler Griggs and Eric Tang and Akshay Malik and Kourosh Hakhamaneshi and Richard Liaw and Philipp Moritz and Matei Zaharia and Joseph E. Gonzalez and Ion Stoica},
      year={2025},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning

News

Links

Getting Started

Installation

Scripts for Reproduction

SkyRL-Agent

SkyRL-SQL-7B

Evaluation

SWE-Bench

Text-to-SQL

Acknowledgement

Citation

FilesExpand file tree

skyagent

Directory actions

More options

Directory actions

More options

Latest commit

History

skyagent

Folders and files

parent directory

README.md

SkyRL-v0: Train Real-World Long-Horizon Agents via Reinforcement Learning

News

Links

Getting Started

Installation

Scripts for Reproduction

SkyRL-Agent

SkyRL-SQL-7B

Evaluation

SWE-Bench

Text-to-SQL

Acknowledgement

Citation