(ICLR'26) TimeOmni-1: Incentivizing Complex Reasoning with Time Series in Large Language Models

This repository provides installation and usage scripts for TimeOmni-1.

🙋 Please let us know if you find out a mistake or have any suggestions!

🌟 If you find this resource helpful, please consider to star this repository and cite our research:

Updates/News:

🚩 News (Apr. 2026): We have released new post-trained versions based on Qwen3.5 on Hugging Face: TimeOmni-1-9B and TimeOmni-1-4B. These new versions further scale up model performance (inference code coming soon).

🚩 News (Feb. 2026): Please find the open source model on Hugging Face: TimeOmni-1-7B; see also our online demo: https://huggingface.co/spaces/anton-hugging/TimeOmni-1

🚩 News (Jan. 2026): TimeOmni-1 has been accepted to ICLR 2026! 🎉

📊 Benchmarks

Table. Model Size Scaling Comparison

* Note: All metrics below are computed only on valid responses. “–” indicates a success rate (SR) below 10%; in such cases, results are omitted due to insufficient statistical significance, and we therefore do not report them. For ACC, higher is better; for MAE, lower is better. Bold marks the best value in each ACC/MAE column.

	Task1 ID (ACC↑/SR)	Task1 OOD (ACC↑/SR)	Task2 ID (ACC↑/SR)	Task2 OOD (ACC↑/SR)	Task3 ID (MAE↓/SR)	Task3 OOD (MAE↓/SR)	Task4 ID (ACC↑/SR)	Task4 OOD (ACC↑/SR)
7B (Qwen2.5-Instruct)
Qwen2.5-Instruct-7B	48.5/100.0	42.8/100.0	21.6/99.8	26.3/100.0	23.28/53.1	146.12/55.5	25.5/100.0	24.9/100.0
TimeOmni-1-7B	90.7/97.5	87.7/98.3	69.3/99.8	64.0/99.8	14.30/93.8	145.53/82.3	47.9/100.0	58.9/100.0
4B (Qwen3.5)
Qwen-3.5-4B	0.0/16.5	5.9/17.0	28.3/12.4	35.4/12.0	-/2.2	-/9.0	-/8.5	-/9.2
TimeOmni-1-4B	91.5/99.5	91.2/98.4	71.1/100.0	66.1/99.9	13.68/97.6	170.41/86.1	58.5/100.0	72.0/100.0
9B (Qwen3.5)
Qwen-3.5-9B	91.2/51.0	93.5/46.1	43.3/12.1	36.3/12.8	17.56/14.1	-/0.8	64.2/28.2	72.0/32.2
TimeOmni-1-9B	93.5/100.0	92.8/99.8	70.9/100.0	66.2/100.0	13.54/97.8	140.06/95.6	59.6/100.0	75.6/99.6

🛠️ Installation

conda create -n timeomni python=3.10
conda activate timeomni
pip install -r requirements.txt

📦 Model Download

python install/download_hf_model.py

Default model path: ~/.cache/huggingface/hub.

🧪 Dataset Download

python install/download_testbed.py

This creates:

data/timeomni1_id_test.json
data/timeomni1_ood_test.json

🚀 Inference (single question)

Default system prompt:

Output Format:
<think>Your step-by-step reasoning process that justifies your answer</think>
<answer>Your final answer(Note: Only output a single uppercase letter of the correct option)</answer>

Run:

python inference/inference.py \
  --model_dir "Local Model Path /models--anton-hugging--TimeOmni-1-7B/snapshots/<hash>" \
  --question "Your Question" \
  --system_prompt "Output Format:\n<think>Your step-by-step reasoning process that justifies your answer</think>\n<answer>Your final answer(Note: Only output a single uppercase letter of the correct option)</answer>"

📊 Evaluation

bash eval/run-timeomini_test.sh

Optional env overrides:

MODEL_DIR=anton-hugging/TimeOmni-1-7B \
ANS_ID_PATH=answer/timeomni1_test/your_id_outputs.json \
RES_ID_PATH=answer/timeomni1_test/your_id_results.json \
ANS_OOD_PATH=answer/timeomni1_test/your_ood_outputs.json \
RES_OOD_PATH=answer/timeomni1_test/your_ood_results.json \
bash eval/run-timeomini_test.sh

We report Success Rate (SR), defined as the proportion of model outputs that yield a valid and extractable answer. All other metrics are computed on valid cases only.

Tasks 1, 2, 4: model outputs a single uppercase letter (A/B/C/D). Metric: Accuracy (ACC).
Task 3: model outputs a sequence (e.g., [2, 20, 21, ..., 83]). Metric: Mean Absolute Error (MAE).

✍️ Citation

@inproceedings{
guan2026timeomni,
title={TimeOmni-1: Incentivizing Complex Reasoning with Time Series in Large Language Models},
author={Tong Guan and Zijie Meng and Dianqi Li and Shiyu Wang and Chao-Han Huck Yang and Qingsong Wen and Zuozhu Liu and Sabato Marco Siniscalchi and Ming Jin and Shirui Pan},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=kOIclg7muL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
eval		eval
figs		figs
inference		inference
install		install
.gitignore		.gitignore
LICENSE		LICENSE
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

(ICLR'26) TimeOmni-1: Incentivizing Complex Reasoning with Time Series in Large Language Models

Updates/News:

📊 Benchmarks

🛠️ Installation

📦 Model Download

🧪 Dataset Download

🚀 Inference (single question)

📊 Evaluation

✍️ Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

(ICLR'26) TimeOmni-1: Incentivizing Complex Reasoning with Time Series in Large Language Models

Updates/News:

📊 Benchmarks

🛠️ Installation

📦 Model Download

🧪 Dataset Download

🚀 Inference (single question)

📊 Evaluation

✍️ Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages