Overnight autoresearch adaptation beats TinyRecursiveModels on Sudoku-Extreme: 92.2% exact acc in 5 min (vs paper's 87% in 18h) #369

VihariKanukollu · 2026-03-21T18:49:05Z

VihariKanukollu
Mar 21, 2026

I adapted the autoresearch loop to run on TinyRecursiveModels by @jm_alexia, targeting the Sudoku-Extreme benchmark.

Results: 92.2% exact accuracy in 5 minutes of training (274 total runs, 263 metric runs), compared to the paper's 87% in 18 hours. The agent found five key interventions over the course of the run:

Schedule tuning — higher LR (3e-4 vs 1e-4) with long warmdown for short-budget regime
Short-run throughput unlock — ~1.8x more optimizer steps in the same 300s wall-clock budget
Gradient clipping — stabilized the higher-LR runs
Halt-objective simplification — dropped q_continue loss entirely, reduced halt supervision to 0.05x (vs paper's 0.5x), freeing capacity for the actual task
Gated asymmetric regularization — learnable injection gate + asymmetric dropout (0.1 token-mixing, 0.02 feature-mixing), which contradicts the paper's always-on input reinjection design

The most interesting finding was #5: the agent deliberately weakened the input reinjection path so the model had to rely more on its recurrent state, which is the opposite of the original TRM design choice.

Progress chart (annotated with breakthrough points):

Full thread with details: https://x.com/VihariKanukollu/status/2035411680050778435

This is an adaptation of the autoresearch seed repo applied to a non-nanochat benchmark — happy to share the modified train.py and prepare.py if there's interest.

saakethtypes · 2026-03-21T19:16:17Z

saakethtypes
Mar 21, 2026

Impressive results, nice work!

0 replies

ediestel · 2026-03-23T11:25:51Z

ediestel
Mar 23, 2026

excellent. yes : i would like to test your modified train.py and prepare.py and get back to you - great interest. I am not sure how code should be shared here, but you could email it to diestel.research@gmail.com; confidentially and your ip personally guaranteed.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Overnight autoresearch adaptation beats TinyRecursiveModels on Sudoku-Extreme: 92.2% exact acc in 5 min (vs paper's 87% in 18h) #369

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Overnight autoresearch adaptation beats TinyRecursiveModels on Sudoku-Extreme: 92.2% exact acc in 5 min (vs paper's 87% in 18h) #369

Uh oh!

VihariKanukollu Mar 21, 2026

Replies: 2 comments

Uh oh!

Uh oh!

saakethtypes Mar 21, 2026

Uh oh!

ediestel Mar 23, 2026

VihariKanukollu
Mar 21, 2026

saakethtypes
Mar 21, 2026

ediestel
Mar 23, 2026