Agent RL — Reinforcement Learning for Claude Code

Make your AI agents learn from your feedback. Every correction becomes a permanent rule. Over weeks, the corrections file becomes a methodology.

What It Does

This skill adds a persistent feedback loop to any Claude Code skill, prompt file, or conversational workflow:

Bootstrap — Point it at any skill or prompt. It creates memory files, injects a feedback step, and shows you how to give your first 5 rounds of feedback.
Feedback — After every run, rate the output and give corrections. Generalizable corrections get saved permanently. One-off fixes stay one-off.
Compile — Review accumulated rules. Deduplicate, resolve contradictions, remove stale entries.
Stats — See how your agent is improving. Rule growth, grade trends, category breakdown.
Remove — Cleanly remove RL from a skill. Your correction data is preserved.

How It Works

You correct an agent's output
        │
        ▼
Classify: one-off or generalizable?
        │
        ├─ One-off → fix this output, move on
        │
        └─ Generalizable → save to corrections.md
                                    │
                                    ▼
                        Next run reads corrections.md FIRST
                        ─── rules override defaults ───
                                    │
                                    ▼
                        Output improves over time

Works With

Claude Code skills — any SKILL.md in .claude/skills/
Prompt files — any .md or .txt used as an agent prompt
Inline workflows — conversational patterns with no file (creates one for you)

Feedback Channels

Channel	How it works	Best for
In Claude Code	Rate + correct inline after each run	Solo users
Feedback log	Entries appended to a markdown file	Async review
Notion DB	Each output gets a database row	Teams

Feedback Depth

Depth	What gets captured	Time
Grades only	1-5 score tracked in grades-log.md	~5 sec
Corrections	Wrong/Right/Pattern format	~1-2 min
Full context	Corrections + voice samples + anti-patterns + vocabulary	~3-5 min

Install

Option 1: Copy manually

cp -r .claude/skills/agent-rl ~/.claude/skills/agent-rl

Option 2: Use the install script

git clone https://github.com/Othmane-Khadri/agent-rl-skill.git
cd agent-rl-skill
chmod +x install.sh
./install.sh

Usage

Add RL to a skill

"add RL to my linkedin-post skill"

Give feedback (automatic after each run)

"How did this land? (1-5)"  →  you: "3"
"Any corrections?"          →  you: "Too formal, drop the hedging"
"Save as permanent rule?"   →  you: "yes"

Review accumulated rules

"review my RL rules for linkedin-post"

Check progress

"RL stats"

Remove RL

"remove RL from linkedin-post"

What Gets Created

When you add RL to a skill:

.claude/skills/your-skill/
├── SKILL.md              ← modified (2 steps added, backup saved)
├── rl-config.md          ← configuration and stats
├── corrections.md        ← the memory (grows over time)
├── grades-log.md         ← grade history (if grades-only depth)
├── voice-samples.md      ← what good looks like (if full-context)
├── anti-patterns.md      ← what to avoid (if full-context)
├── vocabulary.md         ← terms to use/avoid (if full-context)
└── *.pre-rl.bak          ← backup of original SKILL.md

Agent RL's own self-improving files:

.claude/skills/agent-rl/
├── SKILL.md              ← the skill itself
├── corrections.md        ← corrections on how agent-rl works (self-RL)
└── self-rl-log.md        ← usage and grade history

Your First 5 Sessions

Session	Focus	What to save
1	The obvious	The correction you'd make every time
2	The tone	How the output sounds (too formal? too casual?)
3	The structure	How it's organized (wrong order? missing section?)
4	The vocabulary	Specific words (jargon? missing terms?)
5	First compile	Run "review my RL rules" and see your methodology forming

Prerequisites

Claude Code installed
At least one existing skill, prompt file, or workflow to add RL to

Self-Improving

Agent RL eats its own dog food. It ships with its own corrections.md and self-rl-log.md. After every mode you run, it asks for quick feedback on how it performed — and saves generalizable corrections to improve itself next time.

Your corrections make the skill better for you. Over time, agent-rl learns how you like to bootstrap, capture feedback, and compile rules.

Design Principles

Corrections override defaults — the memory file outranks the base prompt
Library, not checklist — apply with judgment, not mechanically
Reasoning included — every rule says why, not just what
One-off vs. generalizable — prevents rule bloat
Confirm before saving — no silent writes
Backups before edits — every modification creates a .bak file
Data preserved on removal — removing RL keeps your corrections
Eats its own dog food — the skill applies RL to itself

Built by Earleads — GTM Engineering as a Service.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.claude/skills/agent-rl		.claude/skills/agent-rl
.gitignore		.gitignore
README.md		README.md
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Agent RL — Reinforcement Learning for Claude Code

What It Does

How It Works

Works With

Feedback Channels

Feedback Depth

Install

Option 1: Copy manually

Option 2: Use the install script

Usage

Add RL to a skill

Give feedback (automatic after each run)

Review accumulated rules

Check progress

Remove RL

What Gets Created

Your First 5 Sessions

Prerequisites

Self-Improving

Design Principles

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Agent RL — Reinforcement Learning for Claude Code

What It Does

How It Works

Works With

Feedback Channels

Feedback Depth

Install

Option 1: Copy manually

Option 2: Use the install script

Usage

Add RL to a skill

Give feedback (automatic after each run)

Review accumulated rules

Check progress

Remove RL

What Gets Created

Your First 5 Sessions

Prerequisites

Self-Improving

Design Principles

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages