--dangerously-skip-permissions blocked on RunPod (root-only environments)

`--dangerously-skip-permissions` blocked on RunPod (root-only environments) — autonomous loop impossible

### Problem

Autoresearch requires `claude --dangerously-skip-permissions` for autonomous operation — the agent must run indefinitely without human approval. However, Claude Code blocks this flag when running as root:

```
--dangerously-skip-permissions cannot be used with root/sudo privileges for security reasons
```

**RunPod** (and most GPU cloud providers — Lambda, Vast.ai, etc.) run everything as root by default. There is no built-in way to switch to a non-root user. This makes autoresearch unusable on the most common ML research infrastructure out of the box.

Without this flag, Claude Code prompts for permission on every bash command, git operation, and file edit — the agent stops every 2-3 minutes waiting for human approval, making the autonomous loop impossible.

### Reproduction

```bash
# 1. Spin up any RunPod GPU pod
# 2. SSH in
ssh root@<ip> -p <port>

# 3. Setup
git clone https://github.com/karpathy/autoresearch.git
cd autoresearch
uv sync && uv run prepare.py

# 4. Install Claude Code
npm install -g @anthropic-ai/claude-code

# 5. Try autonomous mode
claude --dangerously-skip-permissions
# ERROR: --dangerously-skip-permissions cannot be used with root/sudo privileges
```

### What We Tried (16 hours, H100 80GB)

We spent a full overnight session attempting to get autoresearch running on RunPod. Here's what happened:

**Attempt 1 — Claude Code as root (no flag):**
Claude Code ran but stopped for permission prompts every 2-3 minutes. It completed a baseline (val_bpb=2.1851) but then got stuck in a cycle of running experiments → permission prompt → waiting for human → timeout.

**Attempt 2 — Create non-root user:**
```bash
useradd -m -s /bin/bash researcher
su - researcher -c "cd /path && claude --dangerously-skip-permissions"
```
The flag worked, but introduced:
- Git `dubious ownership` errors (files owned by root, user is researcher)
- Fragile `su` session handling
- File permission conflicts between root-owned workspace and researcher user

**Attempt 3 — Bash script replacement:**
We wrote a bash script that systematically sweeps hyperparameters using environment variables (since `train.py` reads from `os.environ`). This finally worked but loses the adaptive/creative aspect of having an LLM agent choose experiments.

### Results

| Attempt | Method | Experiments Completed | Hours Spent |
|---------|--------|----------------------|-------------|
| 1 | Claude Code as root | 1 (baseline only) | 3 hours |
| 2 | Claude Code as researcher | ~5 (with manual approvals) | 2 hours |
| 3 | Bash script (no AI) | 2 successful + 80 crashed* | 11 hours |
| **Total** | | **~88 attempts, 3 real results** | **16 hours** |

*The 80 crashes were from a separate bug (sed commands corrupting train_gpt.py during reverts — our mistake, not autoresearch's). Once we switched to env vars instead of file modification, experiments completed successfully.

### Additional Discovery: Eval Time on Custom Codebases

When adapting autoresearch for a different training script (our case: the [OpenAI Parameter Golf](https://github.com/openai/parameter-golf) contest), the sliding window eval on the full FineWeb validation set takes **21 minutes** per experiment — not the ~10 seconds that autoresearch's built-in eval takes. This changes the math dramatically:

| Setup | Train | Eval | Total | Experiments/Hour |
|-------|-------|------|-------|-----------------|
| Karpathy's autoresearch | 5 min | ~10 sec | ~6 min | ~10 |
| Custom codebase (our case) | 5 min | 21 min | 26 min | ~2 |

The README's estimate of "100 experiments while you sleep" assumes the built-in eval. Anyone adapting autoresearch for a different codebase should benchmark their eval time first.

### Suggested Fixes

**1. Add a `setup-cloud.sh` script** that handles root environments:

```bash
#!/bin/bash
# setup-cloud.sh — Run autoresearch on root-only cloud GPUs (RunPod, Lambda, etc.)

set -e
WORK_DIR="${1:-/workspace/autoresearch}"
USER="researcher"

echo "Setting up non-root user for autonomous Claude Code..."

# Create user if needed
id $USER &>/dev/null || useradd -m -s /bin/bash $USER

# Copy workspace
cp -r "$WORK_DIR" /home/$USER/autoresearch
chown -R $USER:$USER /home/$USER/autoresearch

# Fix git ownership
su - $USER -c "git config --global --add safe.directory /home/$USER/autoresearch"

# Verify data
su - $USER -c "ls ~/.cache/autoresearch/data/ > /dev/null 2>&1" || {
    echo "Data not found. Run 'uv run prepare.py' first as root, then re-run this script."
    exit 1
}

echo ""
echo "Setup complete. Launch with:"
echo "  su - $USER -c 'cd ~/autoresearch && claude --dangerously-skip-permissions'"
```

**2. Document the eval time caveat** in the README's "Platform support" section:

> **Custom codebases:** If you're adapting autoresearch for a different training script, benchmark your eval time first. The default autoresearch eval takes seconds, but custom eval functions (e.g., sliding window eval on large validation sets) can take 20+ minutes, reducing throughput from ~10 experiments/hour to ~2.

**3. Document the env-var approach** as a fallback for environments where Claude Code can't run:

> **No Claude Code available?** If you can't run Claude Code with `--dangerously-skip-permissions` (root-only environments, no Node.js, etc.), you can still use autoresearch's training infrastructure with a bash script that sweeps hyperparameters via environment variables. Since `train.py` reads all hyperparameters from `os.environ`, you never need to modify the file.

### Environment

- RunPod community cloud, H100 80GB HBM3
- Ubuntu 24.04 (root user only)
- Claude Code 2.1.81
- Node.js 22.22.1
- PyTorch 2.9.1+cu128, CUDA 12.8
- Flash Attention 2.8.3 + FA3 Hopper kernels


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

--dangerously-skip-permissions blocked on RunPod (root-only environments) #396

Problem

Reproduction

What We Tried (16 hours, H100 80GB)

Results

Additional Discovery: Eval Time on Custom Codebases

Suggested Fixes

Environment

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Attempt	Method	Experiments Completed	Hours Spent
1	Claude Code as root	1 (baseline only)	3 hours
2	Claude Code as researcher	~5 (with manual approvals)	2 hours
3	Bash script (no AI)	2 successful + 80 crashed*	11 hours
Total		~88 attempts, 3 real results	16 hours

Setup	Train	Eval	Total	Experiments/Hour
Karpathy's autoresearch	5 min	~10 sec	~6 min	~10
Custom codebase (our case)	5 min	21 min	26 min	~2

--dangerously-skip-permissions blocked on RunPod (root-only environments) #396

Description

Problem

Reproduction

What We Tried (16 hours, H100 80GB)

Results

Additional Discovery: Eval Time on Custom Codebases

Suggested Fixes

Environment

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions