Skip to content

Create a new workflow to run self_hosted_large tests#2252

Open
Dreamsorcerer wants to merge 20 commits into
mainfrom
sam/self-hosted-large
Open

Create a new workflow to run self_hosted_large tests#2252
Dreamsorcerer wants to merge 20 commits into
mainfrom
sam/self-hosted-large

Conversation

@Dreamsorcerer
Copy link
Copy Markdown
Collaborator

No description provided.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

Comment thread .github/workflows/ci.yml Outdated
Comment thread .github/workflows/ci.yml Outdated
Comment thread .github/workflows/ci.yml Outdated
@Dreamsorcerer Dreamsorcerer marked this pull request as ready for review June 3, 2026 13:29
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 3, 2026

Greptile Summary

This PR replaces the dimsim pytest marker with self_hosted_large and introduces a dedicated self-hosted-large-tests CI job that runs on a GPU-capable [self-hosted, Linux, large] runner, finally wiring up the three dimsim e2e tests that were previously marked but never executed in CI.

  • The dimsim_process.py render-mode logic is refactored so that an explicit DIMSIM_RENDER env var takes priority over the CI auto-detection, allowing the new job to opt into GPU rendering while keeping other CI jobs on CPU.
  • Explicit timeouts (120 s / 180 s) are added to wait_until_odom_position calls that previously relied on the 60-second default — appropriate given the heavier scenes.
  • The self-hosted-tests matrix job filter does not exclude self_hosted_large, which is fine today but could silently run large tests on the base runner if both markers are ever combined on one test.

Confidence Score: 4/5

Safe to merge; the new CI job is well-guarded against fork abuse and the marker migration is complete across all files.

The self-hosted-tests matrix job does not exclude self_hosted_large from its marker filter, which is harmless today but leaves a small gap if both markers are ever combined on a test — it would run on the base runner without the required GPU and likely fail silently on that runner. Everything else — the render-mode refactor, timeout additions, and codecov wiring — is straightforward and correct.

.github/workflows/ci.yml — the self-hosted-tests run step could benefit from adding self_hosted_large to its exclusion filter for future safety.

Important Files Changed

Filename Overview
.github/workflows/ci.yml New self-hosted-large-tests job added for GPU-capable runner; dimsim marker exclusion replaced with self_hosted_large in regular test job; actions/checkout bumped to v6. The self-hosted-tests job filter does not explicitly exclude self_hosted_large, which could cause duplicate runs if both markers appear on one test.
dimos/simulation/dimsim/dimsim_process.py Render mode selection refactored to respect DIMSIM_RENDER env var when explicitly set, falling back to CPU in CI and GPU otherwise — correctly enables the new large runner to opt into GPU rendering.
dimos/conftest.py dimsim marker replaced with self_hosted_large; description updated to reflect high-memory runner requirement.
dimos/e2e_tests/test_dimsim_walk_forward.py Marker updated to self_hosted_large; explicit timeout=120 added to wait_until_odom_position to prevent silent hangs under the default 60-second timeout.
dimos/e2e_tests/test_dimsim_spatial_memory.py Marker updated to self_hosted_large; explicit timeout=180 added to wait_until_odom_position for the longer spatial-memory scenario.
pyproject.toml Default addopts exclusion filter updated from dimsim to self_hosted_large, keeping local dev runs free of GPU-heavy simulation tests.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[CI Triggered] --> B{Event type}
    B -->|push / same-repo PR| C[self-hosted-large-tests job starts]
    B -->|fork PR| D[self-hosted-large-tests SKIPPED]
    C --> E[Checkout + Fix permissions]
    E --> F[Install uv + apt deps]
    F --> G[uv sync --group tests-self-hosted]
    G --> H[Build C++ extensions]
    H --> I{DIMSIM_RENDER set?}
    I -->|Yes - gpu from job env| J[render = gpu]
    I -->|No| K{CI env set?}
    K -->|Yes| L[render = cpu]
    K -->|No| M[render = gpu]
    J --> N[Run pytest -m self_hosted_large with DISPLAY=:0]
    N -->|pass| O[Upload coverage to Codecov flag: SelfHosted-Large]
    N -->|fail| P[Re-run --lf with -vvvvv]
    P --> O
    O --> Q[ci-complete watcher allows skip of self-hosted-large-tests]
Loading

Comments Outside Diff (1)

  1. .github/workflows/ci.yml, line 298-305 (link)

    P2 The self-hosted-tests filter doesn't exclude self_hosted_large, so any future test decorated with both self_hosted and self_hosted_large would run on the base runner (which lacks the required GPU/memory) and on the large runner — duplicating the run and likely failing on the smaller machine. Adding an explicit exclusion here is a small, cheap guard against that.

Reviews (1): Last reviewed commit: "Merge branch 'main' into sam/self-hosted..." | Re-trigger Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant