Skip to content

Add changelog-draft Oz skill with Python scripts and GHA workflow#10280

Merged
vikvang merged 10 commits into
masterfrom
oz/changelog-draft-skill
May 13, 2026
Merged

Add changelog-draft Oz skill with Python scripts and GHA workflow#10280
vikvang merged 10 commits into
masterfrom
oz/changelog-draft-skill

Conversation

@vikvang
Copy link
Copy Markdown
Contributor

@vikvang vikvang commented May 6, 2026

Description

Adds an Oz agent skill that generates reviewable changelog drafts from PRs merged in a release range. Replaces the previous TypeScript orchestrator approach with a skill-first architecture following the pattern established in resolve-merge-conflicts and Buzz repo skills.

What it does:

  • Extracts explicit CHANGELOG-* markers from PR bodies
  • Classifies unmarked PRs inline using LLM judgment (with reference guidance)
  • Identifies external contributors via GitHub org membership checks
  • Cross-references feature flags to determine channel visibility
  • Produces a human-reviewable markdown draft + machine-readable JSON audit artifact
  • Does not mutate channel_versions.json or publish release notes

Files added:

  • .agents/skills/changelog-draft/SKILL.md — main skill (7-step workflow)
  • .agents/skills/changelog-draft/scripts/fetch_prs.py — PR collection + marker extraction
  • .agents/skills/changelog-draft/scripts/classify_contributors.py — org membership classification
  • .agents/skills/changelog-draft/scripts/extract_feature_flags.py — flag gate parser
  • .agents/skills/classify-changelog-pr/SKILL.md — inline classification guidance
  • .github/workflows/changelog_draft.ymlworkflow_dispatch workflow using oz-agent-action

All Python scripts are stdlib-only (no pip deps), use gh CLI for GitHub API access, and output JSON to stdout.

Oz conversation | Plan

Testing

  • All three Python scripts compile without errors (py_compile)
  • extract_feature_flags.py validated against live crates/warp_features/src/lib.rs — correctly extracts all RELEASE/PREVIEW/DOGFOOD flags
  • Full end-to-end validation planned as a manual workflow_dispatch run against a recent stable release after merge

Agent Mode

  • Warp Agent Mode - This PR was created via Warp's AI Agent Mode

Co-Authored-By: Oz oz-agent@warp.dev

@cla-bot cla-bot Bot added the cla-signed label May 6, 2026
@vikvang vikvang requested a review from petradonka May 6, 2026 18:20
@vikvang
Copy link
Copy Markdown
Contributor Author

vikvang commented May 6, 2026

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 6, 2026

@vikvang

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds an Oz changelog-draft skill, helper Python scripts, reference classification guidance, and a manual GitHub Actions workflow for generating changelog draft artifacts from a release range.

Concerns

  • The PR collection script only scans merge commits, so squash/rebase-merged PRs can be omitted from the release audit.
  • The classification workflow claims to apply feature-flag visibility rules but never gathers per-PR diffs/files needed to detect FeatureFlag usage.
  • Contributor classification treats failed/private org membership checks as external, which can misattribute internal private members.

Security

  • The workflow executes warpdotdev/oz-agent-action@main with WARP_API_KEY; pin the action to a reviewed SHA to avoid a mutable-action supply-chain risk.

Verdict

Found: 0 critical, 4 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

[
"git",
"log",
"--merges",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Limiting the range scan to merge commits misses squash/rebase-merged PRs, so the draft can omit PRs and fail the audit requirement; scan first-parent commits and extract PR numbers from all subjects instead.


### Step 5 — Classify unmarked PRs

For each PR that has no explicit `CHANGELOG-*` entries, decide whether to include it and under which category.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Step 5 applies feature-flag visibility rules, but the workflow only collected PR title/body/labels and never fetches each PR diff; add a diff/file fetch before classification or stable drafts can include preview/dogfood-gated changes.

def is_org_member(org: str, username: str) -> bool:
"""Check if a user is a member of the given GitHub org via gh api."""
result = run(
["gh", "api", f"orgs/{org}/members/{username}", "--silent"],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] GITHUB_TOKEN in this workflow is not a read:org user token, so private org membership checks will fail and be treated as external; either provide an appropriate token or return an unknown bucket instead of marking failures external.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this valid? IIUC, this comment is saying the skill won't work properly if anyone external to our org tries to use it? That seems reasonable to me.

Comment thread .github/workflows/changelog_draft.yml Outdated
fetch-depth: 0

- name: Generate changelog draft
uses: warpdotdev/oz-agent-action@main
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] warpdotdev/oz-agent-action@main is mutable while receiving WARP_API_KEY; pin the action to a reviewed commit SHA like the other workflow actions.

@vikvang vikvang requested a review from vorporeal May 6, 2026 19:20
---

## New Features
- You can now drag tabs out of a window into their own window, or between windows, similar to Chrome. (#9275)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we actually link to the PR here? Our changelog is formatted as markdown, so we can write something like [#9275](https://github.com/warpdotdev/warp/pull/9275).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, would be much cleaner!

- Fix the terminal pane background appearing darker in horizontal tabs mode with background image or custom opacity. (#9474)
- AI code blocks tagged `vue`, `xml`, `dockerfile`, `jsx`, `tsx`, etc. now render with syntax highlighting. (#9471)
- Reopen Closed Session is now reachable from the new-session menu on Linux and Windows. (#9347)
- Fixed missing syntax highlighting for C++ header files using `.hpp`, `.hxx`, or `.H` extensions. (#9388)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the bullets (using this one as an example) differ from the draft you put in Slack:

- Fixed missing syntax highlighting for C++ header files using `.hpp`, `.hxx`, or `.H` extensions. (#9388) — @princepal9120 :sparkles:

Which one is the latest version? If the one in Slack is the most up-to-date, then we need to update to actually include the unicode emoji, i.e. ✨. We aren't able to render :sparkles:.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange. The latest version is the draft in Slack. I'll enforce the unicode emojis.

Tries merge commits first (for repos using merge-commit strategy).
Falls back to all first-parent commits (for repos using squash merges).
"""
# Try merge commits first
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use merge commits in this repo, so I think this is redundant.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK

name: Changelog Draft

on:
workflow_dispatch:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just checking, is there anything we need to specify in the workflow to prevent external contributors from running it? this might be a repo-wide setting, but wanted to validate.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure, this might be a @vorporeal question.

share: team

- name: Upload changelog artifacts
uses: namespace-actions/upload-artifact@f6ccaacc655aec41b93af180d1d7eef21af862d2 # v1.0.3
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we upload via the namespace upload artifact action, are the artifacts available from the github workflow UI?

contents: read
pull-requests: read

jobs:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not blocking, but we should integrate this with our existing changelog generation flow (the one that currently runs after a release is created an posts the changelog to #release). I think that's in the create release or create release candidate workflow.

Ideally there's minimal change to how the oncalls handle the changelog, even if the internals of how we generate it changes.

def is_org_member(org: str, username: str) -> bool:
"""Check if a user is a member of the given GitHub org via gh api."""
result = run(
["gh", "api", f"orgs/{org}/members/{username}", "--silent"],
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this valid? IIUC, this comment is saying the skill won't work properly if anyone external to our org tries to use it? That seems reasonable to me.

@@ -0,0 +1,54 @@
---
name: classify-changelog-pr
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if we're going to have an agent classify whether unmarked PRs should appear in the changelog, then we need a way for a contributor to explicitly mark that a PR should not have a changelog entry.

I intentionally don't include a changelog entry for the vast majority of my PRs. I wouldn't want to accidentally have them be included into the changelog.

- **IMPROVEMENT** — Enhances an existing feature in a way users would notice (performance, UX, new options).
- **BUG-FIX** — Fixes a user-visible bug or regression.
- **OZ** — Changes to Oz / AI agent capabilities. At most 4 per release in the stable changelog.
- **IMAGE** — A GCP-hosted image URL for the release. Rare; only include if explicitly provided.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think image is something we need to support here? That's something we configure manually, not via PRs.

Comment thread .agents/skills/changelog-draft/SKILL.md Outdated

### Step 1 — Determine the release range

Infer the previous release tag for comparison. Release tags follow the pattern `v0.YYYY.MM.DD.HH.MM.<channel>_NN`. The previous tag is the most recent tag on the same channel before the given `release_tag`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is quite the right condition. We can have multiple versions of a single release, e.g.

v0.2026.04.29.08.57.stable_00
v0.2026.04.29.08.57.stable_01
v0.2026.04.29.08.57.stable_02

We want to diff off of the previous release cut, not the most recent tag. i.e. if we have versions:

v0.2026.04.22.08.57.stable_00
v0.2026.04.29.08.57.stable_00
v0.2026.04.29.08.57.stable_01

and we're generating a changelog for v0.2026.04.29.08.57.stable_01, we want to diff against v0.2026.04.22.08.57.stable_00, not v0.2026.04.29.08.57.stable_00.

Getting the right release tag to diff from should probably be done programmatically, there should be some prior art in our existing workflows.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this makes sense. I'll look for any existing workflows that might have something that can help here.

vikvang and others added 5 commits May 7, 2026 10:24
Replaces the TypeScript orchestrator in services/changelog-draft/ with a
skill-first architecture:

- .agents/skills/changelog-draft/SKILL.md: 7-step workflow for generating
  reviewable changelog drafts from release PRs
- scripts/fetch_prs.py: collects PRs in a release range, extracts CHANGELOG
  markers from PR bodies
- scripts/classify_contributors.py: checks warpdotdev org membership via gh api
- scripts/extract_feature_flags.py: parses RELEASE/PREVIEW/DOGFOOD flag lists
  from warp_features/src/lib.rs
- .agents/skills/classify-changelog-pr/SKILL.md: inline classification guidance
  for unmarked PRs
- .github/workflows/changelog_draft.yml: workflow_dispatch workflow using
  oz-agent-action

Co-Authored-By: Oz <oz-agent@warp.dev>
- fetch_prs.py now falls back to all first-parent commits when no merge
  commits are found (handles squash-merge workflow used by warp repo)
- Added examples/changelog-draft-example.md generated from real data:
  v0.2026.04.29.08.56.stable_00 → v0.2026.05.06.09.12.stable_00
  (211 PRs, 57 with explicit markers, 154 unmarked)

Co-Authored-By: Oz <oz-agent@warp.dev>
- fetch_prs.py: include changed_files per PR so the agent can detect
  FeatureFlag references and apply channel visibility rules
- classify_contributors.py: add 'unknown' bucket for auth failures
  instead of misclassifying as external
- SKILL.md: add guidance for feature-flag detection via changed_files
  and handling unknown contributors
- changelog_draft.yml: pin oz-agent-action to commit SHA
  ce1621abf6a8ed8afdd4e4cc994545ede8fe1c6f

Co-Authored-By: Oz <oz-agent@warp.dev>
- fetch_prs.py: remove merge-commit fallback (warp uses squash merges
  only); add CHANGELOG-NONE opt-out marker support
- SKILL.md: fix release range logic to diff against previous release
  *cut* (different date _00 tag), not just previous tag — handles RC
  tags like _01, _02 correctly
- SKILL.md: use markdown PR links [#N](url) in output examples;
  remove IMAGE from category list; add CHANGELOG-NONE to skipped output
- classify-changelog-pr: remove IMAGE category (configured manually);
  add CHANGELOG-NONE as first exclusion rule
- Example: convert all PR refs to markdown links; add ✨ unicode emoji
  to external contributor attribution
- Workflow: add comment clarifying workflow_dispatch access restriction

Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
@vikvang vikvang force-pushed the oz/changelog-draft-skill branch from 5a919ce to 1b43ac6 Compare May 7, 2026 14:49
@vikvang vikvang requested a review from danielpeng2 May 12, 2026 18:24
vikvang and others added 2 commits May 12, 2026 16:57
…an output

- Add CHANGELOG-NONE to PR template for deterministic changelog exclusion
- Remove 'Needs Review' and 'Skipped PRs' from markdown output (kept in JSON)
- Add 'Community' section with Contributors and Issue Reporters subsections
- New fetch_issue_reporters.py script to credit users who reported fixed issues
- Update fetch_prs.py to extract linked issues from PR bodies (Fixes/Closes)
- Update SKILL.md workflow: new Step 5 for issue reporters, 8 steps total

Co-Authored-By: Oz <oz-agent@warp.dev>
For commits that aren't squash-merges (no #PR in subject), walk the
merge commit's second-parent lineage to find PRs brought in via merge
commit (e.g. security-patches branch). Each discovered PR gets its own
fetch_pr_data call so none are missed.

Co-Authored-By: Oz <oz-agent@warp.dev>
@vikvang vikvang marked this pull request as ready for review May 12, 2026 23:42
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 12, 2026

@vikvang

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds an Oz changelog-draft skill, helper scripts, PR-template marker guidance, and a manual GitHub Actions workflow for generating release changelog drafts.

Concerns

  • The new CHANGELOG-NONE template line is parsed as a real opt-out, causing unchanged PR templates to suppress changelog entries.
  • PR number extraction can fetch issue numbers instead of PR numbers when squash titles mention issues before the GitHub (#PR) suffix.
  • Contributor classification treats many operational failures as confirmed external contributors.
  • The workflow checks out the release tag before running the newly added skill, which can fail for existing release tags that do not contain these files.

Found: 0 critical, 4 important, 1 suggestions

Verdict

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

CHANGELOG-BUG-FIX: {{more text goes here...}}
CHANGELOG-IMAGE: {{GCP-hosted URL goes here...}}
CHANGELOG-OZ: {{text goes here...}}
CHANGELOG-NONE
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This marker sits inside the template's HTML comment, but fetch_prs.py scans the raw body and treats any CHANGELOG-NONE as opt-out before placeholder filtering, so every PR that leaves the template unchanged will be skipped. Either remove this literal marker from the commented template or make marker extraction ignore HTML comments/template placeholders.

Expects the GitHub squash format: 'feat: something (#1234)'.
"""
msg = run(["git", "log", "-1", "--format=%s", sha])
m = re.search(r"#(\d+)", msg)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This takes the first #123 in the squash subject, so titles like Fixes #123 (#456) fetch issue 123 instead of PR 456. Match the trailing (#<pr>) GitHub suffix or take the last parenthesized PR number.

stderr = result.stderr.lower()
if "403" in stderr or "401" in stderr or "saml" in stderr:
return "unknown"
return "external"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Any non-401/403 failure (network errors, gh missing, rate limits with unexpected stderr) is classified as external, which can publicly credit internal or unknown users. Only return external on an explicit 404 and treat other failures as unknown.

Comment thread .github/workflows/changelog_draft.yml Outdated
- name: Check out code
uses: namespacelabs/nscloud-checkout-action@938f5d2d403d6224d9a0c0dc559b1dae09c2ede4 # v8.1.1
with:
ref: ${{ inputs.release_tag }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Checking out the requested release tag means the workspace may not contain this newly added skill/scripts for existing or older tags, so the Oz step can fail before it can generate a draft. Run the workflow from the current repository revision and use release_tag only as the git range head, or copy the skill before checking out the tag.


---

## Needs Review
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 [SUGGESTION] The example contradicts the skill's output contract that says Needs Review and Skipped PRs belong only in the JSON artifact. Keep the example aligned so the agent does not include internal audit sections in the human markdown.

vikvang and others added 2 commits May 13, 2026 10:46
fetch_issue_reporters.py now accepts --org flag to check each reporter's
org membership via the GitHub API, filtering out internal members so they
aren't misattributed as external community reporters.

Co-Authored-By: Oz <oz-agent@warp.dev>
- fetch_prs.py: strip HTML comments before scanning for CHANGELOG
  markers so template placeholders aren't parsed as real opt-outs
- fetch_prs.py: match trailing (#N) for PR number extraction to
  avoid grabbing issue numbers from squash titles
- classify_contributors.py: only return external on explicit 404,
  treat all other failures as unknown
- changelog_draft.yml: check out default branch instead of release
  tag so skill files are always available for older releases
- example: remove Needs Review and Skipped sections to match output
  contract; add NONE to classify skill categories, remove IMAGE

Co-Authored-By: Oz <oz-agent@warp.dev>
@vikvang vikvang enabled auto-merge (squash) May 13, 2026 17:57
@vikvang vikvang merged commit 3c22e42 into master May 13, 2026
36 of 41 checks passed
@vikvang vikvang deleted the oz/changelog-draft-skill branch May 13, 2026 18:16
vikvang added a commit that referenced this pull request May 14, 2026
## Description
Replace the `warpdotdev/generate-changelog` GitHub Action in the
`generate_changelogs` job with the Oz changelog-draft skill
(`oz-agent-action`). This makes the release changelog generation
AI-powered, using the same skill that was landed in #10280.

### What changed
- **SKILL.md**: Step 8 now produces a third output file
`changelog-release.json` with the `{newFeatures, improvements, bugFixes,
images, oz_updates}` schema expected by the Slack payload builder and
in-app changelog.json steps.
- **create_release.yml**: The `generate_changelogs` job now:
1. Runs `oz-agent-action` with a prompt to follow the changelog-draft
skill workflow
2. A bridge step reads `changelog-release.json` and sets
`outputs.changelog` so all downstream steps (Slack post, GCS upload)
work unchanged
3. Fixes `.image` to `.images` key in the changelog.json builder for
schema consistency

### How it works
The Oz agent runs the full skill workflow (fetch PRs, classify
contributors, extract feature flags, classify unmarked PRs, assemble
draft) and writes three files:
- `changelog-draft.md` — human-reviewable markdown
- `changelog-draft.json` — machine-readable audit artifact
- `changelog-release.json` — release-pipeline-compatible JSON (new)

The bridge step loads `changelog-release.json` into
`steps.generate_changelog.outputs.changelog`, maintaining the same
interface the downstream Slack and GCS steps expect.

## Linked Issue
- N/A (continuation of #10280)

## Testing
- Validated YAML structure (1788 lines, all key references intact)
- Generated a real `changelog-release.json` from the latest stable
release range (v0.2026.05.06 → v0.2026.05.13): 2 new features, 10
improvements, 27 bug fixes, 1 image
- Ran the exact downstream jq transforms (Slack payload builder +
changelog.json builder) against the generated JSON — all produce valid
output
- Full end-to-end test will occur on the next release cut

- [ ] I have manually tested my changes locally with `./script/run`

## Agent Mode
- [x] Warp Agent Mode - This PR was created via Warp's AI Agent Mode

Co-Authored-By: Oz <oz-agent@warp.dev>

<!--
CHANGELOG-NONE
-->

---------

Co-authored-by: Oz <oz-agent@warp.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants