A lightweight AI agent that grabs fresh AI-related headlines and posts a daily digest to GitHub Issues.
🔔 Watch this repository to receive the daily AI news digest email delivered straight to your inbox.
Scheduled runs check for today's digest issue before calling the LLM, so fallback CI skips duplicate builds.
Push and pull request CI runs pytest and mypy.
Scheduled agent runs generate locally and dispatch the final publish through GitHub Actions. Direct --publish-issue remains a manual fallback.
flowchart LR
subgraph Trigger[Triggers]
GH[GitHub Actions<br/>schedule or manual dispatch]
AGENT[Codex / Claude Code<br/>automation]
LOCAL[Local CLI run<br/>uv run python src/main.py]
end
subgraph Guard[Issue Guard]
CHECK[Check today's GitHub issue]
end
subgraph App[Application]
C[Collect]
F[Filter]
G[Group candidates]
K[Categorize]
R[Render]
C --> F --> G --> K --> R
end
subgraph In[Inputs]
FEEDS[feeds.json]
RSS[RSS feed endpoints]
CONF[.env + config.py]
OAI[OpenAI API<br/>GitHub fallback]
MODEL[Codex / Claude model<br/>agent mode]
end
subgraph Out[Outputs]
JSON[digest-candidates.json<br/>digest-decisions.json]
MD[news.md]
ISSUE[GitHub Issue<br/>label: ai-digest]
end
GH --> CHECK --> C
AGENT --> CHECK
LOCAL --> C
FEEDS --> C
RSS --> C
CONF --> C
CONF --> K
OAI --> K
MODEL --> K
G --> JSON
JSON --> K
R --> MD
MD --> ISSUE
classDef io fill:#eef7ff,stroke:#1f6feb,stroke-width:1px,color:#0b1f3a;
classDef proc fill:#f7f7f7,stroke:#555,stroke-width:1px,color:#111;
class FEEDS,RSS,CONF,OAI,MODEL,JSON,MD,ISSUE io;
class GH,AGENT,LOCAL,CHECK,C,F,G,K,R proc;
flowchart LR
GH[GitHub Actions] --> T{Today's issue<br/>already open?}
AGENT[Codex / Claude] --> T
T -- Yes --> S[Stop]
T -- No --> C[Collect + filter + build candidate groups]
C --> P{Path}
P -- GitHub Actions --> K1{OPENAI_API_KEY available?}
K1 -- Yes --> L[OpenAI dedupe + categorize]
K1 -- No --> R[Local duplicate resolution + fallback categorization]
P -- Codex / Claude --> X[Write digest-candidates.json]
X --> Y[Agent writes digest-decisions.json]
Y --> Z[Apply decisions]
L --> W[Render + write news.md]
R --> W
Z --> W
- Python 3.12+ with pip
pip install uvcp .env.example .env
# Edit .env and add your OPENAI_API_KEY
# Placeholder values such as sk-... or your_api_key_here are treated as missinguv run python src/main.pyThis path keeps feed collection and filtering in Python, but lets Codex or Claude Code handle dedupe/categorization without OPENAI_API_KEY.
For the full daily-agent runbook, see AGENTS.md.
uv run python src/main.py --check-issue --issue-status-file digest-issue-status.json
uv run python src/main.py --candidates-only
# agent reads digest-candidates.json and writes digest-decisions.json
uv run python src/main.py --apply-decisions digest-decisions.json
uv run python src/main.py --dispatch-publish--check-issue writes digest-issue-status.json by default. --candidates-only writes digest-candidates.json and digest-run-status.json by default. Use --candidates-file <path>, --status-file <path>, and --issue-status-file <path> to override these artifacts.
Daily agent runs:
- Stop on
ok: trueplusexists: true, or onok: falseplusretryable: false. - Continue only on
ok: falseplusretryable: true. - Run candidate export with
RSS_MAX_WORKERS=2 RSS_TIMEOUT=15; retry once withRSS_MAX_WORKERS=1 RSS_TIMEOUT=20onfeed_fetch_failedorempty_snapshot_with_feed_errors. - Write
digest-decisions.json, run--apply-decisions, then run--dispatch-publish. - After dispatch, wait briefly and re-run
--check-issueto confirm the issue exists.
--dispatch-publish prefers GITHUB_TOKEN or GH_TOKEN with workflow-dispatch access and falls back to authenticated local gh. Direct --publish-issue is still available as a manual fallback.
Agent decisions should use this JSON shape:
{
"executive_summary": "2-3 sentence overview of today's AI news.",
"top_stories": ["g1i1"],
"groups": [
{
"group_id": "g1",
"off_topic_ids": [],
"clusters": [
{
"keep_id": "g1i1",
"duplicate_ids": ["g1i2"],
"category": "Tools & Applications",
"short_title": "OpenAI launches coding assistant",
"summary_line": "Why this matters in one sentence.",
"tier": "high"
}
]
}
]
}Use off_topic_ids to drop low-signal or off-topic items from a group. For singleton groups, set clusters to [] and list the item id in off_topic_ids.
--dispatch-publish sends the rendered digest to the publish-only GitHub Actions workflow so the final issue author is app/github-actions, which is friendlier to watch-email notifications than publishing through your own local GitHub identity.
The default digest output is compact and title-first. executive_summary and summary_line are optional enrichment fields; the renderer keeps top stories and category sections skimmable even when those fields are present.
The collector reads RSS feed URLs from feeds.json in the project root. The
file should contain a JSON object where each key is a feed URL and each value
specifies the category and human-readable source name.
Optional fields:
type: source-specific handling such as paper limitssource_role: source authority for duplicate tie-breaks and ranking. Supported values:primary,independent_reporting,commentary,community.feed_mode: whether a feed is part of the main digest or supporting discovery only. Supported values:core,discovery_only.
{
"https://example.com/feed.xml": {
"source": "Example Feed",
"category": "All",
"type": "news",
"source_role": "independent_reporting",
"feed_mode": "core"
}
}Pipeline notes:
- Exact duplicates are removed by normalized URL before any LLM call.
- Source-specific low-signal items such as webinars, sponsored posts, Academy tutorials, and event promos are dropped before grouping.
- Podcast-style discussion posts from broad news feeds are also filtered before grouping.
- Broad mixed-source feeds can also be gated by source-specific title rules before grouping.
- A per-source cap is applied before LLM dedupe for diversity and lower cost.
- The collector preserves
original_titleand RSSsummaryfor duplicate resolution. - Candidate export also writes
digest-run-status.jsonwith feed health, group counts, and samplefeed_errorsfor automation use. --check-issuewritesdigest-issue-status.jsonthrough the same repo-local GitHub path used for publishing, preferringGITHUB_TOKEN/GH_TOKENand falling back to localghauth. On failure it still writes a status artifact withok: false, areason, anerror_kind, and aretryableflag so automation can distinguish transient GitHub failures from hard auth/config errors.--candidates-onlyexits nonzero only when feed health is bad enough to make the snapshot unreliable. Healthy empty days are reported asreason: "no_fresh_items"without failing.discovery_onlyfeeds can still merge into a core story and contribute coverage context, but standalone discovery-only items are dropped before final render.- When fallback top stories are auto-selected, the digest prefers category diversity before repeating the same lane.
- The LLM receives candidate groups and returns structured duplicate clusters instead of line-based
SKIPoutput. --dispatch-publishtriggers.github/workflows/publish-digest.ymlwith a compressed digest payload, and that workflow runs the repo-local--publish-issuepath on GitHub Actions. Direct--publish-issueremains a manual fallback.- Short display titles are generated only for kept items after duplicates are resolved.