Skip to content

Architecture

AgenticMail edited this page May 18, 2026 · 1 revision

Architecture

The webhook function is a single Netlify Function in agenticmail/frontend/netlify/functions/github-webhook.mts. ~1000 lines of TypeScript, no separate service, no database.


Request lifecycle

1. GitHub POSTs to /api/github/webhook with the event body
2. Function reads raw body (must be exact bytes for HMAC)
3. Verifies signature against GITHUB_WEBHOOK_SECRET (constant-time)
4. Checks delivery UUID against dedup store, writes if new
5. Parses JSON, returns 202 Accepted in <100ms
6. context.waitUntil keeps function alive for async work:
     a. Branch by event type
     b. Get installation token (60-min lifetime, fresh each call)
     c. For user-triggered mentions: rate-limit check + plan gate
     d. Fetch thread context (comments, PR files if applicable)
     e. Call Anthropic (claude-haiku-4-5, OAuth or API key)
     f. Write usage record (tokens + estimated cost)
     g. Perform action (comment, close, merge, review)
     h. Write audit entry (status, latency, errors)

The 202 is non-negotiable — GitHub retries deliveries that don't ack in 10 seconds, which would cause duplicate processing. The dedup store protects against that anyway, but fast acks keep the system clean.


HMAC verification

Constant-time compare against the sha256 HMAC of the raw request bytes. We can't re-serialize JSON before hashing because key ordering varies between JSON parsers — the hash would mismatch. So we req.text() once, hash that exact string, then JSON.parse() for actual use.

const rawBody = await req.text();
const sig = req.headers.get("x-hub-signature-256");
const ok = await verifySignature(rawBody, sig, secret);
if (!ok) return 401;
const payload = JSON.parse(rawBody);

Constant-time XOR byte-by-byte. A bad-signature delivery writes an audit entry with status: "bad_signature" so probes show up in /api/github/audit.


Deduplication

GitHub retries webhook deliveries 2–3 times in the first minute if it doesn't get a 2xx response. The same X-GitHub-Delivery UUID arrives each time.

We write <UUID> → timestamp into the github-webhook-dedup blob store on first receipt. On subsequent deliveries we see the UUID is present and short-circuit with 202 + deduped: true.

5-minute TTL is enough — GitHub stops retrying long before that.


Rate limiting

Per-installation token bucket in github-rate-limit:

const bucket = { count, windowStart };
if (now - bucket.windowStart > 1 hour) reset bucket;
if (bucket.count >= 60) return rate_limited;
bucket.count++;

60 user-mentions per installation per rolling hour. Generous for genuine use, hostile to spam. Auto-triage / auto-summary on issue.opened / pull_request.opened are not rate-limited — they fire once per object.

When tripped, the bot posts a 🚦 cooldown comment with the ETA. Audit entry records status: "rate_limited".


Plan gating

Paid verbs (close, merge, review) check the installation's plan before executing:

if (PAID_VERB_SET.has(cmd.verb)) {
  const plan = await getPlan(payload.repository.owner.login);
  if (plan !== "paid") return upgradeRequiredComment();
}

Plan records live in github-billing, written by the marketplace_purchase webhook handler on purchased | changed | pending_change and deleted on cancelled. Anything other than planName: "Free" is treated as paid. Single-tier v1 — we don't differentiate Pro / Team / Enterprise yet.


Audit log

Every delivery writes one JSON blob to github-webhook-audit, keyed by <YYYY-MM-DD>/<delivery-uuid>:

interface AuditEntry {
  ts: string;
  deliveryId: string;
  event: string;
  action?: string;
  installationId?: number;
  repo?: string;
  status: "accepted" | "deduped" | "bad_signature" | "rate_limited"
        | "processed" | "failed" | "ignored";
  latencyMs?: number;
  error?: string;
  extra?: Record<string, unknown>;
}

Read via GET /api/github/audit?day=YYYY-MM-DD. Failures never throw — the logger is best-effort so a Blob outage doesn't bring down webhook processing.


Cost telemetry

Every Anthropic call writes a UsageEntry to github-usage:

interface UsageEntry {
  ts: string;
  account: string;       // for aggregating by Marketplace customer
  installationId?: number;
  deliveryId?: string;
  verb: string;
  model: string;
  inputTokens: number;
  outputTokens: number;
  costCents: number;     // estimated, see HAIKU_INPUT/OUTPUT_CENTS_PER_TOKEN
}

The cost estimate uses constants at the top of the file (HAIKU_INPUT_CENTS_PER_TOKEN, HAIKU_OUTPUT_CENTS_PER_TOKEN) so they move when Anthropic publishes new rates. Aggregate via GET /api/github/usage?account=<login> for per-customer rollups.

This is what makes pricing decisions defensible: Anthropic charges per token, customers pay flat rate, and you need to know the delta per customer before launch.


LLM context

generateReply() constructs the thread context as plain text, then sends it to Claude with a verb-specific instruction. Context budget:

  • Original post body (full)
  • Up to 20 most recent comments (oldest → newest)
  • On PRs: up to 20 changed files with 40 lines of patch each (the highest-signal context for summarize/review/reply)
  • Trigger user, verb, args

Model: claude-haiku-4-5, max_tokens: 1024. System prompt instructs the model to reply in Markdown, be concise, not fabricate links or code, and always end with the AgenticMail footer.

OAuth tokens require anthropic-beta: oauth-2025-04-20 header. The SDK doesn't add it automatically (not all OAuth-token holders are on this beta surface), so we set it explicitly when the auth token starts with sk-ant-oat01-.


Why one big file

We could split this into a dozen modules. We don't, because:

  • Single deploy unit on Netlify → no inter-function call latency
  • Cold start hits one bundle, not many → faster first response
  • Audit reading the whole flow is one Read command, not chained navigation
  • ~1000 lines is fine for a v1 surface; we'll split when it hurts

The only extracted module is _lib/parse-mention.ts — pure functions that the unit-test suite imports without dragging in Anthropic / Octokit / Netlify.


What's deliberately not here

  • No persistent comment storage. Comment text leaves the function once, over TLS, to Anthropic. We don't write it to any blob store. Audit log records delivery metadata only — latencyMs, status, error — never the comment body.
  • No analytics SDK. Netlify function logs + our own audit blob are the entire observability stack.
  • No SDK telemetry to Anthropic. Standard Anthropic SDK behaviour; no extra metadata.user_id we set on requests.
  • No cross-installation correlation. Each install is its own rate-limit bucket, billing record, audit prefix. No "all installs from the same actor" detection in v1.

Clone this wiki locally