Skip to content

promotion UX: support partial / edited promote (extract knowledge from hybrid noise+signal rows) #30

@tznthou

Description

@tznthou

Context

v0.3.0 promotion path (POST /journal/promote/:id) copies the entire session_journal.content field into memories.content — promote is row-grained. After 5/7 dogfood, cross-provider review (Codex, 2026-05-08) flags row granularity as a hard limit on journal yield.

Finding

Journal entry id=21 (5/7 capture) demonstrates the hybrid row problem:

```
Content opening (visible in pending preview, first ~200 chars):
雙寫同步、diff 空、139 行(~150 軟上限內)。
[wrap-up summary]
更新檔案: ...
本次 archive: ...
Promote 候選 (待子超 confirm):

Content middle (chars ~800-1500):
ImageMagick inline chroma key + G despill 一條 magick 命令版
(fuzz 25% transparent + ( -clone 0 ... ) parens stack + CopyGreen composite)
zsh 對 +clone[0] 報「no matches」要 bash subshell +
ImageMagick 的 [N] 只能用在 input filename 不能用在 +clone[N]
```

This row is noise wrapper + signal core:

  • Wrapper layer (lines 1-30): wrap-up summary, file diff, archive list, promote candidate list — all transient
  • Core layer (lines 30-50): ImageMagick chroma key command, zsh syntax pitfall, +clone[N] limitation — durable technical knowledge

Current options under v0.3.0:

  • ccmem promote 21 → memories pollutes with wrap-up text (defeats trust split)
  • ccmem reject 21 → ImageMagick / zsh lessons silently lost

Why this matters

  • Hybrid rows are not edge cases — any session that records a real outcome and then summarizes will produce one. 5/7 had 1/16 such row in 24h.
  • The trust split (v0.3.0 P1) is structurally vulnerable: pending → promoted → memories preserves the pollution if promotion is row-grained
  • Codex framing: "promotion UX must support extract / edit promote, otherwise journal can isolate but cannot transform"

Proposed fix

Mode A (minimum viable): ccmem promote <id> --content "..."

Accept stdin or --content flag with edited body; the new memory's content is the supplied text, but source_session_id and source_journal_id (new column) preserve traceability back to the original journal row.

Mode B (better UX): ccmem promote <id> --extract

Spawn $EDITOR with the journal row pre-loaded; user edits to keep only the durable signal; on save, the edited buffer becomes the memory content.

Both modes share:

  • memories.source_journal_id foreign key to journal entry (new column, nullable for legacy memories)
  • Original journal row retains status='promoted' and references the resulting memory id
  • content_hash on memory recomputed from edited content (avoid hash collision with raw row)

Acceptance criteria

  • POST /journal/promote/:id accepts optional body.content parameter; if present, memory.content = body.content (else current behavior preserved)
  • ccmem promote <id> --content "edited text" CLI path
  • ccmem promote <id> --extract opens $EDITOR, captures saved buffer
  • memories.source_journal_id migration (schema v23)
  • Test cases: hybrid row id=21-equivalent → extract → memory content is core only, journal row marked promoted with reference
  • Backward compat: existing ccmem promote <id> (no flags) still copies whole row

Risk

  • $EDITOR spawning in CLI requires TTY detection — not all environments have it (CI, automation). Fallback: --content mode or error message
  • Trace integrity: if user edits content beyond recognition, source provenance becomes fuzzy. Acceptance: source_journal_id is provenance not validation
  • Schema migration v22 → v23 needs same care as v22 (backup before, integrity check after)

Related

Cross-provider review trail

Codex review (2026-05-08): row granularity is a structural limit on journal yield; promotion UX without extract / edit is the bottleneck after v0.3.0 fixes the persistence gate.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions