✨(schema-bench): Add unified CLI with MVP1 wrapper implementation #3461

NoritakaIkeda · 2025-09-12T09:45:30Z

Issue

resolve: route06/liam-internal#5652

Why is this change needed?

This implements MVP1 of the schema-bench unified CLI roadmap, providing a single entry point for execution, evaluation, and comparison operations while maintaining full backward compatibility with existing scripts.

The unified CLI addresses the need for a consistent interface across different executors (OpenAI, Liam) and operations, setting the foundation for future enhancements like model comparison and reasoning effort configuration.

This is preparation to enable configuration of reasoning effort from the command line.

Key Changes

Unified CLI Interface

New schema-bench command with subcommands:
- execute --executor <liam|openai>: Unified execution interface
- evaluate: Multi-dataset evaluation wrapper
- compare: MVP1 placeholder for future implementation

Non-breaking Implementation

Child process wrapper pattern: Uses spawn() to call existing TypeScript scripts
Full argument pass-through: Unknown arguments forwarded to underlying scripts unchanged
Complete backward compatibility: All existing package.json scripts continue to work

Enhanced OpenAI Executor

Flexible workspace detection: Auto-detects workspace vs internal dataset directories
Improved error handling: Better messaging for missing workspace setup
Debugger conflict resolution: Prevents nested NODE_OPTIONS inspector issues

Documentation Updates

Dual interface documentation: Shows both unified CLI (recommended) and legacy scripts
Clear migration guidance: Explains transition from legacy to unified interface
Command examples: Comprehensive usage examples for all interfaces

Technical Implementation

The MVP1 implementation follows the "wrapper + pass-through" pattern to minimize risk:

CLI Entry Point (src/cli/index.ts):
- Uses commander.js pattern for argument parsing
- Maps executors to existing script files
- Handles exit code propagation correctly
Child Process Execution:
- Spawns existing scripts via tsx runtime
- Inherits stdio for seamless user experience
- Prevents debugger conflicts in nested execution
Workspace Resolution:
- Enhanced executeOpenai.ts with flexible base directory detection
- Supports both workspace and internal dataset execution modes

Acceptance Criteria

✅ Full Compatibility: All existing scripts work unchanged
✅ Unified Interface: New CLI provides consistent interface across executors
✅ Non-breaking: Legacy workflows continue without modification
✅ Extensible: Foundation for future MVP2/3 enhancements

Testing

Verified unified CLI execution with both OpenAI and Liam executors
Confirmed legacy scripts remain functional
Tested argument pass-through behavior
Validated workspace detection logic

This sets the foundation for future enhancements while ensuring zero disruption to existing workflows.

Summary by CodeRabbit

New Features
- Unified CLI: schema-bench subcommands for execute, evaluate, and a compare placeholder.
- Package now exposes a public CLI entry so schema-bench can be run directly.
Improvements
- Execution flows accept an explicit executor (liam|openai) and pass-through options.
- OpenAI execution falls back to a bundled default dataset when no workspace dataset is present.
Documentation
- Updated guides with unified CLI examples and note that legacy commands remain supported.
- Added OpenAI environment note (OPENAI_API_KEY).

Add unified schema-bench CLI with execute, evaluate, and compare subcommands: - **Unified CLI Interface**: New `schema-bench` command with subcommands - `execute --executor <liam|openai>`: Wraps existing executor scripts - `evaluate`: Wraps evaluateSchemaMulti.ts for multi-dataset evaluation - `compare`: MVP1 placeholder with guidance message - **Non-breaking Implementation**: All existing scripts remain unchanged - Legacy scripts continue to work via package.json scripts - New CLI uses child_process.spawn to call existing implementations - Full argument pass-through to underlying scripts - **Enhanced OpenAI Executor**: Improved workspace detection - Auto-detects workspace vs internal dataset directories - Flexible execution base directory resolution - Better error handling for missing workspace setup - **Updated Documentation**: README and command docs reflect both interfaces - Recommends unified CLI while preserving legacy documentation - Clear migration path from legacy to unified interface This implements MVP1 from the schema-bench roadmap, providing a single entry point while maintaining full backward compatibility. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

vercel · 2025-09-12T09:45:32Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Preview	Comments	Updated (UTC)
liam-app	Ready	Preview	Comment	Sep 12, 2025 0:57am
liam-assets	Ready	Preview	Comment	Sep 12, 2025 0:57am
liam-erd-sample	Ready	Preview	Comment	Sep 12, 2025 0:57am
liam-storybook	Ready	Preview	Comment	Sep 12, 2025 0:57am

1 Skipped Deployment

Project	Deployment	Preview	Comments	Updated (UTC)
liam-docs	Ignored	Preview		Sep 12, 2025 0:57am

changeset-bot · 2025-09-12T09:45:34Z

⚠️ No Changeset found

Latest commit: 3c8fad2

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2025-09-12T09:45:37Z

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds a unified schema-bench CLI with a Node ESM bin shim and TypeScript entrypoint, routes execute/evaluate/compare subcommands to TS scripts via tsx, introduces a package bin/script, refactors executeOpenai to use a computed baseExecutionDir, and updates docs while preserving legacy scripts.

Changes

Cohort / File(s)	Summary
CLI entry + shim & packaging `frontend/internal-packages/schema-bench/src/cli/index.ts`, `frontend/internal-packages/schema-bench/bin/schema-bench.mjs`, `frontend/internal-packages/schema-bench/package.json`	Adds a versioned top-level TypeScript CLI with subcommands (execute/evaluate/compare/help/version), a Node ESM shim to run the TS CLI via tsx, a runTsCli helper, --executor parsing, and registers `schema-bench` as a package bin and npm script.
Executor: OpenAI IO refactor `frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts`	Centralizes baseExecutionDir resolution (workspace or internal default), refactors input/output helpers and executeCase to use base dir, preserves concurrency and OPENAI_API_KEY handling.
Docs & examples `frontend/internal-packages/schema-bench/README.md`, `.claude/commands/benchmark-execute.md`	Rewrites docs to document the unified CLI examples for execute/evaluate, moves some OpenAI examples to legacy section, and updates notes (e.g., OPENAI_API_KEY).
Tooling ignore `knip.jsonc`	Adds CLI entry to knip ignore list with explanatory comment.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User
  participant Bin as bin/schema-bench.mjs
  participant CLI as src/cli/index.ts (TS CLI)
  participant TSX as tsx runner
  participant Exec as Executor script (execute*.ts)
  participant Eval as evaluateSchemaMulti.ts

  U->>Bin: schema-bench <subcommand> [--executor/-x args]
  Note right of Bin: resolve TS path, sanitize NODE_OPTIONS, spawn
  Bin->>CLI: node --import tsx index.ts [args]
  CLI->>CLI: parse subcommand & flags
  alt execute (with --executor liam|openai)
    CLI->>TSX: node --import tsx execute*.ts [passthrough args]
    TSX->>Exec: run executor (liam/openai)
    Note over Exec: openai uses computed baseExecutionDir for IO
    Exec-->>TSX: exit code
  else evaluate
    CLI->>TSX: node --import tsx evaluateSchemaMulti.ts [args]
    TSX->>Eval: run evaluation
    Eval-->>TSX: exit code
  else compare
    CLI-->>Bin: print placeholder, exit 0
  end
  TSX-->>CLI: propagate exit code
  CLI-->>Bin: propagate exit code
  Bin-->>U: process exit

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

✨ Add LiamDB executor to schema-bench package #2771 — Related changes adding/using the LiamDB executor and executeLiamDb entrypoint that integrate with the unified CLI.
feat: add OpenAI executor to schema-bench #2580 — Prior modifications to executeOpenai.ts; directly relevant to the OpenAI baseExecutionDir refactor and wiring.
✨(schema-bench): Add recall metrics and entity extraction benchmark #3237 — Overlapping work on schema-bench CLI, packaging, and documentation.

Suggested labels

Review effort 4/5

Suggested reviewers

hoshinotsuyoshi
FunamaYukina
junkisai

Poem

Hoppity hop, I ship a single name,
Shim wakes tsx, and nothing looks the same.
Executors hum, outputs tucked in rows,
Legacy paths linger where the old wind blows.
A rabbit cheers — unified CLI, here we go! 🐇✨

Warning

Review ran into problems

🔥 Problems

Errors were encountered while retrieving linked issues.

Errors (1)

Failed to retrieve linked issues from the platform client.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title Check	✅ Passed	The title succinctly and accurately summarizes the primary change — adding a unified CLI and an MVP1 wrapper for schema-bench — which aligns with the PR objectives and modified files. It includes the affected package scope (schema-bench) and the implementation stage (MVP1), making the intent clear to reviewers. The phrasing is specific and not generic, so it describes the main change rather than unrelated details. The leading emoji is cosmetic but does not render the title misleading.
Docstring Coverage	✅ Passed	No functions found in the changes. Docstring coverage check skipped.
Description Check	✅ Passed	The PR description follows the repository template: it includes an Issue section with a resolve reference (route06/liam-internal#5652) and a populated "Why is this change needed?" section, and it also provides comprehensive Key Changes, Technical Implementation, Acceptance Criteria, and Testing details that make the intent and scope clear to reviewers. This exceeds the template's minimum requirements and supplies actionable context for review. Therefore the description is complete and clear.

✨ Finishing touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feat/schema-bench-unified-cli-mvp1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

supabase · 2025-09-12T09:45:43Z

Updates to Preview Branch (feat/schema-bench-unified-cli-mvp1) ↗︎

Deployments	Status	Updated
Database	✅	Fri, 12 Sep 2025 12:54:43 UTC
Services	✅	Fri, 12 Sep 2025 12:54:43 UTC
APIs	✅	Fri, 12 Sep 2025 12:54:43 UTC

Tasks are run on every commit but only new migration files are pushed.
Close and reopen this PR if you want to apply changes from existing seed or migration files.

Tasks	Status	Updated
Configurations	✅	Fri, 12 Sep 2025 12:54:46 UTC
Migrations	✅	Fri, 12 Sep 2025 12:54:46 UTC
Seeding	✅	Fri, 12 Sep 2025 12:54:46 UTC
Edge Functions	✅	Fri, 12 Sep 2025 12:54:47 UTC

View logs for this Workflow Run ↗︎.
Learn more about Supabase for Git ↗︎.

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts (1)
45-50: CLI hint typo: script name is setupWorkspace

Message recommends setup-workspace (hyphenated) which doesn’t exist; scripts define setupWorkspace.
-        `Input directory not found: ${inputDir}. Please run setup-workspace first.`,
+        `Input directory not found: ${inputDir}. Please run setupWorkspace first.`,

🧹 Nitpick comments (6)

frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-002.json (1)

472-480: Unique constraint references a non-existent column

uq_cps_active_one_per_plan includes cancellation_time_is_null_partial_index_placeholder, which is not a defined column. Either model partial uniqueness explicitly in your constraint format (e.g., with a where/filter field) or approximate with a CHECK + UNIQUE that only references existing columns.

If the evaluator requires valid column names, this will cause errors; please adjust the representation.
frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts (1)
181-183: Make concurrency configurable via env

Allow tuning without code changes; keep 5 as default.
-  const MAX_CONCURRENT = 5
+  const MAX_CONCURRENT = Number(process.env.SCHEMABENCH_MAX_CONCURRENCY) || 5
frontend/internal-packages/schema-bench/package.json (1)

40-40: Optional: drop unused “main” or add exports to avoid confusion

This package is CLI-oriented; main: "src/index.ts" may be misleading. Consider removing it or adding proper exports if you intend runtime imports.
frontend/internal-packages/schema-bench/bin/schema-bench.mjs (1)
17-25: Avoid nested inspector conflicts (strip --inspect from NODE_OPTIONS for the child)

Running tsx under an inspected parent can fail. Sanitize NODE_OPTIONS before spawn.
-  const child = spawn(
+  const env = { ...process.env }
+  if (env.NODE_OPTIONS && /--inspect/.test(env.NODE_OPTIONS)) {
+    env.NODE_OPTIONS = env.NODE_OPTIONS.replace(/--inspect(-brk)?(=[^\s]+)?/g, '').trim() || undefined
+  }
+  const child = spawn(
     process.execPath,
     [tsxCli, entryTs, ...process.argv.slice(2)],
     {
       stdio: 'inherit',
-      env: process.env,
+      env,
       cwd: process.cwd(),
     },
   )
frontend/internal-packages/schema-bench/src/cli/index.ts (2)
24-34: Propagate sanitized NODE_OPTIONS to sub-scripts to prevent nested inspector issues

Mirror the shim’s sanitization here to cover direct script execution.
-const runTsCli = (tsFile: string, args: string[]): Promise<number> => {
+const runTsCli = (tsFile: string, args: string[]): Promise<number> => {
   return new Promise((resolve) => {
-    const child = spawn(process.execPath, [tsxPath(), tsFile, ...args], {
+    const env = { ...process.env }
+    if (env.NODE_OPTIONS && /--inspect/.test(env.NODE_OPTIONS)) {
+      env.NODE_OPTIONS = env.NODE_OPTIONS.replace(/--inspect(-brk)?(=[^\s]+)?/g, '').trim() || undefined
+    }
+    const child = spawn(process.execPath, [tsxPath(), tsFile, ...args], {
       stdio: 'inherit',
-      env: process.env,
+      env,
       cwd: process.cwd(),
     })
108-121: Helpful error output for unknown executor

Consider echoing the received value to speed up user troubleshooting.
-    if (!target) {
-      console.error('Unknown executor. Use one of: liam | openai')
+    if (!target) {
+      console.error(`Unknown executor: ${String(executor)}. Use one of: liam | openai`)
       process.exit(1)
     }

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b76a7ea and 5cdf3a5.

📒 Files selected for processing (8)

frontend/internal-packages/schema-bench/README.md (2 hunks)
frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001-difficult.json (1 hunks)
frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001.json (1 hunks)
frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-002.json (1 hunks)
frontend/internal-packages/schema-bench/bin/schema-bench.mjs (1 hunks)
frontend/internal-packages/schema-bench/package.json (2 hunks)
frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts (7 hunks)
frontend/internal-packages/schema-bench/src/cli/index.ts (1 hunks)

🧰 Additional context used

📓 Path-based instructions (4)

frontend/internal-packages/**

📄 CodeRabbit inference engine (AGENTS.md)

Infra and tooling (e2e, configs, storybook, agent) live under frontend/internal-packages

Files:

frontend/internal-packages/schema-bench/bin/schema-bench.mjs
frontend/internal-packages/schema-bench/package.json
frontend/internal-packages/schema-bench/src/cli/index.ts
frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001-difficult.json
frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001.json
frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-002.json
frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts
frontend/internal-packages/schema-bench/README.md

**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Use runtime type validation with valibot for external data validation
Use early returns for readability
Use named exports only (no default exports)
Follow existing import patterns and tsconfig path aliases
Prefer const arrow functions for simple utilities (e.g., const toggle = () => {})

Use TypeScript/TSX across the codebase

Files:

frontend/internal-packages/schema-bench/src/cli/index.ts
frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts

frontend/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

frontend/**/*.{ts,tsx}: Import UI components from @liam-hq/ui when available
Import icons from @liam-hq/ui

Files:

frontend/internal-packages/schema-bench/src/cli/index.ts
frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts

**/*.ts

📄 CodeRabbit inference engine (AGENTS.md)

Name utility files in camelCase (e.g., mergeSchema.ts)

Files:

frontend/internal-packages/schema-bench/src/cli/index.ts
frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts

🧠 Learnings (2)

📓 Common learnings

Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.

📚 Learning: 2025-07-30T05:52:56.270Z

Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.

Applied to files:

frontend/internal-packages/schema-bench/bin/schema-bench.mjs
frontend/internal-packages/schema-bench/package.json
frontend/internal-packages/schema-bench/src/cli/index.ts
frontend/internal-packages/schema-bench/README.md

🧬 Code graph analysis (2)

frontend/internal-packages/schema-bench/src/cli/index.ts (1)

frontend/internal-packages/schema-bench/bin/schema-bench.mjs (4)

require (9-9)

__filename (13-13)

__dirname (14-14)

child (17-25)

frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts (2)

frontend/internal-packages/schema-bench/src/cli/utils/workspace.ts (1)

getWorkspaceSubPath (21-23)

frontend/internal-packages/schema-bench/bin/schema-bench.mjs (2)

__filename (13-13)

__dirname (14-14)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: frontend-lint
GitHub Check: frontend-ci
GitHub Check: security-review
GitHub Check: Supabase Preview

🔇 Additional comments (3)

frontend/internal-packages/schema-bench/package.json (1)

6-8: Bin mapping looks good

The schema-bench binary correctly points to the shim.

frontend/internal-packages/schema-bench/README.md (2)

22-31: Docs align with unified CLI—nice

Clear examples for Liam/OpenAI with pass-through flags.

46-51: Good: unified evaluate path with legacy fallback

This matches the new CLI entry while preserving old scripts.

coderabbitai · 2025-09-12T09:56:16Z

...l-packages/schema-bench/benchmark-workspace-default/execution/output/case-001-difficult.json

+        "foreignKeys':[{": ":",
+        "name": {
+          "type": "FOREIGN KEY",
+          "name": "fk_eom_employee",
+          "columnNames": [
+            "employee_code"
+          ],
+          "targetTableName": "employee",
+          "targetColumnNames": [
+            "employee_code"
+          ],
+          "updateConstraint": "NO_ACTION",
+          "deleteConstraint": "CASCADE"
+        },
+        "targetTableName": {
+          "type": "FOREIGN KEY",
+          "name": "fk_eom_org",
+          "columnNames": [
+            "org_code"
+          ],
+          "targetTableName": "organization",
+          "targetColumnNames": [
+            "org_code"
+          ],
+          "updateConstraint": "NO_ACTION",
+          "deleteConstraint": "CASCADE"
+        },
+        "targetColumnNames": {
+          "type": "FOREIGN KEY",
+          "name": "fk_eom_position",
+          "columnNames": [
+            "position_code"
+          ],
+          "targetTableName": "position",
+          "targetColumnNames": [
+            "position_code"
+          ],
+          "updateConstraint": "NO_ACTION",
+          "deleteConstraint": "SET_NULL"
+        }
+      }


⚠️ Potential issue

Broken JSON and malformed constraints block in employee_org_membership.constraints

This section is syntactically invalid JSON and mixes object keys/values incorrectly. It will fail to parse and breaks downstream tooling.

Replace the malformed block with a valid, consistent constraints map (mirroring the convention used elsewhere), and fold the “reports_to” rules here (so you can drop the separate “checks” table below).

"primaryKey": { "type": "PRIMARY KEY", "name": "pk_emp_org_membership", "columnNames": [ "employee_code", "org_code" ] }, - "foreignKeys':[{": ":", - "name": { - "type": "FOREIGN KEY", - "name": "fk_eom_employee", - "columnNames": [ - "employee_code" - ], - "targetTableName": "employee", - "targetColumnNames": [ - "employee_code" - ], - "updateConstraint": "NO_ACTION", - "deleteConstraint": "CASCADE" - }, - "targetTableName": { - "type": "FOREIGN KEY", - "name": "fk_eom_org", - "columnNames": [ - "org_code" - ], - "targetTableName": "organization", - "targetColumnNames": [ - "org_code" - ], - "updateConstraint": "NO_ACTION", - "deleteConstraint": "CASCADE" - }, - "targetColumnNames": { - "type": "FOREIGN KEY", - "name": "fk_eom_position", - "columnNames": [ - "position_code" - ], - "targetTableName": "position", - "targetColumnNames": [ - "position_code" - ], - "updateConstraint": "NO_ACTION", - "deleteConstraint": "SET_NULL" - } + "fk_eom_employee": { + "type": "FOREIGN KEY", + "name": "fk_eom_employee", + "columnNames": ["employee_code"], + "targetTableName": "employee", + "targetColumnNames": ["employee_code"], + "updateConstraint": "NO_ACTION", + "deleteConstraint": "CASCADE" + }, + "fk_eom_org": { + "type": "FOREIGN KEY", + "name": "fk_eom_org", + "columnNames": ["org_code"], + "targetTableName": "organization", + "targetColumnNames": ["org_code"], + "updateConstraint": "NO_ACTION", + "deleteConstraint": "CASCADE" + }, + "fk_eom_position": { + "type": "FOREIGN KEY", + "name": "fk_eom_position", + "columnNames": ["position_code"], + "targetTableName": "position", + "targetColumnNames": ["position_code"], + "updateConstraint": "NO_ACTION", + "deleteConstraint": "SET_NULL" + }, + "ck_eom_reports_to_not_self": { + "type": "CHECK", + "name": "ck_eom_reports_to_not_self", + "detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code" + }, + "fk_eom_reports_to_same_org": { + "type": "FOREIGN KEY", + "name": "fk_eom_reports_to_same_org", + "columnNames": ["reports_to_employee_code", "org_code"], + "targetTableName": "employee_org_membership", + "targetColumnNames": ["employee_code", "org_code"], + "updateConstraint": "NO_ACTION", + "deleteConstraint": "SET_NULL" + } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"foreignKeys':[{": ":",

"name": {

"type": "FOREIGN KEY",

"name": "fk_eom_employee",

"columnNames": [

"employee_code"

],

"targetTableName": "employee",

"targetColumnNames": [

"employee_code"

],

"updateConstraint": "NO_ACTION",

"deleteConstraint": "CASCADE"

},

"targetTableName": {

"type": "FOREIGN KEY",

"name": "fk_eom_org",

"columnNames": [

"org_code"

],

"targetTableName": "organization",

"targetColumnNames": [

"org_code"

],

"updateConstraint": "NO_ACTION",

"deleteConstraint": "CASCADE"

},

"targetColumnNames": {

"type": "FOREIGN KEY",

"name": "fk_eom_position",

"columnNames": [

"position_code"

],

"targetTableName": "position",

"targetColumnNames": [

"position_code"

],

"updateConstraint": "NO_ACTION",

"deleteConstraint": "SET_NULL"

}

}

"primaryKey": {

"type": "PRIMARY KEY",

"name": "pk_emp_org_membership",

"columnNames": [

"employee_code",

"org_code"

]

},

"fk_eom_employee": {

"type": "FOREIGN KEY",

"name": "fk_eom_employee",

"columnNames": ["employee_code"],

"targetTableName": "employee",

"targetColumnNames": ["employee_code"],

"updateConstraint": "NO_ACTION",

"deleteConstraint": "CASCADE"

},

"fk_eom_org": {

"type": "FOREIGN KEY",

"name": "fk_eom_org",

"columnNames": ["org_code"],

"targetTableName": "organization",

"targetColumnNames": ["org_code"],

"updateConstraint": "NO_ACTION",

"deleteConstraint": "CASCADE"

},

"fk_eom_position": {

"type": "FOREIGN KEY",

"name": "fk_eom_position",

"columnNames": ["position_code"],

"targetTableName": "position",

"targetColumnNames": ["position_code"],

"updateConstraint": "NO_ACTION",

"deleteConstraint": "SET_NULL"

},

"ck_eom_reports_to_not_self": {

"type": "CHECK",

"name": "ck_eom_reports_to_not_self",

"detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code"

},

"fk_eom_reports_to_same_org": {

"type": "FOREIGN KEY",

"name": "fk_eom_reports_to_same_org",

"columnNames": ["reports_to_employee_code", "org_code"],

"targetTableName": "employee_org_membership",

"targetColumnNames": ["employee_code", "org_code"],

"updateConstraint": "NO_ACTION",

"deleteConstraint": "SET_NULL"

}

coderabbitai · 2025-09-12T09:56:16Z

...l-packages/schema-bench/benchmark-workspace-default/execution/output/case-001-difficult.json

+    "checks": {
+      "name": "ck_eom_reports_to_not_self",
+      "columns": {},
+      "comment": null,
+      "indexes": {},
+      "constraints": {
+        "type": {
+          "type": "CHECK",
+          "name": "ck_eom_reports_to_not_self",
+          "detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code"
+        },
+        "type2": {
+          "type": "FOREIGN KEY",
+          "name": "fk_eom_reports_to_same_org",
+          "columnNames": [
+            "reports_to_employee_code",
+            "org_code"
+          ],
+          "targetTableName": "employee_org_membership",
+          "targetColumnNames": [
+            "employee_code",
+            "org_code"
+          ],
+          "updateConstraint": "NO_ACTION",
+          "deleteConstraint": "SET_NULL"
+        }
+      }
+    }


🛠️ Refactor suggestion

Remove the synthetic “checks” table; these belong as constraints on employee_org_membership

The “checks” pseudo-table is not a real table and violates the schema shape described in README. After folding the check/FK above, delete this block.

- }, - "checks": { - "name": "ck_eom_reports_to_not_self", - "columns": {}, - "comment": null, - "indexes": {}, - "constraints": { - "type": { - "type": "CHECK", - "name": "ck_eom_reports_to_not_self", - "detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code" - }, - "type2": { - "type": "FOREIGN KEY", - "name": "fk_eom_reports_to_same_org", - "columnNames": [ - "reports_to_employee_code", - "org_code" - ], - "targetTableName": "employee_org_membership", - "targetColumnNames": [ - "employee_code", - "org_code" - ], - "updateConstraint": "NO_ACTION", - "deleteConstraint": "SET_NULL" - } - } - } + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"checks": {

"name": "ck_eom_reports_to_not_self",

"columns": {},

"comment": null,

"indexes": {},

"constraints": {

"type": {

"type": "CHECK",

"name": "ck_eom_reports_to_not_self",

"detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code"

},

"type2": {

"type": "FOREIGN KEY",

"name": "fk_eom_reports_to_same_org",

"columnNames": [

"reports_to_employee_code",

"org_code"

],

"targetTableName": "employee_org_membership",

"targetColumnNames": [

"employee_code",

"org_code"

],

"updateConstraint": "NO_ACTION",

"deleteConstraint": "SET_NULL"

}

}

}

}

🤖 Prompt for AI Agents

frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001-difficult.json lines 228-255: the "checks" pseudo-table block should be removed and its constraints folded into the real employee_org_membership table; delete the entire "checks" object and move the CHECK constraint (ck_eom_reports_to_not_self) and the FOREIGN KEY constraint (fk_eom_reports_to_same_org) into the "constraints" section of the employee_org_membership table entry, ensuring the CHECK uses the same detail text and the FK references targetTableName/targetColumnNames and update/delete actions are preserved.

coderabbitai · 2025-09-12T09:56:16Z

...nd/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001.json

+        "foreign_keys": {
+          "type": "FOREIGN KEY",
+          "name": "",
+          "columnNames": [
+            ""
+          ],
+          "targetTableName": "",
+          "targetColumnNames": [
+            ""
+          ],
+          "updateConstraint": "NO_ACTION",
+          "deleteConstraint": "NO_ACTION"
+        },


⚠️ Potential issue

Remove placeholder FOREIGN KEY with empty identifiers in position.constraints

This FK has empty name, columnNames, targetTableName, and targetColumnNames, which is invalid and may break consumers.

"primary_key": { "type": "PRIMARY KEY", "name": "pk_position", "columnNames": [ "position_code" ] }, - "foreign_keys": { - "type": "FOREIGN KEY", - "name": "", - "columnNames": [ - "" - ], - "targetTableName": "", - "targetColumnNames": [ - "" - ], - "updateConstraint": "NO_ACTION", - "deleteConstraint": "NO_ACTION" - }, "uniques": { "type": "UNIQUE", "name": "uq_position_name", "columnNames": [ "position_name" ] },

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"foreign_keys": {

"type": "FOREIGN KEY",

"name": "",

"columnNames": [

""

],

"targetTableName": "",

"targetColumnNames": [

""

],

"updateConstraint": "NO_ACTION",

"deleteConstraint": "NO_ACTION"

},

"primary_key": {

"type": "PRIMARY KEY",

"name": "pk_position",

"columnNames": [

"position_code"

]

},

"uniques": {

"type": "UNIQUE",

"name": "uq_position_name",

"columnNames": [

"position_name"

]

},

🤖 Prompt for AI Agents

In frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001.json around lines 98–110, there is a placeholder FOREIGN KEY object with empty name, columnNames, targetTableName, and targetColumnNames which is invalid; remove this placeholder foreign_keys entry (or replace it with a valid FK object) so position.constraints does not contain entries with empty identifiers, and ensure the producing code validates FK objects and omits them when required fields are empty before writing the JSON.

coderabbitai · 2025-09-12T09:56:17Z

frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts

+const getExecutionBaseDir = (): string => {
+  // Prefer top-level benchmark workspace execution dir, fallback to internal default dataset
+  const workspaceExecution = getWorkspaceSubPath('execution')
+  if (existsSync(workspaceExecution)) return workspaceExecution
+  const __filename = fileURLToPath(import.meta.url)
+  const __dirname = dirname(__filename)
+  return resolve(__dirname, '../../benchmark-workspace-default/execution')
+}


⚠️ Potential issue

Workspace detection likely misses actual workspace layout

Workspace structure in README nests datasets under benchmark-workspace/<dataset>/execution/.... Checking only <workspace>/execution will often fail and force the internal fallback.

Use a smarter probe (prefer default dataset if present, then legacy single-level):

-const getExecutionBaseDir = (): string => { - // Prefer top-level benchmark workspace execution dir, fallback to internal default dataset - const workspaceExecution = getWorkspaceSubPath('execution') - if (existsSync(workspaceExecution)) return workspaceExecution - const __filename = fileURLToPath(import.meta.url) - const __dirname = dirname(__filename) - return resolve(__dirname, '../../benchmark-workspace-default/execution') -} +const getExecutionBaseDir = (): string => { + // Prefer workspace default dataset, fall back to legacy flat layout, then internal default dataset + const workspaceDefault = getWorkspaceSubPath('default', 'execution') + if (existsSync(workspaceDefault)) return workspaceDefault + const workspaceFlat = getWorkspaceSubPath('execution') + if (existsSync(workspaceFlat)) return workspaceFlat + const __filename = fileURLToPath(import.meta.url) + const __dirname = dirname(__filename) + return resolve(__dirname, '../../benchmark-workspace-default/execution') +}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const getExecutionBaseDir = (): string => {

// Prefer top-level benchmark workspace execution dir, fallback to internal default dataset

const workspaceExecution = getWorkspaceSubPath('execution')

if (existsSync(workspaceExecution)) return workspaceExecution

const __filename = fileURLToPath(import.meta.url)

const __dirname = dirname(__filename)

return resolve(__dirname, '../../benchmark-workspace-default/execution')

}

const getExecutionBaseDir = (): string => {

// Prefer workspace default dataset, fall back to legacy flat layout, then internal default dataset

const workspaceDefault = getWorkspaceSubPath('default', 'execution')

if (existsSync(workspaceDefault)) return workspaceDefault

const workspaceFlat = getWorkspaceSubPath('execution')

if (existsSync(workspaceFlat)) return workspaceFlat

const __filename = fileURLToPath(import.meta.url)

const __dirname = dirname(__filename)

return resolve(__dirname, '../../benchmark-workspace-default/execution')

}

🤖 Prompt for AI Agents

In frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts around lines 29-36, the current workspace probe only checks for <workspace>/execution and therefore misses layouts where datasets are nested under benchmark-workspace/<dataset>/execution; update getExecutionBaseDir to first resolve the top-level benchmark-workspace path, look for a subdirectory whose execution subpath exists (prefer a "default" dataset if present, otherwise pick the first matching dataset), return that dataset's execution path if found, then fall back to the legacy <workspace>/execution check, and finally fall back to the internal default execution path; implement this by listing entries under the workspace directory (or using a glob) and testing each candidate's execution subpath before returning.

- Remove benchmark output files from version control (should not be committed) - Fix NODE_OPTIONS debugger conflict in unified CLI spawn process - Update benchmark-execute command documentation with unified CLI examples - Improve CLI error handling and environment variable cleanup 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

- Switch to Node --import tsx for better TypeScript execution compatibility - Remove createRequire dependency for cleaner ESM implementation - Update binary script to use consistent tsx execution pattern - Enhanced benchmark-execute documentation with unified CLI examples - Improved NODE_OPTIONS handling to prevent debugger conflicts This ensures the unified CLI works reliably across different environments while maintaining full backward compatibility with existing scripts. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

frontend/internal-packages/schema-bench/bin/schema-bench.mjs (2)
21-29: Small DX polish: inherit stdio already; add windowsHide false (optional).

Not required, but adding windowsHide: false prevents stray console windows on Windows.
   const child = spawn(
     process.execPath,
     ['--import', 'tsx', entryTs, ...process.argv.slice(2)],
     {
       stdio: 'inherit',
       env,
       cwd: process.cwd(),
+      windowsHide: false,
     },
   )
9-10: Fail fast if tsx isn't resolvable; add Node engine or fallback

File: frontend/internal-packages/schema-bench/bin/schema-bench.mjs

frontend/internal-packages/schema-bench/package.json declares [email protected] and has no "engines" field.

Add a runtime check (fail-fast with a clear error) before spawning; apply the suggested diff:
-const _require = createRequire(import.meta.url)
+const _require = createRequire(import.meta.url)
@@
-  const child = spawn(
+  // Verify that 'tsx' is available; give a helpful error if not.
+  try {
+    _require.resolve('tsx')
+  } catch {
+    console.error('[schema-bench] Missing dependency: tsx\n' +
+      'Please add "tsx" to dependencies of @liam-hq/schema-bench (or ensure it is hoisted and resolvable).')
+    process.exit(1)
+  }
+
+  const child = spawn(
     process.execPath,
     ['--import', 'tsx', entryTs, ...process.argv.slice(2)],
Because "engines" is unset, either add "engines.node": ">=20.6.0" to package.json to require Node versions that support --import, or implement a fallback path using --loader tsx for older Node releases.
.claude/commands/benchmark-execute.md (1)
81-82: Add hint about workspace setup for OpenAI (optional).

OpenAI can run against the default dataset or a prepared workspace. A short note helps first-time users who skipped setupWorkspace.
-Note: For OpenAI execution, ensure `OPENAI_API_KEY` is set in your environment.
+Note: For OpenAI execution, ensure `OPENAI_API_KEY` is set. If running against the internal datasets, run the workspace setup first:
+`pnpm -F @liam-hq/schema-bench setupWorkspace`

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5cdf3a5 and f03cedc.

📒 Files selected for processing (4)

.claude/commands/benchmark-execute.md (2 hunks)
frontend/internal-packages/schema-bench/bin/schema-bench.mjs (1 hunks)
frontend/internal-packages/schema-bench/package.json (2 hunks)
frontend/internal-packages/schema-bench/src/cli/index.ts (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (2)

frontend/internal-packages/schema-bench/package.json
frontend/internal-packages/schema-bench/src/cli/index.ts

🧰 Additional context used

📓 Path-based instructions (1)

frontend/internal-packages/**

📄 CodeRabbit inference engine (AGENTS.md)

Infra and tooling (e2e, configs, storybook, agent) live under frontend/internal-packages

Files:

frontend/internal-packages/schema-bench/bin/schema-bench.mjs

🧠 Learnings (2)

📓 Common learnings

Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.

📚 Learning: 2025-07-30T05:52:56.270Z

Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.

Applied to files:

frontend/internal-packages/schema-bench/bin/schema-bench.mjs

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: Supabase Preview
GitHub Check: Supabase Preview

🔇 Additional comments (3)

.claude/commands/benchmark-execute.md (3)
41-52: LGTM: Unified CLI examples for Liam executor are clear and backward-compatible.

The pass-through dataset flags (-all, -entity-extraction, -ambiguous-recall) align with the wrapper’s “forward unknown args” approach.

If helpful, add one line showing --help to discover flags:
pnpm -F @liam-hq/schema-bench schema-bench execute --executor liam --help
61-66: LGTM: OpenAI unified CLI and legacy command are correctly documented.

Clear that unified defaults to the default dataset; legacy remains intact.

74-79: LGTM: Evaluation flow mirrors the unified/legacy pattern consistently.

Examples are concise and correct.

coderabbitai · 2025-09-12T11:32:52Z

frontend/internal-packages/schema-bench/bin/schema-bench.mjs

+  const env = { ...process.env }
+  if (env.NODE_OPTIONS?.includes('--inspect')) {
+    delete env.NODE_OPTIONS
+  }


🛠️ Refactor suggestion

Filter out only inspector flags from NODE_OPTIONS (don’t drop unrelated flags).

Deleting NODE_OPTIONS wholesale can discard non-inspector flags (e.g., memory limits). Strip only --inspect/--inspect-brk.

- const env = { ...process.env } - if (env.NODE_OPTIONS?.includes('--inspect')) { - delete env.NODE_OPTIONS - } + const env = { ...process.env } + if (env.NODE_OPTIONS) { + const filtered = env.NODE_OPTIONS + .split(/\s+/) + .filter((opt) => !/^--inspect(?:-brk)?(?:=|$)/.test(opt)) + .join(' ') + if (filtered) env.NODE_OPTIONS = filtered + else delete env.NODE_OPTIONS + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const env = { ...process.env }

if (env.NODE_OPTIONS?.includes('--inspect')) {

delete env.NODE_OPTIONS

}

const env = { ...process.env }

if (env.NODE_OPTIONS) {

const filtered = env.NODE_OPTIONS

.split(/\s+/)

.filter((opt) => !/^--inspect(?:-brk)?(?:=|$)/.test(opt))

.join(' ')

if (filtered) env.NODE_OPTIONS = filtered

else delete env.NODE_OPTIONS

}

🤖 Prompt for AI Agents

In frontend/internal-packages/schema-bench/bin/schema-bench.mjs around lines 17 to 20, instead of deleting NODE_OPTIONS entirely, parse env.NODE_OPTIONS into space-separated tokens, filter out only inspector-related flags (--inspect, --inspect-brk and their =host:port variants), and rejoin the remaining tokens; if none remain delete env.NODE_OPTIONS, otherwise set it to the filtered string so non-inspector flags (e.g., memory limits) are preserved.

coderabbitai · 2025-09-12T11:32:52Z

frontend/internal-packages/schema-bench/bin/schema-bench.mjs

+  child.on('exit', (code) => process.exit(code ?? 0))
+  child.on('error', () => process.exit(1))
+}


🛠️ Refactor suggestion

Propagate child signals and preserve true exit semantics.

To fully “preserve exit-code semantics,” forward signals (SIGINT, SIGTERM, SIGHUP) and re-emit them when the child exits via signal.

- child.on('exit', (code) => process.exit(code ?? 0)) - child.on('error', () => process.exit(1)) + // Forward common signals to child so Ctrl+C etc. behave correctly + for (const sig of ['SIGINT', 'SIGTERM', 'SIGHUP']) { + process.on(sig, () => { + try { child.kill(sig as NodeJS.Signals) } catch {} + }) + } + child.on('exit', (code, signal) => { + if (signal) { + // Re-emit same signal to preserve semantics + try { process.kill(process.pid, signal as NodeJS.Signals) } catch {} + } else { + process.exit(code ?? 0) + } + }) + child.on('error', (err) => { + console.error('[schema-bench] Failed to start TS CLI:', err) + process.exit(1) + })

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In frontend/internal-packages/schema-bench/bin/schema-bench.mjs around lines 31–33, the child process exit handlers don't forward incoming signals or re-emit child exit signals to the parent; add listeners for SIGINT, SIGTERM, and SIGHUP that forward the same signal to the child (child.kill(signal)), and update the child.on('exit', ...) handler so that if the child exited due to a signal you re-emit that signal on the parent (e.g. process.kill(process.pid, signal)) instead of exiting with a code, otherwise call process.exit(code ?? 0); keep the existing child.on('error', ...) behavior.

- Added CLI entry point to knip ignore list with comment explaining why - CLI entry is launched via bin shim and knip cannot detect it automatically 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>

NoritakaIkeda · 2025-09-12T12:40:50Z

.claude/commands/benchmark-execute.md

 - **Input Standardization**: Entity-extraction inputs are automatically wrapped in `{"input": "..."}` format

-Next, I'll execute the specified model with dataset selection:
+Next, I'll execute the specified model with dataset selection. The unified CLI is recommended; legacy scripts remain available and unchanged:


I’m making these changes so the commands can also be executed via the CLI, which I think is simpler.
However, removing the existing commands right away would create a larger-than-expected impact, so in the short term I’ll keep the old commands as well.
Once the CLI commands are running stably, the legacy commands will be removed.

NoritakaIkeda · 2025-09-12T12:52:34Z

frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts

+const getExecutionBaseDir = (): string => {
+  // Prefer top-level benchmark workspace execution dir, fallback to internal default dataset
+  const workspaceExecution = getWorkspaceSubPath('execution')
+  if (existsSync(workspaceExecution)) return workspaceExecution
+  const __filename = fileURLToPath(import.meta.url)
+  const __dirname = dirname(__filename)
+  return resolve(__dirname, '../../benchmark-workspace-default/execution')
+}


I fixed the outdated directory specification. This isn't a core change in this PR.

NoritakaIkeda · 2025-09-12T12:55:19Z

frontend/internal-packages/schema-bench/src/cli/index.ts

frontend/internal-packages/schema-bench/src/cli/index.ts is responsible for parsing the command-line options and invoking the appropriate function(s) to execute the CLI logic.

NoritakaIkeda · 2025-09-12T12:56:26Z

knip.jsonc

-    "frontend/apps/app/features/public-share/actions.ts"
+    "frontend/apps/app/features/public-share/actions.ts",
+    // - CLI entry is launched via bin shim; knip can't detect it
+    "frontend/internal-packages/schema-bench/src/cli/index.ts"


I’m making these changes so the commands can also be executed via the CLI, which I think is simpler.
However, removing the existing commands right away would create a larger-than-expected impact, so in the short term I’ll keep the old commands as well.

Once the CLI commands are running stably, the legacy commands will be removed.

This is a temporary measure due to the impact of this change. It will be removed during refactoring.

Copilot

Pull Request Overview

This PR implements MVP1 of a unified CLI for schema-bench, providing a single entry point that wraps existing TypeScript scripts while maintaining full backward compatibility. The unified CLI supports execution, evaluation, and comparison operations with consistent interface across different executors.

Introduces a new schema-bench command with subcommands for execute (with executor selection), evaluate, and compare (placeholder)
Implements child process wrapper pattern that forwards arguments to existing scripts unchanged
Enhanced OpenAI executor with flexible workspace detection and improved error handling

Reviewed Changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
src/cli/index.ts	New unified CLI entry point with command parsing and child process execution
src/cli/executeOpenai.ts	Enhanced with flexible workspace detection and base directory parameter threading
package.json	Adds CLI binary entry point and npm script
README.md	Updated documentation showing both unified CLI and legacy script usage
knip.jsonc	Excludes CLI entry from unused code detection
.claude/commands/benchmark-execute.md	Updated command examples with unified CLI recommendations

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-12T12:59:22Z

frontend/internal-packages/schema-bench/src/cli/index.ts

+
+const VERSION = '0.1.0'


Hardcoded version number creates maintenance burden. Consider reading from package.json to ensure consistency and avoid manual updates.

Suggested change

const VERSION = '0.1.0'

import fs from 'node:fs'

// Dynamically read version from package.json

const __filename = fileURLToPath(import.meta.url)

const __dirname = path.dirname(__filename)

// Adjust the path to package.json as needed (here: three levels up)

const pkgPath = path.resolve(__dirname, '../../../package.json')

let VERSION = 'unknown'

try {

const pkgRaw = fs.readFileSync(pkgPath, 'utf8')

const pkg = JSON.parse(pkgRaw)

VERSION = pkg.version || VERSION

} catch (e) {

// fallback: VERSION remains 'unknown'

}

Copilot · 2025-09-12T12:59:23Z

frontend/internal-packages/schema-bench/src/cli/index.ts

+  for (const tok of args) {
+    if (expectValue) {
+      if (!tok.startsWith('-')) {
+        executor = tok
+      } else {
+        pass.push(tok)
+      }
+      expectValue = false
+      continue
+    }


When expectValue is true but tok starts with '-', the code adds tok to pass but doesn't handle the missing executor value. This could silently ignore missing values after --executor flag.

Copilot · 2025-09-12T12:59:23Z

frontend/internal-packages/schema-bench/src/cli/index.ts

+    const map: Record<string, string> = {
+      openai: resolveCli('./executeOpenai.ts'),
+      liam: resolveCli('./executeLiamDbUnified.ts'),
+    }
+    const target = map[String(executor).toLowerCase()]


The executor mapping logic is hardcoded and repeated validation occurs. Consider extracting valid executors to a constant array for reuse in validation messages.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

knip.jsonc (1)
33-35: Declare schema-bench CLI as a knip workspace entry (remove from ignore)

frontend/internal-packages/schema-bench contains CLI sources (src/cli/**/*) and a bin in package.json—remove the CLI from knip ignores and add a workspace entry so knip can traverse the CLI and spawned TS scripts.
-    "frontend/apps/app/features/public-share/actions.ts",
-    // - CLI entry is launched via bin shim; knip can't detect it
-    "frontend/internal-packages/schema-bench/src/cli/index.ts"
+    "frontend/apps/app/features/public-share/actions.ts"
Add this workspace config (in your workspaces section):
{
  "workspaces": {
    "frontend/internal-packages/schema-bench": {
      "entry": [
        "src/cli/index.ts",
        "src/cli/**/*.ts"
      ]
    }
  }
}

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f03cedc and 3c8fad2.

📒 Files selected for processing (1)

knip.jsonc (1 hunks)

🧰 Additional context used

🧠 Learnings (2)

📓 Common learnings

Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.

📚 Learning: 2025-08-25T01:34:20.255Z

Learnt from: CR
PR: liam-hq/liam#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-25T01:34:20.255Z
Learning: Applies to frontend/apps/@(app|docs)/app/**/*.{ts,tsx} : Prefer Server Components for server-side data fetching

Applied to files:

knip.jsonc

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: Supabase Preview
GitHub Check: frontend-ci
GitHub Check: frontend-lint
GitHub Check: Supabase Preview

MH4GF · 2025-09-12T13:02:31Z

Thank you for working late at night, I'll check it on Monday!

MH4GF

@NoritakaIkeda
I couldn't understand why this change was necessary even after reading the description.
Why should they be unified? Is it impossible to configure the reasoning effort without unifying them?

It also seemed odd that the mjs script was calling the tx process. Since we're using tsx, building shouldn't be necessary in the first place.

hoshinotsuyoshi · 2025-09-16T07:43:49Z

.claude/commands/benchmark-execute.md

+# Unified CLI (recommended)
+pnpm -F @liam-hq/schema-bench schema-bench evaluate


(👀 I'm testing now)

prepare

rm -rf benchmark-workspace pnpm --filter @liam-hq/schema-bench setupWorkspace

execute(openai)

# success pnpm -F @liam-hq/schema-bench schema-bench execute --executor openai

evaluation

Actually, an error occurred, but it seems to be a schema discrepancy error. The directory seems to be fine. 👍

pnpm -F @liam-hq/schema-bench schema-bench > @liam-hq/[email protected] schema-bench /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/frontend/internal-packages/schema-bench > node bin/schema-bench.mjs evaluate ⚠️ Failed to load 3 file(s): - case-001-difficult.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/default/execution/reference/case-001-difficult.json: Invalid schema format: Invalid key: Expected "enums" but received undefined - case-001.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/default/execution/reference/case-001.json: Invalid schema format: Invalid key: Expected "enums" but received undefined - case-002.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/default/execution/reference/case-002.json: Invalid schema format: Invalid key: Expected "enums" but received undefined ❌ Failed to evaluate dataset: { "type": "JSON_PARSE_ERROR", "path": "/Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/default/execution/reference/case-001-difficult.json", "cause": "Invalid schema format: Invalid key: Expected \"enums\" but received undefined" } ⚠️ Failed to load 5 file(s): - case-001.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-001.json: Invalid schema format: Invalid key: Expected "comment" but received undefined - case-002.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-002.json: Invalid schema format: Invalid key: Expected "comment" but received undefined - case-003.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-003.json: Invalid schema format: Invalid key: Expected "comment" but received undefined - case-004.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-004.json: Invalid schema format: Invalid key: Expected "comment" but received undefined - case-005.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-005.json: Invalid schema format: Invalid key: Expected "comment" but received undefined ❌ Failed to evaluate dataset: { "type": "JSON_PARSE_ERROR", "path": "/Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-001.json", "cause": "Invalid schema format: Invalid key: Expected \"comment\" but received undefined" } ⚠️ Failed to load 3 file(s): - case-001.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/ambiguous-recall/execution/reference/case-001.json: Invalid schema format: Invalid key: Expected "comment" but received undefined - case-002.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/ambiguous-recall/execution/reference/case-002.json: Invalid schema format: Invalid key: Expected "comment" but received undefined - case-003.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/ambiguous-recall/execution/reference/case-003.json: Invalid schema format: Invalid key: Expected "comment" but received undefined ❌ Failed to evaluate dataset: { "type": "JSON_PARSE_ERROR", "path": "/Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/ambiguous-recall/execution/reference/case-001.json", "cause": "Invalid schema format: Invalid key: Expected \"comment\" but received undefined" } ❌ Evaluation failed for 3 dataset(s) /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/frontend/internal-packages/schema-bench: ERR_PNPM_RECURSIVE_RUN_FIRST_FAIL @liam-hq/[email protected] schema-bench: `node bin/schema-bench.mjs evaluate` Exit status 1

hoshinotsuyoshi · 2025-09-16T07:52:34Z

.claude/commands/benchmark-execute.md

nits

Actually, since .claude/commands/benchmark-execute.md is a command file specifically for Claude, I don't think we need to include the legacy usage patterns. Unless there's some benefit to keeping them, what do you think?

MH4GF · 2025-10-06T08:46:05Z

@NoritakaIkeda Is it okay to close this PR?

NoritakaIkeda · 2025-10-06T08:47:12Z

@MH4GF
Sorry, but I'm closing this PR since the policy has already changed. 🙏

NoritakaIkeda and others added 2 commits September 12, 2025 18:44

Merge remote-tracking branch 'origin/main'

5cdf3a5

vercel bot deployed to Preview – liam-storybook September 12, 2025 09:45 View deployment

github-actions bot assigned NoritakaIkeda Sep 12, 2025

vercel bot deployed to Preview – liam-erd-sample September 12, 2025 09:45 View deployment

vercel bot deployed to Preview – liam-assets September 12, 2025 09:45 View deployment

vercel bot deployed to Preview – liam-app September 12, 2025 09:50 View deployment

coderabbitai bot reviewed Sep 12, 2025

View reviewed changes

vercel bot temporarily deployed to Preview – liam-erd-sample September 12, 2025 11:06 Inactive

vercel bot deployed to Preview – liam-assets September 12, 2025 11:06 View deployment

vercel bot deployed to Preview – liam-storybook September 12, 2025 11:06 View deployment

vercel bot deployed to Preview – liam-app September 12, 2025 11:10 View deployment

vercel bot deployed to Preview – liam-storybook September 12, 2025 11:11 View deployment

vercel bot deployed to Preview – liam-assets September 12, 2025 11:11 View deployment

vercel bot deployed to Preview – liam-erd-sample September 12, 2025 11:11 View deployment

vercel bot deployed to Preview – liam-app September 12, 2025 11:15 View deployment

coderabbitai bot reviewed Sep 12, 2025

View reviewed changes

vercel bot deployed to Preview – liam-storybook September 12, 2025 12:54 View deployment

vercel bot deployed to Preview – liam-assets September 12, 2025 12:54 View deployment

vercel bot deployed to Preview – liam-erd-sample September 12, 2025 12:54 View deployment

NoritakaIkeda commented Sep 12, 2025

View reviewed changes

vercel bot deployed to Preview – liam-app September 12, 2025 12:57 View deployment

NoritakaIkeda marked this pull request as ready for review September 12, 2025 12:58

NoritakaIkeda requested a review from a team as a code owner September 12, 2025 12:58

NoritakaIkeda requested review from FunamaYukina, MH4GF, Copilot, hoshinotsuyoshi and junkisai and removed request for a team September 12, 2025 12:58

Copilot AI reviewed Sep 12, 2025

View reviewed changes

coderabbitai bot reviewed Sep 12, 2025

View reviewed changes

MH4GF reviewed Sep 16, 2025

View reviewed changes

hoshinotsuyoshi reviewed Sep 16, 2025

View reviewed changes

NoritakaIkeda closed this Oct 6, 2025

-  const env = { ...process.env }
-  if (env.NODE_OPTIONS?.includes('--inspect')) {
-    delete env.NODE_OPTIONS
-  }
+  const env = { ...process.env }
+  if (env.NODE_OPTIONS) {
+    const filtered = env.NODE_OPTIONS
+      .split(/\s+/)
+      .filter((opt) => !/^--inspect(?:-brk)?(?:=|$)/.test(opt))
+      .join(' ')
+    if (filtered) env.NODE_OPTIONS = filtered
+    else delete env.NODE_OPTIONS
+  }

-const VERSION = '0.1.0'
+import fs from 'node:fs'
+// Dynamically read version from package.json
+const __filename = fileURLToPath(import.meta.url)
+const __dirname = path.dirname(__filename)
+// Adjust the path to package.json as needed (here: three levels up)
+const pkgPath = path.resolve(__dirname, '../../../package.json')
+let VERSION = 'unknown'
+try {
+  const pkgRaw = fs.readFileSync(pkgPath, 'utf8')
+  const pkg = JSON.parse(pkgRaw)
+  VERSION = pkg.version || VERSION
+} catch (e) {
+  // fallback: VERSION remains 'unknown'
+}

		# Unified CLI (recommended)
		pnpm -F @liam-hq/schema-bench schema-bench evaluate

✨(schema-bench): Add unified CLI with MVP1 wrapper implementation #3461

✨(schema-bench): Add unified CLI with MVP1 wrapper implementation #3461

Uh oh!

Conversation

NoritakaIkeda commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Issue

Why is this change needed?

Key Changes

Unified CLI Interface

Non-breaking Implementation

Enhanced OpenAI Executor

Documentation Updates

Technical Implementation

Acceptance Criteria

Testing

Summary by CodeRabbit

Uh oh!

vercel bot commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai bot commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Other AI code review bot(s) detected

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Review ran into problems

Pre-merge checks (3 passed)

Uh oh!

supabase bot commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

NoritakaIkeda Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

NoritakaIkeda Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

NoritakaIkeda Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

NoritakaIkeda Sep 12, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

NoritakaIkeda commented Sep 12, 2025 •

edited

Loading

vercel bot commented Sep 12, 2025 •

edited

Loading

changeset-bot bot commented Sep 12, 2025 •

edited

Loading

coderabbitai bot commented Sep 12, 2025 •

edited

Loading

supabase bot commented Sep 12, 2025 •

edited

Loading

hoshinotsuyoshi Sep 16, 2025 •

edited

Loading