Skip to content

Conversation

@NoritakaIkeda
Copy link
Member

@NoritakaIkeda NoritakaIkeda commented Sep 12, 2025

Issue

  • resolve: route06/liam-internal#5652

Why is this change needed?

This implements MVP1 of the schema-bench unified CLI roadmap, providing a single entry point for execution, evaluation, and comparison operations while maintaining full backward compatibility with existing scripts.

The unified CLI addresses the need for a consistent interface across different executors (OpenAI, Liam) and operations, setting the foundation for future enhancements like model comparison and reasoning effort configuration.

This is preparation to enable configuration of reasoning effort from the command line.

Key Changes

Unified CLI Interface

  • New schema-bench command with subcommands:
    • execute --executor <liam|openai>: Unified execution interface
    • evaluate: Multi-dataset evaluation wrapper
    • compare: MVP1 placeholder for future implementation

Non-breaking Implementation

  • Child process wrapper pattern: Uses spawn() to call existing TypeScript scripts
  • Full argument pass-through: Unknown arguments forwarded to underlying scripts unchanged
  • Complete backward compatibility: All existing package.json scripts continue to work

Enhanced OpenAI Executor

  • Flexible workspace detection: Auto-detects workspace vs internal dataset directories
  • Improved error handling: Better messaging for missing workspace setup
  • Debugger conflict resolution: Prevents nested NODE_OPTIONS inspector issues

Documentation Updates

  • Dual interface documentation: Shows both unified CLI (recommended) and legacy scripts
  • Clear migration guidance: Explains transition from legacy to unified interface
  • Command examples: Comprehensive usage examples for all interfaces

Technical Implementation

The MVP1 implementation follows the "wrapper + pass-through" pattern to minimize risk:

  1. CLI Entry Point (src/cli/index.ts):

    • Uses commander.js pattern for argument parsing
    • Maps executors to existing script files
    • Handles exit code propagation correctly
  2. Child Process Execution:

    • Spawns existing scripts via tsx runtime
    • Inherits stdio for seamless user experience
    • Prevents debugger conflicts in nested execution
  3. Workspace Resolution:

    • Enhanced executeOpenai.ts with flexible base directory detection
    • Supports both workspace and internal dataset execution modes

Acceptance Criteria

Full Compatibility: All existing scripts work unchanged
Unified Interface: New CLI provides consistent interface across executors
Non-breaking: Legacy workflows continue without modification
Extensible: Foundation for future MVP2/3 enhancements

Testing

  • Verified unified CLI execution with both OpenAI and Liam executors
  • Confirmed legacy scripts remain functional
  • Tested argument pass-through behavior
  • Validated workspace detection logic

This sets the foundation for future enhancements while ensuring zero disruption to existing workflows.

Summary by CodeRabbit

  • New Features

    • Unified CLI: schema-bench subcommands for execute, evaluate, and a compare placeholder.
    • Package now exposes a public CLI entry so schema-bench can be run directly.
  • Improvements

    • Execution flows accept an explicit executor (liam|openai) and pass-through options.
    • OpenAI execution falls back to a bundled default dataset when no workspace dataset is present.
  • Documentation

    • Updated guides with unified CLI examples and note that legacy commands remain supported.
    • Added OpenAI environment note (OPENAI_API_KEY).

NoritakaIkeda and others added 2 commits September 12, 2025 18:44
Add unified schema-bench CLI with execute, evaluate, and compare subcommands:

- **Unified CLI Interface**: New `schema-bench` command with subcommands
  - `execute --executor <liam|openai>`: Wraps existing executor scripts
  - `evaluate`: Wraps evaluateSchemaMulti.ts for multi-dataset evaluation
  - `compare`: MVP1 placeholder with guidance message

- **Non-breaking Implementation**: All existing scripts remain unchanged
  - Legacy scripts continue to work via package.json scripts
  - New CLI uses child_process.spawn to call existing implementations
  - Full argument pass-through to underlying scripts

- **Enhanced OpenAI Executor**: Improved workspace detection
  - Auto-detects workspace vs internal dataset directories
  - Flexible execution base directory resolution
  - Better error handling for missing workspace setup

- **Updated Documentation**: README and command docs reflect both interfaces
  - Recommends unified CLI while preserving legacy documentation
  - Clear migration path from legacy to unified interface

This implements MVP1 from the schema-bench roadmap, providing a single entry point while maintaining full backward compatibility.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
@vercel
Copy link

vercel bot commented Sep 12, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
liam-app Ready Ready Preview Comment Sep 12, 2025 0:57am
liam-assets Ready Ready Preview Comment Sep 12, 2025 0:57am
liam-erd-sample Ready Ready Preview Comment Sep 12, 2025 0:57am
liam-storybook Ready Ready Preview Comment Sep 12, 2025 0:57am
1 Skipped Deployment
Project Deployment Preview Comments Updated (UTC)
liam-docs Ignored Ignored Preview Sep 12, 2025 0:57am

@changeset-bot
Copy link

changeset-bot bot commented Sep 12, 2025

⚠️ No Changeset found

Latest commit: 3c8fad2

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Sep 12, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds a unified schema-bench CLI with a Node ESM bin shim and TypeScript entrypoint, routes execute/evaluate/compare subcommands to TS scripts via tsx, introduces a package bin/script, refactors executeOpenai to use a computed baseExecutionDir, and updates docs while preserving legacy scripts.

Changes

Cohort / File(s) Summary
CLI entry + shim & packaging
frontend/internal-packages/schema-bench/src/cli/index.ts, frontend/internal-packages/schema-bench/bin/schema-bench.mjs, frontend/internal-packages/schema-bench/package.json
Adds a versioned top-level TypeScript CLI with subcommands (execute/evaluate/compare/help/version), a Node ESM shim to run the TS CLI via tsx, a runTsCli helper, --executor parsing, and registers schema-bench as a package bin and npm script.
Executor: OpenAI IO refactor
frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts
Centralizes baseExecutionDir resolution (workspace or internal default), refactors input/output helpers and executeCase to use base dir, preserves concurrency and OPENAI_API_KEY handling.
Docs & examples
frontend/internal-packages/schema-bench/README.md, .claude/commands/benchmark-execute.md
Rewrites docs to document the unified CLI examples for execute/evaluate, moves some OpenAI examples to legacy section, and updates notes (e.g., OPENAI_API_KEY).
Tooling ignore
knip.jsonc
Adds CLI entry to knip ignore list with explanatory comment.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor U as User
  participant Bin as bin/schema-bench.mjs
  participant CLI as src/cli/index.ts (TS CLI)
  participant TSX as tsx runner
  participant Exec as Executor script (execute*.ts)
  participant Eval as evaluateSchemaMulti.ts

  U->>Bin: schema-bench <subcommand> [--executor/-x args]
  Note right of Bin: resolve TS path, sanitize NODE_OPTIONS, spawn
  Bin->>CLI: node --import tsx index.ts [args]
  CLI->>CLI: parse subcommand & flags
  alt execute (with --executor liam|openai)
    CLI->>TSX: node --import tsx execute*.ts [passthrough args]
    TSX->>Exec: run executor (liam/openai)
    Note over Exec: openai uses computed baseExecutionDir for IO
    Exec-->>TSX: exit code
  else evaluate
    CLI->>TSX: node --import tsx evaluateSchemaMulti.ts [args]
    TSX->>Eval: run evaluation
    Eval-->>TSX: exit code
  else compare
    CLI-->>Bin: print placeholder, exit 0
  end
  TSX-->>CLI: propagate exit code
  CLI-->>Bin: propagate exit code
  Bin-->>U: process exit
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

Review effort 4/5

Suggested reviewers

  • hoshinotsuyoshi
  • FunamaYukina
  • junkisai

Poem

Hoppity hop, I ship a single name,
Shim wakes tsx, and nothing looks the same.
Executors hum, outputs tucked in rows,
Legacy paths linger where the old wind blows.
A rabbit cheers — unified CLI, here we go! 🐇✨

Warning

Review ran into problems

🔥 Problems

Errors were encountered while retrieving linked issues.

Errors (1)
  • Failed to retrieve linked issues from the platform client.

Pre-merge checks (3 passed)

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly and accurately summarizes the primary change — adding a unified CLI and an MVP1 wrapper for schema-bench — which aligns with the PR objectives and modified files. It includes the affected package scope (schema-bench) and the implementation stage (MVP1), making the intent clear to reviewers. The phrasing is specific and not generic, so it describes the main change rather than unrelated details. The leading emoji is cosmetic but does not render the title misleading.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
Description Check ✅ Passed The PR description follows the repository template: it includes an Issue section with a resolve reference (route06/liam-internal#5652) and a populated "Why is this change needed?" section, and it also provides comprehensive Key Changes, Technical Implementation, Acceptance Criteria, and Testing details that make the intent and scope clear to reviewers. This exceeds the template's minimum requirements and supplies actionable context for review. Therefore the description is complete and clear.
✨ Finishing touches
  • 📝 Generate Docstrings
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/schema-bench-unified-cli-mvp1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@supabase
Copy link

supabase bot commented Sep 12, 2025

Updates to Preview Branch (feat/schema-bench-unified-cli-mvp1) ↗︎

Deployments Status Updated
Database Fri, 12 Sep 2025 12:54:43 UTC
Services Fri, 12 Sep 2025 12:54:43 UTC
APIs Fri, 12 Sep 2025 12:54:43 UTC

Tasks are run on every commit but only new migration files are pushed.
Close and reopen this PR if you want to apply changes from existing seed or migration files.

Tasks Status Updated
Configurations Fri, 12 Sep 2025 12:54:46 UTC
Migrations Fri, 12 Sep 2025 12:54:46 UTC
Seeding Fri, 12 Sep 2025 12:54:46 UTC
Edge Functions Fri, 12 Sep 2025 12:54:47 UTC

View logs for this Workflow Run ↗︎.
Learn more about Supabase for Git ↗︎.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts (1)

45-50: CLI hint typo: script name is setupWorkspace

Message recommends setup-workspace (hyphenated) which doesn’t exist; scripts define setupWorkspace.

-        `Input directory not found: ${inputDir}. Please run setup-workspace first.`,
+        `Input directory not found: ${inputDir}. Please run setupWorkspace first.`,
🧹 Nitpick comments (6)
frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-002.json (1)

472-480: Unique constraint references a non-existent column

uq_cps_active_one_per_plan includes cancellation_time_is_null_partial_index_placeholder, which is not a defined column. Either model partial uniqueness explicitly in your constraint format (e.g., with a where/filter field) or approximate with a CHECK + UNIQUE that only references existing columns.

If the evaluator requires valid column names, this will cause errors; please adjust the representation.

frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts (1)

181-183: Make concurrency configurable via env

Allow tuning without code changes; keep 5 as default.

-  const MAX_CONCURRENT = 5
+  const MAX_CONCURRENT = Number(process.env.SCHEMABENCH_MAX_CONCURRENCY) || 5
frontend/internal-packages/schema-bench/package.json (1)

40-40: Optional: drop unused “main” or add exports to avoid confusion

This package is CLI-oriented; main: "src/index.ts" may be misleading. Consider removing it or adding proper exports if you intend runtime imports.

frontend/internal-packages/schema-bench/bin/schema-bench.mjs (1)

17-25: Avoid nested inspector conflicts (strip --inspect from NODE_OPTIONS for the child)

Running tsx under an inspected parent can fail. Sanitize NODE_OPTIONS before spawn.

-  const child = spawn(
+  const env = { ...process.env }
+  if (env.NODE_OPTIONS && /--inspect/.test(env.NODE_OPTIONS)) {
+    env.NODE_OPTIONS = env.NODE_OPTIONS.replace(/--inspect(-brk)?(=[^\s]+)?/g, '').trim() || undefined
+  }
+  const child = spawn(
     process.execPath,
     [tsxCli, entryTs, ...process.argv.slice(2)],
     {
       stdio: 'inherit',
-      env: process.env,
+      env,
       cwd: process.cwd(),
     },
   )
frontend/internal-packages/schema-bench/src/cli/index.ts (2)

24-34: Propagate sanitized NODE_OPTIONS to sub-scripts to prevent nested inspector issues

Mirror the shim’s sanitization here to cover direct script execution.

-const runTsCli = (tsFile: string, args: string[]): Promise<number> => {
+const runTsCli = (tsFile: string, args: string[]): Promise<number> => {
   return new Promise((resolve) => {
-    const child = spawn(process.execPath, [tsxPath(), tsFile, ...args], {
+    const env = { ...process.env }
+    if (env.NODE_OPTIONS && /--inspect/.test(env.NODE_OPTIONS)) {
+      env.NODE_OPTIONS = env.NODE_OPTIONS.replace(/--inspect(-brk)?(=[^\s]+)?/g, '').trim() || undefined
+    }
+    const child = spawn(process.execPath, [tsxPath(), tsFile, ...args], {
       stdio: 'inherit',
-      env: process.env,
+      env,
       cwd: process.cwd(),
     })

108-121: Helpful error output for unknown executor

Consider echoing the received value to speed up user troubleshooting.

-    if (!target) {
-      console.error('Unknown executor. Use one of: liam | openai')
+    if (!target) {
+      console.error(`Unknown executor: ${String(executor)}. Use one of: liam | openai`)
       process.exit(1)
     }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b76a7ea and 5cdf3a5.

📒 Files selected for processing (8)
  • frontend/internal-packages/schema-bench/README.md (2 hunks)
  • frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001-difficult.json (1 hunks)
  • frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001.json (1 hunks)
  • frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-002.json (1 hunks)
  • frontend/internal-packages/schema-bench/bin/schema-bench.mjs (1 hunks)
  • frontend/internal-packages/schema-bench/package.json (2 hunks)
  • frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts (7 hunks)
  • frontend/internal-packages/schema-bench/src/cli/index.ts (1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
frontend/internal-packages/**

📄 CodeRabbit inference engine (AGENTS.md)

Infra and tooling (e2e, configs, storybook, agent) live under frontend/internal-packages

Files:

  • frontend/internal-packages/schema-bench/bin/schema-bench.mjs
  • frontend/internal-packages/schema-bench/package.json
  • frontend/internal-packages/schema-bench/src/cli/index.ts
  • frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001-difficult.json
  • frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001.json
  • frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-002.json
  • frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts
  • frontend/internal-packages/schema-bench/README.md
**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.{ts,tsx}: Use runtime type validation with valibot for external data validation
Use early returns for readability
Use named exports only (no default exports)
Follow existing import patterns and tsconfig path aliases
Prefer const arrow functions for simple utilities (e.g., const toggle = () => {})

Use TypeScript/TSX across the codebase

Files:

  • frontend/internal-packages/schema-bench/src/cli/index.ts
  • frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts
frontend/**/*.{ts,tsx}

📄 CodeRabbit inference engine (CLAUDE.md)

frontend/**/*.{ts,tsx}: Import UI components from @liam-hq/ui when available
Import icons from @liam-hq/ui

Files:

  • frontend/internal-packages/schema-bench/src/cli/index.ts
  • frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts
**/*.ts

📄 CodeRabbit inference engine (AGENTS.md)

Name utility files in camelCase (e.g., mergeSchema.ts)

Files:

  • frontend/internal-packages/schema-bench/src/cli/index.ts
  • frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts
🧠 Learnings (2)
📓 Common learnings
Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.
📚 Learning: 2025-07-30T05:52:56.270Z
Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.

Applied to files:

  • frontend/internal-packages/schema-bench/bin/schema-bench.mjs
  • frontend/internal-packages/schema-bench/package.json
  • frontend/internal-packages/schema-bench/src/cli/index.ts
  • frontend/internal-packages/schema-bench/README.md
🧬 Code graph analysis (2)
frontend/internal-packages/schema-bench/src/cli/index.ts (1)
frontend/internal-packages/schema-bench/bin/schema-bench.mjs (4)
  • require (9-9)
  • __filename (13-13)
  • __dirname (14-14)
  • child (17-25)
frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts (2)
frontend/internal-packages/schema-bench/src/cli/utils/workspace.ts (1)
  • getWorkspaceSubPath (21-23)
frontend/internal-packages/schema-bench/bin/schema-bench.mjs (2)
  • __filename (13-13)
  • __dirname (14-14)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: frontend-lint
  • GitHub Check: frontend-ci
  • GitHub Check: security-review
  • GitHub Check: Supabase Preview
🔇 Additional comments (3)
frontend/internal-packages/schema-bench/package.json (1)

6-8: Bin mapping looks good

The schema-bench binary correctly points to the shim.

frontend/internal-packages/schema-bench/README.md (2)

22-31: Docs align with unified CLI—nice

Clear examples for Liam/OpenAI with pass-through flags.


46-51: Good: unified evaluate path with legacy fallback

This matches the new CLI entry while preserving old scripts.

Comment on lines 186 to 226
"foreignKeys':[{": ":",
"name": {
"type": "FOREIGN KEY",
"name": "fk_eom_employee",
"columnNames": [
"employee_code"
],
"targetTableName": "employee",
"targetColumnNames": [
"employee_code"
],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "CASCADE"
},
"targetTableName": {
"type": "FOREIGN KEY",
"name": "fk_eom_org",
"columnNames": [
"org_code"
],
"targetTableName": "organization",
"targetColumnNames": [
"org_code"
],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "CASCADE"
},
"targetColumnNames": {
"type": "FOREIGN KEY",
"name": "fk_eom_position",
"columnNames": [
"position_code"
],
"targetTableName": "position",
"targetColumnNames": [
"position_code"
],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "SET_NULL"
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Broken JSON and malformed constraints block in employee_org_membership.constraints

This section is syntactically invalid JSON and mixes object keys/values incorrectly. It will fail to parse and breaks downstream tooling.

Replace the malformed block with a valid, consistent constraints map (mirroring the convention used elsewhere), and fold the “reports_to” rules here (so you can drop the separate “checks” table below).

         "primaryKey": {
           "type": "PRIMARY KEY",
           "name": "pk_emp_org_membership",
           "columnNames": [
             "employee_code",
             "org_code"
           ]
         },
-        "foreignKeys':[{": ":",
-        "name": {
-          "type": "FOREIGN KEY",
-          "name": "fk_eom_employee",
-          "columnNames": [
-            "employee_code"
-          ],
-          "targetTableName": "employee",
-          "targetColumnNames": [
-            "employee_code"
-          ],
-          "updateConstraint": "NO_ACTION",
-          "deleteConstraint": "CASCADE"
-        },
-        "targetTableName": {
-          "type": "FOREIGN KEY",
-          "name": "fk_eom_org",
-          "columnNames": [
-            "org_code"
-          ],
-          "targetTableName": "organization",
-          "targetColumnNames": [
-            "org_code"
-          ],
-          "updateConstraint": "NO_ACTION",
-          "deleteConstraint": "CASCADE"
-        },
-        "targetColumnNames": {
-          "type": "FOREIGN KEY",
-          "name": "fk_eom_position",
-          "columnNames": [
-            "position_code"
-          ],
-          "targetTableName": "position",
-          "targetColumnNames": [
-            "position_code"
-          ],
-          "updateConstraint": "NO_ACTION",
-          "deleteConstraint": "SET_NULL"
-        }
+        "fk_eom_employee": {
+          "type": "FOREIGN KEY",
+          "name": "fk_eom_employee",
+          "columnNames": ["employee_code"],
+          "targetTableName": "employee",
+          "targetColumnNames": ["employee_code"],
+          "updateConstraint": "NO_ACTION",
+          "deleteConstraint": "CASCADE"
+        },
+        "fk_eom_org": {
+          "type": "FOREIGN KEY",
+          "name": "fk_eom_org",
+          "columnNames": ["org_code"],
+          "targetTableName": "organization",
+          "targetColumnNames": ["org_code"],
+          "updateConstraint": "NO_ACTION",
+          "deleteConstraint": "CASCADE"
+        },
+        "fk_eom_position": {
+          "type": "FOREIGN KEY",
+          "name": "fk_eom_position",
+          "columnNames": ["position_code"],
+          "targetTableName": "position",
+          "targetColumnNames": ["position_code"],
+          "updateConstraint": "NO_ACTION",
+          "deleteConstraint": "SET_NULL"
+        },
+        "ck_eom_reports_to_not_self": {
+          "type": "CHECK",
+          "name": "ck_eom_reports_to_not_self",
+          "detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code"
+        },
+        "fk_eom_reports_to_same_org": {
+          "type": "FOREIGN KEY",
+          "name": "fk_eom_reports_to_same_org",
+          "columnNames": ["reports_to_employee_code", "org_code"],
+          "targetTableName": "employee_org_membership",
+          "targetColumnNames": ["employee_code", "org_code"],
+          "updateConstraint": "NO_ACTION",
+          "deleteConstraint": "SET_NULL"
+        }
       }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"foreignKeys':[{": ":",
"name": {
"type": "FOREIGN KEY",
"name": "fk_eom_employee",
"columnNames": [
"employee_code"
],
"targetTableName": "employee",
"targetColumnNames": [
"employee_code"
],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "CASCADE"
},
"targetTableName": {
"type": "FOREIGN KEY",
"name": "fk_eom_org",
"columnNames": [
"org_code"
],
"targetTableName": "organization",
"targetColumnNames": [
"org_code"
],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "CASCADE"
},
"targetColumnNames": {
"type": "FOREIGN KEY",
"name": "fk_eom_position",
"columnNames": [
"position_code"
],
"targetTableName": "position",
"targetColumnNames": [
"position_code"
],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "SET_NULL"
}
}
"primaryKey": {
"type": "PRIMARY KEY",
"name": "pk_emp_org_membership",
"columnNames": [
"employee_code",
"org_code"
]
},
"fk_eom_employee": {
"type": "FOREIGN KEY",
"name": "fk_eom_employee",
"columnNames": ["employee_code"],
"targetTableName": "employee",
"targetColumnNames": ["employee_code"],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "CASCADE"
},
"fk_eom_org": {
"type": "FOREIGN KEY",
"name": "fk_eom_org",
"columnNames": ["org_code"],
"targetTableName": "organization",
"targetColumnNames": ["org_code"],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "CASCADE"
},
"fk_eom_position": {
"type": "FOREIGN KEY",
"name": "fk_eom_position",
"columnNames": ["position_code"],
"targetTableName": "position",
"targetColumnNames": ["position_code"],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "SET_NULL"
},
"ck_eom_reports_to_not_self": {
"type": "CHECK",
"name": "ck_eom_reports_to_not_self",
"detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code"
},
"fk_eom_reports_to_same_org": {
"type": "FOREIGN KEY",
"name": "fk_eom_reports_to_same_org",
"columnNames": ["reports_to_employee_code", "org_code"],
"targetTableName": "employee_org_membership",
"targetColumnNames": ["employee_code", "org_code"],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "SET_NULL"
}

Comment on lines 228 to 255
"checks": {
"name": "ck_eom_reports_to_not_self",
"columns": {},
"comment": null,
"indexes": {},
"constraints": {
"type": {
"type": "CHECK",
"name": "ck_eom_reports_to_not_self",
"detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code"
},
"type2": {
"type": "FOREIGN KEY",
"name": "fk_eom_reports_to_same_org",
"columnNames": [
"reports_to_employee_code",
"org_code"
],
"targetTableName": "employee_org_membership",
"targetColumnNames": [
"employee_code",
"org_code"
],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "SET_NULL"
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Remove the synthetic “checks” table; these belong as constraints on employee_org_membership

The “checks” pseudo-table is not a real table and violates the schema shape described in README. After folding the check/FK above, delete this block.

-    },
-    "checks": {
-      "name": "ck_eom_reports_to_not_self",
-      "columns": {},
-      "comment": null,
-      "indexes": {},
-      "constraints": {
-        "type": {
-          "type": "CHECK",
-          "name": "ck_eom_reports_to_not_self",
-          "detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code"
-        },
-        "type2": {
-          "type": "FOREIGN KEY",
-          "name": "fk_eom_reports_to_same_org",
-          "columnNames": [
-            "reports_to_employee_code",
-            "org_code"
-          ],
-          "targetTableName": "employee_org_membership",
-          "targetColumnNames": [
-            "employee_code",
-            "org_code"
-          ],
-          "updateConstraint": "NO_ACTION",
-          "deleteConstraint": "SET_NULL"
-        }
-      }
-    }
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"checks": {
"name": "ck_eom_reports_to_not_self",
"columns": {},
"comment": null,
"indexes": {},
"constraints": {
"type": {
"type": "CHECK",
"name": "ck_eom_reports_to_not_self",
"detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code"
},
"type2": {
"type": "FOREIGN KEY",
"name": "fk_eom_reports_to_same_org",
"columnNames": [
"reports_to_employee_code",
"org_code"
],
"targetTableName": "employee_org_membership",
"targetColumnNames": [
"employee_code",
"org_code"
],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "SET_NULL"
}
}
}
}
🤖 Prompt for AI Agents
frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001-difficult.json
lines 228-255: the "checks" pseudo-table block should be removed and its
constraints folded into the real employee_org_membership table; delete the
entire "checks" object and move the CHECK constraint
(ck_eom_reports_to_not_self) and the FOREIGN KEY constraint
(fk_eom_reports_to_same_org) into the "constraints" section of the
employee_org_membership table entry, ensuring the CHECK uses the same detail
text and the FK references targetTableName/targetColumnNames and update/delete
actions are preserved.

Comment on lines 98 to 110
"foreign_keys": {
"type": "FOREIGN KEY",
"name": "",
"columnNames": [
""
],
"targetTableName": "",
"targetColumnNames": [
""
],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "NO_ACTION"
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove placeholder FOREIGN KEY with empty identifiers in position.constraints

This FK has empty name, columnNames, targetTableName, and targetColumnNames, which is invalid and may break consumers.

         "primary_key": {
           "type": "PRIMARY KEY",
           "name": "pk_position",
           "columnNames": [
             "position_code"
           ]
         },
-        "foreign_keys": {
-          "type": "FOREIGN KEY",
-          "name": "",
-          "columnNames": [
-            ""
-          ],
-          "targetTableName": "",
-          "targetColumnNames": [
-            ""
-          ],
-          "updateConstraint": "NO_ACTION",
-          "deleteConstraint": "NO_ACTION"
-        },
         "uniques": {
           "type": "UNIQUE",
           "name": "uq_position_name",
           "columnNames": [
             "position_name"
           ]
         },
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
"foreign_keys": {
"type": "FOREIGN KEY",
"name": "",
"columnNames": [
""
],
"targetTableName": "",
"targetColumnNames": [
""
],
"updateConstraint": "NO_ACTION",
"deleteConstraint": "NO_ACTION"
},
"primary_key": {
"type": "PRIMARY KEY",
"name": "pk_position",
"columnNames": [
"position_code"
]
},
"uniques": {
"type": "UNIQUE",
"name": "uq_position_name",
"columnNames": [
"position_name"
]
},
🤖 Prompt for AI Agents
In
frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001.json
around lines 98–110, there is a placeholder FOREIGN KEY object with empty name,
columnNames, targetTableName, and targetColumnNames which is invalid; remove
this placeholder foreign_keys entry (or replace it with a valid FK object) so
position.constraints does not contain entries with empty identifiers, and ensure
the producing code validates FK objects and omits them when required fields are
empty before writing the JSON.

Comment on lines +29 to +36
const getExecutionBaseDir = (): string => {
// Prefer top-level benchmark workspace execution dir, fallback to internal default dataset
const workspaceExecution = getWorkspaceSubPath('execution')
if (existsSync(workspaceExecution)) return workspaceExecution
const __filename = fileURLToPath(import.meta.url)
const __dirname = dirname(__filename)
return resolve(__dirname, '../../benchmark-workspace-default/execution')
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Workspace detection likely misses actual workspace layout

Workspace structure in README nests datasets under benchmark-workspace/<dataset>/execution/.... Checking only <workspace>/execution will often fail and force the internal fallback.

Use a smarter probe (prefer default dataset if present, then legacy single-level):

-const getExecutionBaseDir = (): string => {
-  // Prefer top-level benchmark workspace execution dir, fallback to internal default dataset
-  const workspaceExecution = getWorkspaceSubPath('execution')
-  if (existsSync(workspaceExecution)) return workspaceExecution
-  const __filename = fileURLToPath(import.meta.url)
-  const __dirname = dirname(__filename)
-  return resolve(__dirname, '../../benchmark-workspace-default/execution')
-}
+const getExecutionBaseDir = (): string => {
+  // Prefer workspace default dataset, fall back to legacy flat layout, then internal default dataset
+  const workspaceDefault = getWorkspaceSubPath('default', 'execution')
+  if (existsSync(workspaceDefault)) return workspaceDefault
+  const workspaceFlat = getWorkspaceSubPath('execution')
+  if (existsSync(workspaceFlat)) return workspaceFlat
+  const __filename = fileURLToPath(import.meta.url)
+  const __dirname = dirname(__filename)
+  return resolve(__dirname, '../../benchmark-workspace-default/execution')
+}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const getExecutionBaseDir = (): string => {
// Prefer top-level benchmark workspace execution dir, fallback to internal default dataset
const workspaceExecution = getWorkspaceSubPath('execution')
if (existsSync(workspaceExecution)) return workspaceExecution
const __filename = fileURLToPath(import.meta.url)
const __dirname = dirname(__filename)
return resolve(__dirname, '../../benchmark-workspace-default/execution')
}
const getExecutionBaseDir = (): string => {
// Prefer workspace default dataset, fall back to legacy flat layout, then internal default dataset
const workspaceDefault = getWorkspaceSubPath('default', 'execution')
if (existsSync(workspaceDefault)) return workspaceDefault
const workspaceFlat = getWorkspaceSubPath('execution')
if (existsSync(workspaceFlat)) return workspaceFlat
const __filename = fileURLToPath(import.meta.url)
const __dirname = dirname(__filename)
return resolve(__dirname, '../../benchmark-workspace-default/execution')
}
🤖 Prompt for AI Agents
In frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts around lines
29-36, the current workspace probe only checks for <workspace>/execution and
therefore misses layouts where datasets are nested under
benchmark-workspace/<dataset>/execution; update getExecutionBaseDir to first
resolve the top-level benchmark-workspace path, look for a subdirectory whose
execution subpath exists (prefer a "default" dataset if present, otherwise pick
the first matching dataset), return that dataset's execution path if found, then
fall back to the legacy <workspace>/execution check, and finally fall back to
the internal default execution path; implement this by listing entries under the
workspace directory (or using a glob) and testing each candidate's execution
subpath before returning.

- Remove benchmark output files from version control (should not be committed)
- Fix NODE_OPTIONS debugger conflict in unified CLI spawn process
- Update benchmark-execute command documentation with unified CLI examples
- Improve CLI error handling and environment variable cleanup

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- Switch to Node --import tsx for better TypeScript execution compatibility
- Remove createRequire dependency for cleaner ESM implementation
- Update binary script to use consistent tsx execution pattern
- Enhanced benchmark-execute documentation with unified CLI examples
- Improved NODE_OPTIONS handling to prevent debugger conflicts

This ensures the unified CLI works reliably across different environments
while maintaining full backward compatibility with existing scripts.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
frontend/internal-packages/schema-bench/bin/schema-bench.mjs (2)

21-29: Small DX polish: inherit stdio already; add windowsHide false (optional).

Not required, but adding windowsHide: false prevents stray console windows on Windows.

   const child = spawn(
     process.execPath,
     ['--import', 'tsx', entryTs, ...process.argv.slice(2)],
     {
       stdio: 'inherit',
       env,
       cwd: process.cwd(),
+      windowsHide: false,
     },
   )

9-10: Fail fast if tsx isn't resolvable; add Node engine or fallback

File: frontend/internal-packages/schema-bench/bin/schema-bench.mjs

  • frontend/internal-packages/schema-bench/package.json declares [email protected] and has no "engines" field.
  • Add a runtime check (fail-fast with a clear error) before spawning; apply the suggested diff:
-const _require = createRequire(import.meta.url)
+const _require = createRequire(import.meta.url)
@@
-  const child = spawn(
+  // Verify that 'tsx' is available; give a helpful error if not.
+  try {
+    _require.resolve('tsx')
+  } catch {
+    console.error('[schema-bench] Missing dependency: tsx\n' +
+      'Please add "tsx" to dependencies of @liam-hq/schema-bench (or ensure it is hoisted and resolvable).')
+    process.exit(1)
+  }
+
+  const child = spawn(
     process.execPath,
     ['--import', 'tsx', entryTs, ...process.argv.slice(2)],
  • Because "engines" is unset, either add "engines.node": ">=20.6.0" to package.json to require Node versions that support --import, or implement a fallback path using --loader tsx for older Node releases.
.claude/commands/benchmark-execute.md (1)

81-82: Add hint about workspace setup for OpenAI (optional).

OpenAI can run against the default dataset or a prepared workspace. A short note helps first-time users who skipped setupWorkspace.

-Note: For OpenAI execution, ensure `OPENAI_API_KEY` is set in your environment.
+Note: For OpenAI execution, ensure `OPENAI_API_KEY` is set. If running against the internal datasets, run the workspace setup first:
+`pnpm -F @liam-hq/schema-bench setupWorkspace`
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5cdf3a5 and f03cedc.

📒 Files selected for processing (4)
  • .claude/commands/benchmark-execute.md (2 hunks)
  • frontend/internal-packages/schema-bench/bin/schema-bench.mjs (1 hunks)
  • frontend/internal-packages/schema-bench/package.json (2 hunks)
  • frontend/internal-packages/schema-bench/src/cli/index.ts (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • frontend/internal-packages/schema-bench/package.json
  • frontend/internal-packages/schema-bench/src/cli/index.ts
🧰 Additional context used
📓 Path-based instructions (1)
frontend/internal-packages/**

📄 CodeRabbit inference engine (AGENTS.md)

Infra and tooling (e2e, configs, storybook, agent) live under frontend/internal-packages

Files:

  • frontend/internal-packages/schema-bench/bin/schema-bench.mjs
🧠 Learnings (2)
📓 Common learnings
Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.
📚 Learning: 2025-07-30T05:52:56.270Z
Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.

Applied to files:

  • frontend/internal-packages/schema-bench/bin/schema-bench.mjs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Supabase Preview
  • GitHub Check: Supabase Preview
🔇 Additional comments (3)
.claude/commands/benchmark-execute.md (3)

41-52: LGTM: Unified CLI examples for Liam executor are clear and backward-compatible.

The pass-through dataset flags (-all, -entity-extraction, -ambiguous-recall) align with the wrapper’s “forward unknown args” approach.

If helpful, add one line showing --help to discover flags:

pnpm -F @liam-hq/schema-bench schema-bench execute --executor liam --help

61-66: LGTM: OpenAI unified CLI and legacy command are correctly documented.

Clear that unified defaults to the default dataset; legacy remains intact.


74-79: LGTM: Evaluation flow mirrors the unified/legacy pattern consistently.

Examples are concise and correct.

Comment on lines +17 to +20
const env = { ...process.env }
if (env.NODE_OPTIONS?.includes('--inspect')) {
delete env.NODE_OPTIONS
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Filter out only inspector flags from NODE_OPTIONS (don’t drop unrelated flags).

Deleting NODE_OPTIONS wholesale can discard non-inspector flags (e.g., memory limits). Strip only --inspect/--inspect-brk.

-  const env = { ...process.env }
-  if (env.NODE_OPTIONS?.includes('--inspect')) {
-    delete env.NODE_OPTIONS
-  }
+  const env = { ...process.env }
+  if (env.NODE_OPTIONS) {
+    const filtered = env.NODE_OPTIONS
+      .split(/\s+/)
+      .filter((opt) => !/^--inspect(?:-brk)?(?:=|$)/.test(opt))
+      .join(' ')
+    if (filtered) env.NODE_OPTIONS = filtered
+    else delete env.NODE_OPTIONS
+  }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const env = { ...process.env }
if (env.NODE_OPTIONS?.includes('--inspect')) {
delete env.NODE_OPTIONS
}
const env = { ...process.env }
if (env.NODE_OPTIONS) {
const filtered = env.NODE_OPTIONS
.split(/\s+/)
.filter((opt) => !/^--inspect(?:-brk)?(?:=|$)/.test(opt))
.join(' ')
if (filtered) env.NODE_OPTIONS = filtered
else delete env.NODE_OPTIONS
}
🤖 Prompt for AI Agents
In frontend/internal-packages/schema-bench/bin/schema-bench.mjs around lines 17
to 20, instead of deleting NODE_OPTIONS entirely, parse env.NODE_OPTIONS into
space-separated tokens, filter out only inspector-related flags (--inspect,
--inspect-brk and their =host:port variants), and rejoin the remaining tokens;
if none remain delete env.NODE_OPTIONS, otherwise set it to the filtered string
so non-inspector flags (e.g., memory limits) are preserved.

Comment on lines +31 to +33
child.on('exit', (code) => process.exit(code ?? 0))
child.on('error', () => process.exit(1))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Propagate child signals and preserve true exit semantics.

To fully “preserve exit-code semantics,” forward signals (SIGINT, SIGTERM, SIGHUP) and re-emit them when the child exits via signal.

-  child.on('exit', (code) => process.exit(code ?? 0))
-  child.on('error', () => process.exit(1))
+  // Forward common signals to child so Ctrl+C etc. behave correctly
+  for (const sig of ['SIGINT', 'SIGTERM', 'SIGHUP']) {
+    process.on(sig, () => {
+      try { child.kill(sig as NodeJS.Signals) } catch {}
+    })
+  }
+  child.on('exit', (code, signal) => {
+    if (signal) {
+      // Re-emit same signal to preserve semantics
+      try { process.kill(process.pid, signal as NodeJS.Signals) } catch {}
+    } else {
+      process.exit(code ?? 0)
+    }
+  })
+  child.on('error', (err) => {
+    console.error('[schema-bench] Failed to start TS CLI:', err)
+    process.exit(1)
+  })

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In frontend/internal-packages/schema-bench/bin/schema-bench.mjs around lines
31–33, the child process exit handlers don't forward incoming signals or re-emit
child exit signals to the parent; add listeners for SIGINT, SIGTERM, and SIGHUP
that forward the same signal to the child (child.kill(signal)), and update the
child.on('exit', ...) handler so that if the child exited due to a signal you
re-emit that signal on the parent (e.g. process.kill(process.pid, signal))
instead of exiting with a code, otherwise call process.exit(code ?? 0); keep the
existing child.on('error', ...) behavior.

- Added CLI entry point to knip ignore list with comment explaining why
- CLI entry is launched via bin shim and knip cannot detect it automatically

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <[email protected]>
- **Input Standardization**: Entity-extraction inputs are automatically wrapped in `{"input": "..."}` format

Next, I'll execute the specified model with dataset selection:
Next, I'll execute the specified model with dataset selection. The unified CLI is recommended; legacy scripts remain available and unchanged:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m making these changes so the commands can also be executed via the CLI, which I think is simpler.
However, removing the existing commands right away would create a larger-than-expected impact, so in the short term I’ll keep the old commands as well.
Once the CLI commands are running stably, the legacy commands will be removed.

Comment on lines +29 to +36
const getExecutionBaseDir = (): string => {
// Prefer top-level benchmark workspace execution dir, fallback to internal default dataset
const workspaceExecution = getWorkspaceSubPath('execution')
if (existsSync(workspaceExecution)) return workspaceExecution
const __filename = fileURLToPath(import.meta.url)
const __dirname = dirname(__filename)
return resolve(__dirname, '../../benchmark-workspace-default/execution')
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed the outdated directory specification. This isn't a core change in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

frontend/internal-packages/schema-bench/src/cli/index.ts is responsible for parsing the command-line options and invoking the appropriate function(s) to execute the CLI logic.

Comment on lines -33 to +35
"frontend/apps/app/features/public-share/actions.ts"
"frontend/apps/app/features/public-share/actions.ts",
// - CLI entry is launched via bin shim; knip can't detect it
"frontend/internal-packages/schema-bench/src/cli/index.ts"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m making these changes so the commands can also be executed via the CLI, which I think is simpler.
However, removing the existing commands right away would create a larger-than-expected impact, so in the short term I’ll keep the old commands as well.

Once the CLI commands are running stably, the legacy commands will be removed.

This is a temporary measure due to the impact of this change. It will be removed during refactoring.

@NoritakaIkeda NoritakaIkeda requested a review from a team as a code owner September 12, 2025 12:58
@NoritakaIkeda NoritakaIkeda requested review from FunamaYukina, MH4GF, Copilot, hoshinotsuyoshi and junkisai and removed request for a team September 12, 2025 12:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements MVP1 of a unified CLI for schema-bench, providing a single entry point that wraps existing TypeScript scripts while maintaining full backward compatibility. The unified CLI supports execution, evaluation, and comparison operations with consistent interface across different executors.

  • Introduces a new schema-bench command with subcommands for execute (with executor selection), evaluate, and compare (placeholder)
  • Implements child process wrapper pattern that forwards arguments to existing scripts unchanged
  • Enhanced OpenAI executor with flexible workspace detection and improved error handling

Reviewed Changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/cli/index.ts New unified CLI entry point with command parsing and child process execution
src/cli/executeOpenai.ts Enhanced with flexible workspace detection and base directory parameter threading
package.json Adds CLI binary entry point and npm script
README.md Updated documentation showing both unified CLI and legacy script usage
knip.jsonc Excludes CLI entry from unused code detection
.claude/commands/benchmark-execute.md Updated command examples with unified CLI recommendations

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Comment on lines +6 to +7

const VERSION = '0.1.0'
Copy link

Copilot AI Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded version number creates maintenance burden. Consider reading from package.json to ensure consistency and avoid manual updates.

Suggested change
const VERSION = '0.1.0'
import fs from 'node:fs'
// Dynamically read version from package.json
const __filename = fileURLToPath(import.meta.url)
const __dirname = path.dirname(__filename)
// Adjust the path to package.json as needed (here: three levels up)
const pkgPath = path.resolve(__dirname, '../../../package.json')
let VERSION = 'unknown'
try {
const pkgRaw = fs.readFileSync(pkgPath, 'utf8')
const pkg = JSON.parse(pkgRaw)
VERSION = pkg.version || VERSION
} catch (e) {
// fallback: VERSION remains 'unknown'
}

Copilot uses AI. Check for mistakes.
Comment on lines +74 to +83
for (const tok of args) {
if (expectValue) {
if (!tok.startsWith('-')) {
executor = tok
} else {
pass.push(tok)
}
expectValue = false
continue
}
Copy link

Copilot AI Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When expectValue is true but tok starts with '-', the code adds tok to pass but doesn't handle the missing executor value. This could silently ignore missing values after --executor flag.

Copilot uses AI. Check for mistakes.
Comment on lines +114 to +118
const map: Record<string, string> = {
openai: resolveCli('./executeOpenai.ts'),
liam: resolveCli('./executeLiamDbUnified.ts'),
}
const target = map[String(executor).toLowerCase()]
Copy link

Copilot AI Sep 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The executor mapping logic is hardcoded and repeated validation occurs. Consider extracting valid executors to a constant array for reuse in validation messages.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
knip.jsonc (1)

33-35: Declare schema-bench CLI as a knip workspace entry (remove from ignore)

frontend/internal-packages/schema-bench contains CLI sources (src/cli/**/*) and a bin in package.json—remove the CLI from knip ignores and add a workspace entry so knip can traverse the CLI and spawned TS scripts.

-    "frontend/apps/app/features/public-share/actions.ts",
-    // - CLI entry is launched via bin shim; knip can't detect it
-    "frontend/internal-packages/schema-bench/src/cli/index.ts"
+    "frontend/apps/app/features/public-share/actions.ts"

Add this workspace config (in your workspaces section):

{
  "workspaces": {
    "frontend/internal-packages/schema-bench": {
      "entry": [
        "src/cli/index.ts",
        "src/cli/**/*.ts"
      ]
    }
  }
}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f03cedc and 3c8fad2.

📒 Files selected for processing (1)
  • knip.jsonc (1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.
📚 Learning: 2025-08-25T01:34:20.255Z
Learnt from: CR
PR: liam-hq/liam#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-25T01:34:20.255Z
Learning: Applies to frontend/apps/@(app|docs)/app/**/*.{ts,tsx} : Prefer Server Components for server-side data fetching

Applied to files:

  • knip.jsonc
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: Supabase Preview
  • GitHub Check: frontend-ci
  • GitHub Check: frontend-lint
  • GitHub Check: Supabase Preview

@MH4GF
Copy link
Member

MH4GF commented Sep 12, 2025

Thank you for working late at night, I'll check it on Monday!

Copy link
Member

@MH4GF MH4GF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@NoritakaIkeda
I couldn't understand why this change was necessary even after reading the description.
Why should they be unified? Is it impossible to configure the reasoning effort without unifying them?

It also seemed odd that the mjs script was calling the tx process. Since we're using tsx, building shouldn't be necessary in the first place.

Comment on lines +74 to +75
# Unified CLI (recommended)
pnpm -F @liam-hq/schema-bench schema-bench evaluate
Copy link
Member

@hoshinotsuyoshi hoshinotsuyoshi Sep 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(👀 I'm testing now)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prepare

rm -rf benchmark-workspace
pnpm --filter @liam-hq/schema-bench setupWorkspace

execute(openai)

# success
pnpm -F @liam-hq/schema-bench schema-bench execute --executor openai

evaluation

Actually, an error occurred, but it seems to be a schema discrepancy error. The directory seems to be fine. 👍

pnpm -F @liam-hq/schema-bench schema-bench


> @liam-hq/[email protected] schema-bench /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/frontend/internal-packages/schema-bench
> node bin/schema-bench.mjs evaluate

⚠️  Failed to load 3 file(s):
   - case-001-difficult.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/default/execution/reference/case-001-difficult.json: Invalid schema format: Invalid key: Expected "enums" but received undefined
   - case-001.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/default/execution/reference/case-001.json: Invalid schema format: Invalid key: Expected "enums" but received undefined
   - case-002.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/default/execution/reference/case-002.json: Invalid schema format: Invalid key: Expected "enums" but received undefined
   ❌ Failed to evaluate dataset: {
  "type": "JSON_PARSE_ERROR",
  "path": "/Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/default/execution/reference/case-001-difficult.json",
  "cause": "Invalid schema format: Invalid key: Expected \"enums\" but received undefined"
}
⚠️  Failed to load 5 file(s):
   - case-001.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-001.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
   - case-002.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-002.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
   - case-003.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-003.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
   - case-004.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-004.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
   - case-005.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-005.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
   ❌ Failed to evaluate dataset: {
  "type": "JSON_PARSE_ERROR",
  "path": "/Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-001.json",
  "cause": "Invalid schema format: Invalid key: Expected \"comment\" but received undefined"
}
⚠️  Failed to load 3 file(s):
   - case-001.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/ambiguous-recall/execution/reference/case-001.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
   - case-002.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/ambiguous-recall/execution/reference/case-002.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
   - case-003.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/ambiguous-recall/execution/reference/case-003.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
   ❌ Failed to evaluate dataset: {
  "type": "JSON_PARSE_ERROR",
  "path": "/Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/ambiguous-recall/execution/reference/case-001.json",
  "cause": "Invalid schema format: Invalid key: Expected \"comment\" but received undefined"
}
❌ Evaluation failed for 3 dataset(s)
/Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/frontend/internal-packages/schema-bench:
 ERR_PNPM_RECURSIVE_RUN_FIRST_FAIL  @liam-hq/[email protected] schema-bench: `node bin/schema-bench.mjs evaluate`
Exit status 1

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits

Actually, since .claude/commands/benchmark-execute.md is a command file specifically for Claude, I don't think we need to include the legacy usage patterns. Unless there's some benefit to keeping them, what do you think?

@MH4GF
Copy link
Member

MH4GF commented Oct 6, 2025

@NoritakaIkeda Is it okay to close this PR?

@NoritakaIkeda
Copy link
Member Author

@MH4GF
Sorry, but I'm closing this PR since the policy has already changed. 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants