-
Notifications
You must be signed in to change notification settings - Fork 180
✨(schema-bench): Add unified CLI with MVP1 wrapper implementation #3461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add unified schema-bench CLI with execute, evaluate, and compare subcommands: - **Unified CLI Interface**: New `schema-bench` command with subcommands - `execute --executor <liam|openai>`: Wraps existing executor scripts - `evaluate`: Wraps evaluateSchemaMulti.ts for multi-dataset evaluation - `compare`: MVP1 placeholder with guidance message - **Non-breaking Implementation**: All existing scripts remain unchanged - Legacy scripts continue to work via package.json scripts - New CLI uses child_process.spawn to call existing implementations - Full argument pass-through to underlying scripts - **Enhanced OpenAI Executor**: Improved workspace detection - Auto-detects workspace vs internal dataset directories - Flexible execution base directory resolution - Better error handling for missing workspace setup - **Updated Documentation**: README and command docs reflect both interfaces - Recommends unified CLI while preserving legacy documentation - Clear migration path from legacy to unified interface This implements MVP1 from the schema-bench roadmap, providing a single entry point while maintaining full backward compatibility. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds a unified schema-bench CLI with a Node ESM bin shim and TypeScript entrypoint, routes execute/evaluate/compare subcommands to TS scripts via tsx, introduces a package bin/script, refactors executeOpenai to use a computed baseExecutionDir, and updates docs while preserving legacy scripts. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor U as User
participant Bin as bin/schema-bench.mjs
participant CLI as src/cli/index.ts (TS CLI)
participant TSX as tsx runner
participant Exec as Executor script (execute*.ts)
participant Eval as evaluateSchemaMulti.ts
U->>Bin: schema-bench <subcommand> [--executor/-x args]
Note right of Bin: resolve TS path, sanitize NODE_OPTIONS, spawn
Bin->>CLI: node --import tsx index.ts [args]
CLI->>CLI: parse subcommand & flags
alt execute (with --executor liam|openai)
CLI->>TSX: node --import tsx execute*.ts [passthrough args]
TSX->>Exec: run executor (liam/openai)
Note over Exec: openai uses computed baseExecutionDir for IO
Exec-->>TSX: exit code
else evaluate
CLI->>TSX: node --import tsx evaluateSchemaMulti.ts [args]
TSX->>Eval: run evaluation
Eval-->>TSX: exit code
else compare
CLI-->>Bin: print placeholder, exit 0
end
TSX-->>CLI: propagate exit code
CLI-->>Bin: propagate exit code
Bin-->>U: process exit
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Warning Review ran into problems🔥 ProblemsErrors were encountered while retrieving linked issues. Errors (1)
Pre-merge checks (3 passed)✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Updates to Preview Branch (feat/schema-bench-unified-cli-mvp1) ↗︎
Tasks are run on every commit but only new migration files are pushed.
View logs for this Workflow Run ↗︎. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts (1)
45-50: CLI hint typo: script name is setupWorkspaceMessage recommends
setup-workspace(hyphenated) which doesn’t exist; scripts definesetupWorkspace.- `Input directory not found: ${inputDir}. Please run setup-workspace first.`, + `Input directory not found: ${inputDir}. Please run setupWorkspace first.`,
🧹 Nitpick comments (6)
frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-002.json (1)
472-480: Unique constraint references a non-existent column
uq_cps_active_one_per_planincludescancellation_time_is_null_partial_index_placeholder, which is not a defined column. Either model partial uniqueness explicitly in your constraint format (e.g., with awhere/filterfield) or approximate with a CHECK + UNIQUE that only references existing columns.If the evaluator requires valid column names, this will cause errors; please adjust the representation.
frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts (1)
181-183: Make concurrency configurable via envAllow tuning without code changes; keep 5 as default.
- const MAX_CONCURRENT = 5 + const MAX_CONCURRENT = Number(process.env.SCHEMABENCH_MAX_CONCURRENCY) || 5frontend/internal-packages/schema-bench/package.json (1)
40-40: Optional: drop unused “main” or add exports to avoid confusionThis package is CLI-oriented;
main: "src/index.ts"may be misleading. Consider removing it or adding properexportsif you intend runtime imports.frontend/internal-packages/schema-bench/bin/schema-bench.mjs (1)
17-25: Avoid nested inspector conflicts (strip --inspect from NODE_OPTIONS for the child)Running tsx under an inspected parent can fail. Sanitize NODE_OPTIONS before spawn.
- const child = spawn( + const env = { ...process.env } + if (env.NODE_OPTIONS && /--inspect/.test(env.NODE_OPTIONS)) { + env.NODE_OPTIONS = env.NODE_OPTIONS.replace(/--inspect(-brk)?(=[^\s]+)?/g, '').trim() || undefined + } + const child = spawn( process.execPath, [tsxCli, entryTs, ...process.argv.slice(2)], { stdio: 'inherit', - env: process.env, + env, cwd: process.cwd(), }, )frontend/internal-packages/schema-bench/src/cli/index.ts (2)
24-34: Propagate sanitized NODE_OPTIONS to sub-scripts to prevent nested inspector issuesMirror the shim’s sanitization here to cover direct script execution.
-const runTsCli = (tsFile: string, args: string[]): Promise<number> => { +const runTsCli = (tsFile: string, args: string[]): Promise<number> => { return new Promise((resolve) => { - const child = spawn(process.execPath, [tsxPath(), tsFile, ...args], { + const env = { ...process.env } + if (env.NODE_OPTIONS && /--inspect/.test(env.NODE_OPTIONS)) { + env.NODE_OPTIONS = env.NODE_OPTIONS.replace(/--inspect(-brk)?(=[^\s]+)?/g, '').trim() || undefined + } + const child = spawn(process.execPath, [tsxPath(), tsFile, ...args], { stdio: 'inherit', - env: process.env, + env, cwd: process.cwd(), })
108-121: Helpful error output for unknown executorConsider echoing the received value to speed up user troubleshooting.
- if (!target) { - console.error('Unknown executor. Use one of: liam | openai') + if (!target) { + console.error(`Unknown executor: ${String(executor)}. Use one of: liam | openai`) process.exit(1) }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
frontend/internal-packages/schema-bench/README.md(2 hunks)frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001-difficult.json(1 hunks)frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001.json(1 hunks)frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-002.json(1 hunks)frontend/internal-packages/schema-bench/bin/schema-bench.mjs(1 hunks)frontend/internal-packages/schema-bench/package.json(2 hunks)frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts(7 hunks)frontend/internal-packages/schema-bench/src/cli/index.ts(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
frontend/internal-packages/**
📄 CodeRabbit inference engine (AGENTS.md)
Infra and tooling (e2e, configs, storybook, agent) live under frontend/internal-packages
Files:
frontend/internal-packages/schema-bench/bin/schema-bench.mjsfrontend/internal-packages/schema-bench/package.jsonfrontend/internal-packages/schema-bench/src/cli/index.tsfrontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001-difficult.jsonfrontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001.jsonfrontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-002.jsonfrontend/internal-packages/schema-bench/src/cli/executeOpenai.tsfrontend/internal-packages/schema-bench/README.md
**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
**/*.{ts,tsx}: Use runtime type validation with valibot for external data validation
Use early returns for readability
Use named exports only (no default exports)
Follow existing import patterns and tsconfig path aliases
Prefer const arrow functions for simple utilities (e.g., const toggle = () => {})Use TypeScript/TSX across the codebase
Files:
frontend/internal-packages/schema-bench/src/cli/index.tsfrontend/internal-packages/schema-bench/src/cli/executeOpenai.ts
frontend/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
frontend/**/*.{ts,tsx}: Import UI components from @liam-hq/ui when available
Import icons from @liam-hq/ui
Files:
frontend/internal-packages/schema-bench/src/cli/index.tsfrontend/internal-packages/schema-bench/src/cli/executeOpenai.ts
**/*.ts
📄 CodeRabbit inference engine (AGENTS.md)
Name utility files in camelCase (e.g., mergeSchema.ts)
Files:
frontend/internal-packages/schema-bench/src/cli/index.tsfrontend/internal-packages/schema-bench/src/cli/executeOpenai.ts
🧠 Learnings (2)
📓 Common learnings
Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.
📚 Learning: 2025-07-30T05:52:56.270Z
Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.
Applied to files:
frontend/internal-packages/schema-bench/bin/schema-bench.mjsfrontend/internal-packages/schema-bench/package.jsonfrontend/internal-packages/schema-bench/src/cli/index.tsfrontend/internal-packages/schema-bench/README.md
🧬 Code graph analysis (2)
frontend/internal-packages/schema-bench/src/cli/index.ts (1)
frontend/internal-packages/schema-bench/bin/schema-bench.mjs (4)
require(9-9)__filename(13-13)__dirname(14-14)child(17-25)
frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts (2)
frontend/internal-packages/schema-bench/src/cli/utils/workspace.ts (1)
getWorkspaceSubPath(21-23)frontend/internal-packages/schema-bench/bin/schema-bench.mjs (2)
__filename(13-13)__dirname(14-14)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: frontend-lint
- GitHub Check: frontend-ci
- GitHub Check: security-review
- GitHub Check: Supabase Preview
🔇 Additional comments (3)
frontend/internal-packages/schema-bench/package.json (1)
6-8: Bin mapping looks goodThe
schema-benchbinary correctly points to the shim.frontend/internal-packages/schema-bench/README.md (2)
22-31: Docs align with unified CLI—niceClear examples for Liam/OpenAI with pass-through flags.
46-51: Good: unified evaluate path with legacy fallbackThis matches the new CLI entry while preserving old scripts.
| "foreignKeys':[{": ":", | ||
| "name": { | ||
| "type": "FOREIGN KEY", | ||
| "name": "fk_eom_employee", | ||
| "columnNames": [ | ||
| "employee_code" | ||
| ], | ||
| "targetTableName": "employee", | ||
| "targetColumnNames": [ | ||
| "employee_code" | ||
| ], | ||
| "updateConstraint": "NO_ACTION", | ||
| "deleteConstraint": "CASCADE" | ||
| }, | ||
| "targetTableName": { | ||
| "type": "FOREIGN KEY", | ||
| "name": "fk_eom_org", | ||
| "columnNames": [ | ||
| "org_code" | ||
| ], | ||
| "targetTableName": "organization", | ||
| "targetColumnNames": [ | ||
| "org_code" | ||
| ], | ||
| "updateConstraint": "NO_ACTION", | ||
| "deleteConstraint": "CASCADE" | ||
| }, | ||
| "targetColumnNames": { | ||
| "type": "FOREIGN KEY", | ||
| "name": "fk_eom_position", | ||
| "columnNames": [ | ||
| "position_code" | ||
| ], | ||
| "targetTableName": "position", | ||
| "targetColumnNames": [ | ||
| "position_code" | ||
| ], | ||
| "updateConstraint": "NO_ACTION", | ||
| "deleteConstraint": "SET_NULL" | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Broken JSON and malformed constraints block in employee_org_membership.constraints
This section is syntactically invalid JSON and mixes object keys/values incorrectly. It will fail to parse and breaks downstream tooling.
Replace the malformed block with a valid, consistent constraints map (mirroring the convention used elsewhere), and fold the “reports_to” rules here (so you can drop the separate “checks” table below).
"primaryKey": {
"type": "PRIMARY KEY",
"name": "pk_emp_org_membership",
"columnNames": [
"employee_code",
"org_code"
]
},
- "foreignKeys':[{": ":",
- "name": {
- "type": "FOREIGN KEY",
- "name": "fk_eom_employee",
- "columnNames": [
- "employee_code"
- ],
- "targetTableName": "employee",
- "targetColumnNames": [
- "employee_code"
- ],
- "updateConstraint": "NO_ACTION",
- "deleteConstraint": "CASCADE"
- },
- "targetTableName": {
- "type": "FOREIGN KEY",
- "name": "fk_eom_org",
- "columnNames": [
- "org_code"
- ],
- "targetTableName": "organization",
- "targetColumnNames": [
- "org_code"
- ],
- "updateConstraint": "NO_ACTION",
- "deleteConstraint": "CASCADE"
- },
- "targetColumnNames": {
- "type": "FOREIGN KEY",
- "name": "fk_eom_position",
- "columnNames": [
- "position_code"
- ],
- "targetTableName": "position",
- "targetColumnNames": [
- "position_code"
- ],
- "updateConstraint": "NO_ACTION",
- "deleteConstraint": "SET_NULL"
- }
+ "fk_eom_employee": {
+ "type": "FOREIGN KEY",
+ "name": "fk_eom_employee",
+ "columnNames": ["employee_code"],
+ "targetTableName": "employee",
+ "targetColumnNames": ["employee_code"],
+ "updateConstraint": "NO_ACTION",
+ "deleteConstraint": "CASCADE"
+ },
+ "fk_eom_org": {
+ "type": "FOREIGN KEY",
+ "name": "fk_eom_org",
+ "columnNames": ["org_code"],
+ "targetTableName": "organization",
+ "targetColumnNames": ["org_code"],
+ "updateConstraint": "NO_ACTION",
+ "deleteConstraint": "CASCADE"
+ },
+ "fk_eom_position": {
+ "type": "FOREIGN KEY",
+ "name": "fk_eom_position",
+ "columnNames": ["position_code"],
+ "targetTableName": "position",
+ "targetColumnNames": ["position_code"],
+ "updateConstraint": "NO_ACTION",
+ "deleteConstraint": "SET_NULL"
+ },
+ "ck_eom_reports_to_not_self": {
+ "type": "CHECK",
+ "name": "ck_eom_reports_to_not_self",
+ "detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code"
+ },
+ "fk_eom_reports_to_same_org": {
+ "type": "FOREIGN KEY",
+ "name": "fk_eom_reports_to_same_org",
+ "columnNames": ["reports_to_employee_code", "org_code"],
+ "targetTableName": "employee_org_membership",
+ "targetColumnNames": ["employee_code", "org_code"],
+ "updateConstraint": "NO_ACTION",
+ "deleteConstraint": "SET_NULL"
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "foreignKeys':[{": ":", | |
| "name": { | |
| "type": "FOREIGN KEY", | |
| "name": "fk_eom_employee", | |
| "columnNames": [ | |
| "employee_code" | |
| ], | |
| "targetTableName": "employee", | |
| "targetColumnNames": [ | |
| "employee_code" | |
| ], | |
| "updateConstraint": "NO_ACTION", | |
| "deleteConstraint": "CASCADE" | |
| }, | |
| "targetTableName": { | |
| "type": "FOREIGN KEY", | |
| "name": "fk_eom_org", | |
| "columnNames": [ | |
| "org_code" | |
| ], | |
| "targetTableName": "organization", | |
| "targetColumnNames": [ | |
| "org_code" | |
| ], | |
| "updateConstraint": "NO_ACTION", | |
| "deleteConstraint": "CASCADE" | |
| }, | |
| "targetColumnNames": { | |
| "type": "FOREIGN KEY", | |
| "name": "fk_eom_position", | |
| "columnNames": [ | |
| "position_code" | |
| ], | |
| "targetTableName": "position", | |
| "targetColumnNames": [ | |
| "position_code" | |
| ], | |
| "updateConstraint": "NO_ACTION", | |
| "deleteConstraint": "SET_NULL" | |
| } | |
| } | |
| "primaryKey": { | |
| "type": "PRIMARY KEY", | |
| "name": "pk_emp_org_membership", | |
| "columnNames": [ | |
| "employee_code", | |
| "org_code" | |
| ] | |
| }, | |
| "fk_eom_employee": { | |
| "type": "FOREIGN KEY", | |
| "name": "fk_eom_employee", | |
| "columnNames": ["employee_code"], | |
| "targetTableName": "employee", | |
| "targetColumnNames": ["employee_code"], | |
| "updateConstraint": "NO_ACTION", | |
| "deleteConstraint": "CASCADE" | |
| }, | |
| "fk_eom_org": { | |
| "type": "FOREIGN KEY", | |
| "name": "fk_eom_org", | |
| "columnNames": ["org_code"], | |
| "targetTableName": "organization", | |
| "targetColumnNames": ["org_code"], | |
| "updateConstraint": "NO_ACTION", | |
| "deleteConstraint": "CASCADE" | |
| }, | |
| "fk_eom_position": { | |
| "type": "FOREIGN KEY", | |
| "name": "fk_eom_position", | |
| "columnNames": ["position_code"], | |
| "targetTableName": "position", | |
| "targetColumnNames": ["position_code"], | |
| "updateConstraint": "NO_ACTION", | |
| "deleteConstraint": "SET_NULL" | |
| }, | |
| "ck_eom_reports_to_not_self": { | |
| "type": "CHECK", | |
| "name": "ck_eom_reports_to_not_self", | |
| "detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code" | |
| }, | |
| "fk_eom_reports_to_same_org": { | |
| "type": "FOREIGN KEY", | |
| "name": "fk_eom_reports_to_same_org", | |
| "columnNames": ["reports_to_employee_code", "org_code"], | |
| "targetTableName": "employee_org_membership", | |
| "targetColumnNames": ["employee_code", "org_code"], | |
| "updateConstraint": "NO_ACTION", | |
| "deleteConstraint": "SET_NULL" | |
| } |
| "checks": { | ||
| "name": "ck_eom_reports_to_not_self", | ||
| "columns": {}, | ||
| "comment": null, | ||
| "indexes": {}, | ||
| "constraints": { | ||
| "type": { | ||
| "type": "CHECK", | ||
| "name": "ck_eom_reports_to_not_self", | ||
| "detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code" | ||
| }, | ||
| "type2": { | ||
| "type": "FOREIGN KEY", | ||
| "name": "fk_eom_reports_to_same_org", | ||
| "columnNames": [ | ||
| "reports_to_employee_code", | ||
| "org_code" | ||
| ], | ||
| "targetTableName": "employee_org_membership", | ||
| "targetColumnNames": [ | ||
| "employee_code", | ||
| "org_code" | ||
| ], | ||
| "updateConstraint": "NO_ACTION", | ||
| "deleteConstraint": "SET_NULL" | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove the synthetic “checks” table; these belong as constraints on employee_org_membership
The “checks” pseudo-table is not a real table and violates the schema shape described in README. After folding the check/FK above, delete this block.
- },
- "checks": {
- "name": "ck_eom_reports_to_not_self",
- "columns": {},
- "comment": null,
- "indexes": {},
- "constraints": {
- "type": {
- "type": "CHECK",
- "name": "ck_eom_reports_to_not_self",
- "detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code"
- },
- "type2": {
- "type": "FOREIGN KEY",
- "name": "fk_eom_reports_to_same_org",
- "columnNames": [
- "reports_to_employee_code",
- "org_code"
- ],
- "targetTableName": "employee_org_membership",
- "targetColumnNames": [
- "employee_code",
- "org_code"
- ],
- "updateConstraint": "NO_ACTION",
- "deleteConstraint": "SET_NULL"
- }
- }
- }
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "checks": { | |
| "name": "ck_eom_reports_to_not_self", | |
| "columns": {}, | |
| "comment": null, | |
| "indexes": {}, | |
| "constraints": { | |
| "type": { | |
| "type": "CHECK", | |
| "name": "ck_eom_reports_to_not_self", | |
| "detail": "reports_to_employee_code IS NULL OR reports_to_employee_code <> employee_code" | |
| }, | |
| "type2": { | |
| "type": "FOREIGN KEY", | |
| "name": "fk_eom_reports_to_same_org", | |
| "columnNames": [ | |
| "reports_to_employee_code", | |
| "org_code" | |
| ], | |
| "targetTableName": "employee_org_membership", | |
| "targetColumnNames": [ | |
| "employee_code", | |
| "org_code" | |
| ], | |
| "updateConstraint": "NO_ACTION", | |
| "deleteConstraint": "SET_NULL" | |
| } | |
| } | |
| } | |
| } |
🤖 Prompt for AI Agents
frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001-difficult.json
lines 228-255: the "checks" pseudo-table block should be removed and its
constraints folded into the real employee_org_membership table; delete the
entire "checks" object and move the CHECK constraint
(ck_eom_reports_to_not_self) and the FOREIGN KEY constraint
(fk_eom_reports_to_same_org) into the "constraints" section of the
employee_org_membership table entry, ensuring the CHECK uses the same detail
text and the FK references targetTableName/targetColumnNames and update/delete
actions are preserved.
| "foreign_keys": { | ||
| "type": "FOREIGN KEY", | ||
| "name": "", | ||
| "columnNames": [ | ||
| "" | ||
| ], | ||
| "targetTableName": "", | ||
| "targetColumnNames": [ | ||
| "" | ||
| ], | ||
| "updateConstraint": "NO_ACTION", | ||
| "deleteConstraint": "NO_ACTION" | ||
| }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove placeholder FOREIGN KEY with empty identifiers in position.constraints
This FK has empty name, columnNames, targetTableName, and targetColumnNames, which is invalid and may break consumers.
"primary_key": {
"type": "PRIMARY KEY",
"name": "pk_position",
"columnNames": [
"position_code"
]
},
- "foreign_keys": {
- "type": "FOREIGN KEY",
- "name": "",
- "columnNames": [
- ""
- ],
- "targetTableName": "",
- "targetColumnNames": [
- ""
- ],
- "updateConstraint": "NO_ACTION",
- "deleteConstraint": "NO_ACTION"
- },
"uniques": {
"type": "UNIQUE",
"name": "uq_position_name",
"columnNames": [
"position_name"
]
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "foreign_keys": { | |
| "type": "FOREIGN KEY", | |
| "name": "", | |
| "columnNames": [ | |
| "" | |
| ], | |
| "targetTableName": "", | |
| "targetColumnNames": [ | |
| "" | |
| ], | |
| "updateConstraint": "NO_ACTION", | |
| "deleteConstraint": "NO_ACTION" | |
| }, | |
| "primary_key": { | |
| "type": "PRIMARY KEY", | |
| "name": "pk_position", | |
| "columnNames": [ | |
| "position_code" | |
| ] | |
| }, | |
| "uniques": { | |
| "type": "UNIQUE", | |
| "name": "uq_position_name", | |
| "columnNames": [ | |
| "position_name" | |
| ] | |
| }, |
🤖 Prompt for AI Agents
In
frontend/internal-packages/schema-bench/benchmark-workspace-default/execution/output/case-001.json
around lines 98–110, there is a placeholder FOREIGN KEY object with empty name,
columnNames, targetTableName, and targetColumnNames which is invalid; remove
this placeholder foreign_keys entry (or replace it with a valid FK object) so
position.constraints does not contain entries with empty identifiers, and ensure
the producing code validates FK objects and omits them when required fields are
empty before writing the JSON.
| const getExecutionBaseDir = (): string => { | ||
| // Prefer top-level benchmark workspace execution dir, fallback to internal default dataset | ||
| const workspaceExecution = getWorkspaceSubPath('execution') | ||
| if (existsSync(workspaceExecution)) return workspaceExecution | ||
| const __filename = fileURLToPath(import.meta.url) | ||
| const __dirname = dirname(__filename) | ||
| return resolve(__dirname, '../../benchmark-workspace-default/execution') | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Workspace detection likely misses actual workspace layout
Workspace structure in README nests datasets under benchmark-workspace/<dataset>/execution/.... Checking only <workspace>/execution will often fail and force the internal fallback.
Use a smarter probe (prefer default dataset if present, then legacy single-level):
-const getExecutionBaseDir = (): string => {
- // Prefer top-level benchmark workspace execution dir, fallback to internal default dataset
- const workspaceExecution = getWorkspaceSubPath('execution')
- if (existsSync(workspaceExecution)) return workspaceExecution
- const __filename = fileURLToPath(import.meta.url)
- const __dirname = dirname(__filename)
- return resolve(__dirname, '../../benchmark-workspace-default/execution')
-}
+const getExecutionBaseDir = (): string => {
+ // Prefer workspace default dataset, fall back to legacy flat layout, then internal default dataset
+ const workspaceDefault = getWorkspaceSubPath('default', 'execution')
+ if (existsSync(workspaceDefault)) return workspaceDefault
+ const workspaceFlat = getWorkspaceSubPath('execution')
+ if (existsSync(workspaceFlat)) return workspaceFlat
+ const __filename = fileURLToPath(import.meta.url)
+ const __dirname = dirname(__filename)
+ return resolve(__dirname, '../../benchmark-workspace-default/execution')
+}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const getExecutionBaseDir = (): string => { | |
| // Prefer top-level benchmark workspace execution dir, fallback to internal default dataset | |
| const workspaceExecution = getWorkspaceSubPath('execution') | |
| if (existsSync(workspaceExecution)) return workspaceExecution | |
| const __filename = fileURLToPath(import.meta.url) | |
| const __dirname = dirname(__filename) | |
| return resolve(__dirname, '../../benchmark-workspace-default/execution') | |
| } | |
| const getExecutionBaseDir = (): string => { | |
| // Prefer workspace default dataset, fall back to legacy flat layout, then internal default dataset | |
| const workspaceDefault = getWorkspaceSubPath('default', 'execution') | |
| if (existsSync(workspaceDefault)) return workspaceDefault | |
| const workspaceFlat = getWorkspaceSubPath('execution') | |
| if (existsSync(workspaceFlat)) return workspaceFlat | |
| const __filename = fileURLToPath(import.meta.url) | |
| const __dirname = dirname(__filename) | |
| return resolve(__dirname, '../../benchmark-workspace-default/execution') | |
| } |
🤖 Prompt for AI Agents
In frontend/internal-packages/schema-bench/src/cli/executeOpenai.ts around lines
29-36, the current workspace probe only checks for <workspace>/execution and
therefore misses layouts where datasets are nested under
benchmark-workspace/<dataset>/execution; update getExecutionBaseDir to first
resolve the top-level benchmark-workspace path, look for a subdirectory whose
execution subpath exists (prefer a "default" dataset if present, otherwise pick
the first matching dataset), return that dataset's execution path if found, then
fall back to the legacy <workspace>/execution check, and finally fall back to
the internal default execution path; implement this by listing entries under the
workspace directory (or using a glob) and testing each candidate's execution
subpath before returning.
- Remove benchmark output files from version control (should not be committed) - Fix NODE_OPTIONS debugger conflict in unified CLI spawn process - Update benchmark-execute command documentation with unified CLI examples - Improve CLI error handling and environment variable cleanup 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
- Switch to Node --import tsx for better TypeScript execution compatibility - Remove createRequire dependency for cleaner ESM implementation - Update binary script to use consistent tsx execution pattern - Enhanced benchmark-execute documentation with unified CLI examples - Improved NODE_OPTIONS handling to prevent debugger conflicts This ensures the unified CLI works reliably across different environments while maintaining full backward compatibility with existing scripts. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (3)
frontend/internal-packages/schema-bench/bin/schema-bench.mjs (2)
21-29: Small DX polish: inherit stdio already; add windowsHide false (optional).Not required, but adding windowsHide: false prevents stray console windows on Windows.
const child = spawn( process.execPath, ['--import', 'tsx', entryTs, ...process.argv.slice(2)], { stdio: 'inherit', env, cwd: process.cwd(), + windowsHide: false, }, )
9-10: Fail fast iftsxisn't resolvable; add Node engine or fallbackFile: frontend/internal-packages/schema-bench/bin/schema-bench.mjs
- frontend/internal-packages/schema-bench/package.json declares [email protected] and has no "engines" field.
- Add a runtime check (fail-fast with a clear error) before spawning; apply the suggested diff:
-const _require = createRequire(import.meta.url) +const _require = createRequire(import.meta.url) @@ - const child = spawn( + // Verify that 'tsx' is available; give a helpful error if not. + try { + _require.resolve('tsx') + } catch { + console.error('[schema-bench] Missing dependency: tsx\n' + + 'Please add "tsx" to dependencies of @liam-hq/schema-bench (or ensure it is hoisted and resolvable).') + process.exit(1) + } + + const child = spawn( process.execPath, ['--import', 'tsx', entryTs, ...process.argv.slice(2)],
- Because "engines" is unset, either add "engines.node": ">=20.6.0" to package.json to require Node versions that support --import, or implement a fallback path using
--loader tsxfor older Node releases..claude/commands/benchmark-execute.md (1)
81-82: Add hint about workspace setup for OpenAI (optional).OpenAI can run against the default dataset or a prepared workspace. A short note helps first-time users who skipped setupWorkspace.
-Note: For OpenAI execution, ensure `OPENAI_API_KEY` is set in your environment. +Note: For OpenAI execution, ensure `OPENAI_API_KEY` is set. If running against the internal datasets, run the workspace setup first: +`pnpm -F @liam-hq/schema-bench setupWorkspace`
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.claude/commands/benchmark-execute.md(2 hunks)frontend/internal-packages/schema-bench/bin/schema-bench.mjs(1 hunks)frontend/internal-packages/schema-bench/package.json(2 hunks)frontend/internal-packages/schema-bench/src/cli/index.ts(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- frontend/internal-packages/schema-bench/package.json
- frontend/internal-packages/schema-bench/src/cli/index.ts
🧰 Additional context used
📓 Path-based instructions (1)
frontend/internal-packages/**
📄 CodeRabbit inference engine (AGENTS.md)
Infra and tooling (e2e, configs, storybook, agent) live under frontend/internal-packages
Files:
frontend/internal-packages/schema-bench/bin/schema-bench.mjs
🧠 Learnings (2)
📓 Common learnings
Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.
📚 Learning: 2025-07-30T05:52:56.270Z
Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.
Applied to files:
frontend/internal-packages/schema-bench/bin/schema-bench.mjs
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Supabase Preview
- GitHub Check: Supabase Preview
🔇 Additional comments (3)
.claude/commands/benchmark-execute.md (3)
41-52: LGTM: Unified CLI examples for Liam executor are clear and backward-compatible.The pass-through dataset flags (-all, -entity-extraction, -ambiguous-recall) align with the wrapper’s “forward unknown args” approach.
If helpful, add one line showing
--helpto discover flags:pnpm -F @liam-hq/schema-bench schema-bench execute --executor liam --help
61-66: LGTM: OpenAI unified CLI and legacy command are correctly documented.Clear that unified defaults to the default dataset; legacy remains intact.
74-79: LGTM: Evaluation flow mirrors the unified/legacy pattern consistently.Examples are concise and correct.
| const env = { ...process.env } | ||
| if (env.NODE_OPTIONS?.includes('--inspect')) { | ||
| delete env.NODE_OPTIONS | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Filter out only inspector flags from NODE_OPTIONS (don’t drop unrelated flags).
Deleting NODE_OPTIONS wholesale can discard non-inspector flags (e.g., memory limits). Strip only --inspect/--inspect-brk.
- const env = { ...process.env }
- if (env.NODE_OPTIONS?.includes('--inspect')) {
- delete env.NODE_OPTIONS
- }
+ const env = { ...process.env }
+ if (env.NODE_OPTIONS) {
+ const filtered = env.NODE_OPTIONS
+ .split(/\s+/)
+ .filter((opt) => !/^--inspect(?:-brk)?(?:=|$)/.test(opt))
+ .join(' ')
+ if (filtered) env.NODE_OPTIONS = filtered
+ else delete env.NODE_OPTIONS
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const env = { ...process.env } | |
| if (env.NODE_OPTIONS?.includes('--inspect')) { | |
| delete env.NODE_OPTIONS | |
| } | |
| const env = { ...process.env } | |
| if (env.NODE_OPTIONS) { | |
| const filtered = env.NODE_OPTIONS | |
| .split(/\s+/) | |
| .filter((opt) => !/^--inspect(?:-brk)?(?:=|$)/.test(opt)) | |
| .join(' ') | |
| if (filtered) env.NODE_OPTIONS = filtered | |
| else delete env.NODE_OPTIONS | |
| } |
🤖 Prompt for AI Agents
In frontend/internal-packages/schema-bench/bin/schema-bench.mjs around lines 17
to 20, instead of deleting NODE_OPTIONS entirely, parse env.NODE_OPTIONS into
space-separated tokens, filter out only inspector-related flags (--inspect,
--inspect-brk and their =host:port variants), and rejoin the remaining tokens;
if none remain delete env.NODE_OPTIONS, otherwise set it to the filtered string
so non-inspector flags (e.g., memory limits) are preserved.
| child.on('exit', (code) => process.exit(code ?? 0)) | ||
| child.on('error', () => process.exit(1)) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Propagate child signals and preserve true exit semantics.
To fully “preserve exit-code semantics,” forward signals (SIGINT, SIGTERM, SIGHUP) and re-emit them when the child exits via signal.
- child.on('exit', (code) => process.exit(code ?? 0))
- child.on('error', () => process.exit(1))
+ // Forward common signals to child so Ctrl+C etc. behave correctly
+ for (const sig of ['SIGINT', 'SIGTERM', 'SIGHUP']) {
+ process.on(sig, () => {
+ try { child.kill(sig as NodeJS.Signals) } catch {}
+ })
+ }
+ child.on('exit', (code, signal) => {
+ if (signal) {
+ // Re-emit same signal to preserve semantics
+ try { process.kill(process.pid, signal as NodeJS.Signals) } catch {}
+ } else {
+ process.exit(code ?? 0)
+ }
+ })
+ child.on('error', (err) => {
+ console.error('[schema-bench] Failed to start TS CLI:', err)
+ process.exit(1)
+ })Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In frontend/internal-packages/schema-bench/bin/schema-bench.mjs around lines
31–33, the child process exit handlers don't forward incoming signals or re-emit
child exit signals to the parent; add listeners for SIGINT, SIGTERM, and SIGHUP
that forward the same signal to the child (child.kill(signal)), and update the
child.on('exit', ...) handler so that if the child exited due to a signal you
re-emit that signal on the parent (e.g. process.kill(process.pid, signal))
instead of exiting with a code, otherwise call process.exit(code ?? 0); keep the
existing child.on('error', ...) behavior.
- Added CLI entry point to knip ignore list with comment explaining why - CLI entry is launched via bin shim and knip cannot detect it automatically 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <[email protected]>
| - **Input Standardization**: Entity-extraction inputs are automatically wrapped in `{"input": "..."}` format | ||
|
|
||
| Next, I'll execute the specified model with dataset selection: | ||
| Next, I'll execute the specified model with dataset selection. The unified CLI is recommended; legacy scripts remain available and unchanged: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m making these changes so the commands can also be executed via the CLI, which I think is simpler.
However, removing the existing commands right away would create a larger-than-expected impact, so in the short term I’ll keep the old commands as well.
Once the CLI commands are running stably, the legacy commands will be removed.
| const getExecutionBaseDir = (): string => { | ||
| // Prefer top-level benchmark workspace execution dir, fallback to internal default dataset | ||
| const workspaceExecution = getWorkspaceSubPath('execution') | ||
| if (existsSync(workspaceExecution)) return workspaceExecution | ||
| const __filename = fileURLToPath(import.meta.url) | ||
| const __dirname = dirname(__filename) | ||
| return resolve(__dirname, '../../benchmark-workspace-default/execution') | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I fixed the outdated directory specification. This isn't a core change in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
frontend/internal-packages/schema-bench/src/cli/index.ts is responsible for parsing the command-line options and invoking the appropriate function(s) to execute the CLI logic.
| "frontend/apps/app/features/public-share/actions.ts" | ||
| "frontend/apps/app/features/public-share/actions.ts", | ||
| // - CLI entry is launched via bin shim; knip can't detect it | ||
| "frontend/internal-packages/schema-bench/src/cli/index.ts" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m making these changes so the commands can also be executed via the CLI, which I think is simpler.
However, removing the existing commands right away would create a larger-than-expected impact, so in the short term I’ll keep the old commands as well.
Once the CLI commands are running stably, the legacy commands will be removed.
This is a temporary measure due to the impact of this change. It will be removed during refactoring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements MVP1 of a unified CLI for schema-bench, providing a single entry point that wraps existing TypeScript scripts while maintaining full backward compatibility. The unified CLI supports execution, evaluation, and comparison operations with consistent interface across different executors.
- Introduces a new
schema-benchcommand with subcommands for execute (with executor selection), evaluate, and compare (placeholder) - Implements child process wrapper pattern that forwards arguments to existing scripts unchanged
- Enhanced OpenAI executor with flexible workspace detection and improved error handling
Reviewed Changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/cli/index.ts | New unified CLI entry point with command parsing and child process execution |
| src/cli/executeOpenai.ts | Enhanced with flexible workspace detection and base directory parameter threading |
| package.json | Adds CLI binary entry point and npm script |
| README.md | Updated documentation showing both unified CLI and legacy script usage |
| knip.jsonc | Excludes CLI entry from unused code detection |
| .claude/commands/benchmark-execute.md | Updated command examples with unified CLI recommendations |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
|
||
| const VERSION = '0.1.0' |
Copilot
AI
Sep 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded version number creates maintenance burden. Consider reading from package.json to ensure consistency and avoid manual updates.
| const VERSION = '0.1.0' | |
| import fs from 'node:fs' | |
| // Dynamically read version from package.json | |
| const __filename = fileURLToPath(import.meta.url) | |
| const __dirname = path.dirname(__filename) | |
| // Adjust the path to package.json as needed (here: three levels up) | |
| const pkgPath = path.resolve(__dirname, '../../../package.json') | |
| let VERSION = 'unknown' | |
| try { | |
| const pkgRaw = fs.readFileSync(pkgPath, 'utf8') | |
| const pkg = JSON.parse(pkgRaw) | |
| VERSION = pkg.version || VERSION | |
| } catch (e) { | |
| // fallback: VERSION remains 'unknown' | |
| } |
| for (const tok of args) { | ||
| if (expectValue) { | ||
| if (!tok.startsWith('-')) { | ||
| executor = tok | ||
| } else { | ||
| pass.push(tok) | ||
| } | ||
| expectValue = false | ||
| continue | ||
| } |
Copilot
AI
Sep 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When expectValue is true but tok starts with '-', the code adds tok to pass but doesn't handle the missing executor value. This could silently ignore missing values after --executor flag.
| const map: Record<string, string> = { | ||
| openai: resolveCli('./executeOpenai.ts'), | ||
| liam: resolveCli('./executeLiamDbUnified.ts'), | ||
| } | ||
| const target = map[String(executor).toLowerCase()] |
Copilot
AI
Sep 12, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The executor mapping logic is hardcoded and repeated validation occurs. Consider extracting valid executors to a constant array for reuse in validation messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
knip.jsonc (1)
33-35: Declare schema-bench CLI as a knip workspace entry (remove from ignore)frontend/internal-packages/schema-bench contains CLI sources (src/cli/**/*) and a bin in package.json—remove the CLI from knip ignores and add a workspace entry so knip can traverse the CLI and spawned TS scripts.
- "frontend/apps/app/features/public-share/actions.ts", - // - CLI entry is launched via bin shim; knip can't detect it - "frontend/internal-packages/schema-bench/src/cli/index.ts" + "frontend/apps/app/features/public-share/actions.ts"Add this workspace config (in your workspaces section):
{ "workspaces": { "frontend/internal-packages/schema-bench": { "entry": [ "src/cli/index.ts", "src/cli/**/*.ts" ] } } }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
knip.jsonc(1 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: hoshinotsuyoshi
PR: liam-hq/liam#2771
File: frontend/internal-packages/schema-bench/src/cli/executeLiamDb.ts:22-22
Timestamp: 2025-07-30T05:52:56.270Z
Learning: The schema-bench package (frontend/internal-packages/schema-bench) has been converted from ESM to CommonJS mode by removing "type": "module" from package.json, making __dirname available and correct to use in TypeScript files within this package.
📚 Learning: 2025-08-25T01:34:20.255Z
Learnt from: CR
PR: liam-hq/liam#0
File: CLAUDE.md:0-0
Timestamp: 2025-08-25T01:34:20.255Z
Learning: Applies to frontend/apps/@(app|docs)/app/**/*.{ts,tsx} : Prefer Server Components for server-side data fetching
Applied to files:
knip.jsonc
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Supabase Preview
- GitHub Check: frontend-ci
- GitHub Check: frontend-lint
- GitHub Check: Supabase Preview
|
Thank you for working late at night, I'll check it on Monday! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NoritakaIkeda
I couldn't understand why this change was necessary even after reading the description.
Why should they be unified? Is it impossible to configure the reasoning effort without unifying them?
It also seemed odd that the mjs script was calling the tx process. Since we're using tsx, building shouldn't be necessary in the first place.
| # Unified CLI (recommended) | ||
| pnpm -F @liam-hq/schema-bench schema-bench evaluate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(👀 I'm testing now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prepare
rm -rf benchmark-workspace
pnpm --filter @liam-hq/schema-bench setupWorkspace
execute(openai)
# success
pnpm -F @liam-hq/schema-bench schema-bench execute --executor openai
evaluation
Actually, an error occurred, but it seems to be a schema discrepancy error. The directory seems to be fine. 👍
pnpm -F @liam-hq/schema-bench schema-bench
> @liam-hq/[email protected] schema-bench /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/frontend/internal-packages/schema-bench
> node bin/schema-bench.mjs evaluate
⚠️ Failed to load 3 file(s):
- case-001-difficult.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/default/execution/reference/case-001-difficult.json: Invalid schema format: Invalid key: Expected "enums" but received undefined
- case-001.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/default/execution/reference/case-001.json: Invalid schema format: Invalid key: Expected "enums" but received undefined
- case-002.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/default/execution/reference/case-002.json: Invalid schema format: Invalid key: Expected "enums" but received undefined
❌ Failed to evaluate dataset: {
"type": "JSON_PARSE_ERROR",
"path": "/Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/default/execution/reference/case-001-difficult.json",
"cause": "Invalid schema format: Invalid key: Expected \"enums\" but received undefined"
}
⚠️ Failed to load 5 file(s):
- case-001.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-001.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
- case-002.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-002.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
- case-003.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-003.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
- case-004.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-004.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
- case-005.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-005.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
❌ Failed to evaluate dataset: {
"type": "JSON_PARSE_ERROR",
"path": "/Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/entity-extraction/execution/reference/case-001.json",
"cause": "Invalid schema format: Invalid key: Expected \"comment\" but received undefined"
}
⚠️ Failed to load 3 file(s):
- case-001.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/ambiguous-recall/execution/reference/case-001.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
- case-002.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/ambiguous-recall/execution/reference/case-002.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
- case-003.json: Failed to parse JSON at /Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/ambiguous-recall/execution/reference/case-003.json: Invalid schema format: Invalid key: Expected "comment" but received undefined
❌ Failed to evaluate dataset: {
"type": "JSON_PARSE_ERROR",
"path": "/Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/benchmark-workspace/ambiguous-recall/execution/reference/case-001.json",
"cause": "Invalid schema format: Invalid key: Expected \"comment\" but received undefined"
}
❌ Evaluation failed for 3 dataset(s)
/Users/hoshino/.local/share/go/src/github.com/hoshinotsuyoshi/liam-1/frontend/internal-packages/schema-bench:
ERR_PNPM_RECURSIVE_RUN_FIRST_FAIL @liam-hq/[email protected] schema-bench: `node bin/schema-bench.mjs evaluate`
Exit status 1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits
Actually, since .claude/commands/benchmark-execute.md is a command file specifically for Claude, I don't think we need to include the legacy usage patterns. Unless there's some benefit to keeping them, what do you think?
|
@NoritakaIkeda Is it okay to close this PR? |
|
@MH4GF |
Issue
Why is this change needed?
This implements MVP1 of the schema-bench unified CLI roadmap, providing a single entry point for execution, evaluation, and comparison operations while maintaining full backward compatibility with existing scripts.
The unified CLI addresses the need for a consistent interface across different executors (OpenAI, Liam) and operations, setting the foundation for future enhancements like model comparison and reasoning effort configuration.
This is preparation to enable configuration of reasoning effort from the command line.
Key Changes
Unified CLI Interface
schema-benchcommand with subcommands:execute --executor <liam|openai>: Unified execution interfaceevaluate: Multi-dataset evaluation wrappercompare: MVP1 placeholder for future implementationNon-breaking Implementation
spawn()to call existing TypeScript scriptspackage.jsonscripts continue to workEnhanced OpenAI Executor
Documentation Updates
Technical Implementation
The MVP1 implementation follows the "wrapper + pass-through" pattern to minimize risk:
CLI Entry Point (
src/cli/index.ts):commander.jspattern for argument parsingChild Process Execution:
tsxruntimeWorkspace Resolution:
executeOpenai.tswith flexible base directory detectionAcceptance Criteria
✅ Full Compatibility: All existing scripts work unchanged
✅ Unified Interface: New CLI provides consistent interface across executors
✅ Non-breaking: Legacy workflows continue without modification
✅ Extensible: Foundation for future MVP2/3 enhancements
Testing
This sets the foundation for future enhancements while ensuring zero disruption to existing workflows.
Summary by CodeRabbit
New Features
Improvements
Documentation