You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Add --json-response flag for structured API responses
Adds a new CLI flag that enables JSON response formatting:
- Adds json_response field to RequestFuncInput model
- Modifies OpenAI backend to apply JSON formatting when flag is enabled
- Includes response_format and chat_template_kwargs settings
- Prompts model to avoid premature JSON closure
* change prompt
* add --disable-thinking separately
* slightly prompt change
* update README
* Implement JSON schema support for structured outputs
- Add --json-schema-file and --json-schema-inline CLI arguments
- Add --json-response-prompt for customizable JSON formatting messages
- Extend RequestFuncInput and Client classes with json_schema support
- Update OpenAI chat completions backend to use proper JSON schema format
- Add sample JSON schema files for testing
- Maintain backward compatibility with existing --json-response flag
* Enhance JSON schema system with flexible prompt handling
- Replace --json-response-prompt with unified --json-prompt argument
- Add @file syntax support for loading prompts from files
- Add --include-schema-in-prompt flag to include schema in prompt text
- Implement comprehensive input validation with clear error messages
- Simplify backend prompt logic with consistent schema formatting
- Add extensive README documentation with examples and usage patterns
- Remove deprecated --json-response-prompt for cleaner API
- Fix error handling for malformed JSON responses in streaming mode
* Fix overly general exception handling in main.py
- Replace broad Exception catches with specific exception types
- Use OSError, PermissionError for file operations
- Use json.JSONDecodeError for JSON parsing errors
- Improve error messages with more specific context
* Clean up sample schemas, keep only simple_schema.json
- Remove complex_schema.json and sample_response_schema.json
- Keep simple_schema.json as the primary example schema
- Update simple_schema.json with improved structure
* Simplify JSON schema documentation in README
- Remove verbose examples and compatibility notes
- Keep only essential file-based and inline schema usage
- Reference tests/data/simple_schema.json for example schema
- Make documentation concise and focused
* Refactor JSON validation to parse_args function
- Move JSON argument validation from run_main() to parse_args()
- Create validate_json_args() function for better separation of concerns
- Process and validate JSON arguments early during argument parsing
- Store processed custom_prompt and json_schema in args namespace
- Maintain same validation logic but in proper location
- Follow pattern of other argument validations in parse_args()
* Consolidate JSON schema arguments into unified --json-schema flag
Replace separate --json-schema-file and --json-schema-inline arguments with single --json-schema that supports both inline JSON and @file syntax, matching the pattern established by --json-prompt.
* Clean up code to address comments
* Update README.md
Co-authored-by: Benjamin Chislett <[email protected]>
* Update README.md
Co-authored-by: Benjamin Chislett <[email protected]>
---------
Co-authored-by: Benjamin Chislett <[email protected]>
Copy file name to clipboardExpand all lines: README.md
+19-1Lines changed: 19 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,6 +59,11 @@ After benchmarking, the results are saved to `output-file.json` (or specified by
59
59
|`--disable-tqdm`| Specify to disable tqdm progress bar. |
60
60
|`--best-of`| Number of best completions to return. |
61
61
|`--use-beam-search`| Use beam search for completions. |
62
+
|`--json-response`| Request responses in JSON object format from the API. |
63
+
|`--json-prompt`| No additional context is included in the prompt. Use `--json-prompt` to add custom instructions (appended to end of original prompt) if desired when using one of the JSON modes. Supports inline text or file input with `@file` syntax (e.g., `--json-prompt @prompt.txt`). |
64
+
|`--json-schema`| JSON schema for structured output validation. Supports inline JSON string or file input with `@file` syntax (e.g., `--json-schema @schema.json`). |
65
+
|`--include-schema-in-prompt`| Include the JSON schema in the prompt text for better LLM comprehension. Requires `--json-schema` to be specified. |
66
+
|`--disable-thinking`| Disable thinking mode in chat templates. |
62
67
|`--output-file`| Output json file to save the results. |
63
68
|`--debug`| Log debug messages. |
64
69
|`--profile`| Use Torch Profiler. The endpoint must be launched with VLLM_TORCH_PROFILER_DIR to enable profiler. |
@@ -72,6 +77,18 @@ After benchmarking, the results are saved to `output-file.json` (or specified by
72
77
|`--top-p`| Top-P to use for sampling. Defaults to None, or 1.0 for backends which require it to be specified. |
73
78
|`--top-k`| Top-K to use for sampling. Defaults to None. |
74
79
80
+
### JSON Schema Support
81
+
82
+
For structured JSON outputs with schema validation:
83
+
84
+
```bash
85
+
# File-based schema (see tests/data/simple_schema.json for example)
In addition to providing these arguments on the command-line, you can use `--config-file` to pre-define the parameters for your use case. Examples are provided in `examples/`
0 commit comments