Skip to content

Conversation

ansschh
Copy link

@ansschh ansschh commented Oct 3, 2025

unit tests for tokenizer encoding/decoding functionality.

Tests Added:

  • Basic encoding/decoding
  • Harmony special token handling
  • Round-trip consistency
  • Edge cases and error handling
  • Reserved token range validation

ansschh added 3 commits August 9, 2025 11:25
- Add return type hint to get_tokenizer()
- Add type hints and checkpoint validation to generate.py main()
- Add parameter type hints to suppress_output() in torch/utils.py

Improves IDE support and catches potential bugs early.
- Add CUDA availability check before device initialization
- Validate rank against available CUDA device count
- Add device accessibility testing with clear error messages
- Add error handling for distributed communication setup
- Add cleanup for failed distributed process group initialization
- Provide helpful error messages with troubleshooting guidance

This prevents cryptic CUDA errors and provides clear feedback when:
- CUDA is not available
- Invalid device rank is specified
- Device access fails
- Distributed communication fails
- Test basic encoding/decoding functionality
- Test Harmony special token handling
- Test round-trip consistency
- Test edge cases and error handling
- Verify reserved token range
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Comment on lines +13 to +17
def main(args: argparse.Namespace) -> None:
# Validate checkpoint path exists
checkpoint_path = Path(args.checkpoint)
if not checkpoint_path.exists():
raise FileNotFoundError(f"Checkpoint path does not exist: {args.checkpoint}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Allow non-local checkpoints for vLLM backend

The new preflight check unconditionally wraps args.checkpoint in Path and raises FileNotFoundError when the path does not exist. This happens before the backend switch, so it also triggers when using the vLLM backend, which previously accepted HuggingFace model identifiers or other non-local URIs and downloaded weights on demand. With the change, any string that is not an existing filesystem path now fails before vLLM can handle it, breaking valid use cases such as --backend vllm --checkpoint mistralai/Mixtral-8x7B. Consider moving the existence check inside the torch/triton branches or restricting it to local backends only.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant