-
Notifications
You must be signed in to change notification settings - Fork 1
add e2e test #9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: up-ar
Are you sure you want to change the base?
add e2e test #9
Conversation
Signed-off-by: yiliu30 <[email protected]>
|
/gemini review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds a new end-to-end test for AutoRound quantization. The changes include a new test configuration, a recipe for AutoRound, and modifications to the e2e testing utilities to support the auto_round_align dataset. Additionally, the test_lmeval.py file has been refactored to improve the evaluation flow.
My review focuses on improving code quality and robustness. I've pointed out some unused imports and hardcoded values in e2e_utils.py. I also suggested replacing debug print statements with proper logging in test_lmeval.py for consistency.
| ds = process_dataset(ds, processor, max_seq_length) | ||
| if "auto_round_align" in dataset_id: | ||
| from auto_round.calib_dataset import get_dataset | ||
| from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The imports AutoConfig and AutoModelForCausalLM are unused in this function and should be removed to keep the code clean. It is also a best practice to place imports at the top of the file for readability.
| from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer | |
| from transformers import AutoTokenizer |
| seqlen=2048, | ||
| nsamples=128, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The values for seqlen and nsamples are hardcoded. This function receives max_seq_length and num_calibration_samples as arguments, which should be used here instead. This will make the function more robust and prevent potential mismatches with the configuration.
| seqlen=2048, | |
| nsamples=128, | |
| seqlen=max_seq_length, | |
| nsamples=num_calibration_samples, |
| logger.info("================= Running LM Eval on COMPRESSED model ==========") | ||
| self._run_lm_eval() | ||
| results = self._run_lm_eval() | ||
| print(f"Compressed model results: {results}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This print statement appears to be for debugging. It's better to use the logger instance for logging (e.g., logger.info(...)) to maintain consistency and allow for better log level management.
| print(f"Compressed model results: {results}") | |
| logger.info(f"Compressed model results: {results}") |
| # Always evaluate base model for recovery testing | ||
| logger.info("================= Evaluating BASE model ======================") | ||
| self.base_results = self._eval_base_model() | ||
| print(f"Base model results: {self.base_results}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This print statement appears to be for debugging. It's better to use the logger instance for logging (e.g., logger.info(...)) to maintain consistency and allow for better log level management.
| print(f"Base model results: {self.base_results}") | |
| logger.info(f"Base model results: {self.base_results}") |
Signed-off-by: yiliu30 <[email protected]>
Signed-off-by: yiliu30 <[email protected]>
SUMMARY:
"please provide a brief summary"
TEST PLAN:
"please outline how the changes were tested"