Skip to content

Conversation

@yiliu30
Copy link
Owner

@yiliu30 yiliu30 commented Nov 4, 2025

SUMMARY:
"please provide a brief summary"

TEST PLAN:
"please outline how the changes were tested"

Signed-off-by: yiliu30 <[email protected]>
@yiliu30
Copy link
Owner Author

yiliu30 commented Nov 4, 2025

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a new end-to-end test for AutoRound quantization. The changes include a new test configuration, a recipe for AutoRound, and modifications to the e2e testing utilities to support the auto_round_align dataset. Additionally, the test_lmeval.py file has been refactored to improve the evaluation flow.

My review focuses on improving code quality and robustness. I've pointed out some unused imports and hardcoded values in e2e_utils.py. I also suggested replacing debug print statements with proper logging in test_lmeval.py for consistency.

ds = process_dataset(ds, processor, max_seq_length)
if "auto_round_align" in dataset_id:
from auto_round.calib_dataset import get_dataset
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The imports AutoConfig and AutoModelForCausalLM are unused in this function and should be removed to keep the code clean. It is also a best practice to place imports at the top of the file for readability.

Suggested change
from transformers import AutoConfig, AutoModelForCausalLM, AutoTokenizer
from transformers import AutoTokenizer

Comment on lines +56 to +57
seqlen=2048,
nsamples=128,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The values for seqlen and nsamples are hardcoded. This function receives max_seq_length and num_calibration_samples as arguments, which should be used here instead. This will make the function more robust and prevent potential mismatches with the configuration.

Suggested change
seqlen=2048,
nsamples=128,
seqlen=max_seq_length,
nsamples=num_calibration_samples,

logger.info("================= Running LM Eval on COMPRESSED model ==========")
self._run_lm_eval()
results = self._run_lm_eval()
print(f"Compressed model results: {results}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This print statement appears to be for debugging. It's better to use the logger instance for logging (e.g., logger.info(...)) to maintain consistency and allow for better log level management.

Suggested change
print(f"Compressed model results: {results}")
logger.info(f"Compressed model results: {results}")

# Always evaluate base model for recovery testing
logger.info("================= Evaluating BASE model ======================")
self.base_results = self._eval_base_model()
print(f"Base model results: {self.base_results}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This print statement appears to be for debugging. It's better to use the logger instance for logging (e.g., logger.info(...)) to maintain consistency and allow for better log level management.

Suggested change
print(f"Base model results: {self.base_results}")
logger.info(f"Base model results: {self.base_results}")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants