-
Notifications
You must be signed in to change notification settings - Fork 668
Support pytorch checkpoints for Qwen3/Phi4 mini #13984
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13984
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 11ca5b3 with merge base 76a8906 ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
index_path = os.path.join(input_dir, "pytorch_model.bin.index.json") | ||
if os.path.exists(index_path): | ||
# Sharded checkpoint. | ||
with open(index_path, "r") as f: | ||
index = json.load(f) | ||
weight_map = index["weight_map"] | ||
checkpoint_shards = sorted(set(weight_map.values())) | ||
|
||
# Load all the shards into memory | ||
shard_to_weights = {} | ||
for shard in checkpoint_shards: | ||
shard_to_weights[shard] = torch.load( | ||
os.path.join(input_dir, shard), | ||
weights_only=True, | ||
map_location=torch.device("cpu"), | ||
) | ||
|
||
# Merge tensors into consolidated state dict. | ||
merged_state_dict = {} | ||
for weight_name, shard in weight_map.items(): | ||
tensor = shard_to_weights[shard][weight_name] | ||
merged_state_dict[weight_name] = tensor | ||
return merged_state_dict | ||
|
||
# Single checkpoint | ||
model_path = os.path.join(input_dir, "pytorch_model.bin") | ||
if os.path.exists(model_path): | ||
state_dict = torch.load( | ||
model_path, weights_only=True, map_location=torch.device("cpu") | ||
) | ||
return state_dict | ||
|
||
raise FileNotFoundError(f"Could not find pytorch_model checkpoint in {input_dir}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should you make this a utilty function to deduplicate logic?
No description provided.