Skip to content

Commit 9a13adb

Browse files
chaodenguscfacebook-github-bot
authored andcommitted
Check zch config compability for transfer (meta-pytorch#3405)
Summary: Added a check to validate if zch config is compatible between the source model and the target model. This update is to avoid incorrectly MaaS transfer. For example if the bucket number in the source model is 16 and the number in the target mdoel is 12, even if the zch table size stays the same between these two models, we should apply MPZCH transfer, instead of a noraml transfer, because each row in the source table could map to a different location in the target table. In this update: 1. We added bucket number into state_dict in the checkpoint 2. We compared the bucket numbers between a source model and a target model during transfer. 3. If the bucket number in the source cannot be divided by the number in the target, we raise an exception. Reviewed By: zlzhao1104 Differential Revision: D83580368
1 parent 3faf5e5 commit 9a13adb

File tree

2 files changed

+2
-0
lines changed

2 files changed

+2
-0
lines changed

torchrec/distributed/mc_modules.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -827,6 +827,7 @@ def sharded_parameter_names(self, prefix: str = "") -> Iterator[str]:
827827
"_output_segments_tensor",
828828
"_current_iter_tensor",
829829
"_scalar_logger._scalar_logger_steps",
830+
"_hash_zch_bucket",
830831
]:
831832
continue
832833
if name in module._non_persistent_buffers_set:

torchrec/modules/hash_mc_modules.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -305,6 +305,7 @@ def __init__(
305305

306306
self._max_probe = max_probe
307307
self._buckets = total_num_buckets
308+
self.register_buffer("_hash_zch_bucket", torch.tensor(total_num_buckets))
308309
# Do not need to store in buffer since this is created and consumed
309310
# at each step https://fburl.com/code/axzimmbx
310311
self._evicted_indices = []

0 commit comments

Comments
 (0)