-
-
Notifications
You must be signed in to change notification settings - Fork 308
Add base elements to support distributed comms. Add supports_distributed plugin flag. #1370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add base elements to support distributed comms. Add supports_distributed plugin flag. #1370
Conversation
| def hash_benchmark(benchmark: 'DatasetScenario', *, | ||
| hash_engine=None, num_workers=0) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be a class method? (``hash`) same for the other classes in this file, except the classes defined outside of avalanche
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think I can move those elements to the appropriate classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that the only avalanche-specific hash function in that file is hash_benchmark. Do you think it is still appropriate to move it to CLScenario?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to reuse the class __hash__ method if possible so that child classes can safely override its behavior if needed. Also, hash_dataset should work only on AvalancheDataset since we don't really support any other dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alas, __hash__ must return an int. It is designed to provide a coarse mechanism for populating hash maps. I think that we can just move those methods to the CLScenario and AvalancheDataset classes for the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, if it's different we can keep it as is. Maybe it will be more clear to me once I see how you use it for distributed training
|
Thanks @lrzpellegrini, I added a bunch of comments. I think there are some parts that can be simplified a bit hopefully. |
|
Ok, I think the |
…buted_training_pt2 Add base elements to support distributed comms. Add supports_distributed plugin flag.
This PR ports elements from the #996 PR by adding the following:
supports_distributedflag in BasePlugin. The method used to inherit such a flag in child classes is different from the one in Distributed support (rework) #996Unrelated to distributed training:
_obtain_common_dataloader_parametersin strategyPart of the effort described here: #1315