-
Notifications
You must be signed in to change notification settings - Fork 59
Open
Labels
bugA bug or regressionA bug or regressionlow priorityDevelopers should be aware of this issue, but it need not be addressed imminentlyDevelopers should be aware of this issue, but it need not be addressed imminently
Milestone
Description
When running GST using MPI with a very large number of cores we encounter what appears to be an edge case with the processor distribution heuristics that results in a distribution of processors that fails the layout creation stage. Attached is a cleaned up log along with a script and related files needed for reproducing this error.
I was running on feature-globally-germ-aware-fpr, but this should be reproducible on the tip of develop. Other relevant parameters:
20-nodes with 36 cores each for a total of 720 processors.
python 3.9.16
Manually specifying a processor grid that is 20x36 looks to alleviate this error.
proc_dist_heuristic_failure.zip
Metadata
Metadata
Assignees
Labels
bugA bug or regressionA bug or regressionlow priorityDevelopers should be aware of this issue, but it need not be addressed imminentlyDevelopers should be aware of this issue, but it need not be addressed imminently