Skip to content

Processor Distribution Heuristic Failure #323

@coreyostrove

Description

@coreyostrove

When running GST using MPI with a very large number of cores we encounter what appears to be an edge case with the processor distribution heuristics that results in a distribution of processors that fails the layout creation stage. Attached is a cleaned up log along with a script and related files needed for reproducing this error.

I was running on feature-globally-germ-aware-fpr, but this should be reproducible on the tip of develop. Other relevant parameters:

20-nodes with 36 cores each for a total of 720 processors.
python 3.9.16

Manually specifying a processor grid that is 20x36 looks to alleviate this error.
proc_dist_heuristic_failure.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugA bug or regressionlow priorityDevelopers should be aware of this issue, but it need not be addressed imminently

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions