SWP-aware WAWRegRewriter. #699

martien-de-jong · 2025-11-07T10:51:40Z

This adds an alternative to the latency-aware heuristic in WAWRegisterRewriter and selects between the two based on LoopClass.

andcarminati · 2025-11-07T13:15:00Z

llvm/lib/Target/AIE/AIEBaseSubtarget.cpp

 // The initial graph will have ordering edges induced by hasSideEffects of the
 // locks/DONE.
 class LockDelays : public ScheduleDAGMutation {
+  bool ExactLatencies = true;


If we do this:

class LockDelays : public ScheduleDAGMutation { - bool ExactLatencies = true; + bool ExactLatencies; void apply(ScheduleDAGInstrs *DAG) override { const auto *TII = static_cast<const AIEBaseInstrInfo *>(DAG->TII); const int CoreStallCycle = TII->getCoreStallCycleAfterLock(); @@ -243,7 +243,7 @@ class LockDelays : public ScheduleDAGMutation { } public: - LockDelays(bool ExactLatencies) : ExactLatencies(ExactLatencies) {}; + LockDelays(bool ExactLatencies = true) : ExactLatencies(ExactLatencies) {}; };

We don't need any change in the current instantiation (getPostRAMutationsImpl) of the mutators.

I guess that's true. But I have bad feelings about default parameters in constructors. It restricts future constructors which would be ambiguous.

andcarminati · 2025-11-07T13:19:17Z

llvm/lib/Target/AIE/AIEBaseSubtarget.cpp

        std::optional<int> MemLat = TII->getMemoryLatency(
            SrcMI.getDesc().getSchedClass(), MI.getDesc().getSchedClass());
-        if (!MemLat.has_value()) {
+        int Latency = 1;


CHECK: if we don't need exact latencies, and we don't have them, we use a very optimistic value instead.

Not very optimistic. For a RAW ST -> LD pair, the latency is actually 1, since the LOAD reads memory late.. It's also the default that was assumed before Gaetan made it defined by a target hook.

llvm/lib/Target/AIE/AIEWawRegRewriter.cpp

llvm/lib/Target/AIE/AIEPostPipeliner.cpp

andcarminati · 2025-11-07T15:40:03Z

nit: check this commit message: [AIE][NFC] Use equivalant LoopUtils function.

andcarminati · 2025-11-10T08:24:24Z

llvm/lib/Target/AIE/AIEWawRegRewriter.cpp

 }

 bool AIEWawRegRewriter::renameMBBPhysRegs(const MachineBasicBlock *MBB) {
+  // We do this mainly for the postpipeliner, and it will want to see ZOL


I guess we could have some verifier problems by accepting ZOL loops before MachineBlockPlacement, right?

It is better to choose one direction here:

Implement another hook, like isPreLayoutZOLBody.

Extend isZOLBody and target machine verifier.

Clearly state in the comment why we are not relying isZOLBody here.

Also, sometimes we can see some unexpected gains related to PreSWP + WAWRewriter. Are we preventing gains in some pipelined decrement/branch loop here? Could we guard this behavior with a command line option, to be able to run additional tests in the future?

Hmm. Under these assumptions, we should perhaps not constrain it to single-block loops at all, because the only advantage we can have from PreSWP loops is the scheduler, and for the scheduler anything might work.
I don't think isPreLayoutZOLBody is very clear concept, it is so tied to the flow.
If you have evidence of the combination of PreSWP, CountedLoops and WAWRegRewriter paying off, I would suggest to just remove the check. My main motivation was that I saw it kick in on loops with calls inside, which is concisely prevented by the ZOL check.

Or even AIEBaseInstrInfo::isZOLBody(const MachineBasicBlock &MBB, bool OnlyCheckLastInstr = true)

Ok, I found evidence in AvgPool. Removed the new check

andcarminati · 2025-11-10T08:33:43Z

llvm/lib/Target/AIE/AIEWawRegRewriter.cpp


-static cl::opt<bool>
-    LatencyAware("aie-realloc-latencyaware", cl::Hidden, cl::init(true),
+static cl::opt<int>


I am trying to figure out the rationale behind this change...

I'm overloading here, I agree. for tuning, I want to select false, true, or autoselect.

llvm/lib/Target/AIE/AIEWawRegRewriter.cpp

We pass some flags down which control the asserts that are meant to check completeness of the scheduler model for postscheduler's sake. That allows us to create a more approximate ddg in earlier stages. These would throw, mainly on occurrences of PseudoInstructions

Reorder the allocation order of the candidates based on an approximate pipeline schedule.

The enable flags are now integers, with 0 meaning false, 1 meaning true, anything else meaning auto-select based on LoopClass. This selection avoids running swpaware if latencyaware has run Add some tuning based on LoopClass

martien-de-jong requested review from F-Stuckmann, SagarMaheshwari99, abhinay-anubola, abnikant, andcarminati, katerynamuts, khallouh, konstantinschwarz, mludevid, niwinanto and stephenneuendorffer as code owners November 7, 2025 10:51

andcarminati reviewed Nov 7, 2025

View reviewed changes

llvm/lib/Target/AIE/AIEWawRegRewriter.cpp Show resolved Hide resolved

andcarminati reviewed Nov 7, 2025

View reviewed changes

llvm/lib/Target/AIE/AIEPostPipeliner.cpp Outdated Show resolved Hide resolved

andcarminati reviewed Nov 10, 2025

View reviewed changes

llvm/lib/Target/AIE/AIEWawRegRewriter.cpp Show resolved Hide resolved

Martien de Jong added 4 commits November 10, 2025 13:56

[AIE][NFC] factor computeLRURegisters

4fd558d

[AIE][WAWRewriter] Simulate a pipeline schedule for the LRU renaming

d4a0620

Reorder the allocation order of the candidates based on an approximate pipeline schedule.

[AIE][NFC] Add slot indicator in schedule dump

e286289

martien-de-jong force-pushed the martien.waw-swpaware branch from 47d0673 to 61eed42 Compare November 10, 2025 12:57

Martien de Jong added 3 commits November 10, 2025 16:51

[AIE][NFC] Use equivalent LoopUtils function

2045da6

[AIE] Integrate LatencyAware and SWPAware

a5f9660

The enable flags are now integers, with 0 meaning false, 1 meaning true, anything else meaning auto-select based on LoopClass. This selection avoids running swpaware if latencyaware has run Add some tuning based on LoopClass

[AIE][WAWRewriter] Auto select latency/swpaware and swpaware bias

664f9ef

martien-de-jong force-pushed the martien.waw-swpaware branch from 61eed42 to 664f9ef Compare November 10, 2025 15:53

SWP-aware WAWRegRewriter. #699

Are you sure you want to change the base?

SWP-aware WAWRegRewriter. #699

Uh oh!

Conversation

martien-de-jong commented Nov 7, 2025

Uh oh!

andcarminati Nov 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

andcarminati commented Nov 7, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

andcarminati Nov 7, 2025 •

edited

Loading