-
Notifications
You must be signed in to change notification settings - Fork 30
[AIEX] Extend Staged 2D/3D regalloc to avoid spills #685
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: aie-public
Are you sure you want to change the base?
Conversation
|
Thanks you @andcarminati, very much !! |
|
Also do you think following commit will help ? |
Maybe yes! As mentioned before, I prefer to keep just the minimal necessary changes. We can test after, on top of this PR. |
31b7e71 to
a27561f
Compare
a27561f to
e124649
Compare
80e6f7c to
4c47705
Compare
| MI.getMF()->getSubtarget().getInstrInfo()); | ||
|
|
||
| // We should recognize both cases, with and without splitting. A 2D/3D | ||
| // instruction will always be split os splittable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: as
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repaired as a fixup in the refactor commit.
| MI.eraseFromParent(); | ||
| // As we don't handle all registers now (selective LI filter), | ||
| // We should make sure that all LiveIntervals are correct. | ||
| // If we dont't repair, MI will compose the LIs of some registers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: don't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean with compose? the dead MI will block some LI ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A dead MI will have an invalid SlotIndex that will appear in some instruction`s LI, what is wrong. The original copy should not appear in any LI.
| const AIEBaseRegisterInfo &TRI, | ||
| std::set<Register> &VisitedVRegs); | ||
|
|
||
| SmallSet<int, 8> getRewritableSubRegs(Register Reg, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can we have a comment what this function does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is a refactor but it is always good to document.
| } | ||
| Register SrcReg = RegOp.getParent()->getOperand(1).getReg(); | ||
| if (!VisitedVRegs.count(SrcReg) && | ||
| getRewritableSubRegs(SrcReg, MRI, TRI, VisitedVRegs).empty()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it ever happen, that SrcReg has no SubRegs, but DstReg has them or vis versa?
Can they also have different SubRegs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part was a refactor, but as we are handling a full copy here, we can expect subregs on both sides.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a full copy, I mean a 2d to 2d copy or a 3d to 3d copy.
| MI.eraseFromParent(); | ||
| // As we don't handle all registers now (selective LI filter), | ||
| // We should make sure that all LiveIntervals are correct. | ||
| // If we dont't repair, MI will compose the LIs of some registers, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: don't
| // We should make sure that all LiveIntervals are correct. | ||
| // If we dont't repair, MI will compose the LIs of some registers, | ||
| // what is not correct because MI was deleted. | ||
| updateLRMAndLIS(RegistersToRepair, VRM, LRM, LIS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could we get a better name that does not have the abbreviations in it?
updateLiveIntervals. Also since LIS is the most relevant Datastructure, have it first and then followed by helper structures LRM and VRM.
Now we filter by register class and usage. Basically, we exclude here instructions like copies and non-2D/3D ones. Co-Authored-By: Krishnam Tibrewala <[email protected]>
…gisters Co-Authored-By: Krishnam Tibrewala <[email protected]>
Co-Authored-By: Krishnam Tibrewala <[email protected]>
…reedy run Co-Authored-By: Krishnam Tibrewala <[email protected]>
If we don't need a full register, we can expand to individual lanes. Co-Authored-By: Krishnam Tibrewala <[email protected]>
Co-Authored-By: Krishnam Tibrewala <[email protected]>
Co-Authored-By: Krishnam Tibrewala <[email protected]>
This avoids cycles in bundles that appear in VirtRegRewriter. We also update LIs related to src and dst operands of those expanded copies. Co-Authored-By: Krishnam Tibrewala <[email protected]>
The goal of this test is to check if we properly insert undef flag on the def side of a expanded full copy. On a sub-register def operand, it refers to the part of the register that isn't written. A sub-register def implicitly reads the other parts of the register being redefined unless the <undef> flag is set, and a missing flag can force the related register to be inserted in liveout set of the predecessors block, causing dominance problems. Co-Authored-By: Krishnam Tibrewala <[email protected]>
4c47705 to
a88a541
Compare
This will handle properly use of non-dominating definitions. We also change the handling of the destination registers in two parts: *Copy expansion: we replace the ogininal index by the index of the first lane copy to avoid the creation LRs with just one instruction, in this way we keep que LI correct. *Rewrite: reset dead flags if necessary. In the test, MachineVerifier is still off, because some lanes redefinition are consirered full register redefinition, causing some non-accurate expactations aroud dead flags. The affected test is a extreme corner case where individual lanes are handled apart of original 2D register. Co-Authored-By: Krishnam Tibrewala <[email protected]>
a88a541 to
ab8c08d
Compare
|
|
||
| MachineInstr *FirstMI = nullptr; | ||
| SmallSet<Register, 8> RegistersToRepair; | ||
| bool IsFirstCopy = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We can use FirstMI as a replacement for IsFirstCopy.
| @@ -0,0 +1,172 @@ | |||
| //===----- AIEUnallocatedSuperRegRewriter.cpp - Constrain tied sub-registers | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix this.



This work is intended to avoid 2D/3D (when possible) register spills.
The idea and rationale behind this work is in a previous Draft PR: #442.
To review, I recommend to follow this PR commit by commit.
Credits also for the co-author @krishnamtibrewala.