-
Notifications
You must be signed in to change notification settings - Fork 14.7k
[VP][RISCV] Add a vp.load.ff intrinsic for fault only first load. #128593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
b4e4666
07cf13c
2fb476e
4f2fbff
93d6ab7
5f0f3cc
6720314
3a8289f
081ab46
c1fae0d
c6c237d
1b39f42
cbf0bb0
249b585
7e6a57e
248a78f
d6f4f53
78fd7e3
6a29350
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -587,6 +587,12 @@ VP_PROPERTY_FUNCTIONAL_OPC(Load) | |
VP_PROPERTY_FUNCTIONAL_INTRINSIC(masked_load) | ||
END_REGISTER_VP(vp_load, VP_LOAD) | ||
|
||
BEGIN_REGISTER_VP_INTRINSIC(vp_load_ff, 1, 2) | ||
// val,chain = VP_LOAD_FF chain,base,mask,evl | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was going to ask why this doesn't have an offset operand like the others so it could inherit from SelectionDAGBuilder doesn't set it, I think it's always undef. Not for this PR but can we remove it? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I think we can probably remove Offset and Addressing mode until a target comes along that needs them. |
||
BEGIN_REGISTER_VP_SDNODE(VP_LOAD_FF, -1, vp_load_ff, 2, 3) | ||
HELPER_MAP_VPID_TO_VPSD(vp_load_ff, VP_LOAD_FF) | ||
VP_PROPERTY_NO_FUNCTIONAL | ||
END_REGISTER_VP(vp_load_ff, VP_LOAD_FF) | ||
// llvm.experimental.vp.strided.load(ptr,stride,mask,vlen) | ||
BEGIN_REGISTER_VP_INTRINSIC(experimental_vp_strided_load, 2, 3) | ||
// chain = EXPERIMENTAL_VP_STRIDED_LOAD chain,base,offset,stride,mask,evl | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8440,6 +8440,34 @@ void SelectionDAGBuilder::visitVPLoad( | |
setValue(&VPIntrin, LD); | ||
} | ||
|
||
void SelectionDAGBuilder::visitVPLoadFF( | ||
const VPIntrinsic &VPIntrin, EVT VT, EVT EVLVT, | ||
const SmallVectorImpl<SDValue> &OpValues) { | ||
assert(OpValues.size() == 3 && "Unexpected number of operands"); | ||
SDLoc DL = getCurSDLoc(); | ||
Value *PtrOperand = VPIntrin.getArgOperand(0); | ||
MaybeAlign Alignment = VPIntrin.getPointerAlignment(); | ||
AAMDNodes AAInfo = VPIntrin.getAAMetadata(); | ||
const MDNode *Ranges = VPIntrin.getMetadata(LLVMContext::MD_range); | ||
SDValue LD; | ||
// Do not serialize variable-length loads of constant memory with | ||
// anything. | ||
if (!Alignment) | ||
Alignment = DAG.getEVTAlign(VT); | ||
MemoryLocation ML = MemoryLocation::getAfter(PtrOperand, AAInfo); | ||
bool AddToChain = !BatchAA || !BatchAA->pointsToConstantMemory(ML); | ||
SDValue InChain = AddToChain ? DAG.getRoot() : DAG.getEntryNode(); | ||
MachineMemOperand *MMO = DAG.getMachineFunction().getMachineMemOperand( | ||
MachinePointerInfo(PtrOperand), MachineMemOperand::MOLoad, | ||
LocationSize::beforeOrAfterPointer(), *Alignment, AAInfo, Ranges); | ||
LD = DAG.getLoadFFVP(VT, DL, InChain, OpValues[0], OpValues[1], OpValues[2], | ||
MMO); | ||
SDValue Trunc = DAG.getNode(ISD::TRUNCATE, DL, EVLVT, LD.getValue(1)); | ||
Comment on lines
+8463
to
+8465
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we need the trunc? Are EVLVT and LD.getValue(1) not always i32? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we pre-promote the EVL argument to i64 in visitVectorPredicationIntrinsic |
||
if (AddToChain) | ||
PendingLoads.push_back(LD.getValue(2)); | ||
setValue(&VPIntrin, DAG.getMergeValues({LD.getValue(0), Trunc}, DL)); | ||
} | ||
|
||
void SelectionDAGBuilder::visitVPGather( | ||
const VPIntrinsic &VPIntrin, EVT VT, | ||
const SmallVectorImpl<SDValue> &OpValues) { | ||
|
@@ -8673,6 +8701,9 @@ void SelectionDAGBuilder::visitVectorPredicationIntrinsic( | |
case ISD::VP_LOAD: | ||
visitVPLoad(VPIntrin, ValueVTs[0], OpValues); | ||
break; | ||
case ISD::VP_LOAD_FF: | ||
visitVPLoadFF(VPIntrin, ValueVTs[0], ValueVTs[1], OpValues); | ||
break; | ||
case ISD::VP_GATHER: | ||
visitVPGather(VPIntrin, ValueVTs[0], OpValues); | ||
break; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may not match SVE because they just change the FFR predicate register. But I haven't come up with a representation that can represent two kinds of semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think in this RFC, they mentioned that having two variants—one for RVV and one for SVE—is acceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But if so, it is still target-dependent. Anyway, I'd like to see this PR going forward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, where is the non-vp version? Or does it exist? :-)