Skip to content

[TTI] Consistently pass the pointer type to getAddressComputationCost. NFCI #152657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Aug 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions llvm/include/llvm/Analysis/TargetTransformInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -1675,13 +1675,14 @@ class TargetTransformInfo {

/// \returns The cost of the address computation. For most targets this can be
/// merged into the instruction indexing mode. Some targets might want to
/// distinguish between address computation for memory operations on vector
/// types and scalar types. Such targets should override this function.
/// The 'SE' parameter holds pointer for the scalar evolution object which
/// is used in order to get the Ptr step value in case of constant stride.
/// The 'Ptr' parameter holds SCEV of the access pointer.
LLVM_ABI InstructionCost getAddressComputationCost(
Type *Ty, ScalarEvolution *SE = nullptr, const SCEV *Ptr = nullptr) const;
/// distinguish between address computation for memory operations with vector
/// pointer types and scalar pointer types. Such targets should override this
/// function. \p SE holds the pointer for the scalar evolution object which
/// was used in order to get the Ptr step value. \p Ptr holds the SCEV of the
/// access pointer.
LLVM_ABI InstructionCost
getAddressComputationCost(Type *PtrTy, ScalarEvolution *SE = nullptr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're passing in the actual pointer type I thought it can only ever be an opaque pointer type, or a vector of opaque pointer types. So can we make the interface even simpler by just passing a boolean saying whether it's a vector or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the pointer type also contains the address space, not that any target currently uses it though. Long term though I think we probably want to change this API anyway to not even pass the type or SCEV? That way we can use it properly from VPlan which doesn't have access to SCEV. Just from looking at the various TTIs we seem to have three costs:

  • Scalar address
  • Vector of strided addresses (either constant or variable)
  • Vector of gathered/scattered addresses

Maybe we could look at splitting this into three separate hooks?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or a single hook that takes an enum parameter providing more context? i.e.

getAddressComputationCost(Type *PtrTy, Value *Stride)

where you'd have Stride == ConstantInt(X) for contiguous loads (X==1) or strided loads (X > 1), or Stride == non-constant for non-constant strides, or Stride == nullptr indicates these are gathers/scatters. Given that most uses of the SCEV argument seem to relate to the stride, perhaps that allows you to kill off the SCEV argument entirely this way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah an enum would also work. I don't think we can directly pass a Value though if want to call this from the VPlan cost model where the value isn't materialized yet.

I presume this should be done in a separate PR anyway after we fix up the existing call sites to be consistent?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sure, although I made a typo as obviously Value *Stride is not an enum! :face_palm.

const SCEV *Ptr = nullptr) const;

/// \returns The cost, if any, of keeping values of the given types alive
/// over a callsite.
Expand Down
3 changes: 2 additions & 1 deletion llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
Original file line number Diff line number Diff line change
Expand Up @@ -937,7 +937,8 @@ class TargetTransformInfoImplBase {
// Assume that we have a register of the right size for the type.
virtual unsigned getNumberOfParts(Type *Tp) const { return 1; }

virtual InstructionCost getAddressComputationCost(Type *Tp, ScalarEvolution *,
virtual InstructionCost getAddressComputationCost(Type *PtrTy,
ScalarEvolution *,
const SCEV *) const {
return 0;
}
Expand Down
2 changes: 1 addition & 1 deletion llvm/include/llvm/CodeGen/BasicTTIImpl.h
Original file line number Diff line number Diff line change
Expand Up @@ -3026,7 +3026,7 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
return LT.first.getValue();
}

InstructionCost getAddressComputationCost(Type *Ty, ScalarEvolution *,
InstructionCost getAddressComputationCost(Type *PtrTy, ScalarEvolution *,
const SCEV *) const override {
return 0;
}
Expand Down
4 changes: 2 additions & 2 deletions llvm/lib/Analysis/TargetTransformInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1231,9 +1231,9 @@ unsigned TargetTransformInfo::getNumberOfParts(Type *Tp) const {
}

InstructionCost
TargetTransformInfo::getAddressComputationCost(Type *Tp, ScalarEvolution *SE,
TargetTransformInfo::getAddressComputationCost(Type *PtrTy, ScalarEvolution *SE,
const SCEV *Ptr) const {
InstructionCost Cost = TTIImpl->getAddressComputationCost(Tp, SE, Ptr);
InstructionCost Cost = TTIImpl->getAddressComputationCost(PtrTy, SE, Ptr);
assert(Cost >= 0 && "TTI should not produce negative costs!");
return Cost;
}
Expand Down
4 changes: 2 additions & 2 deletions llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -4336,7 +4336,7 @@ InstructionCost AArch64TTIImpl::getArithmeticInstrCost(
}

InstructionCost
AArch64TTIImpl::getAddressComputationCost(Type *Ty, ScalarEvolution *SE,
AArch64TTIImpl::getAddressComputationCost(Type *PtrTy, ScalarEvolution *SE,
const SCEV *Ptr) const {
// Address computations in vectorized code with non-consecutive addresses will
// likely result in more instructions compared to scalar code where the
Expand All @@ -4345,7 +4345,7 @@ AArch64TTIImpl::getAddressComputationCost(Type *Ty, ScalarEvolution *SE,
unsigned NumVectorInstToHideOverhead = NeonNonConstStrideOverhead;
int MaxMergeDistance = 64;

if (Ty->isVectorTy() && SE &&
if (PtrTy->isVectorTy() && SE &&
!BaseT::isConstantStridedAccessLessThan(SE, Ptr, MaxMergeDistance + 1))
return NumVectorInstToHideOverhead;

Expand Down
2 changes: 1 addition & 1 deletion llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -238,7 +238,7 @@ class AArch64TTIImpl final : public BasicTTIImplBase<AArch64TTIImpl> {
ArrayRef<const Value *> Args = {},
const Instruction *CxtI = nullptr) const override;

InstructionCost getAddressComputationCost(Type *Ty, ScalarEvolution *SE,
InstructionCost getAddressComputationCost(Type *PtrTy, ScalarEvolution *SE,
const SCEV *Ptr) const override;

InstructionCost getCmpSelInstrCost(
Expand Down
6 changes: 3 additions & 3 deletions llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1084,7 +1084,7 @@ InstructionCost ARMTTIImpl::getCmpSelInstrCost(
CostKind, Op1Info, Op2Info, I);
}

InstructionCost ARMTTIImpl::getAddressComputationCost(Type *Ty,
InstructionCost ARMTTIImpl::getAddressComputationCost(Type *PtrTy,
ScalarEvolution *SE,
const SCEV *Ptr) const {
// Address computations in vectorized code with non-consecutive addresses will
Expand All @@ -1095,15 +1095,15 @@ InstructionCost ARMTTIImpl::getAddressComputationCost(Type *Ty,
int MaxMergeDistance = 64;

if (ST->hasNEON()) {
if (Ty->isVectorTy() && SE &&
if (PtrTy->isVectorTy() && SE &&
!BaseT::isConstantStridedAccessLessThan(SE, Ptr, MaxMergeDistance + 1))
return NumVectorInstToHideOverhead;

// In many cases the address computation is not merged into the instruction
// addressing mode.
return 1;
}
return BaseT::getAddressComputationCost(Ty, SE, Ptr);
return BaseT::getAddressComputationCost(PtrTy, SE, Ptr);
}

bool ARMTTIImpl::isProfitableLSRChainElement(Instruction *I) const {
Expand Down
2 changes: 1 addition & 1 deletion llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ HexagonTTIImpl::getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
return BaseT::getIntrinsicInstrCost(ICA, CostKind);
}

InstructionCost HexagonTTIImpl::getAddressComputationCost(Type *Tp,
InstructionCost HexagonTTIImpl::getAddressComputationCost(Type *PtrTy,
ScalarEvolution *SE,
const SCEV *S) const {
return 0;
Expand Down
2 changes: 1 addition & 1 deletion llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -111,7 +111,7 @@ class HexagonTTIImpl final : public BasicTTIImplBase<HexagonTTIImpl> {
InstructionCost
getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
TTI::TargetCostKind CostKind) const override;
InstructionCost getAddressComputationCost(Type *Tp, ScalarEvolution *SE,
InstructionCost getAddressComputationCost(Type *PtrTy, ScalarEvolution *SE,
const SCEV *S) const override;
InstructionCost getMemoryOpCost(
unsigned Opcode, Type *Src, Align Alignment, unsigned AddressSpace,
Expand Down
6 changes: 3 additions & 3 deletions llvm/lib/Target/X86/X86TargetTransformInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5488,7 +5488,7 @@ InstructionCost X86TTIImpl::getPointersChainCost(
return BaseT::getPointersChainCost(Ptrs, Base, Info, AccessTy, CostKind);
}

InstructionCost X86TTIImpl::getAddressComputationCost(Type *Ty,
InstructionCost X86TTIImpl::getAddressComputationCost(Type *PtrTy,
ScalarEvolution *SE,
const SCEV *Ptr) const {
// Address computations in vectorized code with non-consecutive addresses will
Expand All @@ -5504,7 +5504,7 @@ InstructionCost X86TTIImpl::getAddressComputationCost(Type *Ty,
// Even in the case of (loop invariant) stride whose value is not known at
// compile time, the address computation will not incur more than one extra
// ADD instruction.
if (Ty->isVectorTy() && SE && !ST->hasAVX2()) {
if (PtrTy->isVectorTy() && SE && !ST->hasAVX2()) {
// TODO: AVX2 is the current cut-off because we don't have correct
// interleaving costs for prior ISA's.
if (!BaseT::isStridedAccess(Ptr))
Expand All @@ -5513,7 +5513,7 @@ InstructionCost X86TTIImpl::getAddressComputationCost(Type *Ty,
return 1;
}

return BaseT::getAddressComputationCost(Ty, SE, Ptr);
return BaseT::getAddressComputationCost(PtrTy, SE, Ptr);
}

InstructionCost
Expand Down
3 changes: 1 addition & 2 deletions llvm/lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2309,8 +2309,7 @@ chainToBasePointerCost(SmallVectorImpl<Instruction *> &Chain,

} else if (GetElementPtrInst *GEP = dyn_cast<GetElementPtrInst>(Instr)) {
// Cost of the address calculation
Type *ValTy = GEP->getSourceElementType();
Cost += TTI.getAddressComputationCost(ValTy);
Cost += TTI.getAddressComputationCost(GEP->getType());

// And cost of the GEP itself
// TODO: Use TTI->getGEPCost here (it exists, but appears to be not
Expand Down
11 changes: 7 additions & 4 deletions llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5293,11 +5293,12 @@ LoopVectorizationCostModel::getUniformMemOpCost(Instruction *I,
assert(Legal->isUniformMemOp(*I, VF));

Type *ValTy = getLoadStoreType(I);
Type *PtrTy = getLoadStorePointerOperand(I)->getType();
auto *VectorTy = cast<VectorType>(toVectorTy(ValTy, VF));
const Align Alignment = getLoadStoreAlignment(I);
unsigned AS = getLoadStoreAddressSpace(I);
if (isa<LoadInst>(I)) {
return TTI.getAddressComputationCost(ValTy) +
return TTI.getAddressComputationCost(PtrTy) +
TTI.getMemoryOpCost(Instruction::Load, ValTy, Alignment, AS,
CostKind) +
TTI.getShuffleCost(TargetTransformInfo::SK_Broadcast, VectorTy,
Expand All @@ -5310,7 +5311,7 @@ LoopVectorizationCostModel::getUniformMemOpCost(Instruction *I,
// VF.getKnownMinValue() - 1 from a scalable vector. This does not represent
// the actual generated code, which involves extracting the last element of
// a scalable vector where the lane to extract is unknown at compile time.
return TTI.getAddressComputationCost(ValTy) +
return TTI.getAddressComputationCost(PtrTy) +
TTI.getMemoryOpCost(Instruction::Store, ValTy, Alignment, AS,
CostKind) +
(IsLoopInvariantStoreValue
Expand All @@ -5326,8 +5327,9 @@ LoopVectorizationCostModel::getGatherScatterCost(Instruction *I,
auto *VectorTy = cast<VectorType>(toVectorTy(ValTy, VF));
const Align Alignment = getLoadStoreAlignment(I);
const Value *Ptr = getLoadStorePointerOperand(I);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't you just reuse the Ptr variable like this:

  const Align Alignment = getLoadStoreAlignment(I);
  const Value *Ptr = getLoadStorePointerOperand(I);
  Type *PtrTy = toVectorTy(getLoadStorePointerOperand(I)->getType(), VF);

?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Woops yup, thanks

Type *PtrTy = toVectorTy(Ptr->getType(), VF);

return TTI.getAddressComputationCost(VectorTy) +
return TTI.getAddressComputationCost(PtrTy) +
TTI.getGatherScatterOpCost(I->getOpcode(), VectorTy, Ptr,
Legal->isMaskRequired(I), Alignment,
CostKind, I);
Expand Down Expand Up @@ -5562,11 +5564,12 @@ LoopVectorizationCostModel::getMemoryInstructionCost(Instruction *I,
// moment.
if (VF.isScalar()) {
Type *ValTy = getLoadStoreType(I);
Type *PtrTy = getLoadStorePointerOperand(I)->getType();
const Align Alignment = getLoadStoreAlignment(I);
unsigned AS = getLoadStoreAddressSpace(I);

TTI::OperandValueInfo OpInfo = TTI::getOperandInfo(I->getOperand(0));
return TTI.getAddressComputationCost(ValTy) +
return TTI.getAddressComputationCost(PtrTy) +
TTI.getMemoryOpCost(I->getOpcode(), ValTy, Alignment, AS, CostKind,
OpInfo, I);
}
Expand Down
3 changes: 2 additions & 1 deletion llvm/lib/Transforms/Vectorize/VPlanRecipes.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -3104,9 +3104,10 @@ InstructionCost VPWidenMemoryRecipe::computeCost(ElementCount VF,
// Currently, ARM will use the underlying IR to calculate gather/scatter
// instruction cost.
const Value *Ptr = getLoadStorePointerOperand(&Ingredient);
Type *PtrTy = toVectorTy(Ptr->getType(), VF);
assert(!Reverse &&
"Inconsecutive memory access should not have the order.");
return Ctx.TTI.getAddressComputationCost(Ty) +
return Ctx.TTI.getAddressComputationCost(PtrTy) +
Ctx.TTI.getGatherScatterOpCost(Opcode, Ty, Ptr, IsMasked, Alignment,
Ctx.CostKind, &Ingredient);
}
Expand Down
3 changes: 2 additions & 1 deletion llvm/lib/Transforms/Vectorize/VectorCombine.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1790,7 +1790,8 @@ bool VectorCombine::scalarizeLoadExtract(Instruction &I) {
ScalarizedCost +=
TTI.getMemoryOpCost(Instruction::Load, VecTy->getElementType(),
Align(1), LI->getPointerAddressSpace(), CostKind);
ScalarizedCost += TTI.getAddressComputationCost(VecTy->getElementType());
ScalarizedCost +=
TTI.getAddressComputationCost(LI->getPointerOperandType());
}

LLVM_DEBUG(dbgs() << "Found all extractions of a vector load: " << I
Expand Down