This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
Analysis/
6/6
TargetTransformInfo.h
5/7
TargetTransformInfoImpl.h
-
CodeGen/
-
BasicTTIImpl.h
-
lib/
-
Analysis/
1/1
TargetTransformInfo.cpp
-
Target/
-
RISCV/
-
RISCVTargetTransformInfo.cpp
-
X86/
1
X86TargetTransformInfo.cpp
-
Transforms/Vectorize/
-
Vectorize/
-
SLPVectorizer.cpp
-
test/Analysis/CostModel/
-
Analysis/
-
CostModel/
-
ARM/
-
mve-gather-scatter-cost.ll
-
RISCV/
-
gep.ll

Differential D149889

[TTI] Use users of GEP to guess access type in getGEPCost
ClosedPublic

Authored by luke on May 4 2023, 12:18 PM.

Download Raw Diff

Details

Reviewers

ABataev
reames
craig.topper
RKSimon
nikic
asb

Commits

rGa68dcd09e808: [TTI] Use users of GEP to guess access type in getGEPCost

Summary

Currently getGEPCost uses the target type of the GEP as a heuristic for
the type that will be accessed, to pass onto isLegalAddressingMode.
Targets use this to work out if a GEP can then be folded into the
load/store instruction that uses the GEP.
For example, on RISC-V loads and stores can have an offset added to a
base register folded into a single instruction, so the following GEP is
free:

%p = getelementptr i32, ptr %base, i32 42 ; getInstructionCost = 0
%x = load i32, ptr %p ; getInstructionCost = 1
------------------------------------------------------------------------
lw t0, a0(42)

However vector loads and stores cannot have an offset folded into them,
so the following GEP is costed:

%p = getelementptr <2 x i32>, ptr %base, i32 42 ; getInstructionCost = 1
%x = load <2 x i32>, ptr %p ; getInstructionCost = 1
------------------------------------------------------------------------
addi a0, 42
vle32 v8, (a0)

The issue arises whenever there is a mismatch between the target type of
the GEP and the type that is actually accessed:

%p = getelementptr i32, ptr %base, i32 42 ; getInstructionCost = 0
%x = load <2 x i32>, ptr %p ; getInstructionCost = 1
------------------------------------------------------------------------
addi a0, 42
vle32 v8, (a0)

Even though this GEP will result in an add instruction, because TTI
thinks it's loading an i32, it will think it can be folded and not
charge for it.

The target type can become mismatched with the memory access during
transformations, noticeably during SLP where a scalar base pointer will
be reused to perform a vector load or store.

This patch adds an optional AccessType argument to getGEPCost which
allows the type of memory accessed by users to be passed in as a hint,
so that we can more accurately determine if the GEP can be folded into
its users.

If AccessType is not provided, getGEPCost falls back to the old
behaviour of using the PointeeType to guess the memory access type. This
can be revisited in a later patch.

Also for now, only GEPs with exactly one user use the access type hint.
Whilst we could look through all users and use all access types to
determine if we can fold the GEP, this patch avoids doing so to prevent
O(N) behaviour.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

luke created this revision.May 4 2023, 12:18 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2023, 12:18 PM

Herald added subscribers: kosarev, pmatos, StephenFan and 27 others. · View Herald Transcript

luke requested review of this revision.May 4 2023, 12:18 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2023, 12:18 PM

Herald added subscribers: llvm-commits, • pcwang-thead, MaskRay. · View Herald Transcript

luke added a parent revision: D149888: [RISCV][SLP] Add tests for unprofitable SLP vectorization due to GEP. NFC.May 4 2023, 12:18 PM

luke edited the summary of this revision. (Show Details)May 4 2023, 12:21 PM

luke added inline comments.May 4 2023, 12:27 PM

llvm/include/llvm/Analysis/TargetTransformInfo.h
301	Not overly thrilled with introducing a struct just for this, nor with the constructors. Suggestions are very welcome here
llvm/lib/Analysis/TargetTransformInfo.cpp
229–233	I remember seeing this logic being duplicated in a few places across llvm, earlycse is the first one that comes to mind. It might be worthwhile addressing this in a follow up patch
llvm/test/Transforms/SLPVectorizer/RISCV/gep.ll
18 ↗	(On Diff #519600)	This is the main result from all the noise in the test cases.

You should be able to avoid the LICM changes by adjusting the code at https://github.com/llvm/llvm-project/blob/bfb7c99f3aeab09236adf1f684f7144f384c6dd7/llvm/lib/Transforms/Scalar/LICM.cpp#L1346-L1361 to only pass the users in the loop (and drop the separate user iteration that code does).

Something I'm concerned about here is that your default implementation is going to scan all users of the instruction, which means that the cost query is now O(n) instead of O(1). This is probably not acceptable for compile-time reasons. Can we get away with only inspecting a single user in that case and assume it is representative?

This change seems to cause some pretty large code size changes on x86 at least (https://llvm-compile-time-tracker.com/compare.php?from=b77265711b390a6905e8901694d75f4b3812c71a&to=113f1989fa7e469407942249b4086e3a28da2bf4&stat=size-text), so it seems like a pretty high impact change. I can't say whether those changes are good or bad though :)

Harbormaster completed remote builds in B230058: Diff 519600.May 4 2023, 1:19 PM

Only account for users in loop in LICM, fix inliner test

In D149889#4319936, @nikic wrote:

You should be able to avoid the LICM changes by adjusting the code at https://github.com/llvm/llvm-project/blob/bfb7c99f3aeab09236adf1f684f7144f384c6dd7/llvm/lib/Transforms/Scalar/LICM.cpp#L1346-L1361 to only pass the users in the loop (and drop the separate user iteration that code does).

Thanks, just tried it and it seems to work.

Something I'm concerned about here is that your default implementation is going to scan all users of the instruction, which means that the cost query is now O(n) instead of O(1). This is probably not acceptable for compile-time reasons. Can we get away with only inspecting a single user in that case and assume it is representative?

Yeah I think that should still work for our SLP use case, sounds like a good compromise. Will try that and report back.

This change seems to cause some pretty large code size changes on x86 at least (https://llvm-compile-time-tracker.com/compare.php?from=b77265711b390a6905e8901694d75f4b3812c71a&to=113f1989fa7e469407942249b4086e3a28da2bf4&stat=size-text), so it seems like a pretty high impact change. I can't say whether those changes are good or bad though :)

Indeed, from a quick look it looks like it might be affecting the inliner?

Harbormaster completed remote builds in B230209: Diff 519811.May 5 2023, 5:51 AM

luke added a parent revision: D150583: [IR] Add getAccessType to Instruction.May 15 2023, 9:04 AM

Only use the first user to avoid O(N) compile times.
Move the memory access type logic into Instruction.h in a dependent patch.

Remove redundant empty check

luke added inline comments.May 15 2023, 9:10 AM

llvm/lib/Transforms/Scalar/LICM.cpp
1346–1354 ↗	(On Diff #522223)	@nikic I kept the O(N) behaviour here because it looks like we're already iterating through all the users below. But this area seems sensitive. Do we want O(1) behaviour in this block too?

Harbormaster completed remote builds in B232033: Diff 522223.May 15 2023, 11:34 AM

asb added inline comments.May 17 2023, 3:49 AM

llvm/include/llvm/Analysis/TargetTransformInfo.h
286–296	Perhaps it's worth expanding the doc comment to explain how AccessTypes may be used? e.g. "Callers may pass in the access type of users of the GEP in order to allow a more accurate cost, for instance on targets where the legal addressing modes are different for different types." Of course personal tastes on the amount of doc comments to provide can vary....

luke added inline comments.May 17 2023, 5:38 AM

llvm/include/llvm/Analysis/TargetTransformInfo.h
286–296	It definitely needs some documentation, thanks for pointing that out. The usage of AccessTypes can be quite subtle when it comes to users that aren't loads/stores

Add documentation comments

ABataev added inline comments.May 17 2023, 5:47 AM

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
1018	Shall we use getArithmeticInstrCost(ADD) instead of TCC_Basic?
1080	Same question, cost of int add instead of basic?

luke added inline comments.May 17 2023, 6:37 AM

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
1080	I agree we should use getArithmeticInstrCost here, but I have some ideas about adding a new target hook to control this and was wondering if I could defer it to that patch? On RISC-V if `Scale != 1` then we will need an extra `mul` instruction, as opposed to aarch64 which can perform both an add and multiply in one. I think this would make sense to expose the address calculation cost as a hook since it's target dependent, and we already have all the information available here to pass onto TTI.

Harbormaster completed remote builds in B232583: Diff 523014.May 17 2023, 6:49 AM

ABataev added inline comments.May 17 2023, 7:26 AM

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
1080	Add FIXME/TODOs here then

Add TODO about modelling with arithmetic instr

luke marked 2 inline comments as done.May 18 2023, 4:28 AM

ABataev added inline comments.May 18 2023, 5:51 AM

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
1018	return BaseGV ? TTI::TCC_Basic : TTI::TCC_Free;

Harbormaster completed remote builds in B232834: Diff 523338.May 18 2023, 6:12 AM

luke mentioned this in D149654: [SLP][RISCV] Account for offset folding in getPointersChainCost.May 19 2023, 5:16 AM

luke mentioned this in D150862: [RISCV][CodeGenPrepare] Select the optimal base offset for GEPs with large offset.May 19 2023, 7:02 AM

luke mentioned this in rGc27a0b21c578: [SLP][RISCV] Account for offset folding in getPointersChainCost.May 22 2023, 5:55 AM

nikic added inline comments.May 28 2023, 12:23 PM

llvm/include/llvm/Analysis/TargetTransformInfo.h
293	to determine
296	stray "which"?
309	ArrayRef should suffice?
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
1018	Why is the BaseGV case not free as well?
llvm/lib/Transforms/Scalar/LICM.cpp
1346–1354 ↗	(On Diff #522223)	It's okay, but you should drop the extra loop below. It's purpose is to essentially do the check you're doing here, just in a very crude way.

Rebase and address review comments

Herald added a subscriber: wangpc. · View Herald TranscriptJun 21 2023, 5:58 AM

Harbormaster completed remote builds in B240219: Diff 533228.Jun 21 2023, 5:58 AM

@nikic @ABataev thanks for the review, and sorry for the long delay here. I've rebased this and have run some benchmarks now that SLP is enabled by default for RISC-V.
I'm still seeing a reduction in the number of small unprofitable VFs on RISC-V with this patch, and it also seems to make a few more VF=8s profitable.
As a rough overview, here's the number vsetivlis on the benchmark before:

vsetivli : 467 total
1: 26
2: 247
3: 2
4: 169
5: 1
6: 0
7: 1
8: 17

And after:

vsetivli : 454 total
1: 24
2: 235
3: 2
4: 168
5: 1
6: 0
7: 1
8: 19

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
1018	This is just to match the behaviour of the previous `if (Operands.empty())` check. I'll refactor this so that they share this logic

A macro comment on the approach here. I think you're trying to do too much in a single change, and that makes it very hard to review.

I would advise doing the following:

Remove the array of users API. Instead, pass a single optional access type.
In TTIImpl, get the access type of the *sole user*. If there isn't a sole user, *do not pass one*.
As a first step, use the access type in the legal addressing mode if available, and *otherwise the pointee type*. This is slightly wrong, but is wrong in the "it's very close to what we used to do" kinda of way which will reduce test deltas significantly. You can change this behavior in a follow up change if profitable.
Add tests *via cost model* which demonstrate the change above. You should be able to (on RISCV) demonstrate the difference between access type of vector vs scalar.
All of the other changes in this review - i.e. everything in lib/Transform - should be independent changes structured after the above.

Make AccessType just a single pointer
Isolate changes to just cost model for now: will follow up with SLP and other users later
Preserve original behaviour if accesstype isn't passed

In D149889#4438606, @reames wrote:

A macro comment on the approach here. I think you're trying to do too much in a single change, and that makes it very hard to review.

I would advise doing the following:

Remove the array of users API. Instead, pass a single optional access type.

In TTIImpl, get the access type of the *sole user*. If there isn't a sole user, *do not pass one*.

As a first step, use the access type in the legal addressing mode if available, and *otherwise the pointee type*. This is slightly wrong, but is wrong in the "it's very close to what we used to do" kinda of way which will reduce test deltas significantly. You can change this behavior in a follow up change if profitable.

Add tests *via cost model* which demonstrate the change above. You should be able to (on RISCV) demonstrate the difference between access type of vector vs scalar.

All of the other changes in this review - i.e. everything in lib/Transform - should be independent changes structured after the above.

Agreed, thanks for the pointers. I've split it up and the diff is much smaller now. Will follow up with other patches to do the lib/Transform changes.

luke mentioned this in D153570: [SLP] Explicitly pass AccessTy to getGEPCost.Jun 22 2023, 9:28 AM

luke added a child revision: D153570: [SLP] Explicitly pass AccessTy to getGEPCost.Jun 22 2023, 9:29 AM

luke mentioned this in D153574: [CostModel] Use operands argument in getInstructionCost in more places.Jun 22 2023, 10:07 AM

Harbormaster completed remote builds in B240534: Diff 533642.Jun 22 2023, 11:46 AM

luke mentioned this in rG1c70c2bc2c7d: [CostModel] Use operands argument in getInstructionCost in more places.Jun 23 2023, 3:52 AM

LGTM w/required changes.

In addition to the inline code comment, make sure you update your commit message. The current review description is out of sync with the code and you must fix that before landing.

llvm/lib/Target/X86/X86TargetTransformInfo.cpp
4972	Pass nullptr here so that you don't change the x86 behavior.

This revision is now accepted and ready to land.Jun 28 2023, 10:37 AM

luke edited the summary of this revision. (Show Details)Jun 28 2023, 2:58 PM

luke edited the summary of this revision. (Show Details)

Closed by commit rGa68dcd09e808: [TTI] Use users of GEP to guess access type in getGEPCost (authored by luke). · Explain WhyJun 29 2023, 5:45 AM

This revision was automatically updated to reflect the committed changes.

luke mentioned this in rGcb941f9220d5: [RISCV][SLP] Add tests for GEP costs.

luke added a commit: rGa68dcd09e808: [TTI] Use users of GEP to guess access type in getGEPCost.

luke removed a parent revision: D149888: [RISCV][SLP] Add tests for unprofitable SLP vectorization due to GEP. NFC.Jun 29 2023, 5:45 AM

luke mentioned this in rGd0d864f6f482: [SLP] Explicitly pass AccessTy to getGEPCost.Jun 29 2023, 11:10 AM

luke mentioned this in D155960: [NaryReassociate] Use new access type aware getGEPCost.Jul 21 2023, 8:00 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

TargetTransformInfo.h

17 lines

TargetTransformInfoImpl.h

34 lines

CodeGen/

BasicTTIImpl.h

4 lines

lib/

Analysis/

TargetTransformInfo.cpp

9 lines

Target/

RISCV/

RISCVTargetTransformInfo.cpp

2 lines

X86/

X86TargetTransformInfo.cpp

3 lines

Transforms/

Vectorize/

SLPVectorizer.cpp

3 lines

test/

Analysis/

CostModel/

ARM/

mve-gather-scatter-cost.ll

2 lines

RISCV/

gep.ll

20 lines

Diff 535750

llvm/include/llvm/Analysis/TargetTransformInfo.h

Show First 20 Lines • Show All 277 Lines • ▼ Show 20 Lines	public:
/// skipped by renaming the registers in the CPU, but they still are encoded		/// skipped by renaming the registers in the CPU, but they still are encoded
/// and thus wouldn't be considered 'free' here.		/// and thus wouldn't be considered 'free' here.
enum TargetCostConstants {		enum TargetCostConstants {
TCC_Free = 0, ///< Expected to fold away in lowering.		TCC_Free = 0, ///< Expected to fold away in lowering.
TCC_Basic = 1, ///< The cost of a typical 'add' instruction.		TCC_Basic = 1, ///< The cost of a typical 'add' instruction.
TCC_Expensive = 4 ///< The cost of a 'div' instruction on x86.		TCC_Expensive = 4 ///< The cost of a 'div' instruction on x86.
};		};

/// Estimate the cost of a GEP operation when lowered.		/// Estimate the cost of a GEP operation when lowered.
		///
		/// \p PointeeType is the source element type of the GEP.
		/// \p Ptr is the base pointer operand.
		/// \p Operands is the list of indices following the base pointer.
		///
		/// \p AccessType is a hint as to what type of memory might be accessed by
		/// users of the GEP. getGEPCost will use it to determine if the GEP can be
		nikicUnsubmitted Done Reply Inline Actions to determine nikic: to determine
		/// folded into the addressing mode of a load/store. If AccessType is null,
		/// then the resulting target type based off of PointeeType will be used as an
		/// approximation.
		asbUnsubmitted Done Reply Inline Actions Perhaps it's worth expanding the doc comment to explain how AccessTypes may be used? e.g. "Callers may pass in the access type of users of the GEP in order to allow a more accurate cost, for instance on targets where the legal addressing modes are different for different types." Of course personal tastes on the amount of doc comments to provide can vary.... asb: Perhaps it's worth expanding the doc comment to explain how AccessTypes may be used? e.g.
		lukeAuthorUnsubmitted Done Reply Inline Actions It definitely needs some documentation, thanks for pointing that out. The usage of AccessTypes can be quite subtle when it comes to users that aren't loads/stores luke: It definitely needs some documentation, thanks for pointing that out. The usage of AccessTypes…
		nikicUnsubmitted Done Reply Inline Actions stray "which"? nikic: stray "which"?
InstructionCost		InstructionCost
getGEPCost(Type PointeeType, const Value Ptr,		getGEPCost(Type PointeeType, const Value Ptr,
ArrayRef<const Value *> Operands,		ArrayRef<const Value > Operands, Type AccessType = nullptr,
TargetCostKind CostKind = TCK_SizeAndLatency) const;		TargetCostKind CostKind = TCK_SizeAndLatency) const;

		lukeAuthorUnsubmitted Done Reply Inline Actions Not overly thrilled with introducing a struct just for this, nor with the constructors. Suggestions are very welcome here luke: Not overly thrilled with introducing a struct just for this, nor with the constructors.
/// Describe known properties for a set of pointers.		/// Describe known properties for a set of pointers.
struct PointersChainInfo {		struct PointersChainInfo {
/// All the GEPs in a set have same base address.		/// All the GEPs in a set have same base address.
unsigned IsSameBaseAddress : 1;		unsigned IsSameBaseAddress : 1;
/// These properties only valid if SameBaseAddress is set.		/// These properties only valid if SameBaseAddress is set.
/// True if all pointers are separated by a unit stride.		/// True if all pointers are separated by a unit stride.
unsigned IsUnitStride : 1;		unsigned IsUnitStride : 1;
/// True if distance between any two neigbouring pointers is a known value.		/// True if distance between any two neigbouring pointers is a known value.
		nikicUnsubmitted Done Reply Inline Actions ArrayRef should suffice? nikic: ArrayRef should suffice?
unsigned IsKnownStride : 1;		unsigned IsKnownStride : 1;
unsigned Reserved : 29;		unsigned Reserved : 29;

bool isSameBase() const { return IsSameBaseAddress; }		bool isSameBase() const { return IsSameBaseAddress; }
bool isUnitStride() const { return IsSameBaseAddress && IsUnitStride; }		bool isUnitStride() const { return IsSameBaseAddress && IsUnitStride; }
bool isKnownStride() const { return IsSameBaseAddress && IsKnownStride; }		bool isKnownStride() const { return IsSameBaseAddress && IsKnownStride; }

static PointersChainInfo getUnitStride() {		static PointersChainInfo getUnitStride() {
▲ Show 20 Lines • Show All 1,367 Lines • ▼ Show 20 Lines
};		};

class TargetTransformInfo::Concept {		class TargetTransformInfo::Concept {
public:		public:
virtual ~Concept() = 0;		virtual ~Concept() = 0;
virtual const DataLayout &getDataLayout() const = 0;		virtual const DataLayout &getDataLayout() const = 0;
virtual InstructionCost getGEPCost(Type PointeeType, const Value Ptr,		virtual InstructionCost getGEPCost(Type PointeeType, const Value Ptr,
ArrayRef<const Value *> Operands,		ArrayRef<const Value *> Operands,
		Type *AccessType,
TTI::TargetCostKind CostKind) = 0;		TTI::TargetCostKind CostKind) = 0;
virtual InstructionCost		virtual InstructionCost
getPointersChainCost(ArrayRef<const Value > Ptrs, const Value Base,		getPointersChainCost(ArrayRef<const Value > Ptrs, const Value Base,
const TTI::PointersChainInfo &Info, Type *AccessTy,		const TTI::PointersChainInfo &Info, Type *AccessTy,
TTI::TargetCostKind CostKind) = 0;		TTI::TargetCostKind CostKind) = 0;
virtual unsigned getInliningThresholdMultiplier() const = 0;		virtual unsigned getInliningThresholdMultiplier() const = 0;
virtual unsigned adjustInliningThreshold(const CallBase *CB) = 0;		virtual unsigned adjustInliningThreshold(const CallBase *CB) = 0;
virtual int getInlinerVectorBonusPercent() const = 0;		virtual int getInlinerVectorBonusPercent() const = 0;
▲ Show 20 Lines • Show All 345 Lines • ▼ Show 20 Lines	public:
~Model() override = default;		~Model() override = default;

const DataLayout &getDataLayout() const override {		const DataLayout &getDataLayout() const override {
return Impl.getDataLayout();		return Impl.getDataLayout();
}		}

InstructionCost		InstructionCost
getGEPCost(Type PointeeType, const Value Ptr,		getGEPCost(Type PointeeType, const Value Ptr,
ArrayRef<const Value *> Operands,		ArrayRef<const Value > Operands, Type AccessType,
TargetTransformInfo::TargetCostKind CostKind) override {		TargetTransformInfo::TargetCostKind CostKind) override {
return Impl.getGEPCost(PointeeType, Ptr, Operands, CostKind);		return Impl.getGEPCost(PointeeType, Ptr, Operands, AccessType, CostKind);
}		}
InstructionCost getPointersChainCost(ArrayRef<const Value *> Ptrs,		InstructionCost getPointersChainCost(ArrayRef<const Value *> Ptrs,
const Value *Base,		const Value *Base,
const PointersChainInfo &Info,		const PointersChainInfo &Info,
Type *AccessTy,		Type *AccessTy,
TargetCostKind CostKind) override {		TargetCostKind CostKind) override {
return Impl.getPointersChainCost(Ptrs, Base, Info, AccessTy, CostKind);		return Impl.getPointersChainCost(Ptrs, Base, Info, AccessTy, CostKind);
}		}
▲ Show 20 Lines • Show All 786 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines
public:		public:
// Provide value semantics. MSVC requires that we spell all of these out.		// Provide value semantics. MSVC requires that we spell all of these out.
TargetTransformInfoImplBase(const TargetTransformInfoImplBase &Arg) = default;		TargetTransformInfoImplBase(const TargetTransformInfoImplBase &Arg) = default;
TargetTransformInfoImplBase(TargetTransformInfoImplBase &&Arg) : DL(Arg.DL) {}		TargetTransformInfoImplBase(TargetTransformInfoImplBase &&Arg) : DL(Arg.DL) {}

const DataLayout &getDataLayout() const { return DL; }		const DataLayout &getDataLayout() const { return DL; }

InstructionCost getGEPCost(Type PointeeType, const Value Ptr,		InstructionCost getGEPCost(Type PointeeType, const Value Ptr,
ArrayRef<const Value *> Operands,		ArrayRef<const Value > Operands, Type AccessType,
TTI::TargetCostKind CostKind) const {		TTI::TargetCostKind CostKind) const {
// In the basic model, we just assume that all-constant GEPs will be folded		// In the basic model, we just assume that all-constant GEPs will be folded
// into their uses via addressing modes.		// into their uses via addressing modes.
for (const Value *Operand : Operands)		for (const Value *Operand : Operands)
if (!isa<Constant>(Operand))		if (!isa<Constant>(Operand))
return TTI::TCC_Basic;		return TTI::TCC_Basic;

return TTI::TCC_Free;		return TTI::TCC_Free;
▲ Show 20 Lines • Show All 923 Lines • ▼ Show 20 Lines

protected:		protected:
explicit TargetTransformInfoImplCRTPBase(const DataLayout &DL) : BaseT(DL) {}		explicit TargetTransformInfoImplCRTPBase(const DataLayout &DL) : BaseT(DL) {}

public:		public:
using BaseT::getGEPCost;		using BaseT::getGEPCost;

InstructionCost getGEPCost(Type PointeeType, const Value Ptr,		InstructionCost getGEPCost(Type PointeeType, const Value Ptr,
ArrayRef<const Value *> Operands,		ArrayRef<const Value > Operands, Type AccessType,
TTI::TargetCostKind CostKind) {		TTI::TargetCostKind CostKind) {
assert(PointeeType && Ptr && "can't get GEPCost of nullptr");		assert(PointeeType && Ptr && "can't get GEPCost of nullptr");
assert(cast<PointerType>(Ptr->getType()->getScalarType())		assert(cast<PointerType>(Ptr->getType()->getScalarType())
->isOpaqueOrPointeeTypeMatches(PointeeType) &&		->isOpaqueOrPointeeTypeMatches(PointeeType) &&
"explicit pointee type doesn't match operand's pointee type");		"explicit pointee type doesn't match operand's pointee type");
auto *BaseGV = dyn_cast<GlobalValue>(Ptr->stripPointerCasts());		auto *BaseGV = dyn_cast<GlobalValue>(Ptr->stripPointerCasts());
bool HasBaseReg = (BaseGV == nullptr);		bool HasBaseReg = (BaseGV == nullptr);

Show All 11 Lines	InstructionCost getGEPCost(Type PointeeType, const Value Ptr,

for (auto I = Operands.begin(); I != Operands.end(); ++I, ++GTI) {		for (auto I = Operands.begin(); I != Operands.end(); ++I, ++GTI) {
TargetType = GTI.getIndexedType();		TargetType = GTI.getIndexedType();
// We assume that the cost of Scalar GEP with constant index and the		// We assume that the cost of Scalar GEP with constant index and the
// cost of Vector GEP with splat constant index are the same.		// cost of Vector GEP with splat constant index are the same.
const ConstantInt ConstIdx = dyn_cast<ConstantInt>(I);		const ConstantInt ConstIdx = dyn_cast<ConstantInt>(I);
if (!ConstIdx)		if (!ConstIdx)
if (auto Splat = getSplatValue(*I))		if (auto Splat = getSplatValue(*I))
ConstIdx = dyn_cast<ConstantInt>(Splat);		ConstIdx = dyn_cast<ConstantInt>(Splat);
		ABataevUnsubmitted Done Reply Inline Actions Shall we use getArithmeticInstrCost(ADD) instead of TCC_Basic? ABataev: Shall we use getArithmeticInstrCost(ADD) instead of TCC_Basic?
		ABataevUnsubmitted Not Done Reply Inline Actions return BaseGV ? TTI::TCC_Basic : TTI::TCC_Free; ABataev: return BaseGV ? TTI::TCC_Basic : TTI::TCC_Free;
		nikicUnsubmitted Not Done Reply Inline Actions Why is the BaseGV case not free as well? nikic: Why is the BaseGV case not free as well?
		lukeAuthorUnsubmitted Done Reply Inline Actions This is just to match the behaviour of the previous `if (Operands.empty())` check. I'll refactor this so that they share this logic luke: This is just to match the behaviour of the previous `if (Operands.empty())` check. I'll…
if (StructType *STy = GTI.getStructTypeOrNull()) {		if (StructType *STy = GTI.getStructTypeOrNull()) {
// For structures the index is always splat or scalar constant		// For structures the index is always splat or scalar constant
assert(ConstIdx && "Unexpected GEP index");		assert(ConstIdx && "Unexpected GEP index");
uint64_t Field = ConstIdx->getZExtValue();		uint64_t Field = ConstIdx->getZExtValue();
BaseOffset += DL.getStructLayout(STy)->getElementOffset(Field);		BaseOffset += DL.getStructLayout(STy)->getElementOffset(Field);
} else {		} else {
// If this operand is a scalable type, bail out early.		// If this operand is a scalable type, bail out early.
// TODO: handle scalable vectors		// TODO: handle scalable vectors
Show All 9 Lines	for (auto I = Operands.begin(); I != Operands.end(); ++I, ++GTI) {
if (Scale != 0)		if (Scale != 0)
// No addressing mode takes two scale registers.		// No addressing mode takes two scale registers.
return TTI::TCC_Basic;		return TTI::TCC_Basic;
Scale = ElementSize;		Scale = ElementSize;
}		}
}		}
}		}

		// If we haven't been provided a hint, use the target type for now.
		//
		// TODO: Take a look at potentially removing this: This is slightly wrong
		// as it's possible to have a GEP with a foldable target type but a memory
		// access that isn't foldable. For example, this load isn't foldable on
		// RISC-V:
		//
		// %p = getelementptr i32, ptr %base, i32 42
		// %x = load <2 x i32>, ptr %p
		if (!AccessType)
		AccessType = TargetType;

		// If the final address of the GEP is a legal addressing mode for the given
		// access type, then we can fold it into its users.
if (static_cast<T *>(this)->isLegalAddressingMode(		if (static_cast<T *>(this)->isLegalAddressingMode(
TargetType, const_cast<GlobalValue *>(BaseGV),		AccessType, const_cast<GlobalValue *>(BaseGV),
BaseOffset.sextOrTrunc(64).getSExtValue(), HasBaseReg, Scale,		BaseOffset.sextOrTrunc(64).getSExtValue(), HasBaseReg, Scale,
Ptr->getType()->getPointerAddressSpace()))		Ptr->getType()->getPointerAddressSpace()))
return TTI::TCC_Free;		return TTI::TCC_Free;

		// TODO: Instead of returning TCC_Basic here, we should use
		// getArithmeticInstrCost. Or better yet, provide a hook to let the target
		// model it.
return TTI::TCC_Basic;		return TTI::TCC_Basic;
}		}

InstructionCost getPointersChainCost(ArrayRef<const Value *> Ptrs,		InstructionCost getPointersChainCost(ArrayRef<const Value *> Ptrs,
const Value *Base,		const Value *Base,
const TTI::PointersChainInfo &Info,		const TTI::PointersChainInfo &Info,
Type *AccessTy,		Type *AccessTy,
TTI::TargetCostKind CostKind) {		TTI::TargetCostKind CostKind) {
InstructionCost Cost = TTI::TCC_Free;		InstructionCost Cost = TTI::TCC_Free;
// In the basic model we take into account GEP instructions only		// In the basic model we take into account GEP instructions only
// (although here can come alloca instruction, a value, constants and/or		// (although here can come alloca instruction, a value, constants and/or
// constant expressions, PHIs, bitcasts ... whatever allowed to be used as a		// constant expressions, PHIs, bitcasts ... whatever allowed to be used as a
// pointer). Typically, if Base is a not a GEP-instruction and all the		// pointer). Typically, if Base is a not a GEP-instruction and all the
// pointers are relative to the same base address, all the rest are		// pointers are relative to the same base address, all the rest are
		ABataevUnsubmitted Done Reply Inline Actions Same question, cost of int add instead of basic? ABataev: Same question, cost of int add instead of basic?
		lukeAuthorUnsubmitted Done Reply Inline Actions I agree we should use getArithmeticInstrCost here, but I have some ideas about adding a new target hook to control this and was wondering if I could defer it to that patch? On RISC-V if `Scale != 1` then we will need an extra `mul` instruction, as opposed to aarch64 which can perform both an add and multiply in one. I think this would make sense to expose the address calculation cost as a hook since it's target dependent, and we already have all the information available here to pass onto TTI. luke: I agree we should use getArithmeticInstrCost here, but I have some ideas about adding a new…
		ABataevUnsubmitted Done Reply Inline Actions Add FIXME/TODOs here then ABataev: Add FIXME/TODOs here then
// either GEP instructions, PHIs, bitcasts or constants. When we have same		// either GEP instructions, PHIs, bitcasts or constants. When we have same
// base, we just calculate cost of each non-Base GEP as an ADD operation if		// base, we just calculate cost of each non-Base GEP as an ADD operation if
// any their index is a non-const.		// any their index is a non-const.
// If no known dependecies between the pointers cost is calculated as a sum		// If no known dependecies between the pointers cost is calculated as a sum
// of costs of GEP instructions.		// of costs of GEP instructions.
for (const Value *V : Ptrs) {		for (const Value *V : Ptrs) {
const auto *GEP = dyn_cast<GetElementPtrInst>(V);		const auto *GEP = dyn_cast<GetElementPtrInst>(V);
if (!GEP)		if (!GEP)
continue;		continue;
if (Info.isSameBase() && V != Base) {		if (Info.isSameBase() && V != Base) {
if (GEP->hasAllConstantIndices())		if (GEP->hasAllConstantIndices())
continue;		continue;
Cost += static_cast<T *>(this)->getArithmeticInstrCost(		Cost += static_cast<T *>(this)->getArithmeticInstrCost(
Instruction::Add, GEP->getType(), CostKind,		Instruction::Add, GEP->getType(), CostKind,
{TTI::OK_AnyValue, TTI::OP_None}, {TTI::OK_AnyValue, TTI::OP_None},		{TTI::OK_AnyValue, TTI::OP_None}, {TTI::OK_AnyValue, TTI::OP_None},
std::nullopt);		std::nullopt);
} else {		} else {
SmallVector<const Value *> Indices(GEP->indices());		SmallVector<const Value *> Indices(GEP->indices());
Cost += static_cast<T *>(this)->getGEPCost(GEP->getSourceElementType(),		Cost += static_cast<T *>(this)->getGEPCost(GEP->getSourceElementType(),
GEP->getPointerOperand(),		GEP->getPointerOperand(),
Indices, CostKind);		Indices, nullptr, CostKind);
}		}
}		}
return Cost;		return Cost;
}		}

InstructionCost getInstructionCost(const User *U,		InstructionCost getInstructionCost(const User *U,
ArrayRef<const Value *> Operands,		ArrayRef<const Value *> Operands,
TTI::TargetCostKind CostKind) {		TTI::TargetCostKind CostKind) {
Show All 35 Lines	InstructionCost getInstructionCost(const User *U,
case Instruction::Freeze:		case Instruction::Freeze:
return TTI::TCC_Free;		return TTI::TCC_Free;
case Instruction::Alloca:		case Instruction::Alloca:
if (cast<AllocaInst>(U)->isStaticAlloca())		if (cast<AllocaInst>(U)->isStaticAlloca())
return TTI::TCC_Free;		return TTI::TCC_Free;
break;		break;
case Instruction::GetElementPtr: {		case Instruction::GetElementPtr: {
const auto *GEP = cast<GEPOperator>(U);		const auto *GEP = cast<GEPOperator>(U);
		Type *AccessType = nullptr;
		// For now, only provide the AccessType in the simple case where the GEP
		// only has one user.
		if (GEP->hasOneUser() && I)
		AccessType = I->user_back()->getAccessType();

return TargetTTI->getGEPCost(GEP->getSourceElementType(),		return TargetTTI->getGEPCost(GEP->getSourceElementType(),
Operands.front(), Operands.drop_front(),		Operands.front(), Operands.drop_front(),
CostKind);		AccessType, CostKind);
}		}
case Instruction::Add:		case Instruction::Add:
case Instruction::FAdd:		case Instruction::FAdd:
case Instruction::Sub:		case Instruction::Sub:
case Instruction::FSub:		case Instruction::FSub:
case Instruction::Mul:		case Instruction::Mul:
case Instruction::FMul:		case Instruction::FMul:
case Instruction::UDiv:		case Instruction::UDiv:
▲ Show 20 Lines • Show All 220 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/BasicTTIImpl.h

Show First 20 Lines • Show All 414 Lines • ▼ Show 20 Lines	public:
}		}

unsigned getRegUsageForType(Type *Ty) {		unsigned getRegUsageForType(Type *Ty) {
EVT ETy = getTLI()->getValueType(DL, Ty);		EVT ETy = getTLI()->getValueType(DL, Ty);
return getTLI()->getNumRegisters(Ty->getContext(), ETy);		return getTLI()->getNumRegisters(Ty->getContext(), ETy);
}		}

InstructionCost getGEPCost(Type PointeeType, const Value Ptr,		InstructionCost getGEPCost(Type PointeeType, const Value Ptr,
ArrayRef<const Value *> Operands,		ArrayRef<const Value > Operands, Type AccessType,
TTI::TargetCostKind CostKind) {		TTI::TargetCostKind CostKind) {
return BaseT::getGEPCost(PointeeType, Ptr, Operands, CostKind);		return BaseT::getGEPCost(PointeeType, Ptr, Operands, AccessType, CostKind);
}		}

unsigned getEstimatedNumberOfCaseClusters(const SwitchInst &SI,		unsigned getEstimatedNumberOfCaseClusters(const SwitchInst &SI,
unsigned &JumpTableSize,		unsigned &JumpTableSize,
ProfileSummaryInfo *PSI,		ProfileSummaryInfo *PSI,
BlockFrequencyInfo *BFI) {		BlockFrequencyInfo *BFI) {
/// Try to find the estimated number of clusters. Note that the number of		/// Try to find the estimated number of clusters. Note that the number of
/// clusters identified in this function could be different from the actual		/// clusters identified in this function could be different from the actual
▲ Show 20 Lines • Show All 2,042 Lines • Show Last 20 Lines

llvm/lib/Analysis/TargetTransformInfo.cpp

Show First 20 Lines • Show All 220 Lines • ▼ Show 20 Lines	unsigned TargetTransformInfo::getCallerAllocaCost(const CallBase *CB,
const AllocaInst *AI) const {		const AllocaInst *AI) const {
return TTIImpl->getCallerAllocaCost(CB, AI);		return TTIImpl->getCallerAllocaCost(CB, AI);
}		}

int TargetTransformInfo::getInlinerVectorBonusPercent() const {		int TargetTransformInfo::getInlinerVectorBonusPercent() const {
return TTIImpl->getInlinerVectorBonusPercent();		return TTIImpl->getInlinerVectorBonusPercent();
}		}

InstructionCost		InstructionCost TargetTransformInfo::getGEPCost(
TargetTransformInfo::getGEPCost(Type PointeeType, const Value Ptr,		Type PointeeType, const Value Ptr, ArrayRef<const Value *> Operands,
ArrayRef<const Value *> Operands,		Type *AccessType, TTI::TargetCostKind CostKind) const {
TTI::TargetCostKind CostKind) const {		return TTIImpl->getGEPCost(PointeeType, Ptr, Operands, AccessType, CostKind);
return TTIImpl->getGEPCost(PointeeType, Ptr, Operands, CostKind);
}		}
		lukeAuthorUnsubmitted Done Reply Inline Actions I remember seeing this logic being duplicated in a few places across llvm, earlycse is the first one that comes to mind. It might be worthwhile addressing this in a follow up patch luke: I remember seeing this logic being duplicated in a few places across llvm, earlycse is the…

InstructionCost TargetTransformInfo::getPointersChainCost(		InstructionCost TargetTransformInfo::getPointersChainCost(
ArrayRef<const Value > Ptrs, const Value Base,		ArrayRef<const Value > Ptrs, const Value Base,
const TTI::PointersChainInfo &Info, Type *AccessTy,		const TTI::PointersChainInfo &Info, Type *AccessTy,
TTI::TargetCostKind CostKind) const {		TTI::TargetCostKind CostKind) const {
assert((Base \|\| !Info.isSameBase()) &&		assert((Base \|\| !Info.isSameBase()) &&
"If pointers have same base address it has to be provided.");		"If pointers have same base address it has to be provided.");
return TTIImpl->getPointersChainCost(Ptrs, Base, Info, AccessTy, CostKind);		return TTIImpl->getPointersChainCost(Ptrs, Base, Info, AccessTy, CostKind);
▲ Show 20 Lines • Show All 1,047 Lines • Show Last 20 Lines

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

Show First 20 Lines • Show All 1,635 Lines • ▼ Show 20 Lines	if (Info.isSameBase() && V != Base) {
continue;		continue;
Cost += getArithmeticInstrCost(Instruction::Add, GEP->getType(), CostKind,		Cost += getArithmeticInstrCost(Instruction::Add, GEP->getType(), CostKind,
{TTI::OK_AnyValue, TTI::OP_None},		{TTI::OK_AnyValue, TTI::OP_None},
{TTI::OK_AnyValue, TTI::OP_None},		{TTI::OK_AnyValue, TTI::OP_None},
std::nullopt);		std::nullopt);
} else {		} else {
SmallVector<const Value *> Indices(GEP->indices());		SmallVector<const Value *> Indices(GEP->indices());
Cost += getGEPCost(GEP->getSourceElementType(), GEP->getPointerOperand(),		Cost += getGEPCost(GEP->getSourceElementType(), GEP->getPointerOperand(),
Indices, CostKind);		Indices, nullptr, CostKind);
}		}
}		}
return Cost;		return Cost;
}		}

void RISCVTTIImpl::getUnrollingPreferences(Loop *L, ScalarEvolution &SE,		void RISCVTTIImpl::getUnrollingPreferences(Loop *L, ScalarEvolution &SE,
TTI::UnrollingPreferences &UP,		TTI::UnrollingPreferences &UP,
OptimizationRemarkEmitter *ORE) {		OptimizationRemarkEmitter *ORE) {
▲ Show 20 Lines • Show All 119 Lines • Show Last 20 Lines

llvm/lib/Target/X86/X86TargetTransformInfo.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 4,963 Lines • ▼ Show 20 Lines	X86TTIImpl::getPointersChainCost(ArrayRef<const Value *> Ptrs,
Type *AccessTy, TTI::TargetCostKind CostKind) {		Type *AccessTy, TTI::TargetCostKind CostKind) {
if (Info.isSameBase() && Info.isKnownStride()) {		if (Info.isSameBase() && Info.isKnownStride()) {
// If all the pointers have known stride all the differences are translated		// If all the pointers have known stride all the differences are translated
// into constants. X86 memory addressing allows encoding it into		// into constants. X86 memory addressing allows encoding it into
// displacement. So we just need to take the base GEP cost.		// displacement. So we just need to take the base GEP cost.
if (const auto *BaseGEP = dyn_cast<GetElementPtrInst>(Base)) {		if (const auto *BaseGEP = dyn_cast<GetElementPtrInst>(Base)) {
SmallVector<const Value *> Indices(BaseGEP->indices());		SmallVector<const Value *> Indices(BaseGEP->indices());
return getGEPCost(BaseGEP->getSourceElementType(),		return getGEPCost(BaseGEP->getSourceElementType(),
BaseGEP->getPointerOperand(), Indices, CostKind);		BaseGEP->getPointerOperand(), Indices, nullptr,
reamesUnsubmitted Not Done Reply Inline Actions Pass nullptr here so that you don't change the x86 behavior. reames: Pass nullptr here so that you don't change the x86 behavior.
		CostKind);
}		}
return TTI::TCC_Free;		return TTI::TCC_Free;
}		}
return BaseT::getPointersChainCost(Ptrs, Base, Info, AccessTy, CostKind);		return BaseT::getPointersChainCost(Ptrs, Base, Info, AccessTy, CostKind);
}		}

InstructionCost X86TTIImpl::getAddressComputationCost(Type *Ty,		InstructionCost X86TTIImpl::getAddressComputationCost(Type *Ty,
ScalarEvolution *SE,		ScalarEvolution *SE,
▲ Show 20 Lines • Show All 1,666 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,448 Lines • ▼ Show 20 Lines	if (isa<LoadInst, StoreInst>(VL0)) {

// Remark: it not quite correct to use scalar GEP cost for a vector GEP,		// Remark: it not quite correct to use scalar GEP cost for a vector GEP,
// but it's not clear how to do that without having vector GEP arguments		// but it's not clear how to do that without having vector GEP arguments
// ready.		// ready.
// Perhaps using just TTI::TCC_Free/TTI::TCC_Basic would be better option.		// Perhaps using just TTI::TCC_Free/TTI::TCC_Basic would be better option.
if (const auto *Base = dyn_cast<GetElementPtrInst>(BasePtr)) {		if (const auto *Base = dyn_cast<GetElementPtrInst>(BasePtr)) {
SmallVector<const Value *> Indices(Base->indices());		SmallVector<const Value *> Indices(Base->indices());
VecCost = TTI->getGEPCost(Base->getSourceElementType(),		VecCost = TTI->getGEPCost(Base->getSourceElementType(),
Base->getPointerOperand(), Indices, CostKind);		Base->getPointerOperand(), Indices, nullptr,
		CostKind);
}		}
}		}

LLVM_DEBUG(dumpTreeCosts(E, 0, VecCost, ScalarCost,		LLVM_DEBUG(dumpTreeCosts(E, 0, VecCost, ScalarCost,
"Calculated GEPs cost for Tree"));		"Calculated GEPs cost for Tree"));

return VecCost - ScalarCost;		return VecCost - ScalarCost;
};		};
▲ Show 20 Lines • Show All 7,566 Lines • Show Last 20 Lines

llvm/test/Analysis/CostModel/ARM/mve-gather-scatter-cost.ll

	Show First 20 Lines • Show All 520 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %gep3 = getelementptr i8, ptr %base, <16 x i32> %indsext			; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %gep3 = getelementptr i8, ptr %base, <16 x i32> %indsext
	; CHECK-NEXT: Cost Model: Found an estimated cost of 224 for instruction: %res3 = call <16 x i8> @llvm.masked.gather.v16i8.v16p0(<16 x ptr> %gep3, i32 2, <16 x i1> %mask, <16 x i8> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 224 for instruction: %res3 = call <16 x i8> @llvm.masked.gather.v16i8.v16p0(<16 x ptr> %gep3, i32 2, <16 x i1> %mask, <16 x i8> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 224 for instruction: call void @llvm.masked.scatter.v16i8.v16p0(<16 x i8> %res3, <16 x ptr> %gep3, i32 2, <16 x i1> %mask)			; CHECK-NEXT: Cost Model: Found an estimated cost of 224 for instruction: call void @llvm.masked.scatter.v16i8.v16p0(<16 x i8> %res3, <16 x ptr> %gep3, i32 2, <16 x i1> %mask)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %gepbs = getelementptr i16, ptr %base16, <16 x i32> %indzext			; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %gepbs = getelementptr i16, ptr %base16, <16 x i32> %indzext
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %gepbsb = bitcast <16 x ptr> %gepbs to <16 x ptr>			; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %gepbsb = bitcast <16 x ptr> %gepbs to <16 x ptr>
	; CHECK-NEXT: Cost Model: Found an estimated cost of 224 for instruction: %resbs = call <16 x i8> @llvm.masked.gather.v16i8.v16p0(<16 x ptr> %gepbsb, i32 2, <16 x i1> %mask, <16 x i8> undef)			; CHECK-NEXT: Cost Model: Found an estimated cost of 224 for instruction: %resbs = call <16 x i8> @llvm.masked.gather.v16i8.v16p0(<16 x ptr> %gepbsb, i32 2, <16 x i1> %mask, <16 x i8> undef)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 224 for instruction: call void @llvm.masked.scatter.v16i8.v16p0(<16 x i8> %resbs, <16 x ptr> %gepbsb, i32 2, <16 x i1> %mask)			; CHECK-NEXT: Cost Model: Found an estimated cost of 224 for instruction: call void @llvm.masked.scatter.v16i8.v16p0(<16 x i8> %resbs, <16 x ptr> %gepbsb, i32 2, <16 x i1> %mask)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 74 for instruction: %indzext4 = zext <16 x i8> %ind8 to <16 x i32>			; CHECK-NEXT: Cost Model: Found an estimated cost of 74 for instruction: %indzext4 = zext <16 x i8> %ind8 to <16 x i32>
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %gep4 = getelementptr i8, ptr %base, <16 x i32> %indzext			; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %gep4 = getelementptr i8, ptr %base, <16 x i32> %indzext
	; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %indtrunc = trunc <16 x i32> %ind32 to <16 x i8>			; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: %indtrunc = trunc <16 x i32> %ind32 to <16 x i8>
	; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: call void @llvm.masked.scatter.v16i8.v16p0(<16 x i8> %indtrunc, <16 x ptr> %gep4, i32 2, <16 x i1> %mask)			; CHECK-NEXT: Cost Model: Found an estimated cost of 32 for instruction: call void @llvm.masked.scatter.v16i8.v16p0(<16 x i8> %indtrunc, <16 x ptr> %gep4, i32 2, <16 x i1> %mask)
	; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void			; CHECK-NEXT: Cost Model: Found an estimated cost of 0 for instruction: ret void
	;			;
	; no offset ext			; no offset ext
	%gep1 = getelementptr i8, ptr %base, <16 x i32> %ind32			%gep1 = getelementptr i8, ptr %base, <16 x i32> %ind32
	%res1 = call <16 x i8> @llvm.masked.gather.v16i8.v16p0(<16 x ptr> %gep1, i32 1, <16 x i1> %mask, <16 x i8> undef)			%res1 = call <16 x i8> @llvm.masked.gather.v16i8.v16p0(<16 x ptr> %gep1, i32 1, <16 x i1> %mask, <16 x i8> undef)
	call void @llvm.masked.scatter.v16i8.v16p0(<16 x i8> %res1, <16 x ptr> %gep1, i32 2, <16 x i1> %mask)			call void @llvm.masked.scatter.v16i8.v16p0(<16 x i8> %res1, <16 x ptr> %gep1, i32 2, <16 x i1> %mask)
	▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

llvm/test/Analysis/CostModel/RISCV/gep.ll

	Show First 20 Lines • Show All 255 Lines • ▼ Show 20 Lines
	}			}

	; Ensure that memory operations of a different type than the pointer source type			; Ensure that memory operations of a different type than the pointer source type
	; use the correct type to determine if folding is possible. These operations			; use the correct type to determine if folding is possible. These operations
	; are on vector types so there should be a cost for the GEP as the offset cannot			; are on vector types so there should be a cost for the GEP as the offset cannot
	; be folded into the instruction.			; be folded into the instruction.
	define void @non_foldable_vector_uses(ptr %base, <2 x ptr> %base.vec) {			define void @non_foldable_vector_uses(ptr %base, <2 x ptr> %base.vec) {
	; RVI-LABEL: 'non_foldable_vector_uses'			; RVI-LABEL: 'non_foldable_vector_uses'
	; RVI-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %1 = getelementptr i8, ptr %base, i32 42			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = getelementptr i8, ptr %base, i32 42
	; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %x1 = load volatile <2 x i8>, ptr %1, align 2			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %x1 = load volatile <2 x i8>, ptr %1, align 2
	; RVI-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %2 = getelementptr i8, ptr %base, i32 42			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %2 = getelementptr i8, ptr %base, i32 42
	; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %x2 = call <2 x i8> @llvm.masked.load.v2i8.p0(ptr %2, i32 1, <2 x i1> undef, <2 x i8> undef)			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %x2 = call <2 x i8> @llvm.masked.load.v2i8.p0(ptr %2, i32 1, <2 x i1> undef, <2 x i8> undef)
	; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = getelementptr i8, <2 x ptr> %base.vec, <2 x i32> <i32 42, i32 43>			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %3 = getelementptr i8, <2 x ptr> %base.vec, <2 x i32> <i32 42, i32 43>
	; RVI-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %x3 = call <2 x i8> @llvm.masked.gather.v2i8.v2p0(<2 x ptr> %3, i32 1, <2 x i1> undef, <2 x i8> undef)			; RVI-NEXT: Cost Model: Found an estimated cost of 2 for instruction: %x3 = call <2 x i8> @llvm.masked.gather.v2i8.v2p0(<2 x ptr> %3, i32 1, <2 x i1> undef, <2 x i8> undef)
	; RVI-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %4 = getelementptr i8, ptr %base, i32 42			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %4 = getelementptr i8, ptr %base, i32 42
	; RVI-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %x4 = call <2 x i8> @llvm.masked.expandload.v2i8(ptr %4, <2 x i1> undef, <2 x i8> undef)			; RVI-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %x4 = call <2 x i8> @llvm.masked.expandload.v2i8(ptr %4, <2 x i1> undef, <2 x i8> undef)
	; RVI-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %5 = getelementptr i8, ptr %base, i32 42			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %5 = getelementptr i8, ptr %base, i32 42
	; RVI-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %x5 = call <2 x i8> @llvm.vp.load.v2i8.p0(ptr %5, <2 x i1> undef, i32 undef)			; RVI-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %x5 = call <2 x i8> @llvm.vp.load.v2i8.p0(ptr %5, <2 x i1> undef, i32 undef)
	; RVI-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %6 = getelementptr i8, ptr %base, i32 42			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %6 = getelementptr i8, ptr %base, i32 42
	; RVI-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %x6 = call <2 x i8> @llvm.experimental.vp.strided.load.v2i8.p0.i64(ptr %6, i64 undef, <2 x i1> undef, i32 undef)			; RVI-NEXT: Cost Model: Found an estimated cost of 5 for instruction: %x6 = call <2 x i8> @llvm.experimental.vp.strided.load.v2i8.p0.i64(ptr %6, i64 undef, <2 x i1> undef, i32 undef)
	; RVI-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %7 = getelementptr i8, ptr %base, i32 42			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %7 = getelementptr i8, ptr %base, i32 42
	; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: store volatile <2 x i8> undef, ptr %7, align 2			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: store volatile <2 x i8> undef, ptr %7, align 2
	; RVI-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %8 = getelementptr i8, ptr %base, i32 42			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %8 = getelementptr i8, ptr %base, i32 42
	; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: call void @llvm.masked.store.v2i8.p0(<2 x i8> undef, ptr %8, i32 1, <2 x i1> undef)			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: call void @llvm.masked.store.v2i8.p0(<2 x i8> undef, ptr %8, i32 1, <2 x i1> undef)
	; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %9 = getelementptr i8, <2 x ptr> %base.vec, <2 x i32> <i32 42, i32 43>			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %9 = getelementptr i8, <2 x ptr> %base.vec, <2 x i32> <i32 42, i32 43>
	; RVI-NEXT: Cost Model: Found an estimated cost of 2 for instruction: call void @llvm.masked.scatter.v2i8.v2p0(<2 x i8> undef, <2 x ptr> %9, i32 1, <2 x i1> undef)			; RVI-NEXT: Cost Model: Found an estimated cost of 2 for instruction: call void @llvm.masked.scatter.v2i8.v2p0(<2 x i8> undef, <2 x ptr> %9, i32 1, <2 x i1> undef)
	; RVI-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %10 = getelementptr i8, ptr %base, i32 42			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %10 = getelementptr i8, ptr %base, i32 42
	; RVI-NEXT: Cost Model: Found an estimated cost of 12 for instruction: call void @llvm.masked.compressstore.v2i8(<2 x i8> undef, ptr %10, <2 x i1> undef)			; RVI-NEXT: Cost Model: Found an estimated cost of 12 for instruction: call void @llvm.masked.compressstore.v2i8(<2 x i8> undef, ptr %10, <2 x i1> undef)
	; RVI-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %11 = getelementptr i8, ptr %base, i32 42			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %11 = getelementptr i8, ptr %base, i32 42
	; RVI-NEXT: Cost Model: Found an estimated cost of 12 for instruction: call void @llvm.vp.store.v2i8.p0(<2 x i8> undef, ptr %11, <2 x i1> undef, i32 undef)			; RVI-NEXT: Cost Model: Found an estimated cost of 12 for instruction: call void @llvm.vp.store.v2i8.p0(<2 x i8> undef, ptr %11, <2 x i1> undef, i32 undef)
	; RVI-NEXT: Cost Model: Found an estimated cost of 0 for instruction: %12 = getelementptr i8, ptr %base, i32 42			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %12 = getelementptr i8, ptr %base, i32 42
	; RVI-NEXT: Cost Model: Found an estimated cost of 12 for instruction: call void @llvm.experimental.vp.strided.store.v2i8.p0.i64(<2 x i8> undef, ptr %12, i64 undef, <2 x i1> undef, i32 undef)			; RVI-NEXT: Cost Model: Found an estimated cost of 12 for instruction: call void @llvm.experimental.vp.strided.store.v2i8.p0.i64(<2 x i8> undef, ptr %12, i64 undef, <2 x i1> undef, i32 undef)
	; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void			; RVI-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
	;			;
	%1 = getelementptr i8, ptr %base, i32 42			%1 = getelementptr i8, ptr %base, i32 42
	%x1 = load volatile <2 x i8>, ptr %1			%x1 = load volatile <2 x i8>, ptr %1

	%2 = getelementptr i8, ptr %base, i32 42			%2 = getelementptr i8, ptr %base, i32 42
	%x2 = call <2 x i8> @llvm.masked.load.v2i8.p0(ptr %2, i32 1, <2 x i1> undef, <2 x i8> undef)			%x2 = call <2 x i8> @llvm.masked.load.v2i8.p0(ptr %2, i32 1, <2 x i1> undef, <2 x i8> undef)
	▲ Show 20 Lines • Show All 115 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[TTI] Use users of GEP to guess access type in getGEPCostClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 535750

llvm/include/llvm/Analysis/TargetTransformInfo.h

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

llvm/include/llvm/CodeGen/BasicTTIImpl.h

llvm/lib/Analysis/TargetTransformInfo.cpp

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

llvm/lib/Target/X86/X86TargetTransformInfo.cpp

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp

llvm/test/Analysis/CostModel/ARM/mve-gather-scatter-cost.ll

llvm/test/Analysis/CostModel/RISCV/gep.ll

[TTI] Use users of GEP to guess access type in getGEPCost
ClosedPublic