This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Target/RISCV/
-
Target/
-
RISCV/
2
RISCVTargetTransformInfo.cpp
-
test/Transforms/LoopVectorize/RISCV/
-
Transforms/
-
LoopVectorize/
-
RISCV/
-
masked_gather_scatter.ll

Differential D121677

[RISCV] Return Invalid cost in getGatherScatterOpCost instead of crashing for scalable vectors
AbandonedPublic

Authored by liaolucy on Mar 15 2022, 2:13 AM.

Download Raw Diff

Details

Reviewers

craig.topper
kito-cheng
frasercrmck
Jim
sdesmalen

Summary

getCommonMaskedMemoryOpCost tries to cast<FixedVectorType>
on a scalable vector type. Return invalid instead.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

liaolucy created this revision.Mar 15 2022, 2:13 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2022, 2:13 AM

Herald added subscribers: VincentWu, luke957, achieveartificialintelligence and 25 others. · View Herald Transcript

liaolucy requested review of this revision.Mar 15 2022, 2:13 AM

Herald added a project: Restricted Project. · View Herald TranscriptMar 15 2022, 2:13 AM

Herald added subscribers: llvm-commits, alextsao1999, • pcwang-thead and 2 others. · View Herald Transcript

I try to support scalable vectors in getGatherScatterOpCost, but I don't know how to support scalable vectors.

https://reviews.llvm.org/D115143 SVE adds on an overhead cost for gathers and scatters, which is a rough estimate based on performance investigations.

Does RVV add similar values？

https://reviews.llvm.org/D119529 maybe this patch , try to fix the same issue

Harbormaster completed remote builds in B154276: Diff 415362.Mar 15 2022, 3:02 AM

I actually hit this same issue the day you posted this, which is fun. But I fear this is quite a lot of work to get watertight. I've left some comments showing where we'd still crash. I've also seen us crash on getIntrinsicInstrCost when given, e.g., llvm.cttz.nxv1i8.

Looking at D119529, I'm not sure I see the value in accepting that the BaseTTI version is just allowed to crash on scalable vectors. It's a very common idiom that implementations fall back to that. It also just meaning extra work for new targets supporting scalable vectorization: now they need a fully-fledged TTI implementation to avoid crashing? Seems questionable to me.

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
198	We can still crash here: I've seen this in `CodeMetrics:analyzeBasicBlock` because it calls with `TCK_CodeSize`.
201	We'd still crash here?

Herald added subscribers: • s, arichardson. · View Herald TranscriptMar 16 2022, 8:21 AM

In D121677#3386240, @frasercrmck wrote:

I actually hit this same issue the day you posted this, which is fun. But I fear this is quite a lot of work to get watertight. I've left some comments showing where we'd still crash. I've also seen us crash on getIntrinsicInstrCost when given, e.g., llvm.cttz.nxv1i8.

Can adding a cost model for llvm.cttz.nxv1i8 solve this problem? I would like to try.

Herald added subscribers: sunshaoce, StephenFan. · View Herald TranscriptMar 30 2022, 11:26 PM

In D121677#3418361, @liaolucy wrote:

Can adding a cost model for llvm.cttz.nxv1i8 solve this problem? I would like to try.

In my experience, the minimum we need to handle is:

any scalable-vector intrinsics in getTypeBasedIntrinsicInstrCost which crashes in the base version. This would fix cttz.
scalable-vector fshl,fshr,experimental_stepvector,experimental_vector_insert and experimental_vector_extract in getIntrinsicInstrCost

In #2, these intrinsics all call BaseT::getIntrinsicInstrCost on scalable-vectors, so we need to catch them ahead of time. The other intrinsics either return some kind of cost or fall through to getTypeBasedIntrinsicInstrCost which we'll handle in #1.

This is what I mean about it looking like a lot of work. I've done the bare minimum in our downstream to fix crashes but the costs I used are wildly inaccurate: I just don't expect us to crash on IR containing intrinsics.

liaolucy mentioned this in D119529: [BasicTTI] Set scalarization cost of getCommonMaskedMemoryOpCost to Invalid..Mar 31 2022, 6:57 PM

After a series of @reames's patches, no crash, thanks.

Herald added a subscriber: shiva0217. · View Herald TranscriptJun 16 2022, 7:13 PM

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVTargetTransformInfo.cpp

3 lines

test/

Transforms/

LoopVectorize/

RISCV/

masked_gather_scatter.ll

4 lines

Diff 415362

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp

Show First 20 Lines • Show All 189 Lines • ▼ Show 20 Lines	RISCVTTIImpl::getMaskedMemoryOpCost(unsigned Opcode, Type *Src, Align Alignment,

return getMemoryOpCost(Opcode, Src, Alignment, AddressSpace, CostKind);		return getMemoryOpCost(Opcode, Src, Alignment, AddressSpace, CostKind);
}		}

InstructionCost RISCVTTIImpl::getGatherScatterOpCost(		InstructionCost RISCVTTIImpl::getGatherScatterOpCost(
unsigned Opcode, Type DataTy, const Value Ptr, bool VariableMask,		unsigned Opcode, Type DataTy, const Value Ptr, bool VariableMask,
Align Alignment, TTI::TargetCostKind CostKind, const Instruction *I) {		Align Alignment, TTI::TargetCostKind CostKind, const Instruction *I) {
if (CostKind != TTI::TCK_RecipThroughput)		if (CostKind != TTI::TCK_RecipThroughput)
return BaseT::getGatherScatterOpCost(Opcode, DataTy, Ptr, VariableMask,		return BaseT::getGatherScatterOpCost(Opcode, DataTy, Ptr, VariableMask,
		frasercrmckUnsubmitted Not Done Reply Inline Actions We can still crash here: I've seen this in `CodeMetrics:analyzeBasicBlock` because it calls with `TCK_CodeSize`. frasercrmck: We can still crash here: I've seen this in `CodeMetrics:analyzeBasicBlock` because it calls…
Alignment, CostKind, I);		Alignment, CostKind, I);

if ((Opcode == Instruction::Load &&		if ((Opcode == Instruction::Load &&
		frasercrmckUnsubmitted Not Done Reply Inline Actions We'd still crash here? frasercrmck: We'd still crash here?
!isLegalMaskedGather(DataTy, Align(Alignment))) \|\|		!isLegalMaskedGather(DataTy, Align(Alignment))) \|\|
(Opcode == Instruction::Store &&		(Opcode == Instruction::Store &&
!isLegalMaskedScatter(DataTy, Align(Alignment))))		!isLegalMaskedScatter(DataTy, Align(Alignment))))
return BaseT::getGatherScatterOpCost(Opcode, DataTy, Ptr, VariableMask,		return BaseT::getGatherScatterOpCost(Opcode, DataTy, Ptr, VariableMask,
Alignment, CostKind, I);		Alignment, CostKind, I);

// FIXME: Only supporting fixed vectors for now.		// FIXME: Only supporting fixed vectors for now.
if (!isa<FixedVectorType>(DataTy))		if (!isa<FixedVectorType>(DataTy))
return BaseT::getGatherScatterOpCost(Opcode, DataTy, Ptr, VariableMask,		return InstructionCost::getInvalid();
Alignment, CostKind, I);

auto *VTy = cast<FixedVectorType>(DataTy);		auto *VTy = cast<FixedVectorType>(DataTy);
unsigned NumLoads = VTy->getNumElements();		unsigned NumLoads = VTy->getNumElements();
InstructionCost MemOpCost =		InstructionCost MemOpCost =
getMemoryOpCost(Opcode, VTy->getElementType(), Alignment, 0, CostKind, I);		getMemoryOpCost(Opcode, VTy->getElementType(), Alignment, 0, CostKind, I);
return NumLoads * MemOpCost;		return NumLoads * MemOpCost;
}		}

▲ Show 20 Lines • Show All 158 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/RISCV/masked_gather_scatter.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -loop-vectorize -mtriple=riscv32 -mattr=+v,+d -riscv-v-vector-bits-min=256 -S \| FileCheck %s -check-prefixes=RV32			; RUN: opt < %s -loop-vectorize -mtriple=riscv32 -mattr=+v,+d -riscv-v-vector-bits-min=256 -S \| FileCheck %s -check-prefixes=RV32
	; RUN: opt < %s -loop-vectorize -mtriple=riscv64 -mattr=+v,+d -riscv-v-vector-bits-min=256 -S \| FileCheck %s -check-prefixes=RV64			; RUN: opt < %s -loop-vectorize -mtriple=riscv64 -mattr=+v,+d -riscv-v-vector-bits-min=256 -S \| FileCheck %s -check-prefixes=RV64

				; Check that we don't crash when scalable-vectorization=on.
				; RUN: opt < %s -loop-vectorize -mtriple=riscv32 -mattr=+v,+d -scalable-vectorization=on -S
				; RUN: opt < %s -loop-vectorize -mtriple=riscv64 -mattr=+v,+d -scalable-vectorization=on -S

	; The source code:			; The source code:
	;			;
	;void foo4(double A, double B, int *trigger) {			;void foo4(double A, double B, int *trigger) {
	;			;
	; for (int i=0; i<10000; i += 16) {			; for (int i=0; i<10000; i += 16) {
	; if (trigger[i] < 100) {			; if (trigger[i] < 100) {
	; A[i] = B[i*2] + trigger[i]; << non-consecutive access			; A[i] = B[i*2] + trigger[i]; << non-consecutive access
	; }			; }
	▲ Show 20 Lines • Show All 168 Lines • Show Last 20 Lines