This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/lib/Target/RISCV/
-
lib/
-
Target/
-
RISCV/
-
RISCVTargetTransformInfo.h

Differential D107945

[RISCV] Use RISCV::RVVBitsPerBlock for RGK_ScalableVector in getRegisterBitWidth.
ClosedPublic

Authored by craig.topper on Aug 11 2021, 8:55 PM.

Download Raw Diff

Details

Reviewers

frasercrmck
sdesmalen

Commits

rG8f6cea43e745: [RISCV] Use RISCV::RVVBitsPerBlock for RGK_ScalableVector in…

Summary

I might be wrong, but I think this is should be width of the known
min size we use for scalable vectors. It shouldn't scale with
minimum vlen.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

craig.topper created this revision.Aug 11 2021, 8:55 PM

Herald added subscribers: vkmr, evandro, luismarques and 24 others. · View Herald TranscriptAug 11 2021, 8:55 PM

craig.topper requested review of this revision.Aug 11 2021, 8:55 PM

Herald added a project: Restricted Project. · View Herald TranscriptAug 11 2021, 8:55 PM

Herald added a subscriber: MaskRay. · View Herald Transcript

Harbormaster completed remote builds in B119187: Diff 365906.Aug 11 2021, 9:16 PM

Yeah I'm not sure, to be honest. It could be either meaning, going by a quick look around. Any ideas how we can know conclusively?

@sdesmalen can you help?

According to the description of getRegisterBitWidth, the function returns the width of the largest vector register type, which is probably where SVE and RVV are a bit different. For SVE the maximum vector length is always a multiple of 128bits and bounded by maximum vscale, so we can always return ElementCount of 'vscale x 128'. The LV uses this to determine a suitable VF based on the widest element type. e.g. if the maximum element width is 64bits, the maximum VF would be "vscale x 2", whereas if the max element width is 32bits, the maximum VF would be "vscale x 4". RVV can choose different LMULs, so you may want to return a wider bitwidth as default to get a more suitable vectorization factor, or alternatively experiment with adding a new RGK_* enum value to request a smaller/wider bitwidth. The LoopVectorizer also has an option to choose a higher bandwidth "-vectorizer-maximize-bandwidth", which forces the LV to choose a higher bitwidth based on the smallest element type in the loop (instead of the biggest element type).

In D107945#2945521, @sdesmalen wrote:

According to the description of getRegisterBitWidth, the function returns the width of the largest vector register type, which is probably where SVE and RVV are a bit different. For SVE the maximum vector length is always a multiple of 128bits and bounded by maximum vscale, so we can always return ElementCount of 'vscale x 128'. The LV uses this to determine a suitable VF based on the widest element type. e.g. if the maximum element width is 64bits, the maximum VF would be "vscale x 2", whereas if the max element width is 32bits, the maximum VF would be "vscale x 4". RVV can choose different LMULs, so you may want to return a wider bitwidth as default to get a more suitable vectorization factor, or alternatively experiment with adding a new RGK_* enum value to request a smaller/wider bitwidth. The LoopVectorizer also has an option to choose a higher bandwidth "-vectorizer-maximize-bandwidth", which forces the LV to choose a higher bitwidth based on the smallest element type in the loop (instead of the biggest element type).

Ignoring LMUL for right now. I think what is in the code right now is wrong so I'd like something that is at least functionally correct. If I just want the vectorizer to use at most LMUL=1, I should return the fixed size of 64 that is used by our lmul=1 types, <vscale x 1 x i64>, <vscale x 2 x i32>, <vscale x 4 x i16>? This is what RISCV::RVVBitsPerBlock represents.

Further to the information received in the SVE call, this seems like the correct thing to.

This revision is now accepted and ready to land.Aug 17 2021, 10:00 AM

This revision was landed with ongoing or failed builds.Aug 17 2021, 11:13 AM

Closed by commit rG8f6cea43e745: [RISCV] Use RISCV::RVVBitsPerBlock for RGK_ScalableVector in… (authored by craig.topper). · Explain Why

This revision was automatically updated to reflect the committed changes.

craig.topper added a commit: rG8f6cea43e745: [RISCV] Use RISCV::RVVBitsPerBlock for RGK_ScalableVector in….

May I ask a question, why is RISCV::RVVBitsPerBlock set to 64? Any clue(RFC) to this concept? Thanks.

Herald added a subscriber: achieveartificialintelligence. · View Herald TranscriptSep 28 2021, 12:06 AM

In D107945#3026709, @luke957 wrote:

May I ask a question, why is RISCV::RVVBitsPerBlock set to 64? Any clue(RFC) to this concept? Thanks.

We map RVV types to scalable vector types in IR like <vscale x 1 x i64>. Where vscale is a runtime value calculated as (VLEN/RVVBitsPerBlock).

So <vscale x 1 x i64> is ((VLEN/RVVBitsPerBlock) x 1 x 64) bits. Which simplifies to VLEN bits. Any type that simplifies to VLEN bits is an LMUL=1 type. Smaller than VLEN represents a fractional LMUL. Larger would LMUL=2 or 4 or 8.

The value needs to be large enough so that we can support a fractional LMUL of 1/8 for i8 which is required for ELEN=64. With RVVBitsPerBlock==64 we can use <vscale x 1 x i8>. RVVBitsPerBlock also needs to be divisible by ELEN.

RVVBitsPerBlock is the smallest VLEN we can support. I think we are going to need to select a value of 32 at compile time when targeting Zve32x or Zve32f. This will require all the intrinsic types to map to different LLVM IR types depending on which ELEN we are targeting.

Revision Contents

Path

Size

llvm/

lib/

Target/

RISCV/

RISCVTargetTransformInfo.h

2 lines

Diff 366961

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h

Show First 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	TypeSize getRegisterBitWidth(TargetTransformInfo::RegisterKind K) const {
switch (K) {		switch (K) {
case TargetTransformInfo::RGK_Scalar:		case TargetTransformInfo::RGK_Scalar:
return TypeSize::getFixed(ST->getXLen());		return TypeSize::getFixed(ST->getXLen());
case TargetTransformInfo::RGK_FixedWidthVector:		case TargetTransformInfo::RGK_FixedWidthVector:
return TypeSize::getFixed(		return TypeSize::getFixed(
ST->hasStdExtV() ? ST->getMinRVVVectorSizeInBits() : 0);		ST->hasStdExtV() ? ST->getMinRVVVectorSizeInBits() : 0);
case TargetTransformInfo::RGK_ScalableVector:		case TargetTransformInfo::RGK_ScalableVector:
return TypeSize::getScalable(		return TypeSize::getScalable(
ST->hasStdExtV() ? ST->getMinRVVVectorSizeInBits() : 0);		ST->hasStdExtV() ? RISCV::RVVBitsPerBlock : 0);
}		}

llvm_unreachable("Unsupported register kind");		llvm_unreachable("Unsupported register kind");
}		}

InstructionCost getGatherScatterOpCost(unsigned Opcode, Type *DataTy,		InstructionCost getGatherScatterOpCost(unsigned Opcode, Type *DataTy,
const Value *Ptr, bool VariableMask,		const Value *Ptr, bool VariableMask,
Align Alignment,		Align Alignment,
▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines