This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Set getMinVectorRegisterBitWidth to 16 if enable fixed length vector code gen for RVV
ClosedPublic

Authored by kito-cheng on Jan 3 2022, 5:09 AM.

Details

Summary

getMinVectorRegisterBitWidth means what vector types is supported in
this target, and actually RISC-V support all fixed length vector types with
vector length less than getMinRVVVectorSizeInBits, so set it to 16,
means 2 x i8, that is minimal fixed length vector size in theory.

That also fixed one issue, some testcase migth become non-vectorizable
when -riscv-v-vector-bits-min set to larger value, because the vector size is
smaller than -riscv-v-vector-bits-min.

For example, following code can vectorize by SLP with
-riscv-v-vector-bits-min=128 or -riscv-v-vector-bits-min=256, but
can't vectorize -riscv-v-vector-bits-min=512 or larger:

void foo(double *da) {
  da[0] = 0;
  da[1] = 1;
  da[2] = 2;
  da[3] = 3;
}

Diff Detail

Event Timeline

kito-cheng created this revision.Jan 3 2022, 5:09 AM
kito-cheng requested review of this revision.Jan 3 2022, 5:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2022, 5:09 AM
kito-cheng edited the summary of this revision. (Show Details)

I agree it shouldn't be based on -riscv-v-vector-bits-min. But 16 feels maybe too low. What do other targets use?

I think only VE and Aarch64 is meaningful for RISC-V as reference since we are the only 3 targets having scaleable vector support, so I only take a look on those two targets:

VE: NO VLS code gen support, getMinVectorRegisterBitWidth always return 0.
AArch64: Return 64 as default, and set 128 for many core with this comment // FIXME: remove this to enable 64-bit SLP if performance looks good.[1]

My thought: This hook is describing capability of target, so I would prefer describe what we really can support, which is 2 x i8, I know there is concern about it having benefit or not, but I think that should be cost model stuffs, we could describe that on cost model in following patches, for example 2 x i8 and (ST->getMinRVVVectorSizeInBits()) / 8 x i8 having same cost, so SLP and loop vectorization will using larger type if possible.

[1] https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AArch64/AArch64Subtarget.cpp#L138

This revision is now accepted and ready to land.Jan 7 2022, 5:57 PM
This revision was landed with ongoing or failed builds.Jan 7 2022, 7:16 PM
This revision was automatically updated to reflect the committed changes.