This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Initial support of LoopVectorizer for RISC-V Vector.
ClosedPublic

Authored by HsiangKai on Jan 28 2021, 7:08 PM.

Details

Summary

Define an option -riscv-vector-bits-max to specify the maximum vector
bits for vectorizer. Loop vectorizer will use the value to check if it
is safe to use the whole vector registers to vectorize the loop.

It is not the optimum solution for loop vectorizing for scalable vector.
It assumed the whole vector registers will be used to vectorize the code.
If it is possible, we should configure vl to do vectorize instead of
using whole vector registers.

Diff Detail

Event Timeline

HsiangKai created this revision.Jan 28 2021, 7:08 PM
HsiangKai requested review of this revision.Jan 28 2021, 7:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 28 2021, 7:08 PM
Herald added a subscriber: MaskRay. · View Herald Transcript
craig.topper added inline comments.Jan 28 2021, 11:17 PM
llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
330 ↗(On Diff #320034)

I'm not sure this should be in this file. This file belongs to the MC layer, but this isn't an MC layer property or a property of the V extension. It's a property of our CodeGen implementation.

I'm not sure where a better place to put it is.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
4718

You can probably just the element VTs and then check that Align == ElemVT.getStoreSize() rather than spelling out all of the alignments.

What is considered misaligned for scalable vectors? Should we be checking the alignement is >= the element size?

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
55

Is this used with scalable vectors?

AArch64 seems to base their return here for SVE on a command line, but I expected them to require specific scalable vector types in IR for the backend to work.

llvm/test/Transforms/LoopVectorize/RISCV/scalable-vf-hint.ll
2

Is this directory new? If so it needs a lit.local.cfg to mark that all tests in it require the RISCV target to be compiled

Hi Kai, are we OK with having a test that goes from IR to assembly in the Transforms component?

I'd expect here a vectorized IR test. Then we can add tests those inputs Codegen so they generate sensible RVV instructions.

Hi Kai, are we OK with having a test that goes from IR to assembly in the Transforms component?

I'd expect here a vectorized IR test. Then we can add tests those inputs Codegen so they generate sensible RVV instructions.

I'd prefer this too.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
4718

Yeah, I agree. For RVV we just need to check that the vectors are at least aligned to the element size, don't we? I reckon Align >= EltVT.getStoreSize() is sufficient.

vkmr added a subscriber: vkmr.Jan 29 2021, 4:28 AM
vkmr added inline comments.Jan 29 2021, 5:15 AM
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
20

Minor nit: Reword the description for more clarity - may be something like "Maximum vector register size in bits"?

100

Nit: Use call to getRegisterBitWidth() here instead of RISCVVType::RVVBitsPerBlock. (Or implement and use getMinVectorRegisterBitWidth())

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
52–55

If I understand correctly, the assumption behind this code is that a single vector register is of size vscale x RVVBitsPerBlock and ignore the idea (for now?) of having register groups, i.e LMUL>1.
Unless we are ignoring register grouping for now, from Loop Vectorizer's perspective it would make sense to view the register group size as the real register size, specially for computing a feasible VF based on register usage.
Since the documentation of getRegisterBitWidth() defines it to be "The width of the largest scalar or vector register type", it might be more accurate to use getMinVectorRegisterBitWidth() to return RISCVVType::RVVBitsPerBlock and getRegisterBitWidth() to return getMinVectorRegisterBitWidth() * MAX_LMUL. (I am not considering fractional LMUL here.)

craig.topper added inline comments.Jan 29 2021, 10:14 AM
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp
100

getRegisterWidth() is likely going to be updated to be similar to AArch64 and be controlled by a command line option for minimum width. So it won't be the right thing.

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
52–55

Returning a non-zero value seems to at least partially enable the vectorizer to generate fixed vectors which isn't supported by the backend yet. It looks like something else stopped it in my testing, but it at least queried the cost model. Not sure what stopped it.

I do plan to support fixed vectors in the RVV backend, but it will probably be a couple weeks away. The register width here will probably need to be a command line controlled value like AArch64. And it should be at least 128 bits per the 0.10 spec. So I don't think its connected to RVVBitsPerBlock.

vkmr added inline comments.Jan 29 2021, 10:45 AM
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
52–55

Perhaps I misunderstood something, my concern here is more about how to encapsulate the idea of register grouping for scalable vectors in the TTI methods to query register widths. Having a command line option to control register width would still only reflect the width of a single register, right? Perhaps, we can add another command line option to specify a max group multiplier (essentially the Maximum LMUL value).

IIRC, the TTI method getMinVectorRegisterBitWidth() in addition to getRegisterBitWidth() was introduced to handle similar concerns with NEON. With scalable vectors, things are a little more complicated.

craig.topper added inline comments.Jan 29 2021, 10:58 AM
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
52–55

I don't think I understand how this interface works for scalable vector vectorization. AArch64 has it connected to a command line which means it can be larger than 128 bits. But I thought the backend needed specific types like <vscale x 4 x i32>. Does this interface effect the fixed portion of the scalable type for scalable vector vectorization?

HsiangKai updated this revision to Diff 320261.Jan 29 2021, 7:29 PM
HsiangKai added inline comments.
llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
330 ↗(On Diff #320034)

I will move it to RISCVISelLowering.h.

llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
52–55

I think I didn't dig into how the callback is used. I remove it in this patch. We could add it back after we have clear idea how to do it.

HsiangKai added inline comments.Jan 30 2021, 2:47 PM
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
52–55

Craig is right. getRegisterBitWidth() is not related to scalable vector vectorization. It is reasonable to remove it in this patch.

vkmr added inline comments.Feb 1 2021, 5:08 AM
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h
52–55

Does this interface effect the fixed portion of the scalable type for scalable vector vectorization?

Yes, the auto vectorizer uses this interface to compute the VF (for scalable vectors this is the fixed part of the VF) with the most optimal cost.

Craig is right. getRegisterBitWidth() is not related to scalable vector vectorization. It is reasonable to remove it in this patch.

Agreed.

This revision is now accepted and ready to land.Feb 4 2021, 5:04 PM
frasercrmck added inline comments.Feb 5 2021, 1:52 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
4706
/// .... If true, it also returns
/// whether the unaligned memory access is "fast" in the last argument by
/// reference.

I think this suggests that setting Fast is part of the contract. It could theoretically read an uninitialized variable if we don't set it but return true.

HsiangKai updated this revision to Diff 321911.Feb 5 2021, 6:38 PM

Address @frasercrmck's comments.

HsiangKai marked an inline comment as done.Feb 5 2021, 6:39 PM
This revision was landed with ongoing or failed builds.Feb 8 2021, 2:51 PM
This revision was automatically updated to reflect the committed changes.