Check if it is legal to vectorize reduction.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h | ||
---|---|---|
133 | This is returning true for not scalable. Is that saying that any fixed length reduction is supported? |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h | ||
---|---|---|
132 | Can we just retrun true when VF is not scalable to support fixed-length vector reductions? | |
133 | I understand returning true for not scalable is just to make canVectorizeReductions() in LoopVectorizer.cpp return right value. There will be other checks after canVectorizeReductions() returns. |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h | ||
---|---|---|
151 | The added test case contains a fmin case. For opt -loop-vectorize, it seems no crash. |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h | ||
---|---|---|
151 | The crash would be in llc in SelectionDAG, not opt. |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h | ||
---|---|---|
133 | Let me rephrase a little. You're returning true for fixed vectors, but not checking element type or opcode or hasStdExtV. Does the the mean the vectorizer will start generating reductions for vectors of i128. Or reductions of floats when the F extension isn't enabled? |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h | ||
---|---|---|
133 | I believe the "legal" in this function really means "are you going to crash if there is a reduction of this type". Normal non-scalable reductions can always be legalized to something, even if that mean expanding or scalarizing or converting to soft float. The standard costmodelling then kicks in to say whether it is actually a good idea. i.e work the same as X86/Arm/etc. (But, because they are outside the loop, the vectorizer doesn't account for them directly, only the arithmetic instructions that will be in the loop. The assumption is what is in the loop will dominate performance). |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h | ||
---|---|---|
135 | I'm wondering if returning true for fixed-length vectors, even if correct (i.e. not crashing), is likely to produce worse code. Will it trick the cost modeling into producing code which we'll then expand during legalization, and increase register pressure and the likelihood of spilling? |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h | ||
---|---|---|
130 | There should be a blank line above this. | |
135 | For most reductions the expansion should be log2(elements) SingleSrcPermute shuffles and binops to reduce elements by half each step. So it shouldn't increase the register pressure much since you just need 2 registers. This should match the default cost model for getArithmeticReductionCost/getMinMaxReductionCost. That isn't what happens for ordered floating point reduction though, but it also doesn't look like this function is told that it is an ordered reduction. It doesn't look like getArithmeticReductionCost get's told either. Maybe we only vectorize to unordered? |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h | ||
---|---|---|
135 | I think the ordered propery is in RecurrenceDescriptor after https://reviews.llvm.org/D98435. But getArithmeticReductionCost/getMinMaxReductionCost don't receive the RecurrenceDescriptor. |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h | ||
---|---|---|
135 | Yeah fair enough, that sounds fine then. Part of me also finds it weird that we'd say it's legal to generate a fixed-length fmin/fmax/mul reduction even though we'd expand it 100% of the time. But perhaps it's just that this API's job isn't perfectly clear. |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h | ||
---|---|---|
135 | The ordered propery is added for RecurrenceDescriptor in https://reviews.llvm.org/D98435 while in-order reduction support is off by default and controlled with the -enable-strict-reductions flag. Would ordered floating point reduction be fixed in function InnerLoopVectorizer::fixVectorizedLoop() if -enable-strict-reductions flag is used? | |
135 | I agree the API's name is weird. |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.h | ||
---|---|---|
154 | Updated. Thanks for your work. |
There should be a blank line above this.