SLP vectorizer supports horizontal reductions for Add/FAdd binary operations. Patch adds support for horizontal min/max reductions.
Function getReductionCost() is split to getArithmeticReductionCost() for binary operation reductions and getMinMaxReductionCost() for min/max reductions.
Patch fixes PR26956.
Details
- Reviewers
spatel mzolotukhin mkuper hfinkel RKSimon chandlerc - Commits
- rGccce7afee8ad: [SLP] Support for horizontal min/max reduction.
rG6dd29fccb881: [SLP] Support for horizontal min/max reduction.
rL314101: [SLP] Support for horizontal min/max reduction.
rL312791: [SLP] Support for horizontal min/max reduction.
Diff Detail
Event Timeline
A few comments, but someone with more vectorizer knowledge needs to review as well.
include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
1009 | Missing assert message | |
1012 | Move this comment out and just above the arithmetic/minmax functions? | |
1020 | where only the first n/2 elements are meaningful, | |
1023 | , not n, | |
lib/Target/X86/X86TargetTransformInfo.cpp | ||
1659 | Unnecessary? |
lib/Target/X86/X86TargetTransformInfo.cpp | ||
---|---|---|
1659 | Yes, missed it, thanks. |
This is another example of what I was talking about re smaller patches.
You could do things like name changes (Reduction -> ArithmeticReduction) or moving comments around (from inside the body of getReductionCost to above the declaration) in separate NFC patches. In a lot of cases they're simple enough not to require pre-commit review, and it makes reviewing actual meaningful patches much much easier, because the diff only contains relevant things. You could also start with a patch that only supports one kind of min/max reduction, and add the rest of them as a follow-up.
I'm sorry if it seems like I'm being petty. In fact I'm really interested in getting all of this stuff in. But the SLP vectorizer is not the simplest or the clearest piece of code to begin with, and I'm not deeply familiar with its nuances, so it's really hard for me to meaningfully review SLP patches if they also contain noise, and if they do more than one thing. If there's someone else who's capable of reviewing and LGTMing these patches as is, I don't oppose it. But I can't.
Michael, no problems at all. Will do my best to split this and other patches into several smaller pieces, but I can't guarantee that it will be possible to do in all cases.
include/llvm/CodeGen/BasicTTIImpl.h | ||
---|---|---|
1008 | ScalarTy->isFloatingPointTy() | |
lib/Analysis/CostModel.cpp | ||
171 | Do you need the this == &RD? Won't it always match on (Kind == RD.Kind && Opcode == RD.Opcode)? | |
172 | They're not public, but maybe keep to style guide (also, maybe drop the class?) enum ReductionKind { RK_None, /// Not a reduction. RK_Arithmetic, /// Binary reduction data. RK_MinMax, /// Min/max reduction data. }; | |
lib/Target/X86/X86TargetTransformInfo.cpp | ||
1778 | One cost entry per line | |
lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
4138 | Same as above: enum ReductionKind { RK_Not, /// Not a reduction. RK_Arithmetic, /// Binary reduction data. RK_Min, /// Minimum reduction data. RK_UMin, /// Unsigned minimum reduction data. RK_Max, /// Maximum reduction data. RK_UMax, /// Unsigned maximum reduction data. }; |
I think its almost there now, my only concern is that we're not discriminating between signed/unsigned in getMinMaxReductionCost calls - e.g. SSE is very inconsistent with support for these.
LGTM with one minor
lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
4133 | Call this RK_None to match the other version? |
Is there any chance that you can simplify the PR34635.ll test case that you committed? There's a lot of metadata/unnecessary code in there which is likely to make the testcase very brittle.
llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
5065 ↗ | (On Diff #116544) | Hi, Why are we filling the NonNan flags for the "OperationData" object the value from the condition of the select instruction instead of the select Instruction itself? Wheb ew get to code gen we check that the value itself is not a Nan, am I missing something? |
llvm/trunk/lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
5065 ↗ | (On Diff #116544) |
|
ScalarTy->isFloatingPointTy()