So long as the operation is reassociative, we can reassociate the double vecreduce from for example fadd(vecreduce(a), vecreduce(b)) to vecreduce(fadd(a,b)). This will in general save a few instructions, but some architectures (MVE) require the opposite fold, so a shouldExpandReduction is added to account for it. Only targets that use shouldExpandReduction will be affected.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
We probably want this for all associate reductions - fmul, min/max, integer ops, etc. Generalize into a helper function that takes an opcode and maps it to the corresponding reduction opcode? Subsequent patches then just need to add set of tests and another case into a switch or something like that.
Should we be doing this at IR level as well? https://simd.godbolt.org/z/aTE8qjMET (to be clear - I'm not expecting this patch to address it)
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
15695 |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
15695 | Never mind, I misunderstood the transformation. |
The case that was reported to me was a <16 x float> reduction and a <4 x float> reduction added together, which needs to be handled after legalization. Similar to fadd_reduct_reassoc_v4v8f32 but with a 16x and more going on.
I'm not sure that certain architectures would like the combine in for all reductions. It can depend on whether the reduction+add is cheap. MVE might prefer multiple reductions for integer adds for example, which can require quite precise cost modelling, unfortunately.
Sounds good. I can take a look, and see if it needs a target-hook to prevent the transform for any backends.
I've update this to now handle all the reduction types, and added tests for various architectures.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
1325 | Flags param is not used? SelectionDAG::FlagInserter FlagsInserter(DAG, Flags); |
Thanks - Now using a FlagInserter to copy the flags. The flags for integer ops are not passed through, as nsw/nuw/etc may not apply to the new operations.
LGTM
Ah, right - I was only thinking of FMF flags in this context. That's worth a header comment on reassociateReduction. It looks like we drop all flags in reassociateOps, but that's not called from FP opcodes.
We could also make it less likely to go wrong by using a default flags param and/or naming that param "FMFFlags" or something like that.
For example, we have functions like this:
SDValue getMemBasePlusOffset(SDValue Base, SDValue Offset, const SDLoc &DL, const SDNodeFlags Flags = SDNodeFlags());
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
5521 | Should here be "min/max/...(vecreduce..."? |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
5521 | Yep! Thanks. Updated in 14914fb1573f0393979492238735beecde65b3bb. |
Flags param is not used?
They could propagate with: