In foldSelectIntoOp we sometimes transform a select of a fadd into a
fadd of a select, where we select between data and an identity value.
For both fadd and fsub the identity is always -0.0, but if the nsz
flag is set on the select instruction we can use +0.0 instead. Doing
so then triggers other optimisations, such as when folding the select
of masked load into a new masked load.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/IR/Constants.cpp | ||
---|---|---|
2851–2852 | Remove the TODO comment. | |
2852–2853 | This API is awkward. Should we change the constant's method to take a positive/negative param (to match the APFloat call within this call)? ConstantFP::getZero(Type, bool Negative = false) | |
llvm/test/Transforms/InstCombine/select-binop-foldable-floating-point.ll | ||
66 | We can get a bit more coverage from this test by adding some extra FMF (nnan or ninf for example). That way, we'll verify that all flags are propagated as expected to the new select instruction. This could also swap the select's true/false operands, so we're testing each of the changed code paths. Add some FMF on the fadd too? | |
169 | This pair of tests already transformed, but we dropped the FMF right? This will be easier to see if we pre-commit the tests with the baseline CHECK lines. |
The Arm backend has patterns to turn the awkward fadd(select(.., -0.0)) back into a more natural predicated fadd. They will need to be made to work with both the forms, if this is changing.
That's a good point, although it's a shame that there isn't a single test in CodeGen/ARM that defends the DAG combine in PerformFAddVSelectCombine. So I'm not entirely sure what DAG combine I need for the IR cases that you're worried about. :(
Yeah - it's handled via tablegen patterns at the moment, from these:
https://github.com/llvm/llvm-project/blob/4fed5f174fa57c73907bb3344018a5f9c2bc2e68/llvm/lib/Target/ARM/ARMInstrMVE.td#L3774
And comes up from any float reduction I think, due to the way we specify the predicated reduction op in the loop:
https://godbolt.org/z/GjhhbKx9b
- Added more pattern matches for MVE targets so we detect splats of +0.0 as well as -0.0.
- Added new ConstantFP::getZero interface.
- Added more flags to the tests for more coverage.
The backend code change and tests should be a separate patch/review that lands before the InstCombine change, so we don't regress those patterns.
It would be good to refactor foldSelectIntoOp() with a lambda as a preliminary cleanup, so we don't have to duplicate the code.
But this is an improvement even without that, so LGTM.
I think cleaning this up makes sense. I'll merge this patch first, then create a follow-on NFC to refactor it.
Remove the TODO comment.