This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] Use +0.0 instead of -0.0 as the FP identity for some folds
ClosedPublic

Authored by david-arm on Jun 1 2022, 5:39 AM.

Details

Summary

In foldSelectIntoOp we sometimes transform a select of a fadd into a
fadd of a select, where we select between data and an identity value.
For both fadd and fsub the identity is always -0.0, but if the nsz
flag is set on the select instruction we can use +0.0 instead. Doing
so then triggers other optimisations, such as when folding the select
of masked load into a new masked load.

Diff Detail

Event Timeline

david-arm created this revision.Jun 1 2022, 5:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2022, 5:39 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
david-arm requested review of this revision.Jun 1 2022, 5:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 1 2022, 5:39 AM
spatel added inline comments.Jun 1 2022, 10:28 AM
llvm/lib/IR/Constants.cpp
2851–2852

Remove the TODO comment.

2852–2853

This API is awkward. Should we change the constant's method to take a positive/negative param (to match the APFloat call within this call)?

ConstantFP::getZero(Type, bool Negative = false)
llvm/test/Transforms/InstCombine/select-binop-foldable-floating-point.ll
66

We can get a bit more coverage from this test by adding some extra FMF (nnan or ninf for example). That way, we'll verify that all flags are propagated as expected to the new select instruction.

This could also swap the select's true/false operands, so we're testing each of the changed code paths.

Add some FMF on the fadd too?

169

This pair of tests already transformed, but we dropped the FMF right? This will be easier to see if we pre-commit the tests with the baseline CHECK lines.

dmgreen added a subscriber: dmgreen.Jun 2 2022, 6:48 AM

The Arm backend has patterns to turn the awkward fadd(select(.., -0.0)) back into a more natural predicated fadd. They will need to be made to work with both the forms, if this is changing.

The Arm backend has patterns to turn the awkward fadd(select(.., -0.0)) back into a more natural predicated fadd. They will need to be made to work with both the forms, if this is changing.

That's a good point, although it's a shame that there isn't a single test in CodeGen/ARM that defends the DAG combine in PerformFAddVSelectCombine. So I'm not entirely sure what DAG combine I need for the IR cases that you're worried about. :(

The Arm backend has patterns to turn the awkward fadd(select(.., -0.0)) back into a more natural predicated fadd. They will need to be made to work with both the forms, if this is changing.

Ah! My mistake - was looking in the wrong backend - should be CodeGen/Thumb2. Doh!

Yeah - it's handled via tablegen patterns at the moment, from these:
https://github.com/llvm/llvm-project/blob/4fed5f174fa57c73907bb3344018a5f9c2bc2e68/llvm/lib/Target/ARM/ARMInstrMVE.td#L3774

And comes up from any float reduction I think, due to the way we specify the predicated reduction op in the loop:
https://godbolt.org/z/GjhhbKx9b

david-arm updated this revision to Diff 434763.Jun 7 2022, 3:53 AM
  • Added more pattern matches for MVE targets so we detect splats of +0.0 as well as -0.0.
  • Added new ConstantFP::getZero interface.
  • Added more flags to the tests for more coverage.
david-arm marked 4 inline comments as done.Jun 7 2022, 3:54 AM
david-arm added inline comments.
llvm/lib/IR/Constants.cpp
2852–2853

Good suggestion @spatel !

spatel added a comment.Jun 7 2022, 7:49 AM
  • Added more pattern matches for MVE targets so we detect splats of +0.0 as well as -0.0.

The backend code change and tests should be a separate patch/review that lands before the InstCombine change, so we don't regress those patterns.

david-arm updated this revision to Diff 435075.Jun 8 2022, 1:36 AM
david-arm marked an inline comment as done.

Hi @spatel, I've split the MVE changes out into D127275.

spatel accepted this revision.Jun 8 2022, 5:27 AM

It would be good to refactor foldSelectIntoOp() with a lambda as a preliminary cleanup, so we don't have to duplicate the code.
But this is an improvement even without that, so LGTM.

This revision is now accepted and ready to land.Jun 8 2022, 5:27 AM

It would be good to refactor foldSelectIntoOp() with a lambda as a preliminary cleanup, so we don't have to duplicate the code.
But this is an improvement even without that, so LGTM.

I think cleaning this up makes sense. I'll merge this patch first, then create a follow-on NFC to refactor it.