This is an archive of the discontinued LLVM Phabricator instance.

[Reassociate] Enable FP reassociation via 'reassoc' and 'nsz'
ClosedPublic

Authored by wristow on Jul 11 2022, 4:14 PM.

Details

Summary

[Reassociate] Enable FP reassociation via 'reassoc' and 'nsz'

Compiling with '-ffast-math' tuns on all the FastMathFlags (FMF), as
expected, and that enables FP reassociation. Only the two FMF flags
'reassoc' and 'nsz' are technically required to perform reassociation,
but disabling other unrelated FMF bits is needlessly suppressing the
optimization.

This patch fixes that needless suppression, and makes appropriate
adjustments to test-cases, fixing some outstanding TODOs in the process.

Fixes: #56483

Diff Detail

Event Timeline

wristow created this revision.Jul 11 2022, 4:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2022, 4:14 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
wristow requested review of this revision.Jul 11 2022, 4:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 11 2022, 4:14 PM
wristow added a comment.EditedJul 11 2022, 4:27 PM

Assuming this approach is sensible, I'm raising the question of whether this patch should be split into two.

Specifically, in implementing the fix, I found cases where opt produced less efficient code with my fix than without it. I ultimately noticed that the similar inefficiency already existed even when the full set of fast flags were enabled, and there were some TODOs suggesting that should be investigated. See for example, test7() in "llvm/test/Transforms/Reassociate/fast-basictest.ll". So in fixing this, I made some changes to deal with those inefficiencies.

Splitting this patch into two (one to deal with those inefficiencies, and one to deal with the needless suppression of reassociation) seems reasonable. But regardless, I wanted to show the entire patch here, to give the full context of the fix.

nikic added a subscriber: nikic.Jul 12 2022, 1:04 AM
nikic added inline comments.
llvm/lib/Transforms/Scalar/Reassociate.cpp
151

Just I->isAssociative() is sufficient, isFast() implies isAssociative().

I support the direction. IIRC, fixing FMF requirements for -reassociate came up during similar patches that were made to fix -instcombine like D47335. This just got lost in my bug queue.

Using I->isAssociative() directly seems like the right change rather than adding a wrapper (agree with inline comment from @nikic). Any uses of isFast() in IR are bogus at this point (that API should be deprecated) - there is no transform that requires all of the individual FMF.

If we can fix the regressions as an independent patch, that would be ideal.

Thanks for the quick feedback!

Using I->isAssociative() directly seems like the right change rather than adding a wrapper (agree with inline comment from @nikic). Any uses of isFast() in IR are bogus at this point (that API should be deprecated) - there is no transform that requires all of the individual FMF.

I'll update the check to look for hasAllowReassoc() && hasNoSignedZeros() rather than using isFast() at all, and rather than invoking isAssociative() (described in more detail in my inline comment).

If we can fix the regressions as an independent patch, that would be ideal.

I'll split this into two independent patches.

llvm/lib/Transforms/Scalar/Reassociate.cpp
151

Actually that is what I tried in my first attempt, but it didn't work. :)
The problem is that there are operators that are not associative (and so isAssociative() returns false), but some of those operators can be transformed into something associative, and the Reassociation Pass then nicely handles them. For example, subtraction isn't associative, but if we have:

(0.0-X)*Y + Z

then this can (arithmetically) be transformed into:

Z - X*Y

But that transformation was suppressed when isAssociative() alone was used, because it returned false. So for example, code like the following:

%A = fsub fast float 0.0, %X
%B = fmul fast float %A, %Y
%C = fadd fast float %B, %Z

no longer was optimized when I only used isAssociative().

But as I write this, I realize that by including isFast(), that's not the right solution. I should have done is check for reassoc and nsz:
hasAllowReassoc() && hasNoSignedZeros()
rather than isFast(). I'll change that. That will optimize more cases.

I'll split this into two independent patches.

I've split off the miscellaneous cleanup into a separate patch: D129612. Once that wraps up, I'll update the patch here to just be the improved optimization with reassoc and nsz.

craig.topper added inline comments.
llvm/lib/Transforms/Scalar/Reassociate.cpp
151

I think there was some confusion about what isAssociative is. It's based on opcode not the fast math flags. It's a property like isCommutative.

wristow added inline comments.Jul 12 2022, 10:27 PM
llvm/lib/Transforms/Scalar/Reassociate.cpp
151

To elaborate, for the Instruction method I->isAssociative(), it's based purely on the opcode for integer-typed operations; but for floating-point operations, it's based on both the opcode and whether or not reassoc and nsz are set.

craig.topper added inline comments.Jul 12 2022, 10:36 PM
llvm/lib/Transforms/Scalar/Reassociate.cpp
151

Oops. I missed that. Thanks for clarifying!

wristow updated this revision to Diff 444878.Jul 14 2022, 10:08 PM

Update to:

  1. Remove the cleanup that has been done separately (and committed as 230c8c56f21cfe4e23a24793f3137add1af1d1f0).
  2. Check only for reassoc and nsz (rather than isFast()).
wristow marked 3 inline comments as done.Jul 14 2022, 10:11 PM
spatel accepted this revision.Jul 15 2022, 8:35 AM

This is a straightforward translation of the existing FMF checks, so LGTM.
I added some inline comments for potential cleanups that could be done as NFC commits either before or after.

llvm/lib/Transforms/Scalar/Reassociate.cpp
145–147

It would be good to add a line to the code comment to state the non-obvious bit that came up in this review.
Something like this:

/// This is not the same as testing Instruction::isAssociative() because it includes 
/// operations like fsub.
156

Existing issue: this is awkward - we're dyn_casting to Instruction, but then we nakedly cast to BinaryOperator on the return. Could this cast to BO directly? There might be some subtlety with binop constant expressions that needs to be addressed.

2229

Existing issue: checking the type isn't consistent with the other diffs that check if the value is an FPMathOp. I'm not sure what, if any, functional difference it would make to the pass, but it seems wrong that these aren't using the same condition.

This revision is now accepted and ready to land.Jul 15 2022, 8:35 AM

This is a straightforward translation of the existing FMF checks, so LGTM.
I added some inline comments for potential cleanups that could be done as NFC commits either before or after.

Thanks for the review @spatel. I'll update the comment and commit. And I'll revisit the suggested cleanups a bit later.

llvm/lib/Transforms/Scalar/Reassociate.cpp
145–147

Good idea Sanjay. I'll do that before committing.

156

Good point. I'll be out of the office for a week, and I don't want to rush into this change (but I do want to get the main fix done now). So I'll revisit this point when I return.

2229

Another good point. Again, I'll revisit this point when I return.

This revision was landed with ongoing or failed builds.Jul 15 2022, 11:45 AM
This revision was automatically updated to reflect the committed changes.