Page MenuHomePhabricator

[x86] try harder to form LEA from ADD to avoid flag conflicts (PR40483)
ClosedPublic

Authored by spatel on Jul 14 2019, 7:24 AM.

Details

Summary

LEA doesn't affect flags, so use it more liberally to replace an ADD when we know that the ADD operands affect flags.

In the motivating example from PR40483:
https://bugs.llvm.org/show_bug.cgi?id=40483
...this lets us avoid duplicating a math op just to avoid flag conflict.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.Jul 14 2019, 7:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 14 2019, 7:24 AM
grandinj added inline comments.
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
2487 ↗(On Diff #209731)

Should this not be an || condition ?

spatel marked an inline comment as done.Jul 14 2019, 12:59 PM
spatel added inline comments.
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
2487 ↗(On Diff #209731)

It could be || (and I tried that initially), but I saw possible regressions (missed load folding) with that logic.

Given that this is a heuristic, it's hard to determine statically when it will be profitable.

Ok if I add a TODO comment about extending this in a follow-up patch? If we can recover the load folding, I agree that using || would be better.

grandinj added inline comments.Jul 14 2019, 11:32 PM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
2487 ↗(On Diff #209731)

Sounds reasonable (noting that I am just a bystander here, not a real reviewer)

LGTM - @craig.topper any comments?

craig.topper added inline comments.Jul 15 2019, 10:33 AM
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
2482 ↗(On Diff #209731)

Should we make sure the flag output has a user? ADC/SBB could definitely only have a live flag input. Not sure about the others.

spatel marked 4 inline comments as done.Jul 15 2019, 1:28 PM
spatel added inline comments.
llvm/lib/Target/X86/X86ISelDAGToDAG.cpp
2482 ↗(On Diff #209731)

Yes, that seems safer, although we do try to reduce back to regular opcodes if the flag wasn't needed. I don't see any test diffs with a draft of that change.

spatel updated this revision to Diff 209944.Jul 15 2019, 1:31 PM
spatel marked an inline comment as done.

Patch updated:

  1. Check flag uses to avoid unintended transform.
  2. Add TODO comment about && vs. ||.

Are the multiply test changes due to the flags being used by seto? But seto usage should never be in danger of creating the instruction duplication we're seeing in the motivating case. It does look like we're getting an improvement on those tests, but not for the reason we're selecting LEA.

Are the multiply test changes due to the flags being used by seto? But seto usage should never be in danger of creating the instruction duplication we're seeing in the motivating case. It does look like we're getting an improvement on those tests, but not for the reason we're selecting LEA.

Yes, I think you're right - we have something like this:

t95: i32,i32 = X86ISD::ADD t73:1, t93
t102: i32,i32 = X86ISD::UMUL t36, t38
t106: i32,i32 = X86ISD::UMUL t40, t33
  t93: i32 = add t106, t102
        t107: i8 = X86ISD::SETCC Constant:i8<0>, t106:1
      t104: i8 = X86ISD::SETCC Constant:i8<0>, t102:1

So the use of LEA is presumably just making things easier for register allocation.
Not sure how to handle this:

  1. Remove UMUL/SMUL from the cases in this patch and forego those improvements.
  2. Add a comment expanding on our reasoning for the transform and keep the improvements.
  3. Refine the flags check somehow (and again forego those improvements in this patch)?

If we focussed just on PR40483 for now - do we just need X86ISD ADD + SUB (ADC + SBB) ?

If we focussed just on PR40483 for now - do we just need X86ISD ADD + SUB (ADC + SBB) ?

Yes - mul and logic don't need to be in the list to change the motivating bug. I'll comment those opcodes out, and we can make this patch independent of them.

spatel updated this revision to Diff 210531.Jul 18 2019, 5:10 AM

Patch updated:
Just the add/sub opcodes for now.

RKSimon accepted this revision.Jul 18 2019, 5:40 AM

LGTM - cheers. Ideally we'd get this into the 9.00 release branch.

This revision is now accepted and ready to land.Jul 18 2019, 5:40 AM
This revision was automatically updated to reflect the committed changes.