This is an archive of the discontinued LLVM Phabricator instance.

[x86] improve CMOV codegen by pushing add into operands
ClosedPublic

Authored by spatel on Jul 22 2021, 3:15 PM.

Details

Summary

This is not the transform direction we want in general, but by the time we have a CMOV, we've already tried everything else that could be better.
The transform increases the uses of the other add operand, but that is safe according to Alive2:
https://alive2.llvm.org/ce/z/Yn6p-A

We could probably extend this to other binops (not just add), but this is the motivating pattern discussed in:
https://llvm.org/PR51069

The test with i8 shows a missed fold because there's a trunc sitting in front of the add. I haven't checked to see what it would take to get that case.

Diff Detail

Event Timeline

spatel created this revision.Jul 22 2021, 3:15 PM
spatel requested review of this revision.Jul 22 2021, 3:15 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2021, 3:15 PM
pengfei added inline comments.Jul 22 2021, 10:54 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
49872

Surprising to see the first operand is false operand. I saw we are taking it as true in same cases

def : Pat<(f128 (X86cmov VR128:$t, VR128:$f, timm:$cond, EFLAGS)),
          (CMOV_VR128 VR128:$t, VR128:$f, timm:$cond)>;

Which one is right?

craig.topper added inline comments.Jul 22 2021, 11:04 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
49872

It should be false first. For non-pseudos false should be same as the dest operand making it a tied register which should be the first source operand. The comments in emitLoweredSelect which handle the pseudo also have false first.

Aha, yes, this late this is fine.
Looks good to me.
I guess you want to deal with i8 as a follow-up step?

llvm/lib/Target/X86/X86ISelLowering.cpp
49850

It's not really hoisting, pushAddThroughCmovOfConsts/pushAddIntoCmovOfConsts maybe or something?

49872

Unless i'm really misreading it, i believe the assembly is correct, so false first.

llvm/test/CodeGen/X86/add-cmov.ll
51–52

I guess you need to look past a truncation when looking for the cmov,
and then widen everything to the cmov type.

Nice!

FTR - I'd started trying to get cmov-of-constants support into X86DAGToDAGISel::matchAddress (but maybe it should be in a LEA pass?) as I think we could then handle "add(x,add(y,cmov(20, 50))) -> add(add(x, 20), cmov(y,add(y,30)))" to help pull more of the math inside LEA?

spatel marked 5 inline comments as done.Jul 23 2021, 5:35 AM
spatel added inline comments.
llvm/lib/Target/X86/X86ISelLowering.cpp
49850

Sure - will update.

49872

I copied that code from combineCMov(), so we should be consistent in this file at least.

llvm/test/CodeGen/X86/add-cmov.ll
51–52

Yes - or we try to carve out another case where we prefer i32 instead of trying to narrow the add. Changing the preferred type would probably be the better choice, but there will likely be a lot of test diffs...

spatel updated this revision to Diff 361166.Jul 23 2021, 5:45 AM
spatel marked 3 inline comments as done.
spatel retitled this revision from [x86] improve CMOV codegen by hoisting add to [x86] improve CMOV codegen by pushing add into operands.

Patch updated - changed "hoist" to "push".

lebedev.ri accepted this revision.Jul 23 2021, 5:49 AM

I think byte-sized add should be dealt with separately.
LG unless there are other comments.

llvm/test/CodeGen/X86/add-cmov.ll
51–52

I think we should in general widen everything less than i32 to i32,
but here i'd guess it would be better to just look past trunc.

This revision is now accepted and ready to land.Jul 23 2021, 5:49 AM
RKSimon accepted this revision.Jul 23 2021, 5:53 AM

LGTM - cheers

This revision was landed with ongoing or failed builds.Jul 23 2021, 6:41 AM
This revision was automatically updated to reflect the committed changes.