This is an archive of the discontinued LLVM Phabricator instance.

[X86] combineADC - fold ADC(C1,C2,Carry) -> ADC(0,C1+C2,Carry)
ClosedPublic

Authored by RKSimon on Mar 25 2022, 7:19 AM.

Details

Summary

If we're not relying on the flag result, we can fold the constants together into the RHS immediate operand and set the LHS operand to zero, simplifying for further folds.

We could do something similar if the flag result is in use and the constant fold doesn't affect it, but I don't have any real test cases for this yet.

As suggested by @davezarzycki on Issue #35256

Diff Detail

Event Timeline

RKSimon created this revision.Mar 25 2022, 7:19 AM
RKSimon requested review of this revision.Mar 25 2022, 7:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 25 2022, 7:19 AM

This is a great improvement but a scenario is still missing. For example, from a build of LLVM with this change applied (i.e. self hosted):

000000000180e020 <_ZNK4llvm12X86InstrInfo30getFMA3OpcodeToCommuteOperandsERKNS_12MachineInstrEjjRKNS_17X86InstrFMA3GroupE>:
180e020: mov 0x10(%rsi),%rax
180e024: movzwl (%rax),%r9d
180e028: mov 0x10(%rax),%rsi
180e02c: cmp %ecx,%edx
180e02e: mov %ecx,%edi
180e030: cmova %edx,%ecx
180e033: cmovb %edx,%edi
180e036: xor %eax,%eax
180e038: bt $0x2a,%rsi
180e03d: mov $0x0,%edx
180e042: adc $0x2,%edx
180e045: cmp $0x1,%edi

Sorry, I can't see it - what were you hoping it would fold to?

I was just surprised to see that the last four lines didn't swap the BT and the zeroing of EDX. Maybe this is a separate problem. This seems like the natural output of the last four lines of the example above:

xor %edx,%edx
bt $0x2a,%rsi
adc $0x2,%edx
cmp $0x1,%edi

Yes, the schedulers don't seem to do a good job of working around eflags uses.

Okay. I'll try to remember to create a followup bug after this lands. I'm sure I'm naive about it, but it seems to me that after register allocation, MOV $0, <reg> instructions should move earlier (when possible) in the basic block to get out of the way of EFLAG dependencies and allow the MOV $0, <reg> to XOR <reg>, <reg> optimization, but that's just me hand waving.

craig.topper added inline comments.Mar 26 2022, 5:49 PM
llvm/test/CodeGen/X86/combine-adc.ll
88

This still seems more complicated that it needs to be.

RKSimon added inline comments.Mar 27 2022, 10:49 AM
llvm/test/CodeGen/X86/combine-adc.ll
88

Yes combineCarryThroughADD is missing a fold to X86ISD::BT, its on my todo list.......

Allen added a subscriber: Allen.Mar 27 2022, 1:38 PM

any further comments?

llvm/test/CodeGen/X86/combine-adc.ll
88

This is addressed in D122572

This revision is now accepted and ready to land.Mar 29 2022, 10:24 AM
This revision was landed with ongoing or failed builds.Mar 30 2022, 1:13 AM
This revision was automatically updated to reflect the committed changes.