This is an archive of the discontinued LLVM Phabricator instance.

[x86][eflags] Fix PR37431 by teaching the EFLAGS copy lowering to specifically handle SETB_C* pseudo instructions.
ClosedPublic

Authored by chandlerc on May 12 2018, 5:29 PM.

Details

Summary

While the logic here is somewhat similar to the arithmetic lowering, it
is different enough that it made sense to have its own function.
I actually tried a bunch of different optimizations here and none worked
well so I gave up and just always do the arithmetic based lowering.

Looking at code from the PR test case, we actually pessimize a bunch of
code when generating these. Because SETB_C* pseudo instructions clobber
EFLAGS, we end up creating a bunch of copies of EFLAGS to feed multiple
SETB_C* pseudos from a single set of EFLAGS. This in turn causes the
lowering code to ruin all the clever code generation that SETB_C* was
hoping to achieve. None of this is needed. Whenever we're generating
multiple SETB_C* instructions from a single set of EFLAGS we should
instead generate a single maximally wide one and extract subregs for all
the different desired widths. That would result in substantially better
code generation. But this patch doesn't attempt to address that.

The test case from the PR is included as well as more directed testing
of the specific lowering pattern used for these pseudos.

Diff Detail

Event Timeline

chandlerc created this revision.May 12 2018, 5:29 PM
chandlerc planned changes to this revision.May 12 2018, 6:00 PM
  1. I need CHECK lines in this test case.
  1. More importantly, this is the wrong approach. SETB_C* is more different from SETB than I realized. Props to Craig for pointing this out. I need to specially handle SETB_C* in the EFLAGS copy lowering to "promote" the way I emit SETB to use SETB_C* when appropriate, and then fix the registers afterward.
chandlerc updated this revision to Diff 146711.May 14 2018, 3:50 PM

Take a completely different approach here that should work much more reliably.
The code is pretty nasty. Suggestions on simplifying it are very welcome.

chandlerc retitled this revision from [x86][eflags] Fix PR37431 by teaching the x86 backend to understand a setCC pseudo when trying to understand generic setCC behavior. to [x86][eflags] Fix PR37431 by teaching the EFLAGS copy lowering to specifically handle SETB_C* pseudo instructions..May 14 2018, 4:23 PM
chandlerc edited the summary of this revision. (Show Details)
chandlerc added a reviewer: rnk.

Updated Phabricator description and such as well to match the new approach in this patch. PTAL.

craig.topper added inline comments.May 14 2018, 5:24 PM
llvm/test/CodeGen/X86/flags-copy-lowering.mir
526

This looks wrong. You didn't do anything to put zeros in bits 15:8. SUBREG_TO_REG doesn't make that happen.

chandlerc updated this revision to Diff 146741.May 14 2018, 7:52 PM

Fix a bug where we didn't explicitly zero-extend registers smaller than
32-bits. Thanks to Craig for pointing this out in review.

chandlerc marked an inline comment as done.May 14 2018, 7:52 PM
chandlerc updated this revision to Diff 146742.May 14 2018, 7:56 PM

Tweak comments slightly.

craig.topper added inline comments.May 14 2018, 9:46 PM
llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
856

Are you intending to change the map entry in CondRegs here?

chandlerc updated this revision to Diff 146763.May 15 2018, 2:12 AM

Fix a tiny issue noticed by Craig in review.

chandlerc added inline comments.May 15 2018, 2:12 AM
llvm/lib/Target/X86/X86FlagsCopyLowering.cpp
856

Yikes. Good catch, fixed.

Amazingly, I don't think this could have broken anything, but just by chance. It happens that because this instruction is a def of EFLAGS as well, we never keep scanning past it, and so we never re-use the now busted cache entry.

Still, better to not modify the entry in CondRegs. I've also left a FIXME that I'll address in a follow-up to just use an API that doesn't require a local reference so that this bug can't be easily written.

This revision is now accepted and ready to land.May 15 2018, 11:31 AM
This revision was automatically updated to reflect the committed changes.