This is an archive of the discontinued LLVM Phabricator instance.

CodeGenPrepare: Avoid and/or i1 in select conditions.
ClosedPublic

Authored by MatzeB on Feb 13 2015, 1:34 PM.

Details

Summary

Thanks for the comments chandler.

I extracted the visitAND/visitOR refactoring to http://reviews.llvm.org/D8026. This new version only deals with the select DAG combines:

This is based on the following equivalences:
select(C0 & C1, X, Y) <=> select(C0, select(C1, X, Y), Y)
select(C0 | C1, X, Y) <=> select(C0, X, select(C1, X, Y))

Many target cannot perform and/or on the CPU flags and therefore the
right side should be choosen to avoid materializign the i1 flags in an
integer register. If the target can perform this operation efficiently
we normalize to the left form.

Diff Detail

Repository
rL LLVM

Event Timeline

MatzeB updated this revision to Diff 19927.Feb 13 2015, 1:34 PM
MatzeB retitled this revision from to CodeGenPrepare: Avoid and/or i1 in select conditions..
MatzeB updated this object.
MatzeB edited the test plan for this revision. (Show Details)
MatzeB set the repository for this revision to rL LLVM.
MatzeB added subscribers: Unknown Object (MLST), hfinkel, arsenm.
ab added a subscriber: ab.Feb 13 2015, 1:54 PM

I should note that this would have been useful as a DAGCombine as
well, because for instance on X86, (select (fcmp oeq/une)) is lowered
into the pattern matched here.

Matthias tells me doing it as a DAGCombine was frowned upon (why?), so
I'm doing pretty much the same thing on X86ISD::CMOV nodes, and will
submit shortly.
-Ahmed

In D7622#123459, @ab wrote:

I should note that this would have been useful as a DAGCombine as
well, because for instance on X86, (select (fcmp oeq/une)) is lowered
into the pattern matched here.

Matthias tells me doing it as a DAGCombine was frowned upon (why?),

I also don't understand why this would not be done in DAGCombine. It is basic-block local, and concerns itself with type legality, it seems appropriate for DAGCombine. One complication with DAGCombine is, if SELECT is not legal, then matching this might be hard.

so
I'm doing pretty much the same thing on X86ISD::CMOV nodes, and will
submit shortly.
-Ahmed

We discussed this in IRC, and I think chandler (and you?) told me that DAGCombine is supposed to normalize in a target independent way, as this normalization should or should not be done depending on the target I was told to do it in CodeGenPrepare...

Legality should not be a problem here as there is already a select instruction in use, I don't see why duplicating this select with a different condition (but same true/false input) could not be legal.

  • Matthias
MatzeB updated this revision to Diff 21073.Mar 2 2015, 6:37 PM

New version performing the transformation as a DAGCombine operation instead of CodeGenPrepare.

MatzeB added a comment.Mar 2 2015, 6:43 PM

The commit message for this:

DAGCombine: Canonicalize select(and/or,x,y) depending on target.

This is based on the following equivalences:
select(C0 & C1, X, Y) <=> select(C0, select(C1, X, Y), Y)
select(C0 | C1, X, Y) <=> select(C0, X, select(C1, X, Y))

Many target cannot perform and/or on the CPU flags and therefore the
right side should be choosen to avoid materializign the i1 flags in an
integer register. If the target can perform this operation efficiently
we normalize to the left form.

Canonicalizing to nested selects has the problem that the and/or in the
conditions is only implcit afterwards, so this patch factors our bigger
parts of the and/or DAGCombiners so they can be applied to nested
selects as well.

There is no context in the diff, and there appears to be a bunch of unrelated code motion mixed in with the change. Fixing those issues would make it much easier to review.

If the code motion doesn't change any behavior but makes it easier to fit this into the DAG combine, just submit the no-op code motion change first and base this patch off that?

MatzeB updated this revision to Diff 21077.Mar 2 2015, 8:26 PM
MatzeB updated this object.
hfinkel added inline comments.Mar 4 2015, 8:51 AM
include/llvm/Target/TargetLowering.h
1108

I think it makes more sense to make this default to:

!hasMultipleConditionRegisters()

because if a target has multiple condition registers, then it probably has operations on those registers.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
4715

Range-based for?

lib/Target/PowerPC/PPCISelLowering.h
624 ↗(On Diff #21077)

This needs to default to !Subtarget.useCRBits() (so you might need to move the implementation into the PPCISelLowering.cpp).

Better yet, if you use hasMultipleConditionRegisters() for the default implementation (which I think you should), then you don't need this at all.

MatzeB updated this revision to Diff 21217.Mar 4 2015, 10:41 AM

Thanks for the review Hal, the new version uses hasMultipleConditionRegisters() as default.

Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:4714
@@ +4713,3 @@
+ const SDNode *Node = Value.getNode();
+ for (auto I = Node->use_begin(), E = Node->use_end(); I != E; ++I) {

+ const SDUse &Use = I.getUse();

Range-based for?

A range based for prohibits me from doing I.getUse() above as I can't access the iterator anymore.

Thanks for the review Hal, the new version uses hasMultipleConditionRegisters() as default.

Comment at: lib/CodeGen/SelectionDAG/DAGCombiner.cpp:4714
@@ +4713,3 @@
+ const SDNode *Node = Value.getNode();
+ for (auto I = Node->use_begin(), E = Node->use_end(); I != E; ++I) {

+ const SDUse &Use = I.getUse();

Range-based for?

A range based for prohibits me from doing I.getUse() above as I can't access the iterator anymore.

Ah, alright.

include/llvm/Target/TargetLowering.h
1107

You should update the comment also. I'd say:

// If a target has multiple condition registers, then it likely has logical operations on those registers.
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
4717

Don't you want to check that it is used as the condition operand of the select?

MatzeB updated this revision to Diff 21219.Mar 4 2015, 11:18 AM

Thanks Hal,

good points. Addressed in this update.

Thanks,
Matthias

hfinkel accepted this revision.Mar 4 2015, 11:25 AM
hfinkel added a reviewer: hfinkel.

LGTM.

This revision is now accepted and ready to land.Mar 4 2015, 11:25 AM

Thanks again for the review. I can't commit this yet though as it depends on http://reviews.llvm.org/D8026

MatzeB updated this revision to Diff 21317.Mar 5 2015, 2:41 PM
MatzeB edited edge metadata.

After looking at some more examples I tweaked the code some more:

  • I'm not checking for hasOnlySelectAsUser() anymore. The fact that we have only select users means that we could get rid of the extra register and the SETCC but we produce an extra select instruction for each of the users, this is not necessarily better so I changed to the conservative hasOneUser().
  • Tweak the callback to not cover types that need multiple registers as those will typically be broken down into multiple select-like instructions.
hfinkel added inline comments.Mar 5 2015, 5:38 PM
include/llvm/Target/TargetLowering.h
1113

Tweak the callback to not cover types that need multiple registers as those will typically be broken down into multiple select-like instructions.

Normally, to check this condition, we'd just use:

isTypeLegal(VT)
MatzeB added inline comments.Mar 6 2015, 11:24 AM
include/llvm/Target/TargetLowering.h
1113

But that will report false for types like i8 and i16 on many target even though we will end up with a single select like instruction. What about?

LegalizeTypeAction Action = getTypeAction(Context, VT);
return Action != TypeExpandInteger && Action != TYpeExpandFloat &&
  Action != TypeSplitVector
hfinkel added inline comments.Mar 6 2015, 11:28 AM
include/llvm/Target/TargetLowering.h
1113

Sure; seems reasonable to me. Go ahead whenever you're ready.

This revision was automatically updated to reflect the committed changes.

Hi all,

working on compiler-rt for an out-of-tree target I found an issue related to this commit. I was able to reproduce it using the Mips backend. Please have a look to the inline comments.

Thanks in advance.

-Michele

llvm/trunk/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
4853–4854 ↗(On Diff #21383)

Running "llc -O3" on the attached example I reach the following assertion

llc: /home/scandale/devel/pure-llvm/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:3204: llvm::SDValue llvm::SelectionDAG::getNode(unsigned int, llvm::SDLoc, llvm::EVT, llvm::SDValue, llvm::SDValue, bool, bool, bool): Assertion `N1.getValueType() == N2.getValueType() && N1.getValueType() == VT && "Binary operator types must match!"' failed.
#0 0x1997d75 llvm::sys::PrintStackTrace(llvm::raw_ostream&) /home/scandale/devel/pure-llvm/llvm/lib/Support/Unix/Signals.inc:425:0
#1 0x19980f1 PrintStackTraceSignalHandler(void*) /home/scandale/devel/pure-llvm/llvm/lib/Support/Unix/Signals.inc:483:0
#2 0x1996cc9 SignalHandler(int) /home/scandale/devel/pure-llvm/llvm/lib/Support/Unix/Signals.inc:199:0
#3 0x7f677b48e740 __restore_rt (/usr/lib/libpthread.so.0+0x10740)
#4 0x7f677a6ce4b7 __GI_raise (/usr/lib/libc.so.6+0x334b7)
#5 0x7f677a6cf88a __GI_abort (/usr/lib/libc.so.6+0x3488a)
#6 0x7f677a6c741d __assert_fail_base (/usr/lib/libc.so.6+0x2c41d)
#7 0x7f677a6c74d2 (/usr/lib/libc.so.6+0x2c4d2)
#8 0x181f0cb llvm::SelectionDAG::getNode(unsigned int, llvm::SDLoc, llvm::EVT, llvm::SDValue, llvm::SDValue, bool, bool, bool) /home/scandale/devel/pure-llvm/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:3203:0
#9 0x17781ae (anonymous namespace)::DAGCombiner::visitSELECT(llvm::SDNode*) /home/scandale/devel/pure-llvm/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp:4888:0

This is because the condition values of the two select nodes have different types, thus we cannot create an AND between them.

A quick workaround is to check also that the types of the two conditions are the same. An alternative could be to extend one of the condition to have homogeneous types.

4872–4873 ↗(On Diff #21383)

Same issue of lines 4853-4854...

I would be fine with either skipping the transformation if the types don't match up or extending the types. I personally would lean towards skipping the transformation as the extension is an extra instruction which makes the transformation less beneficial. As none of the targets I know are affected, what would the preference for your target?

  • Matthias

I would be fine with either skipping the transformation if the types don't match up or extending the types. I personally would lean towards skipping the transformation as the extension is an extra instruction which makes the transformation less beneficial. As none of the targets I know are affected, what would the preference for your target?

  • Matthias

The previous example was showing the issue using the in-tree Mips backend.
Btw, the behavior of the target I am working on is quite close to Mips. On the proposed example it is better to skip the transformation when the two condition are not homogeneous.

-Michele

I fixed this in r234763.