Page MenuHomePhabricator

[SelectionDAG] swap select_cc operands to enable folding
ClosedPublic

Authored by labrinea on Oct 12 2018, 6:39 PM.

Details

Summary

The DAGCombiner tries to SimplifySelectCC as follows:
select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4)
It can't cope with the situation of reordered operands:
select_cc(x, y, 0, 16, cc)
In that case we just need to swap the operands and invert the Condition Code:
select_cc(x, y, 16, 0, ~cc)

Diff Detail

Event Timeline

labrinea created this revision.Oct 12 2018, 6:39 PM

Is the motivating case integer or FP?
I'm asking because we have a canonicalization for integer cmp+sel for the IR in these tests, but we're missing the corresponding FP transform.
If we add the FP canonicalization in IR, would there still be a need for this backend patch? Ie, is something generating this select code in the DAG itself?

Is the motivating case integer or FP?
I'm asking because we have a canonicalization for integer cmp+sel for the IR in these tests, but we're missing the corresponding FP transform.
If we add the FP canonicalization in IR, would there still be a need for this backend patch? Ie, is something generating this select code in the DAG itself?

Yes, the motivating example was floating point comparison. Indeed, I added the unordered predicates and modified InstCombine to handle fpcmp. It then swapped the operands and inverted the predicates for me. But what does "Canonical" actually mean? The backend still won't be able to do the transformation of the description for select_cc(x, y, 0, 16, cc) if cc is already canonical, assuming this is a reachable state in DAG.

Is the motivating case integer or FP?
I'm asking because we have a canonicalization for integer cmp+sel for the IR in these tests, but we're missing the corresponding FP transform.
If we add the FP canonicalization in IR, would there still be a need for this backend patch? Ie, is something generating this select code in the DAG itself?

Yes, the motivating example was floating point comparison. Indeed, I added the unordered predicates and modified InstCombine to handle fpcmp. It then swapped the operands and inverted the predicates for me. But what does "Canonical" actually mean? The backend still won't be able to do the transformation of the description for select_cc(x, y, 0, 16, cc) if cc is already canonical, assuming this is a reachable state in DAG.

Yes, we can't guarantee that this pattern won't appear in the DAG if the cc was already the canonical predicate. We could try harder in instcombine to make the larger constant appear as the 'true value' for the select, but we can't guarantee it (if there are extra uses of the compare, then we won't do that transform in IR and increase the instruction count).

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
18287

I don't know how to expose this as a bug, but this scares me. We are changing the value of CC without swapping the N2/N3 values, and it's possible to fall-through to the code underneath this block. At that point, we have the wrong CC value.

Please move this block of code into a helper function as a preliminary step, so we don't have this risk.

test/CodeGen/AArch64/select_cc.ll
1–4

Please add this test file to trunk as a preliminary step and use utils/update_llc_test_checks.py to auto-generate the check lines.

labrinea updated this revision to Diff 171488.Oct 29 2018, 5:32 AM

I've autogenerated the filecheck lines to show the diff compared to the trunk codegen. For making sure we never fall-through to the next block, having changed the CC but not swapped (N2, N3), I've moved all the preconditions to the beginning of the block (instead of moving the block into a helper function).

labrinea added inline comments.Oct 29 2018, 5:35 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
18282

Shouldn't TLI.getBooleanContents(N0.getValueType()) be TLI.getBooleanContents(getSetCCResultType(N0.getValueType())) instead, or it doesn't matter?

spatel added inline comments.Oct 29 2018, 2:03 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
18282

Given that we're using getSetCCResultType(N0.getValueType()) when we create the SCC value below, that seems right. But that should be a separate clean-up step either before or after this patch. I'm not sure if there's a way to expose that bug in a test though.

18282–18284

clang-format?

18329–18331

clang-format?

labrinea updated this revision to Diff 172359.Nov 2 2018, 8:01 AM

Rebased and clang-formatted.

spatel accepted this revision.Nov 6 2018, 12:13 PM

LGTM

(For reference: I was wondering why x86 doesn't show any diffs for this change; it looks like there's custom code in X86ISelLowering that already does the same thing.)

This revision is now accepted and ready to land.Nov 6 2018, 12:13 PM

LGTM

(For reference: I was wondering why x86 doesn't show any diffs for this change; it looks like there's custom code in X86ISelLowering that already does the same thing.)

I see. Thanks for the review!

This revision was automatically updated to reflect the committed changes.