The DAGCombiner tries to SimplifySelectCC as follows:
select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4)
It can't cope with the situation of reordered operands:
select_cc(x, y, 0, 16, cc)
In that case we just need to swap the operands and invert the Condition Code:
select_cc(x, y, 16, 0, ~cc)
Details
Diff Detail
Event Timeline
Is the motivating case integer or FP?
I'm asking because we have a canonicalization for integer cmp+sel for the IR in these tests, but we're missing the corresponding FP transform.
If we add the FP canonicalization in IR, would there still be a need for this backend patch? Ie, is something generating this select code in the DAG itself?
Yes, the motivating example was floating point comparison. Indeed, I added the unordered predicates and modified InstCombine to handle fpcmp. It then swapped the operands and inverted the predicates for me. But what does "Canonical" actually mean? The backend still won't be able to do the transformation of the description for select_cc(x, y, 0, 16, cc) if cc is already canonical, assuming this is a reachable state in DAG.
Yes, we can't guarantee that this pattern won't appear in the DAG if the cc was already the canonical predicate. We could try harder in instcombine to make the larger constant appear as the 'true value' for the select, but we can't guarantee it (if there are extra uses of the compare, then we won't do that transform in IR and increase the instruction count).
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
18287 | I don't know how to expose this as a bug, but this scares me. We are changing the value of CC without swapping the N2/N3 values, and it's possible to fall-through to the code underneath this block. At that point, we have the wrong CC value. Please move this block of code into a helper function as a preliminary step, so we don't have this risk. | |
test/CodeGen/AArch64/select_cc.ll | ||
1–4 | Please add this test file to trunk as a preliminary step and use utils/update_llc_test_checks.py to auto-generate the check lines. |
I've autogenerated the filecheck lines to show the diff compared to the trunk codegen. For making sure we never fall-through to the next block, having changed the CC but not swapped (N2, N3), I've moved all the preconditions to the beginning of the block (instead of moving the block into a helper function).
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
18282 | Shouldn't TLI.getBooleanContents(N0.getValueType()) be TLI.getBooleanContents(getSetCCResultType(N0.getValueType())) instead, or it doesn't matter? |
lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
18282 | Given that we're using getSetCCResultType(N0.getValueType()) when we create the SCC value below, that seems right. But that should be a separate clean-up step either before or after this patch. I'm not sure if there's a way to expose that bug in a test though. | |
18282–18284 | clang-format? | |
18329–18331 | clang-format? |
LGTM
(For reference: I was wondering why x86 doesn't show any diffs for this change; it looks like there's custom code in X86ISelLowering that already does the same thing.)
Shouldn't TLI.getBooleanContents(N0.getValueType()) be TLI.getBooleanContents(getSetCCResultType(N0.getValueType())) instead, or it doesn't matter?