This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] Fold BinOp into Select containing identity constant
Changes PlannedPublic

Authored by laytonio on Oct 24 2020, 11:21 PM.

Details

Summary

To prevent materializing unneeded identity constants, and to avoid unnecessary arithmetic in branches that where the value wouldn't be modified, add folds for the following:

(add x, (select cc, 0, cf)) -> (select cc, x, (add x, cf))
(or x, (select cc, 0, cf)) -> (select cc, x, (or x, cf))
(xor x, (select cc, 0, cf)) -> (select cc, x, (xor x, cf))
(sub x, (select cc, 0, cf)) -> (select cc, x, (sub x, cf))
(shl x, (select cc, 0, cf)) -> (select cc, x, (shl x, cf))
(lshr x, (select cc, 0, cf)) -> (select cc, x, (lshr x, cf))
(ashr x, (select cc, 0, cf)) -> (select cc, x, (ashr x, cf))
(mul x, (select cc, 1, cf)) -> (select cc, x, (mul x, cf))
(sdiv x, (select cc, 1, cf)) -> (select cc, x, (sdiv x, cf))
(udiv x, (select cc, 1, cf)) -> (select cc, x, (udiv x, cf))
(and x, (select cc, -1, cf)) -> (select cc, x, (and x, cf))

Fixes select cases of PR47922

Diff Detail

Event Timeline

laytonio created this revision.Oct 24 2020, 11:21 PM
laytonio requested review of this revision.Oct 24 2020, 11:21 PM
laytonio added inline comments.Oct 24 2020, 11:24 PM
llvm/test/CodeGen/PowerPC/cmpb.ll
243

Large regressions here unfortunately.

First attempt, feedback appreciated.

foad added a subscriber: foad.Oct 26 2020, 1:51 AM
foad added inline comments.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
536–537

Maybe pass IdentityInt in as an Optional<int64_t> to simplify all the callers?

2086–2087

"Don't do this if the select has more than one use", surely?

2118

Use int or unsigned instead of char.

llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
339–341

Regression here and in lots of other tests below. It looks like it's no longer using an add-with-carry instruction to implement x+(cond?1:0).

There's a large number of regressions in here that still need addressing - have you had any luck nailing any of them down?

llvm/test/CodeGen/AMDGPU/amdgpu-codegenprepare-idiv.ll
339–341

Is this just a generic combine we're missing or would this need to be fixed with an isel pattern?

llvm/test/CodeGen/X86/avx-select.ll
26

I'm not sure if these tests are testing what there were supposed to anymore?

llvm/test/CodeGen/X86/binop-identity.ll
3

Maybe pre-commit these tests to trunk and rebase this patch to show the delta?

I'm not sure you need -verify-machineinstrs

We're going to need a binop-identity-vector.ll set of tests as well.

laytonio updated this revision to Diff 300884.Oct 26 2020, 9:13 PM

Fixed some of the regressions.

laytonio marked 2 inline comments as done.Oct 26 2020, 9:16 PM
laytonio added inline comments.
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
536–537

I don't think there would be a good way to handle ISD::AND with AllOnes with a int64_t

laytonio added inline comments.Oct 28 2020, 11:06 AM
llvm/test/CodeGen/X86/binop-identity.ll
3

We're going to need a binop-identity-vector.ll set of tests as well.

What vector widths do you think would be best for testing here? Should I do a left and right for each binop at say <4 x i32>, <8 x i32>, and <16 x i32> with SSE, AVX, and AVX512, or is that overkill?

laytonio planned changes to this revision.Oct 28 2020, 11:23 AM
foad added inline comments.Oct 29 2020, 6:57 AM
llvm/test/CodeGen/AMDGPU/dagcombine-select.ll
58–69

Regression here. The "or" is suppoed to be combined away.

llvm/test/CodeGen/AMDGPU/sdiv64.ll
576–578

Another add-with-carry regression here.

foad added inline comments.Oct 29 2020, 7:00 AM
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
536–537

I was thinking you'd pass in -1 for that case, which would be sign-extended whenever it gets converted to an APInt.

xbolva00 added inline comments.
llvm/test/CodeGen/PowerPC/cmpb.ll
243

It depends, you should regenerate test checks with current codegen and then commit them.
Then rebase this patch.

nemanjai added inline comments.Jan 30 2022, 8:20 AM
llvm/test/CodeGen/PowerPC/cmpb.ll
243

The instruction selection for this rather complex instruction is handled in PPCDAGToDAGISel::combineToCMPB(). The instruction does a comparison of pairs of bytes in two registers producing a byte with all 1's if the two bytes are equal and all 0's if they are not.

Presumably the DAG that comes out for this after your transformation looks different enough that the idiom recognition for that instruction is no longer capable of detecting the DAG.

But I agree that you should produce the test checks with the script so we can see the direct comparison of the full codegen.

@laytonio Hasn't been active since Dec 2020 - is there anyone that has the bandwidth to continue this (either comandeer or start a new patch)?

spatel added a comment.Feb 8 2022, 5:16 AM

@laytonio Hasn't been active since Dec 2020 - is there anyone that has the bandwidth to continue this (either comandeer or start a new patch)?

We're not there yet, but we're on a path to eventually overlap this patch with:
D118644 (enabled this fold for a subset of x86 vector types/opcodes)
D119150 (moves the code into DAGCombiner with a TLI hook)
D119111 (adds more FP opcodes - abandoned, but we can easily adapt it into D119150)

This first set of patches is a direct response to perf regressions that occurred via D113442, so we will try to get them into the 14 release branch too.

lebedev.ri resigned from this revision.Jan 12 2023, 4:57 PM

This review may be stuck/dead, consider abandoning if no longer relevant.
Removing myself as reviewer in attempt to clean dashboard.

Herald added a project: Restricted Project. · View Herald TranscriptJan 12 2023, 4:57 PM