Page MenuHomePhabricator

AMDGPU: Add a fast path for icmp.i1(src, false, NE)
ClosedPublic

Authored by mareko on Sep 13 2018, 2:30 PM.

Details

Summary

This allows moving the condition from the intrinsic to the standard ICmp
opcode, so that LLVM can do simplifications on it. The icmp.i1 intrinsic
is an identity for retrieving the SGPR mask.

And we can also get the mask from and i1, or i1, xor i1.

Diff Detail

Repository
rL LLVM

Event Timeline

mareko created this revision.Sep 13 2018, 2:30 PM

Should the instcombine part change also to allow creation of i1 uses?

Should the instcombine part change also to allow creation of i1 uses?

What do you mean by that? I'm not sure what you mean.

Should the instcombine part change also to allow creation of i1 uses?

What do you mean by that? I'm not sure what you mean.

In InstCombineCalls we whitelist bitwidth sizes that are legal, so if the input compare is an i1 compare, it will fold into the intrinsic

mareko updated this revision to Diff 168520.Oct 5 2018, 1:31 PM
mareko edited the summary of this revision. (Show Details)

AMDGPU: Add a fast path for icmp.i1(src, false, NE)

Summary:
This allows moving the condition from the intrinsic to the standard ICmp
opcode, so that LLVM can do simplifications on it. The icmp.i1 intrinsic
is an identity for retrieving the SGPR mask.

And we can also get the mask from and i1, or i1, xor i1.

Don't fold icmp in InstCombineCalls.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D52060

arsenm added inline comments.Oct 29 2018, 6:30 PM
lib/Transforms/InstCombine/InstCombineCalls.cpp
3656–3658 ↗(On Diff #168520)

Needs test in InstCombine

mareko added inline comments.Nov 20 2018, 3:18 PM
lib/Transforms/InstCombine/InstCombineCalls.cpp
3656–3658 ↗(On Diff #168520)

What should the test do?

arsenm added inline comments.Nov 20 2018, 3:59 PM
lib/Transforms/InstCombine/InstCombineCalls.cpp
3656–3658 ↗(On Diff #168520)

use an original i1 eq/ne comparison. Like the others, just i1

mareko updated this revision to Diff 175387.Nov 26 2018, 8:29 PM
mareko edited the summary of this revision. (Show Details)

Add InstCombine tests.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 14 2019, 6:17 PM
Closed by commit rL351150: AMDGPU: Add a fast path for icmp.i1(src, false, NE) (authored by mareko, committed by ). · Explain Why
This revision was automatically updated to reflect the committed changes.