Page MenuHomePhabricator

[AMDGPU] Fold brcond (setcc zext(i1 x), 1, ne) -> brcond x

Authored by rampitec on Nov 28 2018, 2:34 PM.


Diff Detail

Event Timeline

rampitec created this revision.Nov 28 2018, 2:34 PM

Should this be a generic combine?


sext should also be OK


Can't this be reduced further?

rampitec marked 2 inline comments as done.Nov 28 2018, 3:04 PM

Should this be a generic combine?

I tried to make it generic but got real regression in ARM (tests 2007-05-09-tailmerge-2.ll and 2007-05-22-tailmerge-3.ll). The tail merge did not happen when expected. Turns out that is not unconditionally good regardless of the target.


sext would mean comparison with -1, not with 1 and does not really happen.
I can add it, but it will complicate the code.


I do not think so. That is the final code produced:

        s_and_b64 vcc, exec, s[12:13]
        s_cbranch_vccz BB0_4
; %bb.3:                                ; %bb132
                                        ;   in Loop: Header=BB0_2 Depth=2
        s_add_u32 s4, s8, s14
        s_addc_u32 s5, s9, s15
        s_branch BB0_5

As you may see we do not change exec here, just test it.

rampitec updated this revision to Diff 175780.Nov 28 2018, 3:31 PM
rampitec marked an inline comment as done.

Added sext handling.

mareko added a subscriber: mareko.Nov 28 2018, 5:04 PM
mareko added inline comments.

How do you know OrigCC is setcc? It could also be any bitwise op.

rampitec marked an inline comment as done.Nov 28 2018, 5:08 PM
rampitec added inline comments.

I do not need to kbow it, it can be anything producing i1. An i1 value will become sreg64 and will be directly used. In fact in the testcase it is not setcc, it is CopyFromReg.

rampitec planned changes to this revision.Nov 28 2018, 7:48 PM

It does not validate in some tests. I will investigate it as I do not see right away what is wrong.

rampitec abandoned this revision.Nov 29 2018, 6:31 AM

The error is that condition is actually has to be negated. However, if I just negate it it goes into an endless loop inside the combiner. I need to fix our BRCOND lowering instead.