This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Rewrite the elide-br-by-swapping-icmp-ops combine to do less.
ClosedPublic

Authored by aemerson on Aug 26 2020, 4:58 PM.

Details

Summary

This combine previously tried to take sequences like:

%cond = G_ICMP pred, a, b
G_BRCOND %cond, %truebb
G_BR %falsebb
%truebb:
...
%falsebb:
...

and by inverting the compare predicate and swapping branch targets, delete the G_BR and instead have a single conditional branch to the falsebb. Since in an earlier patch we have a combine to fold not(icmp) into just an inverted icmp, we don't need this combine to do as much. This patch instead generalizes the combine by just looking for:

G_BRCOND %cond, %truebb
G_BR %falsebb
%truebb:
...
%falsebb:
...

and then inverting the condition using a not (xor). The xor can be folded away in a separate combine. This change also lets us avoid some optimization code in the IRTranslator.

I also think that deleting G_BRs in the combiner is unnecessary. That's something that targets can decide to do at selection time and could simplify generic code in future.

Diff Detail

Event Timeline

aemerson created this revision.Aug 26 2020, 4:58 PM
aemerson requested review of this revision.Aug 26 2020, 4:58 PM

The IRTranslator chooses to omit the G_BR if it can use a fallthrough. Should it not do that either, and the verifier enforce always having G_BR and G_BRCOND?

The IRTranslator chooses to omit the G_BR if it can use a fallthrough. Should it not do that either, and the verifier enforce always having G_BR and G_BRCOND?

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
921

Should be able to get ths from MachinEIRBuilder now

llvm/lib/CodeGen/GlobalISel/Utils.cpp
744

Not sure there's much point in using APInt here?

aemerson added inline comments.Aug 27 2020, 12:07 PM
llvm/lib/CodeGen/GlobalISel/Utils.cpp
744

I've moved this function to an earlier patch and changed it to use int64_t instead.

The IRTranslator chooses to omit the G_BR if it can use a fallthrough. Should it not do that either, and the verifier enforce always having G_BR and G_BRCOND?

I think this is a reasonable idea. We discussed this a long time ago internally but ultimately didn't have any strong opinions to change it.

The IRTranslator chooses to omit the G_BR if it can use a fallthrough. Should it not do that either, and the verifier enforce always having G_BR and G_BRCOND?

I think this is a reasonable idea. We discussed this a long time ago internally but ultimately didn't have any strong opinions to change it.

My main question for this is: Where would we handle the fallthrough cases for this? It's common enough that I'd prefer it to be somewhere target independent.

aemerson updated this revision to Diff 288462.Aug 27 2020, 2:47 PM

Address comments.

aemerson updated this revision to Diff 288979.Aug 31 2020, 10:17 AM

Rebase and move getICmpTrueVal() from an earlier patch to this one.

arsenm accepted this revision.Sep 9 2020, 8:00 AM

Seems like you would have to find a compare or something to set isFP

This revision is now accepted and ready to land.Sep 9 2020, 8:00 AM
This revision was landed with ongoing or failed builds.Sep 9 2020, 1:16 PM
This revision was automatically updated to reflect the committed changes.
llvm/lib/Target/AMDGPU/AMDGPUCombine.td