This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel][IRTranslator] Generate better conditional branch lowering.
ClosedPublic

Authored by aemerson on Aug 26 2020, 5:00 PM.

Details

Summary

This is a port of the functionality from SelectionDAG, which tries to find a tree of conditions from compares that are then combined using OR or AND, before using that result as the input to a branch. Instead of naively lowering the code as is, this change converts that into a sequence of conditional branches on the sub-expressions of the tree.

Like SelectionDAG, we re-use the case block codegen functionality from the switch lowering utils, which causes us to generate some different code. The result of which I've tried to mitigate in earlier combine patches.

Diff Detail

Event Timeline

aemerson created this revision.Aug 26 2020, 5:00 PM
aemerson requested review of this revision.Aug 26 2020, 5:00 PM
aemerson added inline comments.Aug 26 2020, 5:02 PM
llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
396

I don't think this is strictly necessary for GlobalISel because we don't have to care about values being exported across basic blocks. However, I'd like to port over the functionality as is for now, and then remove it later after doing some performance tests to check the impact.

I was actually just looking at the original globalisel proposal, and switches were intended to be directly translated into a G_SWITCH instruction. Would it be much effort to move this into a machine transformation?

For AMDGPU we're stuck using LowerSwitch and control flow intrinsics, but in the near-ish future we can directly handle control flow here

I was actually just looking at the original globalisel proposal, and switches were intended to be directly translated into a G_SWITCH instruction. Would it be much effort to move this into a machine transformation?

For AMDGPU we're stuck using LowerSwitch and control flow intrinsics, but in the near-ish future we can directly handle control flow here

Unfortunately we have a fair amount of shared code for doing the analysis for the jump table/bit test optimizations on the IR (in SwitchLoweringUtils). It seemed easier to share the code and pay the cost of doing this in the translator than to duplicate even more logic by having completely separate implementations. Myself I don't have the bandwidth to make such a large change even if we wanted to.

paquette added inline comments.Aug 27 2020, 10:07 AM
llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
437

static_cast?

485

Would it make sense to assert that we have an Instruction::And or an Instruction::Or at the beginning of the function?

Then we can catch it before any of the code under, say BOpc != unsigned(Opc) kicks in and potentially does weird stuff/asserts.

522

Why?

570

Remove commented out code?

aemerson added inline comments.Aug 27 2020, 2:49 PM
llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
522

Because for multiple cases, it's generally better to emit as branches. The code below is trying to handle some special cases which can be simplified better than using multiple branches. I'll add tests for those.

aemerson updated this revision to Diff 288464.Aug 27 2020, 2:49 PM

Address comments.

paquette accepted this revision.Sep 4 2020, 10:45 AM

LGTM

llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp
826

don't need braces here?

This revision is now accepted and ready to land.Sep 4 2020, 10:45 AM
This revision was landed with ongoing or failed builds.Sep 9 2020, 1:16 PM
This revision was automatically updated to reflect the committed changes.