Page MenuHomePhabricator

[GISel] Add new combines for G_ADD
Needs ReviewPublic

Authored by mkitzan on Sep 18 2020, 1:14 PM.

Details

Summary

Patch adds four new GICombineRules for G_ADD:

  • G_ADD(x, -cst) -> G_SUB(x, cst)
  • G_ADD(x, G_SUB(y, x)) -> y
  • G_ADD(G_SUB(y, x), x) -> y
  • G_ADD(x, y) -> G_OR(x, y) (iff x and y share no common bits)

Patch additionally adds new combine tests for AArch64 target for these new rules, as well as updating AMDGPU GISel tests. A new GlobalISel/Utils.h helper function was added to check if two vregs have no common bits set.

Diff Detail

Event Timeline

mkitzan created this revision.Sep 18 2020, 1:14 PM
mkitzan requested review of this revision.Sep 18 2020, 1:14 PM

Out of curiosity, is the first rule a win or canonicalisation?

arsenm added inline comments.Sep 18 2020, 1:46 PM
llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
322

APInt to allow it to work for wide integers?

llvm/test/CodeGen/AArch64/GlobalISel/combine-add.mir
86

Add an s128 case? This currently won't work for vectors, but it would be nice to handle a few vector tests too

llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll
23–25

I don't understand where this regression came from (but isn't particularly important)

54–63

This is a much more interesting regression

mkitzan added a comment.EditedSep 18 2020, 2:57 PM

Out of curiosity, is the first rule a win or canonicalisation?

Primarily canonicalization, however depending on the target there can be perf wins. By eliminating the signed constant, there's a better chance the constant can be folded into the opcode as an immediate operand rather than being loaded into a register before being used (if the target supports immediate operands, otherwise nothing lost, eh?).

Assume hypothetical target only supports 2b immediate operands to arithmetic ops and registers are 8b. Here's an example where the combine is a perf win.

mov r1, -1
add r0, r1
---
sub r0, 1
llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h
322

Make sense, will add

llvm/test/CodeGen/AArch64/GlobalISel/combine-add.mir
86

Will add an s128 and a vector test case for each of these

llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll
54–63

I noticed regressions in this test file too. The combine causing them is the: G_ADD(x, y) -> G_OR(x, y) (iff x and y share no common bits). That combine rule is the cause of nearly all the AMDGPU test changes (with the exception of three or four tests).

arsenm added inline comments.Sep 18 2020, 4:10 PM
llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll
54–63

So we need the equivalent utility for SelectionDAG::isBaseWithConstantOffset(

mkitzan added inline comments.Sep 18 2020, 5:50 PM
llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll
54–63

Not quite sure where that fits into the implementation of the combine. The added GlobalISel/Utils.cpp:haveNoCommonBitsSet acts the same as SelectionDAG.pp:haveNoCommonBitsSet which is used to gate the DAG combine version (x + y) -> (x | y) iff x and y share no bits.

mkitzan updated this revision to Diff 292930.Sep 18 2020, 6:24 PM
  • Changed G_ADD(x, -cst) -> G_SUB(x, cst) to use APInt
  • Added vector and s128 test cases (though some combines don't fire with these input types)
arsenm added inline comments.Fri, Sep 25, 1:29 PM
llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.i128.ll
54–63

It's not for the combine itself, it's for all of the other code to deal with the side effect of it. In general this is a terrible combine because nobody expects adds to turn into ors. Without a utility function you can easily use to match adds that were turned into ors, I think it's generally worse to do this combine. I think having this utility function is a prerequisite for this