Download Raw Diff

Details

Reviewers

craig.topper
asb
reames

Commits

rGfbace9540890: [RISCV] Enable preferZeroCompareBranch to optimize branch on zero in…

Summary

Similar to ARM and SystemZ.

Related Patchs: D101778(preferZeroCompareBranch)
https://reviews.llvm.org/rG9a9421a461166482465e786a46f8cced63cd2e9f ( == 0 to u< 1）

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

liaolucy created this revision.Jan 18 2023, 6:23 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 18 2023, 6:23 PM

Herald added subscribers: luke, VincentWu, vkmr and 28 others. · View Herald Transcript

liaolucy requested review of this revision.Jan 18 2023, 6:23 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 18 2023, 6:23 PM

Herald added subscribers: llvm-commits, • pcwang-thead, eopXD, MaskRay. · View Herald Transcript

Update, we can see the diff of testcases.

Herald added a subscriber: hiraditya. · View Herald TranscriptJan 18 2023, 6:25 PM

Harbormaster completed remote builds in B208640: Diff 490352.Jan 18 2023, 7:23 PM

Thanks for this! We typically add CHECK lines for both RV32 and RV64, and I think it would be worth doing here. I did look at this before, but held off as it seemed to cause a regression on one of the SystemZ test cases and I didn't have a chance to look at it.

Trying now, it looks like that's only currently the case on optbranch_64 with RV32 - I don't know if you've looked at that regression at all?

It might be worth grabbing the optbranch_32 test case from SystemZ as well to expand the test coverage.

grabbing the optbranch_32 test case from SystemZ
add CHECK lines for RV32

optbranch_64 with RV32 has regression，thanks Alex. I don't have a solution for the regression now. Any suggestions？

I analyze ARM and SystemZ:

RISCV: after codegenprepare, generate @llvm.uadd.with.overflow.i64 + extractvalue，which causes the regression

*** IR Dump After CodeGen Prepare (codegenprepare) ***
define i64 @optbranch_64(i64 %Arg) {
bb:
  %0 = call { i64, i1 } @llvm.uadd.with.overflow.i64(i64 %Arg, i64 1)
  %math = extractvalue { i64, i1 } %0, 0
  %ov = extractvalue { i64, i1 } %0, 1
  br i1 %ov, label %bb2, label %bb3

bb2:                                              ; preds = %bb
  ret i64 -1

bb3:                                              ; preds = %bb
  ret i64 %math
}

ARM has simplifycfg pass before codegenprepare:

*** IR Dump After Simplify the CFG (simplifycfg) ***
define i64 @optbranch_64(i64 %Arg) {
bb:
  %i1 = icmp eq i64 %Arg, -1
  %i4 = add nuw i64 %Arg, 1
  %common.ret.op = select i1 %i1, i64 -1, i64 %i4
  ret i64 %common.ret.op
}

systemZ is the same as RISCV, generate @llvm.uadd.with.overflow.i64 + extractvalue, but SystemZ seems to have the register pairs $r2d.

 *** IR Dump After SystemZ DAG->DAG Pattern Instruction Selection (systemz-isel) ***:
 Machine code for function optbranch_64: IsSSA, TracksLiveness
Function Live Ins: $r2d in %1

bb.0.bb:
  successors: %bb.1(0x40000000), %bb.2(0x40000000); %bb.1(50.00%), %bb.2(50.00%)
  liveins: $r2d
  %1:gr64bit = COPY $r2d
  %0:gr64bit = ALGHSIK %1:gr64bit, 1, implicit-def $cc
  BRC 15, 12, %bb.2, implicit $cc
  J %bb.1

bb.1.bb2:
; predecessors: %bb.0

  %2:gr64bit = LGHI -1
  $r2d = COPY %2:gr64bit
  Return implicit $r2d

bb.2.bb3:
; predecessors: %bb.0

  $r2d = COPY %0:gr64bit
  Return implicit $r2d

liaolucy added a reviewer: asb.Jan 22 2023, 4:50 PM

Harbormaster completed remote builds in B209260: Diff 491212.Jan 22 2023, 6:09 PM

Can you land the test with current codegen so that we can see the regression being discussed?

liaolucy mentioned this in rG66a30b9fd0ba: [RISCV] Add precommit tests for D142071;.Feb 1 2023, 11:20 PM

@reames Sorry for the late.

precommit tests:https://reviews.llvm.org/rG66a30b9fd0bab0d5db24772674932a66d433212c

I plan to implement similar code in riscv-codegenprepare, including only the initial Shift. This should not have the regression for add.

liaolucy added a reviewer: reames.Feb 1 2023, 11:44 PM

Harbormaster completed remote builds in B211392: Diff 494175.Feb 2 2023, 12:52 AM

Based on your prior comment, I think your regression here comes down to how we lower uadd_with_overflow. It looks like our lowering of that could use some improvement.

You could investigate improving that and then returning here. Another option to consider is where we should be returning false from TLI->shouldFormOverflowOp. (Either case, the result should be a separate patch and this one should then depend on it.)

In D142071#4099927, @reames wrote:

Based on your prior comment, I think your regression here comes down to how we lower uadd_with_overflow. It looks like our lowering of that could use some improvement.

You could investigate improving that and then returning here. Another option to consider is where we should be returning false from TLI->shouldFormOverflowOp. (Either case, the result should be a separate patch and this one should then depend on it.)

Thanks.

A deeper look at the ARM's lowering.
RISCV may need setOperationAction(ISD::ADDCARRY, MVT::i32, Custom);

Reference @rogfer01’s D35192

Testcase:

declare {i64, i1} @llvm.uadd.with.overflow.i64(i64, i64) nounwind readnone
define i64 @optbranch_64(i64 %Arg) {
bb:
  %0 = call { i64, i1 } @llvm.uadd.with.overflow.i64(i64 %Arg, i64 -1)
  %math = extractvalue { i64, i1 } %0, 0
  %ov = extractvalue { i64, i1 } %0, 1
  %.math = select i1 %ov, i64 -1, i64 %math
  ret i64 %.math
}

Step 2: Optimized lowered selection DAG: %bb.0, Arm and RISCV are the same.

SelectionDAG has 18 nodes:
  t0: ch,glue = EntryToken
      t2: i32,ch = CopyFromReg t0, Register:i32 %0
      t4: i32,ch = CopyFromReg t0, Register:i32 %1
    t5: i64 = build_pair t2, t4
  t7: i64,i1 = uaddo t5, Constant:i64<-1>
  t8: i64 = select t7:1, Constant:i64<-1>, t7
    t12: i32 = extract_element t8, Constant:i32<0>
  t14: ch,glue = CopyToReg t0, Register:i32 $r0, t12
    t10: i32 = extract_element t8, Constant:i32<1>
  t16: ch,glue = CopyToReg t14, Register:i32 $r1, t10, t14:1
  t17: ch = ARMISD::RET_FLAG t16, Register:i32 $r0, Register:i32 $r1, t16:1

but step 3,
ARM: Type-legalized selection DAG:

Type-legalized selection DAG: %bb.0 'optbranch_64:bb'
SelectionDAG has 18 nodes:
  t0: ch,glue = EntryToken
    t23: i32 = select t28, Constant:i32<-1>, t21
  t14: ch,glue = CopyToReg t0, Register:i32 $r0, t23
    t24: i32 = select t28, Constant:i32<-1>, t22
  t16: ch,glue = CopyToReg t14, Register:i32 $r1, t24, t14:1
    t2: i32,ch = CopyFromReg t0, Register:i32 %0
  t21: i32,i32 = uaddo t2, Constant:i32<-1>
    t4: i32,ch = CopyFromReg t0, Register:i32 %1
    t26: i32 = and t21:1, Constant:i32<1>
  t22: i32,i32 = addcarry t4, Constant:i32<-1>, t26
  t28: i32 = and t22:1, Constant:i32<1>
  t17: ch = ARMISD::RET_FLAG t16, Register:i32 $r0, Register:i32 $r1, t16:1

RISCV:
Type-legalized selection DAG: %bb.0 'optbranch_64:bb'

SelectionDAG has 24 nodes:
  t0: ch,glue = EntryToken
  t2: i32,ch = CopyFromReg t0, Register:i32 %0
  t4: i32,ch = CopyFromReg t0, Register:i32 %1
    t31: i32 = select t38, Constant:i32<-1>, t26
  t14: ch,glue = CopyToReg t0, Register:i32 $x10, t31
    t32: i32 = select t38, Constant:i32<-1>, t29
  t16: ch,glue = CopyToReg t14, Register:i32 $x11, t32, t14:1
  t26: i32 = add t2, Constant:i32<-1>
  t28: i32 = setcc t26, t2, setult:ch
    t27: i32 = add t4, Constant:i32<-1>
  t29: i32 = add t27, t28
      t35: i32 = setcc t29, t4, seteq:ch
      t33: i32 = setcc t29, t4, setult:ch
    t36: i32 = select t35, t28, t33
  t38: i32 = and t36, Constant:i32<1>
  t17: ch = RISCVISD::RET_FLAG t16, Register:i32 $x10, Register:i32 $x11, t16:1

ARM seems to have introduced new nodes ARMISD::ADDC and ARMISD::ADDE
I need more time to implement this feature because I haven't sorted out the logic yet.

If we want to solve this regression for @optbranch_64, we need to lowering ISD::ADDCARRY and ISD::SETCCCARRY.

But I checked these nodes all used for carry bit/flags. For rv64gc/rv32gc we do not have instructions to handle carry-add or carry-sub.

Maybe this lowering can be used in rvv. They have 11.4. Vector Integer Add-with-Carry / Subtract-with-Borrow Instructions

So I will create a patch for TLI->shouldFormOverflowOp.

liaolucy mentioned this in D143646: [RISCV] Return false from shouldFormOverflowOp when type is i8 and i16.Feb 9 2023, 6:45 AM

liaolucy added a parent revision: D143646: [RISCV] Return false from shouldFormOverflowOp when type is i8 and i16.Feb 9 2023, 6:46 AM

liaolucy edited the summary of this revision. (Show Details)

craig.topper mentioned this in D144614: [LegalizeTypes][RISCV] Add a special case to ExpandIntRes_UADDSUBO for (uaddo X, 1)..Feb 22 2023, 10:15 PM

craig.topper mentioned this in rG2fc5a5117c61: [LegalizeTypes][RISCV] Add a special case to ExpandIntRes_UADDSUBO for (uaddo X….Feb 23 2023, 9:17 AM

rebase

Harbormaster completed remote builds in B215621: Diff 500011.Feb 23 2023, 4:27 PM

LGTM, but wait a day or two to see if @asb or @reames have any comments.

This revision is now accepted and ready to land.Feb 23 2023, 7:34 PM

This revision was landed with ongoing or failed builds.Feb 27 2023, 10:43 PM

Closed by commit rGfbace9540890: [RISCV] Enable preferZeroCompareBranch to optimize branch on zero in… (authored by liaolucy). · Explain Why

This revision was automatically updated to reflect the committed changes.

liaolucy added a commit: rGfbace9540890: [RISCV] Enable preferZeroCompareBranch to optimize branch on zero in….

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Enable preferZeroCompareBranch to optimize branch on zero in codegenprepare
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 490352

llvm/lib/Target/RISCV/RISCVISelLowering.h

llvm/test/CodeGen/RISCV/branch-on-zero.ll

This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Enable preferZeroCompareBranch to optimize branch on zero in codegenprepareClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 490352

llvm/lib/Target/RISCV/RISCVISelLowering.h

llvm/test/CodeGen/RISCV/branch-on-zero.ll

[RISCV] Enable preferZeroCompareBranch to optimize branch on zero in codegenprepare
ClosedPublic