This is an archive of the discontinued LLVM Phabricator instance.

[CodeGenPrepare][RISCV] Reverse transform in CGP to use zero-compare branch
AbandonedPublic

Authored by dtcxzyw on Apr 7 2023, 7:16 AM.

Details

Summary

This patch handles signed icmp cases of Add and Sub based on D101778 if it guarantees no overflow after hoisting. For example:
https://alive2.llvm.org/ce/z/wbCO7s

define i32 @src(i32 %0) {
  %x = icmp sge i32 %0, 0
  br i1 %x, label %nonnegative, label %5

nonnegative:                                      ; preds = %1
  %2 = icmp slt i32 %0, 13
  br i1 %2, label %5, label %3

3:                                                ; preds = %nonnegative
  %4 = add i32 %0, -13
  ret i32 %4

5:                                                ; preds = %nonnegative, %1
  ret i32 0
}

define i32 @tgt(i32 %0) {
  %x = icmp sge i32 %0, 0
  br i1 %x, label %nonnegative, label %5

nonnegative:                                      ; preds = %1
  %2 = add i32 %0, -13
  %3 = icmp slt i32 %2, 0
  br i1 %3, label %5, label %4

4:                                                ; preds = %nonnegative
  ret i32 %2

5:                                                ; preds = %nonnegative, %1
  ret i32 0
}

This patch addresses performance regressions in RISC-V benchmarks reported by our CI. I think this patch can protect against regressions from patch D147568.

Diff Detail

Event Timeline

dtcxzyw created this revision.Apr 7 2023, 7:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 7 2023, 7:16 AM
dtcxzyw requested review of this revision.Apr 7 2023, 7:16 AM
nikic added a comment.Apr 7 2023, 9:24 AM

Can you base this on nowrap flags instead of re-deriving them?

Can you base this on nowrap flags instead of re-deriving them?

We cannot deduce that Add/Sub instructions with NSW flags do not overflow after hoisting.
For example:
https://alive2.llvm.org/ce/z/GVZbvc

define i32 @src(i32 %0) {
  %2 = add nsw i32 %0, -13
  %3 = icmp slt i32 %2, 0
  br i1 %3, label %5, label %4

4:                                                
  ret i32 %2

5:                                               
  ret i32 0
}

define i32 @tgt(i32 %0) {
  %2 = icmp slt i32 %0, 13
  br i1 %2, label %5, label %3

3:                                               
  %4 = add nsw i32 %0, -13
  ret i32 %4

5:                                                
  ret i32 0
}

This transformation in InstCombine is safe but loses the information that add i32 %0, -13 does not overflow before the branch.
Thus, the reverse transformation in CodeGenPrepare depending on the NSW flag will fail:
https://alive2.llvm.org/ce/z/woAW5v

Therefore the re-derivation is necessary as a "mitigation" for the loss of information.

dtcxzyw abandoned this revision.Mon, Nov 20, 7:37 PM