This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG] add nsw on BuildLookuptable LinearMap calculation
ClosedPublic

Authored by khei4 on May 19 2023, 1:37 AM.

Details

Summary

1-c step of https://reviews.llvm.org/D146903

We can attach NSW on the addition and multiplication of LinearMap from the switch if the values are monotonic.

alive proofs: https://alive2.llvm.org/ce/z/F8mZUf

Note: SwitchToLookUpTable is inherently target-independent except for BitMap patterns, which embed the values into one register. It might be better to separate independent tests from target-specific test cases also.

Diff Detail

Event Timeline

khei4 created this revision.May 19 2023, 1:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 19 2023, 1:37 AM
khei4 requested review of this revision.May 19 2023, 1:37 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 19 2023, 1:37 AM
khei4 updated this revision to Diff 523690.May 19 2023, 1:40 AM

remove comment

khei4 updated this revision to Diff 523692.May 19 2023, 1:43 AM

fix style

khei4 updated this revision to Diff 523760.May 19 2023, 7:15 AM
khei4 retitled this revision from (WIP)[SimplifyCFG] add nsw on BuildLookuptable LinearMap calculation to [SimplifyCFG] add nsw on BuildLookuptable LinearMap calculation.
khei4 edited the summary of this revision. (Show Details)
khei4 added a reviewer: nikic.

remove unnecessary initializer

TODO: track compile time effect

khei4 edited the summary of this revision. (Show Details)May 19 2023, 7:17 AM
khei4 edited the summary of this revision. (Show Details)May 19 2023, 7:20 AM
nikic added inline comments.May 21 2023, 11:28 AM
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
6122

I don't think this is sufficient. Let's say we have multiplier 2 and MaxIndex is -1 (that is, full range). Then 2*-1 does not overflow, but there are still plenty of values in the range that do overflow (e.g. INT_MAX).

A possible way to properly handle this would be using ConstantRange. I think another possibility is to used ssub_ov for the Val - PrevVal calculation above and set NSW based on the overflow flag there.

In any case, we need additional tests to cover this.

khei4 added a comment.EditedMay 21 2023, 10:45 PM

TODO: use ConstantRange,
TODO: add test case which have select case value ranges [2^n-1, 2^n].

TODO: add signed overflow results

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
6122

Let's say we have multiplier 2 and MaxIndex is -1 (that is, full range). Then 2*-1 does not overflow, but there are still plenty of values in the range that do overflow (e.g. INT_MAX).

Yeah, its case matters. Overflowed MaxIndex should be handled. I'll use ConstantRange!

Also, I'll add a linear map test that uses full range.

khei4 planned changes to this revision.May 21 2023, 11:30 PM

I would skip the LinearMap part! Until the overflow benefits are revealed!G

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
6122

@nikic It seems like I missed Max/MinIndex is signed sense.
I think the tests we need are unsigned results with signed-overflowed/underflowed values

khei4 added inline comments.Jun 12 2023, 11:19 PM
llvm/lib/Transforms/Utils/SimplifyCFG.cpp
6122

It seems like I missed Max/MinIndex in a signed sense.

I mean we don't need to worry about signed-over/underflow checks with falsy unsigned sense Max/MinIndex.

And I guess this is index full range tests.
(from above URL)

define i32 @signed_overflow1(i8 %n) {
; CHECK-LABEL: @signed_overflow1(
; CHECK-NEXT:  start:
; CHECK-NEXT:    [[TRUNC:%.*]] = trunc i8 [[N:%.*]] to i2
; CHECK-NEXT:    [[SWITCH_TABLEIDX:%.*]] = sub i2 [[TRUNC]], -2
; CHECK-NEXT:    [[SWITCH_TABLEIDX_ZEXT:%.*]] = zext i2 [[SWITCH_TABLEIDX]] to i3
; CHECK-NEXT:    [[SWITCH_GEP:%.*]] = getelementptr inbounds [4 x i32], ptr @switch.table.signed_overflow1, i32 0, i3 [[SWITCH_TABLEIDX_ZEXT]]
; CHECK-NEXT:    [[SWITCH_LOAD:%.*]] = load i32, ptr [[SWITCH_GEP]], align 4
; CHECK-NEXT:    ret i32 [[SWITCH_LOAD]]
;
start:
  %trunc = trunc i8 %n to i2
  switch i2 %trunc, label %bb1 [
  i2 0, label %bb6
  i2 1, label %bb3
  i2 -2, label %bb4
  i2 -1, label %bb5
  ]

bb1:                                              ; preds = %start
  unreachable

bb3:                                              ; preds = %start
  br label %bb6

bb4:                                              ; preds = %start
  br label %bb6

bb5:                                              ; preds = %start
  br label %bb6

bb6:                                              ; preds = %start, %bb3, %bb4, %bb5
  %.sroa.0.0 = phi i32 [ 4444, %bb5 ], [ 3333, %bb4 ], [ 2222, %bb3 ], [ 1111, %start ]
  ret i32 %.sroa.0.0
}
khei4 edited the summary of this revision. (Show Details)Jun 18 2023, 8:26 PM
khei4 updated this revision to Diff 532521.Jun 18 2023, 9:04 PM

Change the plan to use PrevDist to check monotonicity.
Rebase for pre-commit test https://reviews.llvm.org/D153238.

khei4 edited the summary of this revision. (Show Details)Jun 19 2023, 8:34 PM
nikic accepted this revision.Jun 25 2023, 6:10 AM

LGTM

llvm/lib/Transforms/Utils/SimplifyCFG.cpp
6064

I would drop the Inc var and directly write

Wrapped |= Dist.isStrictlyPositive() ? Val.sle(PrevVal) : Val.sgt(PrevVal);

here.

This revision is now accepted and ready to land.Jun 25 2023, 6:10 AM
khei4 updated this revision to Diff 534383.Jun 25 2023, 4:44 PM

inline Inc.

This revision was landed with ongoing or failed builds.Jun 25 2023, 5:11 PM
This revision was automatically updated to reflect the committed changes.
uabelho added a subscriber: uabelho.Sep 8 2023, 3:52 AM

Hi @khei4 and @nikic

I think I've found a case where this patch causes a miscompile.
If we do

opt bbi-86196.ll -o - -S -passes="simplifycfg<switch-to-lookup>"

on

define i8 @foo(i3 %0) {                              ; Assume 1
entry:
  switch i3 %0, label %cond.end [
    i3 -1, label %cond.false
    i3 -2, label %cond.false
    i3 1, label %cond.false                          ; -> %cond.false
    i3 0, label %cond.false
  ]

cond.false:                                       ; preds = %entry, %entry, %entry, %entry
  %mul = shl nsw i3 %0, 1                            ; 1 << 1 = 2
  br label %cond.end

cond.end:                                         ; preds = %entry, %cond.false
  %cond = phi i3 [ %mul, %cond.false ], [ 2, %entry ]; 2
  %conv = sext i3 %cond to i8
  ret i8 %conv
}

we get

define i16 @foo(i3 %0) {                            ; Assume 1
entry:
  %switch.tableidx = sub i3 %0, -2                  ; 1 - -2 = 3
  %1 = icmp ult i3 %switch.tableidx, -4             ; 3 (011) u< -4 (100) = 1
  %switch.idx.mult = mul nsw i3 %switch.tableidx, 2 ; 3 * 2 = 6 ovf -2 poison?!
  %switch.offset = add nsw i3 %switch.idx.mult, -4  ; -2 + -4 = -6 ovf 2 poison?!
  %cond = select i1 %1, i3 %switch.offset, i3 2     ; %1 is 1 -> 2 poison?!
  %conv = sext i3 %cond to i16                      ; 2 poison
  ret i16 %conv                                     ; 2 poison
}

which I think is wrong.
I've added some comments in the input and output showing what I think happens for the input value 1.
There are "nsw" on the mul and add instructions, but both caluculations overflow the i3 so I think we get poison there after the transformation?

khei4 added a comment.EditedSep 10 2023, 1:38 AM

@uabelho Thank you for catching this! As you see, attaching NSW to this example you gave seems unreasonable!

I thought switch results values' monotonicity was enough to say wrapping must not happen, but it wasn't correct.
(-2, -1, 1, 2) |→ (-4, -2, 0, 2) is monotonic, but actual calculation contains ov (-2, -1, 1, 2) |→ (0, 1, 2, 3) (*2) |→ (0, 2, 4(ov), 6(ov)) |→ (-4, -2, 0, 2).

I think we can handle this case if we calculate the maximum value, with monotonicity checking.

Candidate patch: https://github.com/llvm/llvm-project/pull/65882