This is an archive of the discontinued LLVM Phabricator instance.

[InstSimplify] Fix addo/subo undef folds (PR42209)
AbandonedPublic

Authored by nikic on Jun 9 2019, 2:05 PM.

Details

Summary

Fix folds of addo and subo with an undef operand to be:

  • subo(X, undef) folds to { 0, false } (holds for undef = X).
  • addo(X, undef) folds to { -1, false } (holds for undef = ~X).
  • Same for commuted variants.

These are consistent with the folds we use for subsat and addsat.

Fixes PR42209

Diff Detail

Event Timeline

nikic created this revision.Jun 9 2019, 2:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2019, 2:05 PM
lebedev.ri edited the summary of this revision. (Show Details)Jun 14 2019, 3:12 PM

I wonder if we should instead fold to { i8 undef, i1 false } ?

nikic added a comment.Jun 15 2019, 1:18 AM

@lebedev.ri It's not possible to fold to { i8 undef, i1 false }. As a counter-example uaddo(1, undef) can never have the value { 0, false }. Reaching the zero value would have required overflow, which is precluded by the overflow result. Similar cases exist for the other intrinsics as well.

lebedev.ri requested changes to this revision.Jun 15 2019, 4:20 PM

@lebedev.ri It's not possible to fold to { i8 undef, i1 false }. As a counter-example uaddo(1, undef) can never have the value { 0, false }. Reaching the zero value would have required overflow, which is precluded by the overflow result. Similar cases exist for the other intrinsics as well.

Ok so. I *think* i'm finally holding alive the right way, gently-enough, and breathing in other direction to not disturb it.
And it does not agree with that statement:

$ .../alive  -root-only /tmp/test.opt  
OMP: Info #270: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
Processing /tmp/test.opt..

----------------------------------------
Name: zz
  %undef0 = udiv i8 %a, 0
  %X = uadd_overflow i8 1, %undef0
  %Y = extractvalue {i8, i1} %X, 0
  %Z = extractvalue {i8, i1} %X, 1
  ret %Y
=>
  %undef0 = udiv i8 %a, 0
  %X = uadd_overflow i8 1, %undef0
  %Y = 0
  %Z = 0
  ret %Y

Done: 64
Optimization is correct!

----------------------------------------
Name: zz 2
  %undef0 = udiv i8 %a, 0
  %X = uadd_overflow i8 1, %undef0
  %Y = extractvalue {i8, i1} %X, 0
  %Z = extractvalue {i8, i1} %X, 1
  ret %Z
=>
  %undef0 = udiv i8 %a, 0
  %X = uadd_overflow i8 1, %undef0
  %Y = 0
  %Z = 0
  ret %Z

Done: 64
Optimization is correct!

In fact, it tells me that my suggestion about folding all 4 of @llvm.{u,s}{add,sub}.with.overflow to {undef, false} is correct.
And the @llvm.{u,s}mul.with.overflow can all be folded to {0, false}

This revision now requires changes to proceed.Jun 15 2019, 4:20 PM
nikic added a comment.Jun 16 2019, 1:27 AM

@lebedev.ri Not familiar with what exactly alive proves, but assuming you're using two separate proofs for a reason: It is possible to make the result zero, yes. But only if the overflow flag is chosen as true (rather than false) at the same time. Both result & overflow need to be checked together for the result to be meaningful.

(Additionally, if alive models udiv semantics correctly, any transformation will be correct by definition (IUB) after a udiv 0 -- I'm assuming that's not how it works though.)

@lebedev.ri Not familiar with what exactly alive proves, but assuming you're using two separate proofs for a reason: It is possible to make the result zero, yes. But only if the overflow flag is chosen as true (rather than false) at the same time. Both result & overflow need to be checked together for the result to be meaningful.

Good observarvation. Close, but nope:

$ ./src/alive2/build-Clang-release/alive -root-only /tmp/test.opt 
OMP: Info #270: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
Processing /tmp/test.opt..

----------------------------------------
Name: uadd_overflow
  %X = uadd_overflow i8 %a, undef
  %Op = extractvalue {i8, i1} %X, 0
  %Ov = extractvalue {i8, i1} %X, 1
  %OpIsZero = icmp eq %Op, 0
  %NoOverflow = icmp eq %Ov, 0
  %Zero = and i1 %OpIsZero, %NoOverflow
  ret i1 %Zero
=>
  ret i1 1

Done: 1
Optimization is correct!
$ ./src/alive2/build-Clang-release/alive -root-only /tmp/test.opt 
OMP: Info #270: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
Processing /tmp/test.opt..

----------------------------------------
Name: uadd_overflow
  %X = uadd_overflow i8 %a, %b
  %Op = extractvalue {i8, i1} %X, 0
  %Ov = extractvalue {i8, i1} %X, 1
  %OpIsZero = icmp eq %Op, 0
  %NoOverflow = icmp eq %Ov, 0
  %Zero = and i1 %OpIsZero, %NoOverflow
  ret i1 %Zero
=>
  ret i1 1

ERROR: Value mismatch

Example:
i8 %a = #xa0 (160)
i8 %b = #x80 (128)

Source:
{i8, i1} %X = { #x20 (32), #x1 (0) }
i8 %Op = #x20 (32)
i1 %Ov = #x1 (1)
i1 %OpIsZero = #x0 (0)
i1 %NoOverflow = #x0 (0)
i1 %Zero = #x0 (0)

Target:
Source value: #x0 (0)
Target value: #x1 (1)

(Additionally, if alive models udiv semantics correctly, any transformation will be correct by definition (IUB) after a udiv 0 -- I'm assuming that's not how it works though.)

nikic added a comment.Jun 16 2019, 4:13 AM

@lebedev.ri Not familiar with what exactly alive proves, but assuming you're using two separate proofs for a reason: It is possible to make the result zero, yes. But only if the overflow flag is chosen as true (rather than false) at the same time. Both result & overflow need to be checked together for the result to be meaningful.

Good observarvation. Close, but nope:

$ ./src/alive2/build-Clang-release/alive -root-only /tmp/test.opt 
OMP: Info #270: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
Processing /tmp/test.opt..

----------------------------------------
Name: uadd_overflow
  %X = uadd_overflow i8 %a, undef
  %Op = extractvalue {i8, i1} %X, 0
  %Ov = extractvalue {i8, i1} %X, 1
  %OpIsZero = icmp eq %Op, 0
  %NoOverflow = icmp eq %Ov, 0
  %Zero = and i1 %OpIsZero, %NoOverflow
  ret i1 %Zero
=>
  ret i1 1

Done: 1
Optimization is correct!

Isn't this checking exactly the transform proposed here?

nikic added a comment.Jun 16 2019, 4:18 AM

Isn't this checking exactly the transform proposed here?

Ah no it isn't, got confused between addo and subo there.

Ok, so {undef, 0} it is?

nikic abandoned this revision.Jun 16 2019, 8:42 AM

Sorry, I'm still not convinced. Alive evaluates a single atomic with_overflow operation by computing the operation result and the overflow flag independently, which is usually okay but in my eyes not right when it comes to undef modeling, because it artificially increases the number of undef users over what exists in IR. (For reference, some of the discussion is in https://github.com/AliveToolkit/alive2/issues/71.)

I'll leave this to someone else. Undef and I clearly don't get along...

Hm, not sure how to reclaim the differential, there is no such thing in "Add Action" dropdown.