This is an archive of the discontinued LLVM Phabricator instance.

[ubsan] Handle undef values in the integer overflow diagnostic
Needs ReviewPublic

Authored by vsk on Oct 5 2016, 4:16 PM.

Details

Summary

The compiler may optimize away the values passed into ubsan's integer
overflow handlers. This can lead to confusing reports. E.g, the runtime
reports overflows in nonsensical expressions such as "0 + 0":

struct Undef { int I; }

int getUndefInt() { Undef U; return U.I; }

int main() { return getUndefInt() + getUndefInt(); }

Improve the situation by checking for an actual overflow in the handler.
If we can't detect an overflow, explain what happened to the user.

Depends on D25294.

Diff Detail

Event Timeline

vsk updated this revision to Diff 73707.Oct 5 2016, 4:16 PM
vsk retitled this revision from to [ubsan] Handle undef values in the integer overflow diagnostic.
vsk updated this object.
vsk added a reviewer: rsmith.
vsk added a subscriber: llvm-commits.
vsk updated this revision to Diff 73708.Oct 5 2016, 4:24 PM
  • Fix -Winvalid-noreturn warnings.
vsk added a subscriber: filcab.

Ping. @filcab, would you mind taking a look?

In particular, I'd be interested in any alternative implementations of hasIntegerOverflow, since the current version can't handle overflows in 128-bit integers.

filcab edited edge metadata.Oct 21 2016, 3:17 AM
In D25295#568734, @vsk wrote:

In particular, I'd be interested in any alternative implementations of hasIntegerOverflow, since the current version can't handle overflows in 128-bit integers.

You can use #if HAVE_INT128_T (and UIntMax) to get 128 bit integers if available.

Why is the test a bc file instead of the plain ll file?
Better yet: Can you get a reproducible for this bug in C/C++? (I can't get your original example to trigger the bug) Would be better, and closer to UBSan uses.

Thank you,
Filipe

lib/ubsan/ubsan_handlers.cc
115

Don't make errors disappear like this. Add an UNREACHABLE() to the default case.

123

Won't you overflow if the values would overflow (for the case where your values are the same type as SIntMax)?

128

return V > UpperLimit || V < LowerLimit;
Then remove the indentation level for the else, make the same transformation, and remove the return false; at the end.

vsk added a comment.Oct 21 2016, 9:22 AM

Why is the test a bc file instead of the plain ll file?

llvm has a bitcode backwards compatibility policy. Checking in bitcode means that people can change the textual IR without breaking this test (e.g when the typeless pointer work lands). We should also check in the textual IR used to produce the bitcode so that people have an easy time updating the test in the future.

Better yet: Can you get a reproducible for this bug in C/C++?

I can reproduce this issue several ways with my system compiler. The problem is that we can't guarantee that all versions of all compilers which compile a test case in C will treat overflow UB in the same way. That's why I thought it'd be better to check in a bitcode test which unambiguously demonstrates the issue.

Fwiw, here is the original program I tried:

// Compile at -fsanitize=integer -O2

struct Undef {
  int I;
}

int getUndefInt() {
  Undef U;
  return U.I; //< Read of uninitialized memory.
}

int main() { return getUndefInt() + getUndefInt(); }
lib/ubsan/ubsan_handlers.cc
115

I will try and think of a better way to do this. As it stands, the problem is that the default case isn't unreachable. If e.g the optimizer decides to pass in (0, 0) into the overflow handler instead of (INT_MAX, INT_MAX), we would hit it.

123

Yes, I brought this up in my last comment as something I didn't have a good solution for. I think I have a better approach now that doesn't actually necessitate evaluating the arithmetic op. I'll update the patch soon!

128

Good point; but I'd rather rework the overflow check.

vsk updated this revision to Diff 75489.Oct 21 2016, 2:50 PM
vsk updated this object.
vsk edited edge metadata.
  • Don't depend on sizeof(SIntMax) being greater than sizeof(ValueHandle). I.e, get a bit smarter about doing overflow checks.
lib/ubsan/ubsan_handlers.cc
115

D'oh, what a silly thinko! Sorry, of course this is unreachable.

vsk updated this revision to Diff 75630.Oct 24 2016, 12:53 PM
  • Get rid of the IR/bitcode test and re-write in C. This makes the test a bit easier to understand, and more portable.
filcab requested changes to this revision.Oct 28 2016, 3:16 PM
filcab edited edge metadata.
filcab added a subscriber: kcc.

I really don't like that the test is reimplementing part of the instrumentation.
I know a test like this will probably be a bit fragile, as we will be relying on some optimizations (to make the undefs appear).

The original test (the one in the phab summary) works, and gets reduced to a call to the reported function on -O2 -fsanitize=undefined. It looks safe enough to use tests like that (With some comments so people understand what we're doing there if the tests break for some reason).
@kcc might have a different opinion, so I'd like him to chime in.

lib/ubsan/ubsan_handlers.cc
105

No need for \brief

111

Signed integer overflow triggers undefined behavior.

112

What do these comments mean? You seem to want to "name" some of the variables, but if I got that correctly, you ended up with two variables which you refer to as L.
Having names depend on the values (Ln vs Lp) doesn't seem to help either.

135

Won't LowerLimit / -1 overflow?

This revision now requires changes to proceed.Oct 28 2016, 3:16 PM
vsk added a comment.Oct 28 2016, 3:35 PM

I really don't like that the test is reimplementing part of the instrumentation.

OK. Then we should see if we can do anything about llvm.sadd.with.overflow() returning different values at -O0/-O3 when its operands are %undef. I'll ask around.

I know a test like this will probably be a bit fragile, as we will be relying on some optimizations (to make the undefs appear).

Right. I have tried changing together e.g multiple calls to getUndef() in one program, but the optimizer throws it away. We may be able to play some tricks with noinline to make this work.

The original test (the one in the phab summary) works, and gets reduced to a call to the reported function on -O2 -fsanitize=undefined. It looks safe enough to use tests like that (With some comments so people understand what we're doing there if the tests break for some reason).
@kcc might have a different opinion, so I'd like him to chime in.

Right, I can try this approach again.

lib/ubsan/ubsan_handlers.cc
111

Is your concern that getIntegerBitWidth() could return something that exceeds the bit-width of SIntMax?

112

I'm trying to attach labels to "LHS" and "RHS" based on whether or not they are negative. Would it help to state explicitly that the 'n' suffix denotes that the variable is negative, or should the comments just be dropped?

135

Yes, this is a bug. The check should be for RHS < LowerLimit/LHS.

vsk marked 5 inline comments as done.Nov 7 2016, 4:24 PM
In D25295#582574, @vsk wrote:

I really don't like that the test is reimplementing part of the instrumentation.

OK. Then we should see if we can do anything about llvm.sadd.with.overflow() returning different values at -O0/-O3 when its operands are %undef. I'll ask around.

I've asked around and haven't found a way around this. Part of the nature of undef is that two uses of a single undef may assume different values: "an ‘undef‘ “variable” can arbitrarily change its value over its “live range”" (http://llvm.org/docs/LangRef.html#undefined-values).

I know a test like this will probably be a bit fragile, as we will be relying on some optimizations (to make the undefs appear).

I tried again to write a plain-C test which gets compiled down to unconditional branches into the ubsan diagnostic routines, but haven't managed to do it. Here is my latest attempt: https://ghostbin.com/paste/wy2g2. The compiler just decides not to codegen the branch-on-undef in the way I like.

Do you have other suggestions for how to test this?

I've taken the rest of your suggestions into account in a local patch.

lib/ubsan/ubsan_handlers.cc
112

It seems like these comments aren't adding anything, so I've removed them locally.

vsk updated this revision to Diff 85022.Jan 19 2017, 1:46 PM
vsk edited edge metadata.
vsk marked 4 inline comments as done.
  • Fix two overflow issues Filipe pointed out in inline comments.
  • Remove the tests for undef values in diagnostic handlers. Instead, we are just testing the corner cases for the overflow checking logic. The rationale is that we don't want to mock up ubsan instrumentation in a test, and we also can't rely on the compiler treating UB in a consistent/testable way.