[ubsan] add pointer overflow checking
Needs RevisionPublic

Authored by regehr on May 17 2016, 4:56 AM.



This is a patch by Will Dietz that I dusted off and would like to try to get accepted. It teaches UBSan about overflowing pointers. Will wrote a blog post about using it:


There is a corresponding compiler-rt patch that I'll submit next.

Not really sure who to add as reviewers for this, suggestions welcome.

Diff Detail

regehr retitled this revision from to [ubsan] add pointer overflow checking.May 17 2016, 4:56 AM
regehr updated this object.
regehr added reviewers: filcab, rsmith.
regehr added a subscriber: dtzWill.

For EmitInBoundsGEPOverflowCheck, since it's easier to have clang build the GEP and then add the verifications, it might be better to create an IRBuilder that inserts before the GEP instructions and use that builder so your verification comes before the GEP calculation (and the UB).


Nit: Since you're maintaining the naming convention that is already there, you could just call this addr too. Otherwise, seems out of place.


Shouldn't you check before you create the pointer?


Shouldn't we be looking at the sanitizer's PointerOverflow?


Since you always used the IntPtrTy s{add,mul}_with_overflow operations, won't this always be true, if Valid is true by this time?


No need for \brief anymore.

filcab added inline comments.May 17 2016, 8:29 AM

Derp, you can add or subtract stuff. Nevermind.

This comment was removed by dtzWill.
dtzWill added inline comments.May 17 2016, 2:21 PM

No, at least I don't think so.

PointerOverflow is checked in EmitInBoundsGEPOverflowCheck-- what's going on here is also what is going on elsewhere-- we want to only call EmitInBoundsGEPOverflowCheck on GEP's that are 'inbounds'. Here that means checking isSignedOverflowDefined().

That said, I think we may be emitting GEP checks twice on this path, will fix if so.

regehr updated this revision to Diff 57563.May 18 2016, 12:14 AM

updated patch from Will:

  • fix pointer-overflow.c
  • only check for overflows in the default address space
  • avoid a duplicate overflow check
  • address reviewer feedback about variable names
  • address reviewer feedback: remove \brief

I should add a note about testing:

  • I built SPEC CINT 2006 with pointer overflow checking, it runs normally (at 5% slowdown) and two overflows in 400.perlbench were found
  • FFmpeg builds normally and dies during testing with an overflow
  • PHP builds normally and a couple of overflows during testing
  • I haven't yet bootstrapped LLVM with pointer checking but will do that next, if I don't report back here then you can assume this worked and nothing interesting was found

I still think there's a problem here:
The check is meant to protect against creating invalid pointers via GEP. Which means that this should be done before the GEP instruction, not after. Since the GEP instruction is the one that would trigger UB.
With this patch, you're creating the pointer, and then checking it. The way I see it, you should be using something like (in EmitInBoundsGEPOverflowCheck):

auto Builder = IRBuilder(GEPExpr); // Shadowing CGF.Builder

Maybe there are better constructors of IRBuilder to use, but this one seems appropriate.
I'm not the most well-versed person in building IR, so I'd like to have Richard Smith (or another person knowing a bit more about this) double check.

Also add a CHECK-NOT line for the getelementptr intructions before your CHECK for function calls, please.
The tests are mostly testing that we're "most likely" instrumenting the proper expressions. Please add at least one test which actually checks the instructions added and the math performed.

tss added a subscriber: tss.Nov 18 2016, 6:16 PM

Is this patch missing something? I applied this patch to a recent clang git, but when compiling a test program I get:

test.c:(.text+0xb0): undefined reference to `__ubsan_handle_pointer_overflow'

According to the linked web page, it should print errors such as:

./test.c:7:19: runtime error: pointer index expression with base 0x7fffffffd3cb overflowed to 0xffffffffffffffff

But I don't see any such text in this patch.

In D20322#600582, @tss wrote:

Is this patch missing something? I applied this patch to a recent clang git, but when compiling a test program I get:

You also need to apply the patch https://reviews.llvm.org/D20323 to compiler-rt.


I'd suggest:

  • Instead of adding an Emit...Check call after creating each InBounds GEP, factor out a function that does both (EmitCheckedInBoundsGEP maybe?) (This would match how we deal with other similar cases such as EmitCheckedLValue.)
  • Either pass your just-computed value to the callback rather than the GEP value, or drop the 'inbounds' flag from the GEP instruction when you're going to emit a check for it. (In the former case, it shouldn't matter which order the GEP and check are emitted in; the value of the GEP is not used by the check, so we don't get UB at the IR level.)

As things stand, LLVM is permitted to delete the checks on the basis that the argument to the runtime library call would always be a poison value.


@dtzWill, did you ever get round to checking for repeated GEPs on this path?

regehr marked 7 inline comments as done.Nov 18 2016, 7:36 PM

I had some deadlines to get through, but now I'll get back to updating this patch, hopefully in a few days!

filcab requested changes to this revision.Nov 29 2016, 9:44 AM

Just changing the state so it doesn't show up on the reviewers' TODO lists on phab.

This revision now requires changes to proceed.Nov 29 2016, 9:44 AM