This is an archive of the discontinued LLVM Phabricator instance.

[ubsan] add pointer overflow checking
AbandonedPublic

Authored by regehr on May 17 2016, 4:56 AM.

Details

Reviewers
filcab
rsmith
Summary

This is a patch by Will Dietz that I dusted off and would like to try to get accepted. It teaches UBSan about overflowing pointers. Will wrote a blog post about using it:

https://wdtz.org/catching-pointer-overflow-bugs.html

There is a corresponding compiler-rt patch that I'll submit next.

Not really sure who to add as reviewers for this, suggestions welcome.

Diff Detail

Event Timeline

regehr updated this revision to Diff 57460.May 17 2016, 4:56 AM
regehr retitled this revision from to [ubsan] add pointer overflow checking.
regehr updated this object.
regehr added reviewers: filcab, rsmith.
regehr added a subscriber: dtzWill.
filcab edited edge metadata.May 17 2016, 8:14 AM

For EmitInBoundsGEPOverflowCheck, since it's easier to have clang build the GEP and then add the verifications, it might be better to create an IRBuilder that inserts before the GEP instructions and use that builder so your verification comes before the GEP calculation (and the UB).

lib/CodeGen/CGExpr.cpp
2798

Nit: Since you're maintaining the naming convention that is already there, you could just call this addr too. Otherwise, seems out of place.

2799

Shouldn't you check before you create the pointer?

3150

Shouldn't we be looking at the sanitizer's PointerOverflow?

3254

Since you always used the IntPtrTy s{add,mul}_with_overflow operations, won't this always be true, if Valid is true by this time?

lib/CodeGen/CodeGenFunction.h
1974

No need for \brief anymore.

filcab added inline comments.May 17 2016, 8:29 AM
lib/CodeGen/CGExpr.cpp
3254

Derp, you can add or subtract stuff. Nevermind.

This comment was removed by dtzWill.
dtzWill added inline comments.May 17 2016, 2:21 PM
lib/CodeGen/CGExpr.cpp
3150

No, at least I don't think so.

PointerOverflow is checked in EmitInBoundsGEPOverflowCheck-- what's going on here is also what is going on elsewhere-- we want to only call EmitInBoundsGEPOverflowCheck on GEP's that are 'inbounds'. Here that means checking isSignedOverflowDefined().

That said, I think we may be emitting GEP checks twice on this path, will fix if so.

regehr updated this revision to Diff 57563.May 18 2016, 12:14 AM
regehr edited edge metadata.

updated patch from Will:

  • fix pointer-overflow.c
  • only check for overflows in the default address space
  • avoid a duplicate overflow check
  • address reviewer feedback about variable names
  • address reviewer feedback: remove \brief

I should add a note about testing:

  • I built SPEC CINT 2006 with pointer overflow checking, it runs normally (at 5% slowdown) and two overflows in 400.perlbench were found
  • FFmpeg builds normally and dies during testing with an overflow
  • PHP builds normally and a couple of overflows during testing
  • I haven't yet bootstrapped LLVM with pointer checking but will do that next, if I don't report back here then you can assume this worked and nothing interesting was found

I still think there's a problem here:
The check is meant to protect against creating invalid pointers via GEP. Which means that this should be done before the GEP instruction, not after. Since the GEP instruction is the one that would trigger UB.
With this patch, you're creating the pointer, and then checking it. The way I see it, you should be using something like (in EmitInBoundsGEPOverflowCheck):

auto Builder = IRBuilder(GEPExpr); // Shadowing CGF.Builder

Maybe there are better constructors of IRBuilder to use, but this one seems appropriate.
I'm not the most well-versed person in building IR, so I'd like to have Richard Smith (or another person knowing a bit more about this) double check.

Also add a CHECK-NOT line for the getelementptr intructions before your CHECK for function calls, please.
The tests are mostly testing that we're "most likely" instrumenting the proper expressions. Please add at least one test which actually checks the instructions added and the math performed.

tss added a subscriber: tss.Nov 18 2016, 6:16 PM

Is this patch missing something? I applied this patch to a recent clang git, but when compiling a test program I get:

test.c:(.text+0xb0): undefined reference to `__ubsan_handle_pointer_overflow'

According to the linked web page, it should print errors such as:

./test.c:7:19: runtime error: pointer index expression with base 0x7fffffffd3cb overflowed to 0xffffffffffffffff

But I don't see any such text in this patch.

rsmith edited edge metadata.Nov 18 2016, 6:33 PM
In D20322#600582, @tss wrote:

Is this patch missing something? I applied this patch to a recent clang git, but when compiling a test program I get:

You also need to apply the patch https://reviews.llvm.org/D20323 to compiler-rt.

lib/CodeGen/CGExpr.cpp
2799

I'd suggest:

  • Instead of adding an Emit...Check call after creating each InBounds GEP, factor out a function that does both (EmitCheckedInBoundsGEP maybe?) (This would match how we deal with other similar cases such as EmitCheckedLValue.)
  • Either pass your just-computed value to the callback rather than the GEP value, or drop the 'inbounds' flag from the GEP instruction when you're going to emit a check for it. (In the former case, it shouldn't matter which order the GEP and check are emitted in; the value of the GEP is not used by the check, so we don't get UB at the IR level.)

As things stand, LLVM is permitted to delete the checks on the basis that the argument to the runtime library call would always be a poison value.

3150

@dtzWill, did you ever get round to checking for repeated GEPs on this path?

regehr marked 7 inline comments as done.Nov 18 2016, 7:36 PM

I had some deadlines to get through, but now I'll get back to updating this patch, hopefully in a few days!

filcab requested changes to this revision.Nov 29 2016, 9:44 AM
filcab edited edge metadata.

Just changing the state so it doesn't show up on the reviewers' TODO lists on phab.

This revision now requires changes to proceed.Nov 29 2016, 9:44 AM
vsk added a subscriber: vsk.May 9 2017, 5:29 PM

Hi @dtzWill, I have some time to work on this, and would be happy to implement the changes suggested by @rsmith. Let me know if it's all right for me to take a stab at this.

regehr added a comment.May 9 2017, 5:35 PM
In D20322#750381, @vsk wrote:

Hi @dtzWill, I have some time to work on this, and would be happy to implement the changes suggested by @rsmith. Let me know if it's all right for me to take a stab at this.

I"m not @dtzWill but I'm guessing he'll be happy for you to do this (same with me)!

In D20322#750381, @vsk wrote:

Hi @dtzWill, I have some time to work on this, and would be happy to implement the changes suggested by @rsmith. Let me know if it's all right for me to take a stab at this.

I"m not @dtzWill but I'm guessing he'll be happy for you to do this (same with me)!

Yes, that would be great! Let me know if there are any problems or questions :).

sberg added a subscriber: sberg.Jun 7 2017, 6:31 AM

This is apparently superseded by https://reviews.llvm.org/D33305 "[ubsan] Add a check for pointer overflow UB" (and should probably be abandoned or whatever the appropriate Phabricator state?).

regehr abandoned this revision.Jun 7 2017, 6:35 AM