[ubsan] Add a check for pointer overflow UB
ClosedPublic

Authored by vsk on May 17 2017, 5:39 PM.

Details

Summary

Check pointer arithmetic for overflow.

For some more background on this check, see:

https://wdtz.org/catching-pointer-overflow-bugs.html
https://reviews.llvm.org/D20322

Patch by Will Dietz and John Regehr!

This version of the patch is different from the original in a few ways:

  • It introduces the EmitCheckedInBoundsGEP utility which inserts checks when the pointer overflow check is enabled.
  • It does some constant-folding to reduce instrumentation overhead.
  • It does not check some GEPs in CGExprCXX. I'm not sure that inserting checks here, or in CGClass, would catch many bugs.

Possible future directions for this check:

  • Introduce CGF.EmitCheckedStructGEP, to detect overflows when accessing structures.

Testing: Apart from the added lit test, I ran check-llvm and check-clang
with a stage2, ubsan-instrumented clang. I found one overflow (see:
https://reviews.llvm.org/D33149).

Diff Detail

Repository
rL LLVM
vsk created this revision.May 17 2017, 5:39 PM
filcab added inline comments.May 18 2017, 8:34 AM
lib/CodeGen/CGExprScalar.cpp
3854 ↗(On Diff #99378)

You're creating the GEP first (possibly triggering UB), and then checking ("after" UB). Shouldn't you put the checking instructions before the GEP?

3948 ↗(On Diff #99378)

Do we want an extra test for TotalOffset being a constant + not overflowing? (Not strictly need, but you've been working on avoiding __ubsan() calls :-) )

vsk added a comment.May 18 2017, 9:38 AM

Thanks for the comments, responses inline --

lib/CodeGen/CGExprScalar.cpp
3854 ↗(On Diff #99378)

The checking instructions are not users of the (possibly poisoned) GEP value so inserting the checks after the GEP should be OK. FWIW ubsan's scalar range checks do the same thing and I haven't seen an issue there.

I do have a concern about the IR constant folder reducing GEPs to Undef, but I don't think this affects any interesting cases, and we handle that below by bailing out if the GEP is a Constant.

Does that clear up your concerns?

3948 ↗(On Diff #99378)

If the total offset is a constant and doesn't overflow, there's no need for the select instruction below, but we'd still need to emit a pointer overflow check. Teaching clang to omit the select is a little messy, and I don't think we'd save very much, so I'm leaning against doing it.

vsk updated this revision to Diff 100475.May 26 2017, 2:34 PM
vsk edited the summary of this revision. (Show Details)

Ping.

LGTM!

I've built quite a bit with this (ground-up Linux distribution) which attests to it being fairly robust (no crashing or new errors not experienced with unpatched clang) and the bugs found so far are all true positives (few of which were caught and reported last time around).

For what it's worth, this needs to be adjusted slightly to work with latest Clang. Changes were straightforward as I recall, updated patch attached.

dtzWill accepted this revision.Mon, May 29, 10:44 AM
This revision is now accepted and ready to land.Mon, May 29, 10:44 AM
vsk added a comment.Wed, May 31, 2:52 PM

Thanks for the review! I've rebased the patch and plan on checking it in tomorrow. At the moment I'm getting some additional test coverage (running check-libcxx and testing more backends).

This revision was automatically updated to reflect the committed changes.
sberg added a subscriber: sberg.Wed, Jun 7, 6:50 AM

Just a heads up that I ran into an arguably somewhat unexpected instance of this with (a copy of the Graphite project included in) LibreOffice, see the commit message of https://cgit.freedesktop.org/libreoffice/core/commit/?id=681b4a49d797996229513d3e842d2a431030730a "external/graphite: Avoid -fsanitize=pointer-overflow" for details. (In short, "ptrdiff_t d = ...; T * p += d / sizeof(T);" can trigger an overflow warning when d is negative.)

regehr added a comment.Wed, Jun 7, 8:27 AM

I'm taking a look. For reference here's the test program I'm using.

#include <stddef.h>

typedef struct {
  int x[2];
} T;

int main(void) {
  T f;
  T *p = &f;
  ptrdiff_t d = -3293184;
  p += d / sizeof(T);
  return 0;
}
regehr added a comment.Wed, Jun 7, 8:51 AM

Sorry, let's go with this example instead, which makes it clear that the program is attempting to do something completely sensible:

#include <stddef.h>

typedef struct {
  int x[2];
} T;

int main(void) {
  T f[1000];
  T *p = &f[500];
  ptrdiff_t d = -10;
  p += d / sizeof(T);
  return 0;
}
regehr added a comment.Wed, Jun 7, 9:29 AM

Well, my second program should subtract a multiple of sizeof(T). But anyway, my view is that this is a real overflow and a nasty consequence of the unsigned size_t and the usual arithmetic conversions and I don't think we want to try to poke a hole in UBSan to allow this idiom unless it turns out to be extremely common.

I think it would be better style to cast the sizeof() to a ptrdiff_t rather than to an int, but as long as d is a ptrdiff_t this won't matter.

vsk added a comment.Mon, Jun 12, 5:02 PM

@sberg I agree with @regehr's analysis, and do think that this is a real overflow. Once D34121 lands, we will report this issue in a better way:

runtime error: addition of unsigned offset to 0x7fff59dfe990 overflowed to 0x7fff59dfe980