Page MenuHomePhabricator

[ubsan] Detect invalid unsigned pointer index expression (clang)
ClosedPublic

Authored by vsk on Jun 5 2017, 12:54 PM.

Details

Summary

Adding an unsigned offset to a base pointer has undefined behavior if
the result of the expression would precede the base. An example from
@regehr:

int foo(char *p, unsigned offset) {
  return p + offset >= p; // This may be optimized to '1'.
}

foo(p, -1); // UB.

This patch extends the pointer overflow check in ubsan to detect invalid
unsigned pointer index expressions. It changes the instrumentation to
only permit non-negative offsets in pointer index expressions when all
of the GEP indices are unsigned.

Aside: If anyone has a better name for this type of bug, I'm all ears.
Using "unsigned pointer index expression" could be a problem, because it
sounds like an indexing expression with an _unsigned pointer_.

Diff Detail

Repository
rL LLVM

Event Timeline

vsk created this revision.Jun 5 2017, 12:54 PM
regehr edited edge metadata.Jun 5 2017, 1:04 PM

I'm afraid I don't have a better name for this.

Here's the obligatory gcc explorer link though: https://godbolt.org/g/s10h0O

rsmith edited edge metadata.Jun 5 2017, 3:34 PM

Looks good, with a couple of tweaks (and corresponding test changes).

lib/CodeGen/CGExprScalar.cpp
3910–3911 ↗(On Diff #101447)

Reverse the order of operands here; Builder will simplify or instructions with a constant RHS.

3963–3965 ↗(On Diff #101447)

Likewise reverse the operand order here....

3967 ↗(On Diff #101447)

... and here.

vsk updated this revision to Diff 101479.Jun 5 2017, 5:48 PM
vsk marked 3 inline comments as done.

Thanks for the review comments. I've changed the calls which use Builder as suggested, and fixed up the tests.

It sounds like this patch is in good shape, so I'll commit this after two days provided that there's no blocking feedback.

vsk added a comment.Jun 8 2017, 12:29 PM

I've encountered some new diagnostics when running tests on a stage2 instrumented clang, and will need more time to investigate them. Sorry for the delayed communication, I am a bit swamped this week owing to wwdc and being a build cop.

rsmith accepted this revision.Jun 8 2017, 1:11 PM
This revision is now accepted and ready to land.Jun 8 2017, 1:11 PM
dtzWill accepted this revision.Jun 12 2017, 8:40 AM

LGTM!

Sorry for missing this originally, as a perhaps interesting note:
the checks were extracted from a research prototype that worked at the IR level --where pointer itself is unsigned but the offsets (including the computed total offset) is a signed expression[1].
(we also tracked conversions and whatnot, so... well things were different. Anyway, sorry for missing this!)

This looks great to me, thanks for identifying this and putting it together!

[1] http://llvm.org/docs/GetElementPtr.html#what-happens-if-a-gep-computation-overflows

This revision was automatically updated to reflect the committed changes.
efriedma added inline comments.
cfe/trunk/lib/CodeGen/CGExprScalar.cpp
2666

This logic doesn't look quite right; what happens here if you write "p - SIZE_MAX"?