This is an archive of the discontinued LLVM Phabricator instance.

[ubsan] Disable bounds-check for flexible array ivars
ClosedPublic

Authored by vsk on Jul 11 2016, 10:35 AM.

Details

Summary

Ubsan does not emit a bounds check when accessing flexible array members in C-style structs, e.g:

struct Foo { char arr[0]; };
char bar(struct Foo *F) { return F->arr[1]; } //< Unchecked access.

Teach ubsan to skip the bounds check for flexible array ObjC ivars as well.

This reduces the false-positive rate when instrumenting Objective-C frameworks.

A note on testing: it looks like runnable ubsan tests are typically added to compiler-rt. I chose to create a test in clang instead, because I don't want to make the ubsan tests depend on libobjc.

Diff Detail

Repository
rL LLVM

Event Timeline

vsk updated this revision to Diff 63534.Jul 11 2016, 10:35 AM
vsk retitled this revision from to [ubsan] Disable bounds-check for flexible array ivars .
vsk updated this object.
vsk added reviewers: rsmith, samsonov.
vsk added a subscriber: cfe-commits.
vsk updated this object.Oct 3 2016, 3:35 PM
vsk edited reviewers, added: aprantl; removed: samsonov.
aprantl edited edge metadata.Oct 3 2016, 8:16 PM

Not knowing the specifics of the ObjC class layout: Does this work correctly with private ivars (i.e.: Does this need an extra check that there are no extra ivars in a later @implementation)?

vsk added a comment.Oct 4 2016, 10:28 AM

The ivar list is set by all_declared_ivar_begin(), which accounts for ivars introduced by an implementation. Thanks for raising the point.

Consider this test:

@interface HasFlexibleArray {
@public char chars[0];
}
@implementation HasFlexibleArray {
@public char chars2[0];
}

I found that the current patch will sanitize accesses to 'chars', but will not sanitize accesses to 'chars2'. I think that's the desired behavior -- wdyt?

vsk updated this revision to Diff 73508.Oct 4 2016, 10:42 AM
vsk edited edge metadata.

Add tests for implementations which introduce ivars.

aprantl accepted this revision.Oct 4 2016, 11:03 AM
aprantl edited edge metadata.

Can you double-check that the memory layout is actually what we think it is by inspecting the generated IR? Otherwise this LGTM.

This revision is now accepted and ready to land.Oct 4 2016, 11:03 AM
vsk added a comment.Oct 4 2016, 1:42 PM

Thanks for the review!

I looked at the IR and confirmed that the ivars are laid out in the order they're defined, that the indirect ivar offsets make sense, and that the runtime ivar offsets match up with what we expect. E.g;

@"OBJC_IVAR_$_FlexibleArray1.chars" = global i64 0 ...
@"\01l_OBJC_$_INSTANCE_VARIABLES_FlexibleArray1" = private global { i32, i32, [1 x %struct._ivar_t] } ...

---

@"OBJC_IVAR_$_FlexibleArray2.chars" = global i64 0
@"OBJC_IVAR_$_FlexibleArray2.chars2" = global i64 0 ;; < This offset increases if we sandwich an int between chars and chars2.
@"\01l_OBJC_$_INSTANCE_VARIABLES_FlexibleArray2" = private global { i32, i32, [2 x %struct._ivar_t] } ...
This revision was automatically updated to reflect the committed changes.