This is an archive of the discontinued LLVM Phabricator instance.

[ubsan] Check implicit casts in ObjC for-in statements
ClosedPublic

Authored by vsk on Dec 13 2019, 2:14 PM.

Details

Summary

Check that the implicit cast from id used to construct the element
variable in an ObjC for-in statement is valid.

This check is included as part of a new objc-cast sanitizer, outside
of the main 'undefined' group, as (IIUC) the behavior it's checking for
is not technically UB.

The check can be extended to cover other kinds of invalid casts in ObjC.

Partially addresses: rdar://12903059, rdar://9542496

Diff Detail

Event Timeline

vsk created this revision.Dec 13 2019, 2:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptDec 13 2019, 2:14 PM
delcypher added inline comments.Dec 13 2019, 9:22 PM
compiler-rt/lib/ubsan/ubsan_value.cpp
32

The compiler-rt codebase tends to use SANITIZER_MAC macro (defined to be 1 if Apple otherwise it's 0) rather than __APPLE__.

40

You might want some sort of lock here (or atomic variable) to ensure we don't race and try to dlopen() multiple times.

delcypher added inline comments.Dec 13 2019, 10:01 PM
clang/lib/Driver/ToolChain.cpp
953 ↗(On Diff #233881)

SanitizerKind::ObjCCast doesn't seem to fit the comment above. It is platform dependent (only really works for Apple platforms) and it does require runtime support (i.e. the Objective-C runtime).

vsk updated this revision to Diff 234136.Dec 16 2019, 1:35 PM
vsk marked 3 inline comments as done.

Address review feedback.

clang/lib/Driver/ToolChain.cpp
953 ↗(On Diff #233881)

The runtime dependency is optional, and there's nothing inherently Apple-specific about this check. However, perhaps it's best not to inadvertently advertise support for the GNU objc runtime before it's in tree.

compiler-rt/lib/ubsan/ubsan_value.cpp
32

I see. That seems problematic, as it makes it tricky to add macOS-specific (or iOS-specific) functionality down the road. I don't think SANITIZER_MAC should be defined unless TARGET_OS_MACOS is true.

40

Yes, thanks.

vsk updated this revision to Diff 234139.Dec 16 2019, 1:40 PM

Avoid a static initializer.

vsk updated this revision to Diff 234142.Dec 16 2019, 1:51 PM

Ignore an objc-cast report at a given SourceLocation after it's been reported once.

delcypher added inline comments.Dec 16 2019, 5:17 PM
compiler-rt/lib/ubsan/ubsan_value.cpp
32

Sorry I should clarify. SANITIZER_MAC is poorly named but it is defined to be 1 for Apple platforms and 0. I'm just pointing out the convention that exists today. You're absolutely right that we might want to do different things for different Apple platforms but I don't think we want to start doing a mass re-name until arm64e and arm64_32 support are completed landed in llvm's master.

vsk added inline comments.Dec 17 2019, 11:24 AM
compiler-rt/lib/ubsan/ubsan_value.cpp
32

I think 'SANITIZER_MAC' is confusing, and my preference would be to not use it. __APPLE__ seems clearer to me, and (IIUC) the plan is to replace usage of 'SANITIZER_MAC' with it down the line anyway?

delcypher added inline comments.Dec 20 2019, 2:12 PM
compiler-rt/lib/ubsan/ubsan_value.cpp
32

There aren't any plans to do it right now but cleaning this up seems like a reasonable thing to do. If you take a look at compiler-rt/lib/sanitizer_common/sanitizer_platform.h you'll see that we actually have SANITIZER_<platform> for the other platforms that seems to be set as you'd expect. It's just SANITIZER_MAC that's badly named.

I've filed a radar for this issue (rdar://problem/58124919). So if you prefer to use __APPLE__ could you leave a comment with something like

// TODO(dliew): Don't use unclear `SANITIZER_MAC` here. Instead wait for its replacement rdar://problem/58124919.

Note the TODO(<name>): style is enforced by the sanitizer specific linter so you have to put a name there.

vsk updated this revision to Diff 238393.Jan 15 2020, 4:39 PM
vsk marked an inline comment as done.
vsk added inline comments.
compiler-rt/lib/ubsan/ubsan_value.cpp
32

Thanks, done!

vsk added a comment.Feb 4 2020, 11:11 AM

Friendly ping.

@vsk The compiler-rt side seems fine to me but I'm not very familiar with the Clang side of things. @arphaman @jfb @rjmccall any thoughts?

+ Erik and Akira for IRgen expertise.

vsk updated this revision to Diff 271508.Jun 17 2020, 3:44 PM

Rebase.

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 17 2020, 3:44 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
ahatanak added inline comments.Jun 26 2020, 1:15 AM
clang/lib/CodeGen/CGObjC.cpp
1855

Can you use GetUnarySelector here?

1859

It looks like Args should be cleared before adding the argument. Or should the argument be added to IsKindOfClassArg?

vsk marked 2 inline comments as done.Jun 26 2020, 12:25 PM
vsk added inline comments.
clang/lib/CodeGen/CGObjC.cpp
1859

Thanks for catching this. The argument should be added to IsKindOfClassArg. This wasn't working correctly before: a leftover %struct.__objcFastEnumerationState* was included in the message-send. I've added a runtime test to ensure that the diagnostic does not fire when the implicit cast is correct.

vsk updated this revision to Diff 273815.Jun 26 2020, 12:29 PM

Use the IsKindOfClass CallArgList when emitting the check, and add a runtime test to ensure that an objc-cast diagnostic is not emitted on correct code.

Thank you, LGTM.

This revision was not accepted when it landed; it landed in state Needs Review.Jul 13 2020, 3:11 PM
This revision was automatically updated to reflect the committed changes.