This is an archive of the discontinued LLVM Phabricator instance.

Convert some ObjC msgSends to runtime calls
ClosedPublic

Authored by pete on Nov 16 2015, 5:15 PM.

Details

Summary

It is faster to directly call the ObjC runtime for methods such as retain/release instead of sending a message to those functions.

This patch adds support for converting messages to retain/release/alloc/autorelease to their equivalent runtime calls.

Tests included for the positive case of applying this transformation, negative tests that we ensure we only convert "alloc" to objc_alloc, not "alloc2", and also a driver test to ensure we enable this only for supported runtime versions.

Diff Detail

Repository
rL LLVM

Event Timeline

pete updated this revision to Diff 40365.Nov 16 2015, 5:15 PM
pete retitled this revision from to Convert some ObjC msgSends to runtime calls.
pete updated this object.
pete added a reviewer: rjmccall.
pete added a subscriber: cfe-commits.
pete updated this revision to Diff 40426.Nov 17 2015, 1:20 PM

Updated the driver test to ensure that the option is set for tvOS and watchOS.

rjmccall added inline comments.Nov 17 2015, 1:36 PM
include/clang/Basic/LangOptions.def
194 ↗(On Diff #40365)

I think this can just be a code-gen option.

Also, please use the description "Replace certain message sends with calls to ObjC runtime entrypoints".

include/clang/Basic/ObjCRuntime.h
179 ↗(On Diff #40365)

This isn't really an appropriate comment. Please use:

Does this runtime provide ARC entrypoints that are likely to be faster than
an ordinary message send of the appropriate selector?

The ARC entrypoints are guaranteed to be equivalent to just sending the
corresponding message.  If the entrypoint is implemented naively as just a
message send, using it is a trade-off: it sacrifices a few cycles of overhead
to save a small amount of code.  However, it's possible for runtimes to detect
and special-case classes that use "standard" retain/release behavior; if that's
dynamically a large proportion of all retained objects, using the entrypoint will
also be faster than using a message send.

When this method returns true, Clang will turn non-super message sends of
certain selectors into calls to the correspond entrypoint:
  retain => objc_retain
  release => objc_release
182 ↗(On Diff #40365)

Would you mind asking our runtime folks about the fragile runtime? We do provide these entrypoints there.

lib/CodeGen/CGObjCMac.cpp
1871 ↗(On Diff #40365)

Hmm. It would be better if you could implement this in CGObjC instead of CGObjCMac so that it automatically works on all runtimes.

Also, you need to gate this on !IsSuper. Please add a test for that case.

1876 ↗(On Diff #40365)

You really ought to have a separate query on the runtime for whether this is okay, since objc_alloc isn't one of the standard ARC entrypoints.

I'm not sure what that ends up meaning for the code-gen option. It makes sense to just have one command-line option: we should have the target's default rules for rewriting message sends, there should be a runtime option to suppress it for people who need to (e.g. because they're writing a runtime), end of story. Maybe what you should do is have the driver pass down the suppression option, if indeed it was provided last, and then have the frontend enable stuff based on the target runtime if the suppression option wasn't given; that is, -fno-objc-X would be the -cc1 flag, not -fobjc-X.

test/CodeGenObjC/convert-messages-to-runtime-calls.m
38 ↗(On Diff #40365)

You should also test that sending "alloc" with a dynamic receiver work. This includes something like [self alloc] in a class method.

I hope that it's not presumptuous of me to inquire but I was wondering if the intent of this patch is to optimize calls to RR methods (and alloc) in non-ARC code? Would I be correct in assuming that clang will already emit direct calls to relevant RR runtime functions when ARC is enabled?

rjmccall edited edge metadata.Nov 17 2015, 6:57 PM

I hope that it's not presumptuous of me to inquire but I was wondering if the intent of this patch is to optimize calls to RR methods (and alloc) in non-ARC code? Would I be correct in assuming that clang will already emit direct calls to relevant RR runtime functions when ARC is enabled?

Correct. The patch turns ObjC message sends of -retain, -release, and +alloc into calls to the runtime functions we already use in ARC.

This is largely irrelevant in ARC because it's illegal to write an ObjC message send of -retain or -release. The change to +alloc would still apply, however.

rjmccall added inline comments.Nov 18 2015, 4:21 PM
include/clang/Basic/ObjCRuntime.h
182 ↗(On Diff #40426)

I went ahead and asked Greg, and he agreed that it's best to not enable this on the Mac fragile runtime. The functions exist, but they aren't profitable to call, other than a small code-size decrease.

pete updated this revision to Diff 40703.Nov 19 2015, 1:55 PM
pete edited edge metadata.

Updated with the following changes:

  • Removed the -f-objc-X option and made the -fno-objc-X option be codegen only.
  • Updated all the comments to be what John suggested.
  • Added a method (shouldUseRuntimeFunctionsForAlloc) to single out the alloc case with its own checks.
  • Added check and test case for IsSuper
  • Added test for retain of self.
  • The driver test now only checks for whether the flag is forwarded correctly as the target checks are done during codegen.
  • The CodeGen test now includes all the variants for macos, fragile, iOS, tvos and watchOS.

Given we now have 2 runtime methods to check, I tried to reduce duplication by splitting alloc out of the switch. The alternative is to check for shouldUseARCFunctionsForRetainRelease() in each of the retain/release/autorelease cases. I'm fine with either solution.

I wasn't able to move the code to CGObjC. Unfortunately there's no common EmitMessageSend method which is shared by CGObjCMac and CGObjCGNU. I could move this to a helper in CGObjC and call the helper from CGObjCMac and CGObjCGNU if thats preferable. Perhaps we should do that in a follow up only if the CGObjCGNU valid targets were ever true for the versions being checked in shouldUseARCFunctionsForRetainRelease.

pete updated this revision to Diff 40705.Nov 19 2015, 2:33 PM

After chatting with John offline, he mentioned that this code could be shared with all ObjC CG's if I put it in EmitObjCMessageExpr.

This moves it here and also checks that we don't do the retain/release call->msgSend optimization when GC is enabled. There's also a test for that case.

By "in CGObjC", I mean you should be able to do it in CodeGenFunction::EmitObjCMessageExpr. Maybe put it in a separate function like

static llvm::Value *tryGenerateSpecializedMessageSend(...)

and then do something like

} else if (llvm::Value *SpecializedResult = tryGenerateSpecializedMessageSend(...)) {
  result = RValue::get(SpecializedResult);
} else {

right before the call to GenerateMessageSend.

include/clang/Basic/ObjCRuntime.h
176 ↗(On Diff #40703)

The vertical whitespace is good, but please continue the doc comments across it.

207 ↗(On Diff #40703)

You can just say "alloc" here. :)

include/clang/Driver/Options.td
896 ↗(On Diff #40703)

It's fine to have both the "yes" and "no" options; I just meant that only the "no" option needs to be a cc1 option.

lib/CodeGen/CGObjCMac.cpp
1874 ↗(On Diff #40703)

I'd just write this as a switch over the method family. The runtime check isn't unreasonable to duplicate into each of the cases; also, in the dominant case it will be true, so we're not really saving any work in practice by doing it early. It's the method family check that will usually avoid the rest of the work.

We don't really care if we have a method declaration; if there's no method, you can just ask the selector for its method family. That won't be cached, though, and it's implemented with string comparisons, so please be sure to avoid calling it multiple times.

Please loosen the type restrictions here. You shouldn't need to restrict the receiver type, and any ObjC pointer type should be fine for the return type. Of course, you'll need to add bitcasts to/from i8*, but that's easy enough.

1883 ↗(On Diff #40703)

Oh. Additional check: don't do this if GC != NonGC.

1897 ↗(On Diff #40703)

This has precise lifetime, actually. Not that it really matters, since the ARC optimizer should be disabled, but it's best to be correct.

pete updated this revision to Diff 40719.Nov 19 2015, 4:57 PM

Thanks for all the feedback!

I've addressed all of your previous feedback and also applied clang-format to the whole patch as some parts were looking questionable.

BTW, each of the implementations of EmitObjC* ultimately called to emitARCValueOperation which handled the bit cast insertion when the dest type was not an id. I've added a check for this in the test case, and verified that the IR looked good.

Let me know what you think. Thanks!

The casts done by emitARCValueOperation will handle the input, but they don't quite handle the result properly. The right test case here is a method named "retain" that's declared to return something completely unrelated to its receiver type, e.g.

@class A;
@interface B
- (A*) retain;
@end

You should also add a test case for unusual return types that your mechanism just can't support and that need to go through the ordinary message-send emission, like returning a float.

test/CodeGenObjC/convert-messages-to-runtime-calls.m
62 ↗(On Diff #40719)

You already test this case. If you make retain_self a class method, this will be a class message; but it should still be okay to use objc_retain.

pete updated this revision to Diff 40790.Nov 20 2015, 9:16 AM

Added a couple of tests for retain returning types other than id. Returning a pointer should still be converted to a call, while returning a non-pointer such as float will get a message instead.

I walked through the code in the debugger to check on the return cast. Turns out it is handled at the very end of emitARCValueOperation as follows:

// Cast the result back to the original type.
return CGF.Builder.CreateBitCast(call, origType);
In D14737#293967, @pete wrote:

Added a couple of tests for retain returning types other than id. Returning a pointer should still be converted to a call, while returning a non-pointer such as float will get a message instead.

I walked through the code in the debugger to check on the return cast. Turns out it is handled at the very end of emitARCValueOperation as follows:

// Cast the result back to the original type.
return CGF.Builder.CreateBitCast(call, origType);

Right, that'll cast back to the original type. That will be the type of the receiver of the message, which is not necessarily the type of the result of the message.

pete added a comment.Mar 11 2016, 2:45 PM

Hi John

Sorry, getting back to this after way too long!

In D14737#293967, @pete wrote:

Added a couple of tests for retain returning types other than id. Returning a pointer should still be converted to a call, while returning a non-pointer such as float will get a message instead.

I walked through the code in the debugger to check on the return cast. Turns out it is handled at the very end of emitARCValueOperation as follows:

// Cast the result back to the original type.
return CGF.Builder.CreateBitCast(call, origType);

Right, that'll cast back to the original type. That will be the type of the receiver of the message, which is not necessarily the type of the result of the message.

I stepped through this one in the debugger to make sure I had it right.

So the reason the bit cast ends up not being needed is because we restricted this optimization to cases where the result type "isObjCObjectPointerType()" which conveniently ends up begin i8* and is exactly the same type as the message receiver. I guess if either the receiver type or the result type wasn't represented as i8* then the IRBuilder would have crashed.

If we permitted converting say messaging (int*)retain to a call to objc_retain then the bit cast would be needed. Would you like me to permit this change on functions returning anything other than isObjCObjectPointerType(), eg, change that to "isAnyPointerType()"?

Cheers,
Pete

In D14737#373532, @pete wrote:

I stepped through this one in the debugger to make sure I had it right.

So the reason the bit cast ends up not being needed is because we restricted this optimization to cases where the result type "isObjCObjectPointerType()" which conveniently ends up begin i8* and is exactly the same type as the message receiver. I guess if either the receiver type or the result type wasn't represented as i8* then the IRBuilder would have crashed.

If we permitted converting say messaging (int*)retain to a call to objc_retain then the bit cast would be needed. Would you like me to permit this change on functions returning anything other than isObjCObjectPointerType(), eg, change that to "isAnyPointerType()"?

The IR type for an AST type satisfying isObjCObjectPointerType() isn't always i8*; in particular, that's not true when the receiver has a concrete class type.

A test case here would be something like:

@interface Foo

  • (id) retain;

@end

...

id test(Foo *foo) {

return [foo retain];

}

Here the receiver type will be something like %struct.Foo* and the expected return type will be i8*.

pete updated this revision to Diff 50651.Mar 14 2016, 2:42 PM

Thanks for the example John. I understand what you mean now.

I've added this piece to the test case which verifies that the following IR has the correct bit cast in it. Similarly added cases for alloc and autorelease.

@class A;
@interface B

  • (A*) retain;

@end

A* test_retain_class_ptr(B *b) {

return [b retain];

}

define %1* @test_retain_class_ptr(%2* %b) #0 {
entry:

%b.addr = alloca %2*, align 8
store %2* %b, %2** %b.addr, align 8
%0 = load %2*, %2** %b.addr, align 8
%1 = bitcast %2* %0 to i8*
%2 = call i8* @objc_retain(i8* %1) #2
%3 = bitcast i8* %2 to %1*
ret %1* %3

}

Can you find where that bitcast is being added? I know that different parts of IRGen are differently sensitive to types — it's possible that the return code is one of those more-permissive places.

pete added a comment.Mar 15 2016, 11:35 AM

Can you find where that bitcast is being added? I know that different parts of IRGen are differently sensitive to types — it's possible that the return code is one of those more-permissive places.

Sure, will do.

I should say that the bit cast here is my doing. I should have said that I added code to emitARCValueOperation which optionally takes a return type to cast to, instead of always casting to the receiver type as you noted it was doing before. But as you point out, even without that change, the IR is still getting bit casts when needed.

Ah, okay, if you changed it to cast explicitly, that's all I was concerned about.

pete added a comment.Mar 15 2016, 2:40 PM

Ah, okay, if you changed it to cast explicitly, that's all I was concerned about.

Cool. Thanks. Any other concerns or does this look good to you?

This revision was automatically updated to reflect the committed changes.