Page MenuHomePhabricator

Change CGObjC to use objc intrinsics instead of runtime methods
ClosedPublic

Authored by pete on Dec 17 2018, 6:12 PM.

Details

Summary

This is the final commit to make clang emit the intrinsics so that the ARC optimizer will always see intrinsics instead of runtime method calls.

The changes are mostly mechanical, just using getIntrinsic instead of creating a new method.

The tricky cases are:

  • Old runtimes get weak_external declarations, so i've set the linkage on the intrinsics
  • retain/release should be nonlazybind, which will now be handed in llvm itself when converted from intrinsic to runtime method for ISel
  • objc_autoreleasePoolPop can throw so needs to invoke the runtime method, or call the intrinsic, as needed

Diff Detail

Repository
rC Clang

Event Timeline

pete created this revision.Dec 17 2018, 6:12 PM
rjmccall accepted this revision.Dec 17 2018, 9:43 PM

You're making intrinsics with weak_external linkage? I feel like that's going to be unnecessarily awkward in the future, but okay.

This revision is now accepted and ready to land.Dec 17 2018, 9:43 PM
pete added a comment.Dec 17 2018, 9:45 PM

You're making intrinsics with weak_external linkage? I feel like that's going to be unnecessarily awkward in the future, but okay.

Yeah... i was a little surprised that was possible, but seems to work just fine.

Thanks for the quick review! Will land it first thing tomorrow.

This revision was automatically updated to reflect the committed changes.

After some bisection, it appears that this is the revision that introduced the regression in the GNUstep Objective-C runtime test suite that I reported on the list a few weeks ago. In this is the test (compiled with -fobjc-runtime=gnustep-2.0 -O3 and an ELF triple):

https://github.com/gnustep/libobjc2/blob/master/Test/AssociatedObject.m

After this change, Early CSE w/ MemorySSA is determining that the second load of deallocCalled is redundant. The code goes from:

  %7 = load i1, i1* @deallocCalled, align 1
  br i1 %7, label %8, label %9

; <label>:8:                                      ; preds = %0
  call void @__assert(i8* getelementptr inbounds ([5 x i8], [5 x i8]* @__func__.main, i64 0, i64 0), i8* getelementptr inbounds ([27 x i8], [27 x i8]* @.str, i64 0, i64 0), i32 26, i8* getelementptr inbounds ([15 x i8], [15 x i8]* @.str.1, i64 0, i64 0)) #5
  unreachable

; <label>:9:                                      ; preds = %0
  call void @llvm.objc.autoreleasePoolPop(i8* %1)
  %10 = load i1, i1* @deallocCalled, align 1
  br i1 %10, label %12, label %11

; <label>:11:                                     ; preds = %9
  call void @__assert(i8* getelementptr inbounds ([5 x i8], [5 x i8]* @__func__.main, i64 0, i64 0), i8* getelementptr inbounds ([27 x i8], [27 x i8]* @.str, i64 0, i64 0), i32 29, i8* getelementptr inbounds ([14 x i8], [14 x i8]* @.str.2, i64 0, i64 0)) #5
  unreachable

to:

  %7 = load i1, i1* @deallocCalled, align 1
  br i1 %7, label %8, label %9

; <label>:8:                                      ; preds = %0
  call void @__assert(i8* getelementptr inbounds ([5 x i8], [5 x i8]* @__func__.main, i64 0, i64 0), i8* getelementptr inbounds ([27 x i8], [27 x i8]* @.str, i64 0, i64 0), i32 26, i8* getelementptr inbounds ([15 x i8], [15 x i8]* @.str.1, i64 0, i64 0)) #5
  unreachable

; <label>:9:                                      ; preds = %0
  call void @llvm.objc.autoreleasePoolPop(i8* %1)
  br i1 %7, label %11, label %10

; <label>:10:                                     ; preds = %9
  call void @__assert(i8* getelementptr inbounds ([5 x i8], [5 x i8]* @__func__.main, i64 0, i64 0), i8* getelementptr inbounds ([27 x i8], [27 x i8]* @.str, i64 0, i64 0), i32 29, i8* getelementptr inbounds ([14 x i8], [14 x i8]* @.str.2, i64 0, i64 0)) #5
  unreachable

Later optimisations then determine that, because the assert does not return, the only possible value for %7 is false and cause the second assert to fire unconditionally.

It appears that we are not correctly modelling the side effects of the llvm.objc.autoreleasePoolPop intrinsic, but it's not entirely clear why not. The same test compiled for the macos runtime does not appear to exhibit the same behaviour. The previous revision, where we emitted a call to objc_autoreleasePoolPop and not the intrinsic worked correctly, but with this change the optimisers are assuming that no globals can be modified across an autorelease pool pop operation (at least, in some situations).

Looking at the definition of the intrinsic, I don't see anything wrong, so I still suspect that there is a MemorySSA bug that this has uncovered, rather than anything wrong in this series of commits. Any suggestions as to where to look would be appreciated.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 24 2019, 7:47 AM