Download Raw Diff

Details

Reviewers

dexonsmith
ahatanak
erik.pilkington
arphaman

Commits

rG2c0e875c2398: [objc_direct] fix codegen for mismatched Decl/Impl return types
rG6eb969b7c5b5: [objc_direct] fix codegen for mismatched Decl/Impl return types

Summary

For non direct methods, the codegen uses the type of the Implementation.
Because Objective-C rules allow some differences between the Declaration
and Implementation return types, when the Implementation is in this
translation unit, the type of the Implementation should be preferred to
emit the Function over the Declaration.

Radar-Id: rdar://problem/58797748

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

MadCoder created this revision.Jan 22 2020, 8:53 AM

Herald added a subscriber: cfe-commits. · View Herald TranscriptJan 22 2020, 8:53 AM

Why isn't a similar dance needed for non-direct methods?

In D73208#1835051, @dexonsmith wrote:

Why isn't a similar dance needed for non-direct methods?

because non direct methods do not need an llvm::Function to be synthesized at the call-site. direct methods do, and they form one with the type of the declaration they see. Then that same llvm::Function is used when you CodeGen the Implementation, so if there's a mismatch, sadness ensues because the LLVM IR verifier will notice the discrepancy between the declared return type of the function and the actual types coming out of the ret codepaths.

Regular obj-C methods use the _implementation_ types for the codegen (the declaration(s) aren't even consulted) and I want to stick at what obj-c does as much as I can.

(as a data point: If you use obj-C types with C functions, the type of the first declaration seen is used instead).

In D73208#1835264, @MadCoder wrote:

In D73208#1835051, @dexonsmith wrote:

Why isn't a similar dance needed for non-direct methods?

because non direct methods do not need an llvm::Function to be synthesized at the call-site. direct methods do, and they form one with the type of the declaration they see. Then that same llvm::Function is used when you CodeGen the Implementation, so if there's a mismatch, sadness ensues because the LLVM IR verifier will notice the discrepancy between the declared return type of the function and the actual types coming out of the ret codepaths.

Regular obj-C methods use the _implementation_ types for the codegen (the declaration(s) aren't even consulted) and I want to stick at what obj-c does as much as I can.

(as a data point: If you use obj-C types with C functions, the type of the first declaration seen is used instead).

Okay, that makes sense to me.

Another solution would be to change IRGen for the implementation: if the declaration already exists (getFunction), do a bitcast + RAUW dance to fix it up (and update the DirectMethodDefinitions table). WDYT?

In D73208#1836704, @dexonsmith wrote:

In D73208#1835264, @MadCoder wrote:

In D73208#1835051, @dexonsmith wrote:

Why isn't a similar dance needed for non-direct methods?

because non direct methods do not need an llvm::Function to be synthesized at the call-site. direct methods do, and they form one with the type of the declaration they see. Then that same llvm::Function is used when you CodeGen the Implementation, so if there's a mismatch, sadness ensues because the LLVM IR verifier will notice the discrepancy between the declared return type of the function and the actual types coming out of the ret codepaths.

Regular obj-C methods use the _implementation_ types for the codegen (the declaration(s) aren't even consulted) and I want to stick at what obj-c does as much as I can.

(as a data point: If you use obj-C types with C functions, the type of the first declaration seen is used instead).

Okay, that makes sense to me.

Another solution would be to change IRGen for the implementation: if the declaration already exists (getFunction), do a bitcast + RAUW dance to fix it up (and update the DirectMethodDefinitions table). WDYT?

I didn't want to do that because that would mean that the type used for the implementation would depart from dynamic Objective-C methods, and it feels that it shouldn't. hence I took this option.

In D73208#1836722, @MadCoder wrote:

In D73208#1836704, @dexonsmith wrote:

In D73208#1835264, @MadCoder wrote:

In D73208#1835051, @dexonsmith wrote:

Why isn't a similar dance needed for non-direct methods?

because non direct methods do not need an llvm::Function to be synthesized at the call-site. direct methods do, and they form one with the type of the declaration they see. Then that same llvm::Function is used when you CodeGen the Implementation, so if there's a mismatch, sadness ensues because the LLVM IR verifier will notice the discrepancy between the declared return type of the function and the actual types coming out of the ret codepaths.

Regular obj-C methods use the _implementation_ types for the codegen (the declaration(s) aren't even consulted) and I want to stick at what obj-c does as much as I can.

(as a data point: If you use obj-C types with C functions, the type of the first declaration seen is used instead).

Okay, that makes sense to me.

Another solution would be to change IRGen for the implementation: if the declaration already exists (getFunction), do a bitcast + RAUW dance to fix it up (and update the DirectMethodDefinitions table). WDYT?

I didn't want to do that because that would mean that the type used for the implementation would depart from dynamic Objective-C methods, and it feels that it shouldn't. hence I took this option.

I think we're talking across each other. The idea is check the type when generating the implementation; if it's not correct, you fix it (need to update existing uses to bitcast). So the type used for the implementation would match dynamic methods.

In D73208#1836789, @dexonsmith wrote:

In D73208#1836722, @MadCoder wrote:

In D73208#1836704, @dexonsmith wrote:

In D73208#1835264, @MadCoder wrote:

In D73208#1835051, @dexonsmith wrote:

Why isn't a similar dance needed for non-direct methods?

because non direct methods do not need an llvm::Function to be synthesized at the call-site. direct methods do, and they form one with the type of the declaration they see. Then that same llvm::Function is used when you CodeGen the Implementation, so if there's a mismatch, sadness ensues because the LLVM IR verifier will notice the discrepancy between the declared return type of the function and the actual types coming out of the ret codepaths.

Regular obj-C methods use the _implementation_ types for the codegen (the declaration(s) aren't even consulted) and I want to stick at what obj-c does as much as I can.

(as a data point: If you use obj-C types with C functions, the type of the first declaration seen is used instead).

Okay, that makes sense to me.

Another solution would be to change IRGen for the implementation: if the declaration already exists (getFunction), do a bitcast + RAUW dance to fix it up (and update the DirectMethodDefinitions table). WDYT?

I didn't want to do that because that would mean that the type used for the implementation would depart from dynamic Objective-C methods, and it feels that it shouldn't. hence I took this option.

I think we're talking across each other. The idea is check the type when generating the implementation; if it's not correct, you fix it (need to update existing uses to bitcast). So the type used for the implementation would match dynamic methods.

ah I see, I don't know how to do that, I couldn't figure out how to fix the function declaration, if you want to give it a stab I'd love it.

In D73208#1837909, @MadCoder wrote:

In D73208#1836789, @dexonsmith wrote:

I think we're talking across each other. The idea is check the type when generating the implementation; if it's not correct, you fix it (need to update existing uses to bitcast). So the type used for the implementation would match dynamic methods.

ah I see, I don't know how to do that, I couldn't figure out how to fix the function declaration, if you want to give it a stab I'd love it.

I missed that question until now. This uses llvm::Value::replaceAllUsesWith, if you git grep through clang you'll find examples.

Here's one from CodeGenFunction::AddInitializerToStaticVarDecl:

// The initializer may differ in type from the global. Rewrite
// the global to match the initializer.  (We have to do this
// because some types, like unions, can't be completely represented
// in the LLVM type system.)
if (GV->getType()->getElementType() != Init->getType()) {
  llvm::GlobalVariable *OldGV = GV;

  GV = new llvm::GlobalVariable(CGM.getModule(), Init->getType(),
                                OldGV->isConstant(),
                                OldGV->getLinkage(), Init, "",
                                /*InsertBefore*/ OldGV,
                                OldGV->getThreadLocalMode(),
                         CGM.getContext().getTargetAddressSpace(D.getType()));
  GV->setVisibility(OldGV->getVisibility());
  GV->setDSOLocal(OldGV->isDSOLocal());
  GV->setComdat(OldGV->getComdat());

  // Steal the name of the old global
  GV->takeName(OldGV);

  // Replace all uses of the old global with the new global
  llvm::Constant *NewPtrForOldDecl =
  llvm::ConstantExpr::getBitCast(GV, OldGV->getType());
  OldGV->replaceAllUsesWith(NewPtrForOldDecl);

  // Erase the old global, since it is no longer used.
  OldGV->eraseFromParent();
}

This example is analogous to your case. The pattern is:

Create the new thing (with no name, or with a temporary/bad name).
Steal the name of the old thing.
Bitcast from the new thing to the old type.
RAUW to replace all uses of the old thing with the bitcast-of-new-thing from (3).
Delete the old thing.

The RAUW isn't going to fix up the DenseMap you have on the side though, so you'll need to handle that explicitly; likely during step (4) makes sense.

@dexonsmith here, I still hook the same method but do it in a more LLVM-approved way ;)

It works with minimap perf impect because the only case we call GenerateDirectMethod with an Implementation is if we're about to codegen the body, so the case when you build against a header without seing the @implementation will not be affected perf-wise.

damn you tabs!

LGTM, with one style nitpick.

clang/lib/CodeGen/CGObjCMac.cpp
4049–4051	I think the LLVM style is to leave out these braces.

This revision is now accepted and ready to land.Jan 30 2020, 5:13 PM

MadCoder updated this revision to Diff 241613.Jan 30 2020, 5:18 PM

MadCoder marked an inline comment as done.

Closed by commit rG6eb969b7c5b5: [objc_direct] fix codegen for mismatched Decl/Impl return types (authored by MadCoder). · Explain WhyJan 30 2020, 6:23 PM

This revision was automatically updated to reflect the committed changes.

This is an archive of the discontinued LLVM Phabricator instance.

[objc_direct] fix codegen for mismatched Decl/Impl return types
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 241623

clang/lib/CodeGen/CGObjCMac.cpp

clang/test/CodeGenObjC/direct-method-ret-mismatch.m

This is an archive of the discontinued LLVM Phabricator instance.

[objc_direct] fix codegen for mismatched Decl/Impl return typesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 241623

clang/lib/CodeGen/CGObjCMac.cpp

clang/test/CodeGenObjC/direct-method-ret-mismatch.m

[objc_direct] fix codegen for mismatched Decl/Impl return types
ClosedPublic