Page MenuHomePhabricator

Please use GitHub pull requests for new patches. Avoid migrating existing patches. Phabricator shutdown timeline

GNUstep Objective-C ABI version 2

Authored by theraven on Apr 25 2018, 5:17 AM.



This includes initial support for the (hopefully final) updated Objective-C ABI, developed here:

It also includes some cleanups and refactoring from older GNU ABIs.

This is not yet in a state that's ready to commit (for example, it's currently ELF-only, but doesn't include driver changes to downgrade to the 1.x ABI for non-ELF targets until other binary formats are supported), but it's probably in a state where it could benefit from review.

Diff Detail

rC Clang

Event Timeline

theraven created this revision.Apr 25 2018, 5:17 AM

Are you asking for a code review or a design review of the ABI? The second would be much easier to do from a design document.


There's an isOptional() method now.

Are you asking for a code review or a design review of the ABI?

I don't think a design review is appropriate here, I am asking for a code review.

The second would be much easier to do from a design document.

If you'd be willing to do a design review, I'd be very happy to send you a copy of the ABI document.

theraven updated this revision to Diff 144070.Apr 26 2018, 1:24 AM
  • Fixed non-use of isOptional (in all three places where it should have been used)
  • Reject -fobjc-runtime=gnustep-2.x in the driver for non-ELF targets
  • Fix a bug where the wrong variable was assigned in protocol generation, which triggered an assertion failure in one of the runtime's tests.
  • Rename protocol reference variables to prevent accidental conflicts with protocol references (this was already done for other indirection variables, but I'd missed protocols)
theraven updated this revision to Diff 144094.Apr 26 2018, 4:20 AM

Fix the symbol names that I thought I'd done previously.

Also a small refactoring of how class init variables are generated to do it at the sensible time.

This version has been tested with the runtime's test suite and the GNUstep-Base (Foundation framework) test suite.

rjmccall added inline comments.Apr 26 2018, 8:51 AM

I'd encourage you to use ConstantBuilder whenever you would want to use this.


Why copy the string? Also, you don't actually use it until later.


Please hoist SL->getLength() into a variable.

Also, consider breaking this bit of code (maybe just building a uint64_t?) into a separate function.


It's weird to give a struct like this and have it only be right on a *big*-endian target. You might also consider just using uint64_t.


I think this might be clearer as 64 - 7 - (i*7).


This is a very misleading variable name.


You should leave a comment saying that it's safe to expand an N-code-unit UTF-8 string into an N-code-unit UTF-16 buffer because UTF-16 never uses more code units than UTF-8.


Is @ really the only illegal character in ELF symbol names? Surely at least \0 as well. And your mangling will collide strings if they contain \1.

I think you should probably just have this use internal linkage (and a meaningless symbol name) for strings containing bad characters.


Surely this has to be linkonce_odr?


Field declaration comments, like you use elsewhere, would help here. And something that includes the struct name that you're generating as well.


Same comment here about adding a comment somewhere that mentions the struct name from the runtime header that you're generating.


Please make this exhaustive, since you're just missing Qualifiers::OCL_Autoreleasing (which is illegal on an ivar, of course).


Why 4?


Maybe you can have a method that returns the start/end symbols as a pair? It would cut down on the redundancy here a lot.

theraven updated this revision to Diff 144441.Apr 28 2018, 4:41 AM
theraven marked 15 inline comments as done.
theraven added inline comments.

I don't think separating it would simplify the code much. It's 8 lines of non-comment code and you'd probably need at least three of them in the caller.


No, I copied the mangling for selectors without engaging my brain. I've now replaced it with something much more conservative. I might tweak it a bit before the next release, but this seems to give pretty good coverage.

DHowett-MSFT added inline comments.Apr 30 2018, 3:04 PM

While we're here, is there value in storing the ivar size in layout as well, so that the runtime doesn't need to calculate it from the distance to the next ivar/end of the instance?


This should be readily expandable for PE/COFF, but we'll need to change some of the section names to fit better. Is it worth abstracting the names of the sections across the target format somehow?

(pe/coff will need to emit COMDAT symbols pointing into lexicographically sortable section names that the linker can fold away)


Is there a way to ensure that objc_load happens before other static initializers across the entire set of linker inputs?

theraven added inline comments.May 1 2018, 1:57 AM

Normally the runtime calculates it from the type encoding, if it's required. I agree that it might be nice to have though. Do you have strong feelings about needing it in the 2.0 ABI? The looser coupling means that it would be easy to add in the 2.1 ABI if we want it later. Are you seeing cases where the runtime is calculating it incorrectly because of insufficient information in the type encoding, or where calculating it is causing performance problems?


As I understand it, we can just append $m for the symbol names on PE/COFF and then create zero-size symbols with $a and $z suffixes for the end? The symbol names should be able to be the same, but if that's not possible then it would be a good idea to change them so that ELF and PE/COFF binaries look the same.


This part is quite ELF-specific, and on ELF unfortunately the order of constructor initialisation is undefined (it will be in whatever order the linker decides to put them in). Running the Objective-C code first isn't necessarily correct: C++ static constructors and Objective-C +load methods could each depend on the other (and C++ on ObjC classes having been loaded).

Of course, if you have control over the linker (or possibly even a linker script) then you can adjust the ordering however you want...

Thanks, the comments help a lot.


The distance between ivar offsets wouldn't be correct anyway because of alignment padding. A set-ivar function might be able to get away with copying too many bytes from an input buffer (although I wouldn't recommend it!), but a get-ivar function definitely should not copy too many bytes into an output buffer.

And all that's assuming that the runtime promises not to reorder ivars dynamically. I don't know what the GNU runtime says about that. (Static reordering is fine, since the runtime can reasonably demand that the compiler emit ivars in layout order.)


Seems much more reasonable, thanks.


Micro-optimization: I'm pretty sure the type of this ternary is an r-value std::string, which means copying the string in either case; if you explicitly build a StringRef from StringName, the ternary will instead construct a StringRef and you'll avoid those unnecessary copies.


The indentation here is a little misleading; I'd suggest aligning the two cases of the ternary operator.


I would suggest clarifying in what sense this is a placeholder. Does the runtime recognize it specially? If so, how? Is it replaced statically by a later pass in IRGen?

theraven marked 3 inline comments as done.May 20 2018, 11:01 AM
theraven added inline comments.

The runtime was calculating the size from the type encoding (which can also be wrong in a few cases, such as packed structures). I agree it makes sense to add the size though, and have done..


This comment was inherited from the old version and is actually nonsense now. I have replaced it with one that actually makes sense.

theraven updated this revision to Diff 147716.May 20 2018, 11:02 AM
theraven marked an inline comment as done.

Address review comments, including adding a size field for ivars.

I think I am happy with this revision and would like to commit it before working on PE/COFF support.

rjmccall accepted this revision.May 20 2018, 10:21 PM

Thanks, my comments seem to all be addressed.

This revision is now accepted and ready to land.May 20 2018, 10:21 PM
This revision was automatically updated to reflect the committed changes.
Ka-Ka added a subscriber: Ka-Ka.May 22 2018, 12:50 AM
Ka-Ka added inline comments.

The isnumber() function was added to cctype.h by Apple. I don't think it can be used in llvm.

According to

theraven added inline comments.May 22 2018, 1:31 AM

Ah, isnumber is from 4.4BSD, I assumed it worked everywhere. Changing it to isdigit is fine.

GBuella added inline comments.

BTW isalnum is in ISO since 1989.

GBuella added inline comments.May 22 2018, 2:09 AM

Also, you might want to isalnum((unsigned char)c), as passing a negative value to these functions is UB.

xbolva00 added inline comments.

This causes
CGObjCGNU.cpp:589:16: warning: ‘virtual llvm::Value* {anonymous}::CGObjCGNU::GetSelector(clang::CodeGen::CodeGenFunction&, const clang::ObjCMethodDecl*)’ was hidden [-Woverloaded-virtual]

llvm::Value *GetSelector(CodeGenFunction &CGF,
theraven added inline comments.May 26 2018, 4:19 AM

I can't reproduce this, and I'm not sure what the issue is. The two lines have different overloads, so one shouldn't be hiding the other.

xbolva00 added inline comments.May 26 2018, 4:22 AM

It is a warning from GCC 7.2

theraven added inline comments.May 26 2018, 4:25 AM

Sounds like it's a spurious one. Any idea how to silence it?

xbolva00 added inline comments.May 26 2018, 4:30 AM

Not sure ..

virtual llvm::Value *GetSelector(CodeGenFunction &CGF, Selector Sel, const std::string &TypeEncoding); is here

but base class (CGObjCRuntime) has:

virtual llvm::Value * GetSelector (CodeGenFunction &CGF, Selector Sel)=0