This is an archive of the discontinued LLVM Phabricator instance.

Fix conflict value for metadata "Objective-C Garbage Collection" in the mix of swift and Objective-C bitcode
ClosedPublic

Authored by jinlin on Dec 9 2019, 11:29 AM.

Details

Summary

The change is to fix conflict value for metadata "Objective-C Garbage Collection" in the mix of swift and Objective-C bitcode.
The purpose is to provide the support of LTO for swift and Objective-C mixed project.

Diff Detail

Event Timeline

jinlin created this revision.Dec 9 2019, 11:29 AM
dexonsmith added inline comments.
llvm/lib/Linker/IRMover.cpp
1277–1308

The use case you have sounds great, but I disagree with specializing module flag logic for specific keys. Module flags are meant to be generic ways of enforcing rules.

  • If Swift is emitting a value that's incompatible with the value Clang is emitting but the IR is compatible, then at least one of them needs to change what value they're emitting for this flag so that the normal module flags semantics work.
  • If the IR that Swift and Clang emit is *not* compatible, then that has to be fixed first.
  • If we need support for old, archived bitcode getting merged with new bitcode from one of the compilers (I doubt we do, but maybe?), then we can figure out a scheme for some upgrade logic.

Yeah, the discussion Jin and I had privately was that we should split the Swift ABI version out of the ObjC GC global metadata and update the backend so that it assembles the image info appropriately from different places. I don't know what happened to that.

That will probably require an IR updater, but that shouldn't be a problem.

Yeah, the discussion Jin and I had privately was that we should split the Swift ABI version out of the ObjC GC global metadata and update the backend so that it assembles the image info appropriately from different places. I don't know what happened to that.

That will probably require an IR updater, but that shouldn't be a problem.

Something like that SGTM. @jinlin, the logic for auto upgrading bitcode is in lib/IR/AutoUpgrade.cpp. I'm not sure whether there are module flags that get updated right now; let me or @steven_wu know if need a pointer there.

Breaking part the meta data sounds like a good idea. You will probably need to combine the value in the backend to generate the field in the object file.

Also, please add test case. If you go down the route of splitting the value, you need:

  • codegen tests for correct value in the object file
  • llvm-link tests for combining objc and swift bitcode
  • auto upgrade test if you want to support the old bitcode

Breaking part the meta data sounds like a good idea. You will probably need to combine the value in the backend to generate the field in the object file.

Also, please add test case. If you go down the route of splitting the value, you need:

  • codegen tests for correct value in the object file
  • llvm-link tests for combining objc and swift bitcode
  • auto upgrade test if you want to support the old bitcode

Steven,

  1. I will add a llvm-link tests for combining objc and swift bitcode.
  2. What do you mean by "codegen tests for correct value in the object file"? There is no "object file" in the picture. Are you asking for a test that checks the value in the output of llvm-link?
  3. Let us keep the auto upgrade to a different future diff to keep each diff self contained. Is that ok?

-Jin

  1. I will add a llvm-link tests for combining objc and swift bitcode.
  2. What do you mean by "codegen tests for correct value in the object file"? There is no "object file" in the picture. Are you asking for a test that checks the value in the output of llvm-link?

I think the decision was to break apart the current "Objective-C Garbage Collection" metadata. You need codegen test to test that OBJC_IMAGE_INFO is generated correctly after breaking apart the value.

  1. Let us keep the auto upgrade to a different future diff to keep each diff self contained. Is that ok?

I disagree with that. That is leaving a gap for incompatible bitcode.

Steven

jinlin updated this revision to Diff 246337.Feb 24 2020, 4:18 PM

Update with llvm-link test case.

  1. I will add a llvm-link tests for combining objc and swift bitcode.
  2. What do you mean by "codegen tests for correct value in the object file"? There is no "object file" in the picture. Are you asking for a test that checks the value in the output of llvm-link?

I think the decision was to break apart the current "Objective-C Garbage Collection" metadata. You need codegen test to test that OBJC_IMAGE_INFO is generated correctly after breaking apart the value.

I have question regarding this. Let's assume we have one assembly file from Objective-C and another one from Swift.

Objective-C assembly:
L_OBJC_IMAGE_INFO:
.long 0
.long 64

Swift assembly:

L_OBJC_IMAGE_INFO:
.long 0
.long 83953472

Let's assume that the llvm-link can link Swift bitcode and Objective-C bit successfully and the llc can generate the assembly for the merged output bitcode.

In the final assembly file, what value do you expect for L_OBJC_IMAGE_INFO?

  1. Let us keep the auto upgrade to a different future diff to keep each diff self contained. Is that ok?

I disagree with that. That is leaving a gap for incompatible bitcode.

Steven

  1. I will add a llvm-link tests for combining objc and swift bitcode.
  2. What do you mean by "codegen tests for correct value in the object file"? There is no "object file" in the picture. Are you asking for a test that checks the value in the output of llvm-link?

I think the decision was to break apart the current "Objective-C Garbage Collection" metadata. You need codegen test to test that OBJC_IMAGE_INFO is generated correctly after breaking apart the value.

I have question regarding this. Let's assume we have one assembly file from Objective-C and another one from Swift.

Objective-C assembly:
L_OBJC_IMAGE_INFO:
.long 0
.long 64

Swift assembly:

L_OBJC_IMAGE_INFO:
.long 0
.long 83953472

Let's assume that the llvm-link can link Swift bitcode and Objective-C bit successfully and the llc can generate the assembly for the merged output bitcode.

In the final assembly file, what value do you expect for L_OBJC_IMAGE_INFO?

You can easily test that out with ld64. It depends on the bits in the current garbage collection fields. Most of the fields are useless now (only there for GC which is obsoleted), that is why the module flag uses Override. Some bits in there are taking Max, like OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES. Swift version should be taking Error since mismatch is a failure, unless it is not set.

If you are breaking the module flag "Objective-C Garbage Collection" into all the corresponding bits, you can correctly specify all the right behavior for the bitfield.

  1. Let us keep the auto upgrade to a different future diff to keep each diff self contained. Is that ok?

I disagree with that. That is leaving a gap for incompatible bitcode.

Steven

Just to clarify :]

For Swift, we already have "Swift Version" as module flag, "Garbage Collection" contains swiftVersion, Major and Minor. We should add a new module flag (with type Error) that contains Major and Minor and make sure "Garbage Collection" value is the same for Swift as for ObjC.

In the backend, we will reconstruct IMAGE_INFO from the new module flags and make sure the value stays the same with and without this patch.

IMAGE_INFO will be different for Swift vs ObjC.

Thanks!
Manman

We should add a new module flag (with type Error) that contains Major and Minor and make sure "Garbage Collection" value is the same for Swift as for ObjC.

In the backend, we will reconstruct IMAGE_INFO from the new module flags and make sure the value stays the same with and without this patch.

IMAGE_INFO will be different for Swift vs ObjC.

Should be question mark! Meant for clarification from @steven_wu and @rjmccall. Should IMAGE_INFO include the value from the new module flag?

jinlin added a comment.EditedFeb 25 2020, 3:02 PM
  1. I will add a llvm-link tests for combining objc and swift bitcode.
  2. What do you mean by "codegen tests for correct value in the object file"? There is no "object file" in the picture. Are you asking for a test that checks the value in the output of llvm-link?

I think the decision was to break apart the current "Objective-C Garbage Collection" metadata. You need codegen test to test that OBJC_IMAGE_INFO is generated correctly after breaking apart the value.

That is the thing I am confused. For the merged bitcode, what OBJ_IMAGE_INFO should we generate in the assembly code?

I have question regarding this. Let's assume we have one assembly file from Objective-C and another one from Swift.

Objective-C assembly:
L_OBJC_IMAGE_INFO:
.long 0
.long 64

Swift assembly:

L_OBJC_IMAGE_INFO:
.long 0
.long 83953472

Let's assume that the llvm-link can link Swift bitcode and Objective-C bit successfully and the llc can generate the assembly for the merged output bitcode.

In the final assembly file, what value do you expect for L_OBJC_IMAGE_INFO?

You can easily test that out with ld64. It depends on the bits in the current garbage collection fields. Most of the fields are useless now (only there for GC which is obsoleted), that is why the module flag uses Override. Some bits in there are taking Max, like OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES. Swift version should be taking Error since mismatch is a failure, unless it is not set.

If you are breaking the module flag "Objective-C Garbage Collection" into all the corresponding bits, you can correctly specify all the right behavior for the bitfield.

  1. Let us keep the auto upgrade to a different future diff to keep each diff self contained. Is that ok?

I disagree with that. That is leaving a gap for incompatible bitcode.

Steven

We should add a new module flag (with type Error) that contains Major and Minor and make sure "Garbage Collection" value is the same for Swift as for ObjC.

In the backend, we will reconstruct IMAGE_INFO from the new module flags and make sure the value stays the same with and without this patch.

IMAGE_INFO will be different for Swift vs ObjC.

Should be question mark! Meant for clarification from @steven_wu and @rjmccall. Should IMAGE_INFO include the value from the new module flag?

We should add a new module flag (with type Error) that contains Major and Minor and make sure "Garbage Collection" value is the same for Swift as for ObjC.

In the backend, we will reconstruct IMAGE_INFO from the new module flags and make sure the value stays the same with and without this patch.

IMAGE_INFO will be different for Swift vs ObjC.

Should be question mark! Meant for clarification from @steven_wu and @rjmccall. Should IMAGE_INFO include the value from the new module flag?

Here is my opinion of the best option:

  • Break up Garbage Collection field to multiple fields, including swift version, major and minor, and remaining bits (can break down more if needed). All those bits can be set up have the correct type so LTO can link them correctly.
  • Update backend to emit the same IMAGE_INFO by piecing together new module flags.
  • Add a IRUpgrader to upgrade Garbage Collection to new fields.

IMAGE_INFO is different between swift and objc. For example, in the testcase, Objc has IMAGE_INFO 0x40 and swift has 0x5010700, linker should merge them and produce 0x5010740.

Yes, the OBJC_IMAGE_INFO needs to include the information from the new module flag.

The basic situation here is not very complicated. The OBJC_IMAGE_INFO contains information about the code present in the image, like the ObjC ABI in use, the Swift ABI in use, and whether ObjC GC is enabled. In general, that information only applies to the subset of code that uses a particular language, but it does need to be consistent across all the code that uses that language. Because Swift code on Darwin uses both the ObjC ABI and the Swift ABI, there's actually a sort of hierarchy here:

  • C translation units don't care about either ABI
  • ObjC translation units only care about the ObjC ABI
  • Swift translation units care about both the ObjC and Swift ABIs

This can all be linked into the same image, and it's fine as long as the ObjC and Swift translation units all agree about the ObjC ABI details and all the Swift translation units agree about all the Swift ABI details.

This can be easily represented in LLVM module flags by using separate flags for each logical component, all with the Error behavior. That's because Error requires the flag value to match if present in multiple translation units (thus achieving consistency among the translation units that care about an ABI), but also allows translation units to not provide the flag (thus permitting code to be linked in that doesn't care about an ABI).

This is currently subverted because, as a hasty convenience, we lumped the Swift ABI information into the GC flag. This leads to a valid OBJC_IMAGE_INFO structure because of the details of how that structure is assembled, and the result is correctly handled by the Mach-O linker; however, it causes artificial problems for the LLVM IR linker by incorrectly preventing Swift translation units from being linked with ObjC translation units. The solution is to stop lumping things into a single module flag so that LLVM doesn't have these artificial linking problems; instead, we should use separate module flags for the separate components. But the basic requirement — to assemble all of this information into the emitted OBJC_IMAGE_INFO — will not change.

We should add a new module flag (with type Error) that contains Major and Minor and make sure "Garbage Collection" value is the same for Swift as for ObjC.

In the backend, we will reconstruct IMAGE_INFO from the new module flags and make sure the value stays the same with and without this patch.

IMAGE_INFO will be different for Swift vs ObjC.

Should be question mark! Meant for clarification from @steven_wu and @rjmccall. Should IMAGE_INFO include the value from the new module flag?

Here is my opinion of the best option:

  • Break up Garbage Collection field to multiple fields, including swift version, major and minor, and remaining bits (can break down more if needed). All those bits can be set up have the correct type so LTO can link them correctly.
  • Update backend to emit the same IMAGE_INFO by piecing together new module flags.
  • Add a IRUpgrader to upgrade Garbage Collection to new fields.

IMAGE_INFO is different between swift and objc. For example, in the testcase, Objc has IMAGE_INFO 0x40 and swift has 0x5010700, linker should merge them and produce 0x5010740.

I agree with all of this except that I'm not sure there's any situation where ObjC and Swift translation units should be disagreeing about the ObjC-specific parts of the OBJC_IMAGE_INFO.

I agree with all of this except that I'm not sure there's any situation where ObjC and Swift translation units should be disagreeing about the ObjC-specific parts of the OBJC_IMAGE_INFO.

The lower bits of the flag is for objc info and here is the definition:
#define OBJC_IMAGE_SUPPORTS_GC (1<<1)
#define OBJC_IMAGE_REQUIRES_GC (1<<2)
#define OBJC_IMAGE_OPTIMIZED_BY_DYLD (1<<3)
#define OBJC_IMAGE_SUPPORTS_COMPACTION (1<<4)
#define OBJC_IMAGE_IS_SIMULATED (1<<5)
#define OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES (1<<6)

The 0x40 bit is actually OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES, which probably can differ from translation unit to translation unit even within ObjC. Some of the other bits should not be different. That is why I say you can break the bits down even more if needed.

jinlin added a comment.EditedFeb 25 2020, 4:16 PM
  1. I will add a llvm-link tests for combining objc and swift bitcode.
  2. What do you mean by "codegen tests for correct value in the object file"? There is no "object file" in the picture. Are you asking for a test that checks the value in the output of llvm-link?

I think the decision was to break apart the current "Objective-C Garbage Collection" metadata. You need codegen test to test that OBJC_IMAGE_INFO is generated correctly after breaking apart the value.

I have question regarding this. Let's assume we have one assembly file from Objective-C and another one from Swift.

Objective-C assembly:
L_OBJC_IMAGE_INFO:
.long 0
.long 64

Swift assembly:

L_OBJC_IMAGE_INFO:
.long 0
.long 83953472

Let's assume that the llvm-link can link Swift bitcode and Objective-C bit successfully and the llc can generate the assembly for the merged output bitcode.

In the final assembly file, what value do you expect for L_OBJC_IMAGE_INFO?

You can easily test that out with ld64. It depends on the bits in the current garbage collection fields. Most of the fields are useless now (only there for GC which is obsoleted), that is why the module flag uses Override. Some bits in there are taking Max, like OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES. Swift version should be taking Error since mismatch is a failure, unless it is not set.

If you are breaking the module flag "Objective-C Garbage Collection" into all the corresponding bits, you can correctly specify all the right behavior for the bitfield.

  1. Let us keep the auto upgrade to a different future diff to keep each diff self contained. Is that ok?

I disagree with that. That is leaving a gap for incompatible bitcode.

Steven

I agree with all of this except that I'm not sure there's any situation where ObjC and Swift translation units should be disagreeing about the ObjC-specific parts of the OBJC_IMAGE_INFO.

The lower bits of the flag is for objc info and here is the definition:
#define OBJC_IMAGE_SUPPORTS_GC (1<<1)
#define OBJC_IMAGE_REQUIRES_GC (1<<2)
#define OBJC_IMAGE_OPTIMIZED_BY_DYLD (1<<3)
#define OBJC_IMAGE_SUPPORTS_COMPACTION (1<<4)
#define OBJC_IMAGE_IS_SIMULATED (1<<5)
#define OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES (1<<6)

The 0x40 bit is actually OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES, which probably can differ from translation unit to translation unit even within ObjC. Some of the other bits should not be different. That is why I say you can break the bits down even more if needed.

Do you suggest I should break apart the current "Objective-C Garbage Collection" metadata for both Swift and Objective-C?

I agree with all of this except that I'm not sure there's any situation where ObjC and Swift translation units should be disagreeing about the ObjC-specific parts of the OBJC_IMAGE_INFO.

The lower bits of the flag is for objc info and here is the definition:
#define OBJC_IMAGE_SUPPORTS_GC (1<<1)
#define OBJC_IMAGE_REQUIRES_GC (1<<2)
#define OBJC_IMAGE_OPTIMIZED_BY_DYLD (1<<3)
#define OBJC_IMAGE_SUPPORTS_COMPACTION (1<<4)
#define OBJC_IMAGE_IS_SIMULATED (1<<5)
#define OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES (1<<6)

The 0x40 bit is actually OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES, which probably can differ from translation unit to translation unit even within ObjC. Some of the other bits should not be different. That is why I say you can break the bits down even more if needed.

Clang seems to set it unconditionally. Presumably it just controls whether category descriptors have a class-properties field; offhand, I don't know why that would need to be set globally.

The 0x40 bit is actually OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES, which probably can differ from translation unit to translation unit even within ObjC. Some of the other bits should not be different. That is why I say you can break the bits down even more if needed.

Clang seems to set it unconditionally. Presumably it just controls whether category descriptors have a class-properties field; offhand, I don't know why that would need to be set globally.

Does swift set this bit unconditionally? If so, then there is probably no reason for the lower bits to be different.

@jinlin Let me be more specific about what I suggest:

  • clang can keep "Objective-C Garbage Collection" metadata but its value should be i8 instead of i32. The type of the module flag should be Error.
  • swift compiler will emit "Objective-C Garbage Collection" together with swift ABI version, major and minor. All of them should be Error type.
  • backend should know how to generate IMAGE_INFO from Swift ABI version + major + minor + "Objective-C Garbage Collection"
  • IRUpgrader should turn a i32 type "Objective-C Garbage Collection" into i8 value, if the higher bits are set, it adds the module flag for swift info.
jinlin updated this revision to Diff 249201.Mar 9 2020, 1:25 PM
jinlin added a comment.Mar 9 2020, 1:30 PM

I have not made changes in swift side since the changes in LLVM have to go in first.

It looks good in general. I would prefer to have two separate tests, one for upgrade, one for ir-linking.

I will have @rjmccall to have the final saying on this.

llvm/lib/IR/AutoUpgrade.cpp
4083 ↗(On Diff #249201)

Should we upgrade the first operand to Error? That also means you need to update your test case.

Code looks like, minor comments aside. Please add a test case for the auto-upgrader.

clang/lib/CodeGen/CGObjCMac.cpp
5173 ↗(On Diff #249201)

These two comments are both just restating the code and can be dropped.

llvm/lib/IR/AutoUpgrade.cpp
4069 ↗(On Diff #249201)

We don't generally assert basic structural well-formedness properties like "values have types", or else we'd have a million assertions everywhere.

jinlin updated this revision to Diff 249274.Mar 9 2020, 9:54 PM
jinlin marked 3 inline comments as done.
jinlin added inline comments.
clang/lib/CodeGen/CGObjCMac.cpp
5173 ↗(On Diff #249201)

Removed these two as suggested.

llvm/lib/IR/AutoUpgrade.cpp
4069 ↗(On Diff #249201)

Remove as suggested.

4083 ↗(On Diff #249201)

Updated.

Tests look good, thanks. A couple little things remaining, but feel free to commit after you fix them.

clang/lib/CodeGen/CGObjCMac.cpp
5166 ↗(On Diff #249274)

I also meant this comment.

llvm/lib/IR/AutoUpgrade.cpp
4067 ↗(On Diff #249274)

Spurious newline

jinlin updated this revision to Diff 249348.Mar 10 2020, 6:06 AM
jinlin marked 2 inline comments as done.
jinlin added inline comments.
clang/lib/CodeGen/CGObjCMac.cpp
5166 ↗(On Diff #249274)

Removed.

llvm/lib/IR/AutoUpgrade.cpp
4067 ↗(On Diff #249274)

Thanks for pointing out. Removed.

rjmccall accepted this revision.Mar 10 2020, 9:13 AM

Thanks, approved.

This revision is now accepted and ready to land.Mar 10 2020, 9:13 AM
steven_wu accepted this revision.Mar 10 2020, 10:21 AM

Thank you all for your review and suggestions!

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2020, 1:36 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript