The change is to fix conflict value for metadata "Objective-C Garbage Collection" in the mix of swift and Objective-C bitcode.
The purpose is to provide the support of LTO for swift and Objective-C mixed project.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Linker/IRMover.cpp | ||
---|---|---|
1277–1282 ↗ | (On Diff #232903) | The use case you have sounds great, but I disagree with specializing module flag logic for specific keys. Module flags are meant to be generic ways of enforcing rules.
|
Yeah, the discussion Jin and I had privately was that we should split the Swift ABI version out of the ObjC GC global metadata and update the backend so that it assembles the image info appropriately from different places. I don't know what happened to that.
That will probably require an IR updater, but that shouldn't be a problem.
Something like that SGTM. @jinlin, the logic for auto upgrading bitcode is in lib/IR/AutoUpgrade.cpp. I'm not sure whether there are module flags that get updated right now; let me or @steven_wu know if need a pointer there.
Breaking part the meta data sounds like a good idea. You will probably need to combine the value in the backend to generate the field in the object file.
Also, please add test case. If you go down the route of splitting the value, you need:
- codegen tests for correct value in the object file
- llvm-link tests for combining objc and swift bitcode
- auto upgrade test if you want to support the old bitcode
Steven,
- I will add a llvm-link tests for combining objc and swift bitcode.
- What do you mean by "codegen tests for correct value in the object file"? There is no "object file" in the picture. Are you asking for a test that checks the value in the output of llvm-link?
- Let us keep the auto upgrade to a different future diff to keep each diff self contained. Is that ok?
-Jin
- I will add a llvm-link tests for combining objc and swift bitcode.
- What do you mean by "codegen tests for correct value in the object file"? There is no "object file" in the picture. Are you asking for a test that checks the value in the output of llvm-link?
I think the decision was to break apart the current "Objective-C Garbage Collection" metadata. You need codegen test to test that OBJC_IMAGE_INFO is generated correctly after breaking apart the value.
- Let us keep the auto upgrade to a different future diff to keep each diff self contained. Is that ok?
I disagree with that. That is leaving a gap for incompatible bitcode.
Steven
I have question regarding this. Let's assume we have one assembly file from Objective-C and another one from Swift.
Objective-C assembly:
L_OBJC_IMAGE_INFO:
.long 0
.long 64
Swift assembly:
L_OBJC_IMAGE_INFO:
.long 0
.long 83953472
Let's assume that the llvm-link can link Swift bitcode and Objective-C bit successfully and the llc can generate the assembly for the merged output bitcode.
In the final assembly file, what value do you expect for L_OBJC_IMAGE_INFO?
- Let us keep the auto upgrade to a different future diff to keep each diff self contained. Is that ok?
I disagree with that. That is leaving a gap for incompatible bitcode.
Steven
You can easily test that out with ld64. It depends on the bits in the current garbage collection fields. Most of the fields are useless now (only there for GC which is obsoleted), that is why the module flag uses Override. Some bits in there are taking Max, like OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES. Swift version should be taking Error since mismatch is a failure, unless it is not set.
If you are breaking the module flag "Objective-C Garbage Collection" into all the corresponding bits, you can correctly specify all the right behavior for the bitfield.
- Let us keep the auto upgrade to a different future diff to keep each diff self contained. Is that ok?
I disagree with that. That is leaving a gap for incompatible bitcode.
Steven
Just to clarify :]
For Swift, we already have "Swift Version" as module flag, "Garbage Collection" contains swiftVersion, Major and Minor. We should add a new module flag (with type Error) that contains Major and Minor and make sure "Garbage Collection" value is the same for Swift as for ObjC.
In the backend, we will reconstruct IMAGE_INFO from the new module flags and make sure the value stays the same with and without this patch.
IMAGE_INFO will be different for Swift vs ObjC.
Thanks!
Manman
We should add a new module flag (with type Error) that contains Major and Minor and make sure "Garbage Collection" value is the same for Swift as for ObjC.
In the backend, we will reconstruct IMAGE_INFO from the new module flags and make sure the value stays the same with and without this patch.
IMAGE_INFO will be different for Swift vs ObjC.
Should be question mark! Meant for clarification from @steven_wu and @rjmccall. Should IMAGE_INFO include the value from the new module flag?
That is the thing I am confused. For the merged bitcode, what OBJ_IMAGE_INFO should we generate in the assembly code?
I have question regarding this. Let's assume we have one assembly file from Objective-C and another one from Swift.
Objective-C assembly:
L_OBJC_IMAGE_INFO:
.long 0
.long 64Swift assembly:
L_OBJC_IMAGE_INFO:
.long 0
.long 83953472Let's assume that the llvm-link can link Swift bitcode and Objective-C bit successfully and the llc can generate the assembly for the merged output bitcode.
In the final assembly file, what value do you expect for L_OBJC_IMAGE_INFO?
You can easily test that out with ld64. It depends on the bits in the current garbage collection fields. Most of the fields are useless now (only there for GC which is obsoleted), that is why the module flag uses Override. Some bits in there are taking Max, like OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES. Swift version should be taking Error since mismatch is a failure, unless it is not set.
If you are breaking the module flag "Objective-C Garbage Collection" into all the corresponding bits, you can correctly specify all the right behavior for the bitfield.
- Let us keep the auto upgrade to a different future diff to keep each diff self contained. Is that ok?
I disagree with that. That is leaving a gap for incompatible bitcode.
Steven
Here is my opinion of the best option:
- Break up Garbage Collection field to multiple fields, including swift version, major and minor, and remaining bits (can break down more if needed). All those bits can be set up have the correct type so LTO can link them correctly.
- Update backend to emit the same IMAGE_INFO by piecing together new module flags.
- Add a IRUpgrader to upgrade Garbage Collection to new fields.
IMAGE_INFO is different between swift and objc. For example, in the testcase, Objc has IMAGE_INFO 0x40 and swift has 0x5010700, linker should merge them and produce 0x5010740.
Yes, the OBJC_IMAGE_INFO needs to include the information from the new module flag.
The basic situation here is not very complicated. The OBJC_IMAGE_INFO contains information about the code present in the image, like the ObjC ABI in use, the Swift ABI in use, and whether ObjC GC is enabled. In general, that information only applies to the subset of code that uses a particular language, but it does need to be consistent across all the code that uses that language. Because Swift code on Darwin uses both the ObjC ABI and the Swift ABI, there's actually a sort of hierarchy here:
- C translation units don't care about either ABI
- ObjC translation units only care about the ObjC ABI
- Swift translation units care about both the ObjC and Swift ABIs
This can all be linked into the same image, and it's fine as long as the ObjC and Swift translation units all agree about the ObjC ABI details and all the Swift translation units agree about all the Swift ABI details.
This can be easily represented in LLVM module flags by using separate flags for each logical component, all with the Error behavior. That's because Error requires the flag value to match if present in multiple translation units (thus achieving consistency among the translation units that care about an ABI), but also allows translation units to not provide the flag (thus permitting code to be linked in that doesn't care about an ABI).
This is currently subverted because, as a hasty convenience, we lumped the Swift ABI information into the GC flag. This leads to a valid OBJC_IMAGE_INFO structure because of the details of how that structure is assembled, and the result is correctly handled by the Mach-O linker; however, it causes artificial problems for the LLVM IR linker by incorrectly preventing Swift translation units from being linked with ObjC translation units. The solution is to stop lumping things into a single module flag so that LLVM doesn't have these artificial linking problems; instead, we should use separate module flags for the separate components. But the basic requirement — to assemble all of this information into the emitted OBJC_IMAGE_INFO — will not change.
I agree with all of this except that I'm not sure there's any situation where ObjC and Swift translation units should be disagreeing about the ObjC-specific parts of the OBJC_IMAGE_INFO.
I agree with all of this except that I'm not sure there's any situation where ObjC and Swift translation units should be disagreeing about the ObjC-specific parts of the OBJC_IMAGE_INFO.
The lower bits of the flag is for objc info and here is the definition:
#define OBJC_IMAGE_SUPPORTS_GC (1<<1)
#define OBJC_IMAGE_REQUIRES_GC (1<<2)
#define OBJC_IMAGE_OPTIMIZED_BY_DYLD (1<<3)
#define OBJC_IMAGE_SUPPORTS_COMPACTION (1<<4)
#define OBJC_IMAGE_IS_SIMULATED (1<<5)
#define OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES (1<<6)
The 0x40 bit is actually OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES, which probably can differ from translation unit to translation unit even within ObjC. Some of the other bits should not be different. That is why I say you can break the bits down even more if needed.
If you are breaking the module flag "Objective-C Garbage Collection" into all the corresponding bits, you can correctly specify all the right behavior for the bitfield.
- Let us keep the auto upgrade to a different future diff to keep each diff self contained. Is that ok?
I disagree with that. That is leaving a gap for incompatible bitcode.
Steven
Do you suggest I should break apart the current "Objective-C Garbage Collection" metadata for both Swift and Objective-C?
Clang seems to set it unconditionally. Presumably it just controls whether category descriptors have a class-properties field; offhand, I don't know why that would need to be set globally.
The 0x40 bit is actually OBJC_IMAGE_HAS_CATEGORY_CLASS_PROPERTIES, which probably can differ from translation unit to translation unit even within ObjC. Some of the other bits should not be different. That is why I say you can break the bits down even more if needed.
Clang seems to set it unconditionally. Presumably it just controls whether category descriptors have a class-properties field; offhand, I don't know why that would need to be set globally.
Does swift set this bit unconditionally? If so, then there is probably no reason for the lower bits to be different.
@jinlin Let me be more specific about what I suggest:
- clang can keep "Objective-C Garbage Collection" metadata but its value should be i8 instead of i32. The type of the module flag should be Error.
- swift compiler will emit "Objective-C Garbage Collection" together with swift ABI version, major and minor. All of them should be Error type.
- backend should know how to generate IMAGE_INFO from Swift ABI version + major + minor + "Objective-C Garbage Collection"
- IRUpgrader should turn a i32 type "Objective-C Garbage Collection" into i8 value, if the higher bits are set, it adds the module flag for swift info.
Code looks like, minor comments aside. Please add a test case for the auto-upgrader.
clang/lib/CodeGen/CGObjCMac.cpp | ||
---|---|---|
5171 | These two comments are both just restating the code and can be dropped. | |
llvm/lib/IR/AutoUpgrade.cpp | ||
4084 | We don't generally assert basic structural well-formedness properties like "values have types", or else we'd have a million assertions everywhere. |
I also meant this comment.