Multi-versioned functions defined by cpu_dispatch and implemented with IFunc
can not be called outside the translation units where they are defined due to
lack of symbols. This patch add function aliases for these functions and thus
make them visible outside.
Details
- Reviewers
erichkeane - Commits
- rG9ca1b94a6d3f: [CodeGen] Add alias for cpu_dispatch function with IFunc & Fix resolver linkage…
rC371586: [CodeGen] Add alias for cpu_dispatch function with IFunc & Fix resolver linkage…
rL371586: [CodeGen] Add alias for cpu_dispatch function with IFunc & Fix resolver linkage…
Diff Detail
Event Timeline
Tests missing.
Is that what gcc does? I'd personally thought those should be internalized.
GCC doesn't implement CPU dispatch, this is an ICC thing. ICC uses the Windows behavior on both, but when implementing in Clang we opted to use IFuncs for Linux. This is troublesome, as it can result in the 'FuncName' symbol not existing (so a separate TU treating this function as an extern function on Linux wouldn't link).
I accept this approach, but need to spend some time tomorrow doing some additional review.
That said, this definitely still needs to be covered by lit tests.
Thanks @lebedev.ri , I'm currently under discussion with @erichkeane , and I'll add lit test after the final decision on how to solve the issue.
I prefer this to be in the place where the ifunc gets created, otherwise we definitely need tests. There are sufficient tests for this that show the ifunc having been created, so I'd suggest just adding to them.
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
2957 | I think we want this in GetOrCreateMultiVersionResolver, so that it gets created when the ifunc does. That way you just need a "FD->isCPUDispatchMultiVersion() || isCPUSpecificMultiVersion()" check inside the supportsIFunc branch. |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
2957 | After discussing this offline, I believe this is the right function to create the alias. The motivating example is: // TU1: __attribute__((cpu_dispatch(a,b,c))) void foo(void); // TU2: extern void foo(void); Currently, TU1 doesn't bother to emit the ifunc, because we've attached emitting this to when this is referenced. We made that choice because we expected TU2 to mark 'foo' as cpu_dispatch/cpu_specific in SOME way. I believe that it is harmless to emit the ifunc all the time, which this is attempting to do. However, this needs to change the ifunc to have LinkOnceODR linkage in GetOrCreateMultiVersionResolver, otherwise this can cause linker errors. | |
2961 | I think the alias always needs LinkOnceODR linkage, to match the (new) IFunc linkage. |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
2957 | I've finished most parts. But I think we should also set the resolver to have LinkOnceODR Linkage. Otherwise, we cannot have the cpu_dispatch declaration in multiple TUs. However there's no 'weak' symbol on Windows, so even setting the resolver linkage to LinkOnceODR cannot solve the duplicate defined symbol problem on Windows. Do you have any suggestions on it? :-) |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
2957 | Yep, I think the resolver should also be linkonceodr (as well as the ifunc). See where they are generated and do it there. I can't help but think that we've solved the weak symbols issue with windows before, but I cannot for the life of me remember. @rnk @lebedev.ri , do either of you remember? Does LinkOnceODR not do that trickery? |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
2957 | There must be SOME solution for it, since otherwise inline functions wouldn't work. For example: Its totally permissible to define and use 'foo::bar' in multiple TUs. Note that it is marked linkonce_odr dso_local and comdat. I'm not sure what the latter two do, but please experiment and see which will allow the symbol to be merged in the linker. |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
2957 | Thanks Erich! Yes, It's the comdat that does the trick. I'll update the patch later. https://llvm.org/docs/LangRef.html#comdats |
Actually... I think it might need to be weak_odr based on https://llvm.org/docs/LangRef.html#linkage-types
We want the merge semantics, but need to make sure that the symbols aren't discarded.
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
3002 | Actually, I wanted to set linkage here at the first time, but failed. When compiling code with cpu_specific but no cpu_dispatch, we cannot set it as LinkOnceODR or WeakODR. E.g.: $ cat specific_only.c __declspec(cpu_specific(pentium_iii)) int foo(void) { return 0; } int usage() { return foo(); } $ clang -fdeclspec specific_only.c Global is external, but doesn't have external or weak linkage! i32 ()* ()* @foo.resolver fatal error: error in backend: Broken module found, compilation aborted! This is found by lit test test/CodeGen/attr-cpuspecific.c, in which 'SingleVersion()' doesn't have a cpu_dispatch declaration. |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
3005 | Changing this also changes linkage for attribute(target()), should I also change test cases for them? (including test/CodeGen{,CXX}/attr-*.ll) |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
3002 | No, actually I've tried it earlier with the example I mentioned in my last comment, but WeakODR still makes compiler complaining. I think it's foo.resolver that cannot be declared with as WeakODR/LinkOnceODR without definition. But I'm really not familiar with these rules. |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
3002 | According to the Verifier::visitGlobalValue() in Verify.cpp, an declaration can only be ExternalLinkage or ExternalWeakLinkage. So I still believe we cannot set the resolver to LinkOnceODRLinkage/WeakODRLinkage here, as they are declared but not defined when we only have cpu_specified but no cpu_dispatch in a TU as the example above. |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
3002 | That doesn't seem right then. IF it allows ExternalWeakLinkage I'd expect WeakODR to work as well, since it is essentially the same thing. |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
3002 | I think we should have a double check. It is said "It is illegal for a function declaration to have any linkage type other than external or extern_weak" at the last line of section Linkage Type in the reference manual [1]. I guess weak_odr is not designed for declaration purpose and should be only used by definition. |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
3002 | I had typed a reply, but apparently it didn't submit: Ah, nvm, I see now that external-weak is different from weak. I don't really get the linkages sufficiently to know what the right thing to do is then. If we DO have a definition, I'd say weak_odr so it can be merged, right? If we do NOT, could externally_available work? |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
3002 | No, I think it should be external instead of available_externally. The later cannot used for declaration as well. So, getting back to the example, 1) if we have cpu_dispatch and cpu_specific in same TU, it's okay to use weak_odr for foo.resolver as it is defined when emitCPUDispatchDefinition and it can be merged. 2) If we only have cpu_specific in a TU and have a reference to the dispatched function, foo.resolver will be referenced without definition, and external is the proper linkage to make it work. That's why I didn't set linkage type at this line. |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
3002 |
Wouldn't that make it un-mergable later? Meaning, if you emitted the declaration from one TU, and the definition from another that you'd get a link error? I think the rules are more subtle than that. Any time you have a cpu_dispatch, the resolver is weak_odr so that it can be merged later. The presence of cpu_specific shouldn't matter. For 2, I think you're mostly correct, as long as the linker can still merge them. |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
3002 |
No, it wouldn't. Declaration has nothing to do with definition IMO. $ cat main.ll declare external i32 @bar() define i32 @main() { %call = call i32 @bar() ret i32 %call } $ cat bar.ll define weak_odr i32 @bar() { ret i32 10 } $ cp bar.ll bar2.ll # copy here so we have 2 weak 'bar' $ clang -c main.ll bar.ll bar2.ll $ clang main.o bar.o bar2.o -o main $ ./main $ echo $? 10 $ nm bar.o 0000000000000000 W bar $ nm main.o U bar 0000000000000000 T main
Yes, you're right. So setting linkage type at line 3005 does it: if we have a cpu_dispatch, then it's set to weak_odr. |
clang/lib/CodeGen/CodeGenModule.cpp | ||
---|---|---|
3002 | I see, thank you for clarifying. These changes should likely be reflected for target MV as well, but in that case the resolver should likely always be weak_odr linkage (since it is always emitted). |
All done IMO. :-) Thank Erich a lot for reviewing! Would you mind helping me commit it?
I think we want this in GetOrCreateMultiVersionResolver, so that it gets created when the ifunc does. That way you just need a "FD->isCPUDispatchMultiVersion() || isCPUSpecificMultiVersion()" check inside the supportsIFunc branch.