This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Fix linker error for function multiversioning
Needs RevisionPublic

Authored by eandrews on Aug 23 2023, 1:13 PM.

Details

Summary

Currently target_clones attribute results in a linker error when there are no multi-versioned function declarations in the calling TU.

foo.cpp
int foo1();
__attribute__((target_clones("default", "arch=core2")))
int foo1() { return 0; }

main.cpp
int foo1();
int main()
{   return foo1(); }

$ clang++ main.cpp foo.cpp
/usr/bin/ld: /tmp/main-981c32.o: in function `main':
main.cpp:(.text+0x10): undefined reference to `foo1()'
clang++: error: linker command failed with exit code 1 (use -v to see invocation)

In the calling TU, the call is generated with ‘normal’ assembly name. This does not match any of the versions since their mangling includes a .versionstring. The linker error is not seen with GCC since the mangling for the ifunc symbol in GCC is the ‘normal’ assembly name for function.

Clang – 

$nm foo.o
U   __cpu_indicator_init
U   __cpu_model
T    _Z4foo1v.arch_core2.0
T    _Z4foo1v.default.1
i     _Z4foo1v.ifunc   <----------
W  _Z4foo1v.resolver

$ nm main.o
T main
 U _Z4foo1v

GCC

$ nm foo.o
U   __cpu_indicator_init
U  __cpu_model
i    _Z4foo1v <---------------
t    _Z4foo1v.arch_core2.0
t    _Z4foo1v.default.1
W  _Z4foo1v.resolver

$ nm main.o
T main
U _Z4foo1v

I was initially inclined to match GCC behavior here and remove the ifunc suffix but I decided against it because I am not sure why ifunc mangling was used in the first place. Maybe to maintain consistency between various multiversion attributes? I see target attribute also uses ifunc mangling (while GCC has some other mangling scheme for this attribute). As a less disruptive solution I just added an alias to the ifunc function, similar to what was done for CPU dispatch here - https://reviews.llvm.org/D67058. If the correct solution here is to remove the ifunc suffix, I can make that change instead.

Diff Detail

Event Timeline

eandrews created this revision.Aug 23 2023, 1:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 23 2023, 1:13 PM
eandrews requested review of this revision.Aug 23 2023, 1:13 PM
erichkeane accepted this revision.Aug 23 2023, 1:19 PM

I think the .ifunc spelling was an oversight on my part when I implemented this, I didn't spend enough time investigating GCC's behavior when implementing this feature. I think the alias is the right way about it, but I think the .ifunc should be the alias (at least as far as I can think it through right now). I think that works better because it supports a case where the 'definition' of the target-clones function is generated with GCC, but the 'caller' (also with target clones) comes from clang. I THINK that makes more sense? But perhaps try to chart out the behavior of the GCC/Clang "Knows about TC"/"Doesn't know about TC" in each situation to see which are troublesome?

Additionally, this needs a release note.

This revision is now accepted and ready to land.Aug 23 2023, 1:19 PM
erichkeane requested changes to this revision.Aug 23 2023, 1:19 PM

Accepted before I thought about the other order, so requesting changes so we can think this through... 1 thing we DO need to consider is how this works with OLDER versions of Clang/clang's behavior.

This revision now requires changes to proceed.Aug 23 2023, 1:19 PM

I think the .ifunc spelling was an oversight on my part when I implemented this, I didn't spend enough time investigating GCC's behavior when implementing this feature. I think the alias is the right way about it, but I think the .ifunc should be the alias (at least as far as I can think it through right now). I think that works better because it supports a case where the 'definition' of the target-clones function is generated with GCC, but the 'caller' (also with target clones) comes from clang. I THINK that makes more sense? But perhaps try to chart out the behavior of the GCC/Clang "Knows about TC"/"Doesn't know about TC" in each situation to see which are troublesome?

Additionally, this needs a release note.

Thanks for taking a look! Can you explain why we need an alias? As in, if we just remove the .ifunc suffix in the 'ifunc' function here, it should work without an alias I think. I have to re-check but IIRC this is what GCC does

I think the .ifunc spelling was an oversight on my part when I implemented this, I didn't spend enough time investigating GCC's behavior when implementing this feature. I think the alias is the right way about it, but I think the .ifunc should be the alias (at least as far as I can think it through right now). I think that works better because it supports a case where the 'definition' of the target-clones function is generated with GCC, but the 'caller' (also with target clones) comes from clang. I THINK that makes more sense? But perhaps try to chart out the behavior of the GCC/Clang "Knows about TC"/"Doesn't know about TC" in each situation to see which are troublesome?

Additionally, this needs a release note.

Thanks for taking a look! Can you explain why we need an alias? As in, if we just remove the .ifunc suffix in the 'ifunc' function here, it should work without an alias I think. I have to re-check but IIRC this is what GCC does

At least short term, we have to have both .ifunc and <just function name> exposed, else we end up breaking SOMEONE. At the moment, we're breaking the example you have, however if we just switch the .ifunc to be <function name>, we end up breaking cases where 1 TU is built with Clang 18 and the other with Clang-trunk. So we need to figure out what the 'compatibility story' is between Clang 18, Clang-Trunk, and GCC with each of them in the position of:

1- Generated the function TU
2- Consumed the function, not knowing about it being TC
3- Consumed the function, knowing it was TC.

I think the .ifunc spelling was an oversight on my part when I implemented this, I didn't spend enough time investigating GCC's behavior when implementing this feature. I think the alias is the right way about it, but I think the .ifunc should be the alias (at least as far as I can think it through right now). I think that works better because it supports a case where the 'definition' of the target-clones function is generated with GCC, but the 'caller' (also with target clones) comes from clang. I THINK that makes more sense? But perhaps try to chart out the behavior of the GCC/Clang "Knows about TC"/"Doesn't know about TC" in each situation to see which are troublesome?

Additionally, this needs a release note.

Thanks for taking a look! Can you explain why we need an alias? As in, if we just remove the .ifunc suffix in the 'ifunc' function here, it should work without an alias I think. I have to re-check but IIRC this is what GCC does

At least short term, we have to have both .ifunc and <just function name> exposed, else we end up breaking SOMEONE. At the moment, we're breaking the example you have, however if we just switch the .ifunc to be <function name>, we end up breaking cases where 1 TU is built with Clang 18 and the other with Clang-trunk. So we need to figure out what the 'compatibility story' is between Clang 18, Clang-Trunk, and GCC with each of them in the position of:

1- Generated the function TU
2- Consumed the function, not knowing about it being TC
3- Consumed the function, knowing it was TC.

Makes sense. I'll experiment a bit and update review!

skan added a subscriber: skan.Aug 23 2023, 6:03 PM

A couple of observations with regard to compatibility:

gcc,, at least by default, emits the TC implementations as local functions, the resolver as a weak global function, and the undecorated name as an ifunc. When only a TC declaration (not a definition) is present, gcc still emits the resolver and ifunc, but a failure then occurs at line time since the implementation symbols referenced by the resolver function cannot be resolved to the local symbols emitted for a TU with the TC definitions. I can only guess that this is intentional; that inter-TU calls are intended to be dispatched via the ifunc+resolver and that TC use is therefore restricted to TUs that have definitions. Clang, at least by default, emits the TC implementations as global functions that can be resolved for direct reference by other TUs. This could be seen as potentially exposing implementation details (e.g., the author of the TC functions might reserve the right to change the list of targets without worrying about breaking usage by other TUs.

For TUs that only have a declaration of the function without TC, it will be necessary to recompile the TUs that define TC functions no matter what. This is because the calling TU (that has neither a declaration nor a definition of the function with TC applied) cannot know to dispatch to the ifunc+resolver and cannot provide a definition for the undecorated function name (not even an alias since it doesn't know to do so).

I think it would also be a good idea to exercise the test case for a target that lacks ifunc support to ensure that inter-TU calls work as expected.

aarch64 has different mangling for default functions when using target_clones. Sigh

int __attribute__((target_clones("ls64_v+fp16", "default"))) foo_ovl(int) { return 1; } results in @_Z7foo_ovli._Mfp16Mls64_v and @_Z7foo_ovlv. Note there is no ".default" as in other architectures.