This is an archive of the discontinued LLVM Phabricator instance.

Normalize GlobalDecls when used with CPUDispatch
AbandonedPublic

Authored by erichkeane on Dec 10 2018, 11:53 AM.

Details

Summary

Previously, CPUDispatch functionality did some wacky things with
GlobalDecls, which resulted in multiple GlobalDecls having the same
mangled name. This patch corrects this in a few ways:

First, it Moves the emission of CPUDispatch resolvers to the end of the
translation unit. This is akin to how 'target' multiversioning works,
and permits the resolver emission mechanism to see all possible
FunctionDecls in the TU.

Second, it changes the meaning of the GlobalDecl MultiVersionIndex to
MultiVersionOrdinal. In the case of CPUSpecific, '0' now means "A call
to a non-present CPUDispatch function". Other ordinals are the order of
the CPU Name in the attribute source. In the case of CPUDispatch, '0'
is the resolver function, and other ordinals represent the non-present
CPU-Specific version.

Because of this, the 1:1 relationship between a GlobalDecl and a
Mangled Name is restored. Additionally, it results in functions only
being emitted when necessary, rather than use a variety of hackery to
emit in cases we presume they would be used.

Diff Detail

Event Timeline

erichkeane created this revision.Dec 10 2018, 11:53 AM
aaron.ballman added inline comments.Dec 11 2018, 12:43 PM
include/clang/Basic/Attr.td
896–910

gets -> Gets

901

if (II->isStr(Name))

908

Can this return a const IdentifierInfo * instead?

lib/CodeGen/CodeGenModule.cpp
2150

Rather than use hasAttr<> followed by getAttr<>, just get the attribute and test it for null.

Also, const auto *.

2464

const auto *

2584

const auto *

2611

const auto *

2634–2641

const auto *

erichkeane marked 8 inline comments as done.

Thanks @aaron.ballman . This should fix all of your complaints.

rsmith marked an inline comment as done.Dec 11 2018, 1:55 PM

Thanks, this makes a lot of sense to me.

lib/CodeGen/CodeGenModule.cpp
907

Missing indentation on this line.

918

Hmm, it looks like we don't have a unique name for a GlobalDecl naming the CPUOrdinal == 0 case of an overloaded cpu_specific function:

__attribute__((cpu_specific(atom)))
void f() {} // CPUOrdinal 0 GD mangled as _Z1fv.ifunc
__attribute__((cpu_specific(haswell)))
void f() {} // CPUOrdinal 0 GD also mangled as _Z1fv.ifunc

Maybe we could append the CPU-specific manglings for all named CPUs before the .ifunc?

2142–2155

Instead of adding a special-case list of MultiVersionFuncs, could we instead change MayBeEmittedEagerly to return false for CPUDispatch functions and use the normal DeferredDecls mechanism for this?

2656–2657

I think the first part of this should now be equivalent to getMangledName(...ImplDecl...), which should means that these 6 lines can be replaced by GetAddrOfFunction(ImplDecl, DeclTy, false, false, NotForDefinition).

(Also I think you should pass NotForDefinition here because you just want to reference the symbol, you don't want to define it yourself.)

2657–2663

This too should just be the same GetAddrOfFunction call, I think.

2659

I think we should have a comment somewhere explaining the subtle use of the multiversion ordinal:

  • (Resolver, 0) refers to the resolver itself
  • (Resolver, N) refers to an external (outside this module) symbol for the N'th CPU in Resolver's cpu_dispatch list
  • (Specific, 0) is a placeholder used to refer to the set of function symbols defined by a cpu_specific function declaration; no corresponding symbol is ever emitted (right?)
  • (Specific, N) refers to an internal (within this module) symbol for the N'th CPU in Specific's cpu_specific list
erichkeane marked 4 inline comments as done.Dec 11 2018, 2:28 PM
erichkeane added inline comments.
lib/CodeGen/CodeGenModule.cpp
918

Well shoot, thats true, since you can call the function at different points and have overload resolution 'choose' a different one (since it is just by ordering).

If we were to give it a different name, we would either have to replace it later, or make sure a call to 'not the ifunc' doesn't happen.

2142–2155

That list already would have to exist for GCC Multiversioning, since there isn't a single declaration that corresponds to the resolver. I could definitely remove CPUDispatch from this mechanism and return it to a separate one, but I'm not sure what that buys us.

Unless you think this is something we can do that will fix the GCC Multiversioning as well?

2656–2657

I remember 'DontDefer=true' being set for a good reason, since we need this function emitted.

At the moment, this doesn't work because GetOrCreateLLVMFunction (called by GetAddressOfFunction) replaces NotForDefinition with the ifunc instead of the function, so calling that here ends up getting the ifunc/resolver instead of the function itself.

I'll continue looking into that, but if you have alternative design decisions, I'd love to hear them.

2659

(Specific,0) is a placeholder for a not-yet-existing CPU-Dispatch version of the function. Otherwise, I think documenting that somewhere is a good idea. I'll look for a good spot for it.

erichkeane marked 7 inline comments as done.Dec 12 2018, 7:10 AM

I got a couple of @rsmith's requests done. Most importantly I suspect is MultiVersionFuncs, though NotForDefinition vs ForDefinition is perhpas something you'll find important.

lib/CodeGen/CodeGenModule.cpp
918

I thought about it more last night, if the purpose is simply unique-ness, we can generate it as a unique name. The intent is that it should never make it into the IR, so I think appending the names and .ifunc is fine, simply so that it is unique.

2142–2155

I tried for a bit, and couldn't come up with a way that this would work for Target as well. IMO, since MultiVersionFuncs already has to exist, this fits better there.

Additionally, I noticed that when I switched Dispatch over to this version that I was having to emit more functions during resolver emission, since they were in the deferred list. The deferred list then attempted to emit them, but no longer needed to (though no problems were caused by this).

2656–2657

I replaced both of these as you stated, though I DID need to do ForDefinition. I couldn't come up with a not-super-intrusive way to differentiate between "Not for definition, but for resolver use" and "Not for definition, give me the IFunc".

2659

I rephrased this slightly and put it in "EmitMultiVersionFunctionDefinition", since it seems to fit there best IMO.

As mentioned, the @rsmith comments that I thought were doable without feedback.

erichkeane abandoned this revision.Mar 19 2022, 7:54 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 19 2022, 7:54 PM