This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Work around getBaseObject returning null for alias-to-ifunc
Needs ReviewPublic

Authored by nextsilicon-itay-bookstein on Jun 16 2020, 12:00 AM.

Details

Summary

Previously, computeAliasSummary would crash on a null dereference
for a GlobalIFunc aliasee because getBaseObject returns nullptr in this
case. This patch works around the issue by explicitly handling that
case and returning the GlobalIFunc aliasee cast to a GlobalValue.

Works around https://bugs.llvm.org/show_bug.cgi?id=46340

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2020, 12:00 AM

Wondering what the most appropriate place for the test case is.
I can hit the crash by passing --module-summary to opt, but the crash is a bit removed from the underlying issue itself, I think.

Odd that it is only exposed through the alias summary generation. There are lots of places that call getBaseObject, but maybe not as many as I assumed. If it is only reproducible that way a test based on that is fine with me.

llvm/lib/IR/Globals.cpp
447 ↗(On Diff #270971)

Would it make more sense to just generalize the GlobalAlias handling already here to handle all GlobalIndirectSymbols? And specifically, it seems to be looking for cycles (via the Aliases DenseSet parameter) - does that make sense for IFuncs too?

I added a test case that exercises the crash before the fix.
However, your comment made me realize that there are two separate issues + one question here:

  1. getBaseObject() is currently unable to indirect through GlobalIFuncs (as the current fix tries to address).
  2. computeAliasSummary() assumes that getBaseObject() can never return nullptr, while it has potential flows that return nullptr. In Verifier::visitGlobalAlias() I see that aliases-to-ConstantExpr-s are valid, but in Verifier::visitAliaseeSubExpr I see that alias cycles are invalid. If we assume that the module for which we generate a summary is valid, this question amounts to whether it is valid for the ConstantExpr case of an alias to ever return nullptr there. I can add an assert(Aliasee) / fatal(..) in computeAliasSummary.
  3. Is it valid for a GlobalIFunc to have an alias as its resolver function operand? It sounds contrived, but valid nonetheless (e.g. if the resolver is in a different translation unit and they end up getting merged in LTO?). A GlobalIFunc with a resolver that is itself a GlobalIFunc sounds like an invalid construction to me, though. I don't see references to GlobalIFunc in Verifier.cpp to use as guiding principles for what would be considered valid. So, it sounds to me that with respect to module validity, in an alias-...-alias-ifunc-alias-...-resolverfunc chain there should only ever be a single ifunc at most.

Regardless, changing the findBaseObject implementation to take a set of GlobalIndirectSymbols and uniformizing that handling sounds like a fine alternative to the currently proposed fix, depending (?) on the points I raised above.

I added a test case that exercises the crash before the fix.
However, your comment made me realize that there are two separate issues + one question here:

  1. getBaseObject() is currently unable to indirect through GlobalIFuncs (as the current fix tries to address).
  2. computeAliasSummary() assumes that getBaseObject() can never return nullptr, while it has potential flows that return nullptr. In Verifier::visitGlobalAlias() I see that aliases-to-ConstantExpr-s are valid, but in Verifier::visitAliaseeSubExpr I see that alias cycles are invalid. If we assume that the module for which we generate a summary is valid, this question amounts to whether it is valid for the ConstantExpr case of an alias to ever return nullptr there. I can add an assert(Aliasee) / fatal(..) in computeAliasSummary.
  3. Is it valid for a GlobalIFunc to have an alias as its resolver function operand? It sounds contrived, but valid nonetheless (e.g. if the resolver is in a different translation unit and they end up getting merged in LTO?). A GlobalIFunc with a resolver that is itself a GlobalIFunc sounds like an invalid construction to me, though. I don't see references to GlobalIFunc in Verifier.cpp to use as guiding principles for what would be considered valid. So, it sounds to me that with respect to module validity, in an alias-...-alias-ifunc-alias-...-resolverfunc chain there should only ever be a single ifunc at most.

Regardless, changing the findBaseObject implementation to take a set of GlobalIndirectSymbols and uniformizing that handling sounds like a fine alternative to the currently proposed fix, depending (?) on the points I raised above.

My knowledge of IFunc details is pretty thin. Adding @DmitryPolukhin who ported over ifunc support from gcc and might be able to provide some guidance. I agree it makes sense to have the use in computeAliasSummary check for the null with an assert. And if the usage you describe in 3 is illegal, it would be good to have the verifier check for this.

I added a test case that exercises the crash before the fix.
However, your comment made me realize that there are two separate issues + one question here:

  1. getBaseObject() is currently unable to indirect through GlobalIFuncs (as the current fix tries to address).
  2. computeAliasSummary() assumes that getBaseObject() can never return nullptr, while it has potential flows that return nullptr. In Verifier::visitGlobalAlias() I see that aliases-to-ConstantExpr-s are valid, but in Verifier::visitAliaseeSubExpr I see that alias cycles are invalid. If we assume that the module for which we generate a summary is valid, this question amounts to whether it is valid for the ConstantExpr case of an alias to ever return nullptr there. I can add an assert(Aliasee) / fatal(..) in computeAliasSummary.
  3. Is it valid for a GlobalIFunc to have an alias as its resolver function operand? It sounds contrived, but valid nonetheless (e.g. if the resolver is in a different translation unit and they end up getting merged in LTO?). A GlobalIFunc with a resolver that is itself a GlobalIFunc sounds like an invalid construction to me, though. I don't see references to GlobalIFunc in Verifier.cpp to use as guiding principles for what would be considered valid. So, it sounds to me that with respect to module validity, in an alias-...-alias-ifunc-alias-...-resolverfunc chain there should only ever be a single ifunc at most.

Regardless, changing the findBaseObject implementation to take a set of GlobalIndirectSymbols and uniformizing that handling sounds like a fine alternative to the currently proposed fix, depending (?) on the points I raised above.

My knowledge of IFunc details is pretty thin. Adding @DmitryPolukhin who ported over ifunc support from gcc and might be able to provide some guidance. I agree it makes sense to have the use in computeAliasSummary check for the null with an assert. And if the usage you describe in 3 is illegal, it would be good to have the verifier check for this.

I implemented ifunc using GCC doc as a kind of spec because there is no formal definition of the feature as far as I know. For for ifunc resolver function is not a base object, it is called at runtime to return address of the function that will called. I think ifunc itself should be used as the base object.

Doesn't that mean that GlobalIFunc shouldn't really inherit from GlobalIndirectSymbol (and then GlobalIndirectSymbol loses the justification for its existence)?
I get the feeling like there's an expectation for getBaseObject to be idempotent (since it pierces through ConstantExprs and GlobalAliases), but GlobalIndirectSymbol::getBaseObject takes the resolver (getOperand(0)) and performs the lookup on it.

Doesn't that mean that GlobalIFunc shouldn't really inherit from GlobalIndirectSymbol (and then GlobalIndirectSymbol loses the justification for its existence)?
I get the feeling like there's an expectation for getBaseObject to be idempotent (since it pierces through ConstantExprs and GlobalAliases), but GlobalIndirectSymbol::getBaseObject takes the resolver (getOperand(0)) and performs the lookup on it.

GlobalIndirectSymbol doesn't mean to be an alias. GlobalIFunc doesn't inherit from GlobalAliases that would be wrong.

As far as I can tell, GlobalIndirectSymbol is there to share code between GlobalAlias and GlobalIFunc.
This shared code is mostly a common getBaseObject() implementation, which just delegates to getIndirectSymbol() (== getOperand(0)).
For GlobalAliases, the indirect symbol is the aliasee. For GlobalIFuncs, the indirect symbol is the resolver function.

You said that for GlobalIFuncs the resolver function shouldn't be the base object. It makes sense to me, because the signature of the resolver is different from the signature of the fptr that it's supposed to return.
BUT this means that the getBaseObject() implementation that GlobalIFunc inherits from GlobalIndirectSymbol is wrong for GlobalIFunc.
The getBaseObject() code sharing between GlobalAlias and GlobalIFunc looks like the only purpose of GlobalIndirectSymbol, and it's wrong for one of the classes (or so we claim). That's why I don't understand why GlobalIndirectSymbol should exist at all

How do we want to proceed, then? Does my analysis make sense?

getBaseObject is only small part of code sharing, the majority of the code is outside the class in common code that handles both GlobalIFunc and GlobalAliases in the same way. Here they should be treated differently so, IMHO, there is no need in changing the inheritance but GlobalIFunc should be handles as a special case here.

llvm/lib/IR/Globals.cpp
448 ↗(On Diff #271316)

return GI; should work here I think.

DmitryPolukhin added inline comments.Jul 7 2020, 12:38 PM
llvm/lib/IR/Globals.cpp
448 ↗(On Diff #271316)

Nope, it doesn't just from types point of view. But IFunc should be treated as an object itself.

I did a bit of archeology and it turns out that getBaseObejct was part of moved from GlobalAlias to GlobalIndirectSymbol in https://github.com/llvm/llvm-project/commit/95549497ec8b5269f0439f12859537b7371b7c90
It looks like the simplest solution is to handle nullptr from getBaseObejct in computeAliasSummary...

I did a bit of archeology and it turns out that getBaseObejct was part of moved from GlobalAlias to GlobalIndirectSymbol in https://github.com/llvm/llvm-project/commit/95549497ec8b5269f0439f12859537b7371b7c90
It looks like the simplest solution is to handle nullptr from getBaseObejct in computeAliasSummary...

Doesn't that mean that getBaseObject is given two contradictory meanings, then?
On the one hand, from what you say, the base object should be able to "act in place of the enclosing object" => GlobalIFunc::getBaseObject() should return itself, since the resolver doesn't substitute for the symbol it's supposed to resolve.
On the other hand, from the expectations of code like DCE and SplitModule (in the commit you linked), the base object should tell you that these objects are "linked together" in some respect => GlobalIFunc::getBaseObject() should return the resolver, since they are 'inseparable' in that respect.

From a types point of view, either of the two restores the idempotence getBaseObject().

But as far as this PR is concerned:

  • Without this PR's changes, if you have Alias-to-IFunc, getBaseObject() on the Alias returns nullptr and getBaseObject() on the IFunc returns the resolver.
  • If we put "return GI;" in findBaseObject(), getBaseObject() for the Alias returns the IFunc, but getBaseObject() on the IFunc returns the resolver.

Both cases are not idempotent :\

I don't understand what do you mean by "not idempotent" behavior in this case. As far as I can see GlobalIFunc doesn't implement own getBaseObject (and it is not virtual) so calling getBaseObject on the IFunc should return null same as calling it on Alias-to-IFunc. Calling getbaseObject on Alias-to-IFunc will recursively call it on IFunc that will return null that will be propagated, isn't it? So in my opinion computeAliasSummary should handle null without crash because other places have checks for null returned from getBaseObject.

@tejohnson would simply not emitting an AliasSummary in computeAliasSummary when getBaseObject() returns nullptr be sufficient, then?

@tejohnson would simply not emitting an AliasSummary in computeAliasSummary when getBaseObject() returns nullptr be sufficient, then?

We need a full reference graph in the thin link otherwise we may do incorrect optimizations (e.g. think that the aliasee is dead). I think then you'll run into the problem that D82745 is trying to address (ifunc removed because we don't summarize it).

Perhaps the right fix for now is to implement @DmitryPolukhin 's suggestion of returning GI there, so at least we get the alias summary correct. Then when D82745 is completed, the IFunc itself will be summarized.

The problem with what @DmitryPolukhin suggested (which he later acknowledged in an additional comment), is that getBaseObject() returns a GlobalObject, but GlobalIFunc does not inherit from GlobalObject, so that can't compile.

I get a sense that GlobalIFunc should derive directly from GlobalObject (since it should be a valid final target for aliases), instead of GlobalIndirectSymbol; but that's a pretty big, cross-cutting change which I'm not even sure 100% makes sense :\

The problem with what @DmitryPolukhin suggested (which he later acknowledged in an additional comment), is that getBaseObject() returns a GlobalObject, but GlobalIFunc does not inherit from GlobalObject, so that can't compile.

I get a sense that GlobalIFunc should derive directly from GlobalObject (since it should be a valid final target for aliases), instead of GlobalIndirectSymbol; but that's a pretty big, cross-cutting change which I'm not even sure 100% makes sense :\

Oh I see. Hmm, there seems to be a fundamental issue with the way IFunc are currently represented, assuming it is legal to have an alias to ifunc (I don't see why not), and if it is not correct to return the ifunc resolver in this case (not sure).

How about this:

  1. For now, to fix your immediate issue, change computeAliasSummary so that if getBaseObject returns null, then you see if you can cast the getAliasee result to GlobalIFunc and use that as the Aliasee (which needs to be made type GlobalValue instead of GlobalObject, which is fine since we just call getGUID on it and that's defined on GlobalValue). (Note there is still is the related problem shown in D82745.)
  1. Start an llvm-dev thread on the higher level issue of whether GlobalIFunc should be derived from GlobalIndirectSymbol vs GlobalObject with a summary of the discussion/issues here.
nextsilicon-itay-bookstein retitled this revision from [IR] Fix getBaseObject for GlobalAlias-to-GlobalIFunc to [ThinLTO] Work around getBaseObject returning null for alias-to-ifunc.Sep 7 2020, 9:34 AM
nextsilicon-itay-bookstein edited the summary of this revision. (Show Details)

I have updated the diff to include the changes you suggested. However, as you said, for the test to have a chance of passing, D82745 is required. I will send out a mail to discuss the design issue on llvm-dev soon.

I have updated the diff to include the changes you suggested. However, as you said, for the test to have a chance of passing, D82745 is required. I will send out a mail to discuss the design issue on llvm-dev soon.

If the code won't actually work for now, does it even make sense to submit this one before D82745 and the general issue are resolved? Perhaps this one should be made dependent on D82745.