This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Handle chains of aliases
ClosedPublic

Authored by tejohnson on Nov 13 2018, 6:18 PM.

Details

Summary

At -O0, globalopt is not run during the compile step, and we can have a
chain of an alias having an immediate aliasee of another alias. We were
not correctly summarizing this (we always got the base object when
recording the aliasee), and also not handling it properly in ThinLTO
dead stripping, which would only mark the base object as live. This
could result in the intermediate alias incorrectly being marked dead and
dropped to a declaration.

Fix by correctly representing an alias to another alias in the summary.
Also change the dead stripping computation so that it marks all aliases
in the chain as live.

This required adding a version of stripInBoundsOffsets() that doesn't
follow aliases (this is the interface called by getBaseObject, and we
want to be able to strip in bounds offsets from the immediate aliasee).

Diff Detail

Repository
rL LLVM

Event Timeline

tejohnson created this revision.Nov 13 2018, 6:18 PM
davidxl added inline comments.Dec 11 2018, 10:49 AM
include/llvm/IR/ModuleSummaryIndex.h
427 ↗(On Diff #173973)

Can this be refactored as

return const_cast<const GlobalValueSummary *>(this)->getBaseObject();

to avoid code duplication?

lib/Analysis/ModuleSummaryAnalysis.cpp
446 ↗(On Diff #173973)

Add a comment for the function?

lib/IR/Value.cpp
475 ↗(On Diff #173973)

Add some documentation on the InBounds kinds?

tejohnson marked 4 inline comments as done.Dec 11 2018, 12:01 PM
tejohnson added inline comments.
include/llvm/IR/ModuleSummaryIndex.h
427 ↗(On Diff #173973)

Yep. Although needs another const_cast on the return value to remove the const again.

tejohnson marked an inline comment as done.

Address comments

pcc added a comment.Dec 11 2018, 12:19 PM

Isn't this just a bug in promotion? It should probably internalize any non-prevailing aliases like lib/Linker does.

davidxl added inline comments.Dec 11 2018, 12:25 PM
lib/IR/Value.cpp
469 ↗(On Diff #177754)

Perhaps there is no need to change the name of this kind -- proper documentation is probably enough.

471 ↗(On Diff #177754)

Name the new one PSK_InBoundsNoAliases so that existing enumerator's semantics do not change.

473 ↗(On Diff #177754)

Perhaps leave the semantics of this enum unchanged?

In D54507#1327394, @pcc wrote:

Isn't this just a bug in promotion? It should probably internalize any non-prevailing aliases like lib/Linker does.

Can you clarify? The summary doesn't contain any references to the intermediate aliasee so it looks completely unreferenced.

pcc added a comment.Dec 11 2018, 1:31 PM
In D54507#1327394, @pcc wrote:

Isn't this just a bug in promotion? It should probably internalize any non-prevailing aliases like lib/Linker does.

Can you clarify? The summary doesn't contain any references to the intermediate aliasee so it looks completely unreferenced.

And indeed it isn't. The important thing to keep alive is the base object, which is what the code already does.

My first idea was that we would change the code in convertToDeclaration here:
http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/FunctionImport.cpp#923
so that we would RAUW all alias references to point to the external and all non-alias references to point to the aliasee. But unfortunately it doesn't seem easy to implement "replace some uses with" on constants.

Probably the simplest fix is to make sure that each aliasee is canonicalized to point to the base object before calling convertToDeclaration. This seems pretty straightforward using ConstantExpr::getWithOperands. This also takes us one step closer to the canonical alias form we discussed last year: we would just need to apply the same canonicalization *before* writing the bitcode file.

In D54507#1327573, @pcc wrote:
In D54507#1327394, @pcc wrote:

Isn't this just a bug in promotion? It should probably internalize any non-prevailing aliases like lib/Linker does.

Can you clarify? The summary doesn't contain any references to the intermediate aliasee so it looks completely unreferenced.

And indeed it isn't. The important thing to keep alive is the base object, which is what the code already does.

My first idea was that we would change the code in convertToDeclaration here:
http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/FunctionImport.cpp#923
so that we would RAUW all alias references to point to the external and all non-alias references to point to the aliasee. But unfortunately it doesn't seem easy to implement "replace some uses with" on constants.

Probably the simplest fix is to make sure that each aliasee is canonicalized to point to the base object before calling convertToDeclaration.

Do you mean make sure that each "alias" is canonicalized to point to the base object? I.e. if we have an alias relationship "A aliases B aliases C", then convert it to "A aliases C" and "B aliases C"? Presumably that would fix the issue, but the summary is still not accurately reflecting the IR (although not sure where else it would be problematic).

This seems pretty straightforward using ConstantExpr::getWithOperands. This also takes us one step closer to the canonical alias form we discussed last year: we would just need to apply the same canonicalization *before* writing the bitcode file.

Yes this alias canonicalization would also fix the issue, before the summary is generated. Unfortunately it's not something I have had time to work on since that first attempt.

pcc added a comment.Dec 11 2018, 2:36 PM
In D54507#1327573, @pcc wrote:
In D54507#1327394, @pcc wrote:

Isn't this just a bug in promotion? It should probably internalize any non-prevailing aliases like lib/Linker does.

Can you clarify? The summary doesn't contain any references to the intermediate aliasee so it looks completely unreferenced.

And indeed it isn't. The important thing to keep alive is the base object, which is what the code already does.

My first idea was that we would change the code in convertToDeclaration here:
http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/FunctionImport.cpp#923
so that we would RAUW all alias references to point to the external and all non-alias references to point to the aliasee. But unfortunately it doesn't seem easy to implement "replace some uses with" on constants.

Probably the simplest fix is to make sure that each aliasee is canonicalized to point to the base object before calling convertToDeclaration.

Do you mean make sure that each "alias" is canonicalized to point to the base object? I.e. if we have an alias relationship "A aliases B aliases C", then convert it to "A aliases C" and "B aliases C"?

Yes, exactly.

Presumably that would fix the issue, but the summary is still not accurately reflecting the IR (although not sure where else it would be problematic).

I don't think it would be problematic. In this case the summary contains the more accurate representation of reality because references to globals in alias definitions, unlike all other references, are not subject to linker preemption, and we shouldn't allow the less-accurate representation in the IR to leak anywhere else.

This seems pretty straightforward using ConstantExpr::getWithOperands. This also takes us one step closer to the canonical alias form we discussed last year: we would just need to apply the same canonicalization *before* writing the bitcode file.

Yes this alias canonicalization would also fix the issue, before the summary is generated. Unfortunately it's not something I have had time to work on since that first attempt.

Yes but note that I am not proposing that we canonicalize everywhere right now. We just need to canonicalize here: http://llvm-cs.pcc.me.uk/lib/LTO/LTOBackend.cpp#490 (before dropDeadSymbols).

In D54507#1327677, @pcc wrote:
In D54507#1327573, @pcc wrote:
In D54507#1327394, @pcc wrote:

Isn't this just a bug in promotion? It should probably internalize any non-prevailing aliases like lib/Linker does.

Can you clarify? The summary doesn't contain any references to the intermediate aliasee so it looks completely unreferenced.

And indeed it isn't. The important thing to keep alive is the base object, which is what the code already does.

My first idea was that we would change the code in convertToDeclaration here:
http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/FunctionImport.cpp#923
so that we would RAUW all alias references to point to the external and all non-alias references to point to the aliasee. But unfortunately it doesn't seem easy to implement "replace some uses with" on constants.

Probably the simplest fix is to make sure that each aliasee is canonicalized to point to the base object before calling convertToDeclaration.

Do you mean make sure that each "alias" is canonicalized to point to the base object? I.e. if we have an alias relationship "A aliases B aliases C", then convert it to "A aliases C" and "B aliases C"?

Yes, exactly.

Presumably that would fix the issue, but the summary is still not accurately reflecting the IR (although not sure where else it would be problematic).

I don't think it would be problematic. In this case the summary contains the more accurate representation of reality because references to globals in alias definitions, unlike all other references, are not subject to linker preemption, and we shouldn't allow the less-accurate representation in the IR to leak anywhere else.

Philosophically, I think the summary should summarize the IR that we have (in the bitcode file), not the IR that we want. So I'm not completely comfortable with this approach - I'm actually thinking it may be better to add a skeleton alias canonicalization pass (taken from that draft alias canonicalization patch D29781) that is invoked during thinlto preparation. For now, it can just do this flatting of alias chains, and eventually expanded into the full canonicalization (and invoked in all paths as described in that patch description). I don't want to sign up to do the whole hog canonicalization currently since this bug only shows up at -O0 and I don't have the bandwidth to focus on it ATM. Does that seem like a reasonable interim approach?

This seems pretty straightforward using ConstantExpr::getWithOperands. This also takes us one step closer to the canonical alias form we discussed last year: we would just need to apply the same canonicalization *before* writing the bitcode file.

Yes this alias canonicalization would also fix the issue, before the summary is generated. Unfortunately it's not something I have had time to work on since that first attempt.

Yes but note that I am not proposing that we canonicalize everywhere right now. We just need to canonicalize here: http://llvm-cs.pcc.me.uk/lib/LTO/LTOBackend.cpp#490 (before dropDeadSymbols).

pcc added a comment.Dec 12 2018, 1:35 PM
In D54507#1327677, @pcc wrote:
In D54507#1327573, @pcc wrote:
In D54507#1327394, @pcc wrote:

Isn't this just a bug in promotion? It should probably internalize any non-prevailing aliases like lib/Linker does.

Can you clarify? The summary doesn't contain any references to the intermediate aliasee so it looks completely unreferenced.

And indeed it isn't. The important thing to keep alive is the base object, which is what the code already does.

My first idea was that we would change the code in convertToDeclaration here:
http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/FunctionImport.cpp#923
so that we would RAUW all alias references to point to the external and all non-alias references to point to the aliasee. But unfortunately it doesn't seem easy to implement "replace some uses with" on constants.

Probably the simplest fix is to make sure that each aliasee is canonicalized to point to the base object before calling convertToDeclaration.

Do you mean make sure that each "alias" is canonicalized to point to the base object? I.e. if we have an alias relationship "A aliases B aliases C", then convert it to "A aliases C" and "B aliases C"?

Yes, exactly.

Presumably that would fix the issue, but the summary is still not accurately reflecting the IR (although not sure where else it would be problematic).

I don't think it would be problematic. In this case the summary contains the more accurate representation of reality because references to globals in alias definitions, unlike all other references, are not subject to linker preemption, and we shouldn't allow the less-accurate representation in the IR to leak anywhere else.

Philosophically, I think the summary should summarize the IR that we have (in the bitcode file), not the IR that we want. So I'm not completely comfortable with this approach - I'm actually thinking it may be better to add a skeleton alias canonicalization pass (taken from that draft alias canonicalization patch D29781) that is invoked during thinlto preparation. For now, it can just do this flatting of alias chains, and eventually expanded into the full canonicalization (and invoked in all paths as described in that patch description). I don't want to sign up to do the whole hog canonicalization currently since this bug only shows up at -O0 and I don't have the bandwidth to focus on it ATM. Does that seem like a reasonable interim approach?

Sounds reasonable enough. It seems a little better to do this during LTO because it would allow us to upgrade old bitcode but I'm happy for it to be a pass as well. I left some comments on D29781 about the overall approach there.

In D54507#1328732, @pcc wrote:
In D54507#1327677, @pcc wrote:
In D54507#1327573, @pcc wrote:
In D54507#1327394, @pcc wrote:

Isn't this just a bug in promotion? It should probably internalize any non-prevailing aliases like lib/Linker does.

Can you clarify? The summary doesn't contain any references to the intermediate aliasee so it looks completely unreferenced.

And indeed it isn't. The important thing to keep alive is the base object, which is what the code already does.

My first idea was that we would change the code in convertToDeclaration here:
http://llvm-cs.pcc.me.uk/lib/Transforms/IPO/FunctionImport.cpp#923
so that we would RAUW all alias references to point to the external and all non-alias references to point to the aliasee. But unfortunately it doesn't seem easy to implement "replace some uses with" on constants.

Probably the simplest fix is to make sure that each aliasee is canonicalized to point to the base object before calling convertToDeclaration.

Do you mean make sure that each "alias" is canonicalized to point to the base object? I.e. if we have an alias relationship "A aliases B aliases C", then convert it to "A aliases C" and "B aliases C"?

Yes, exactly.

Presumably that would fix the issue, but the summary is still not accurately reflecting the IR (although not sure where else it would be problematic).

I don't think it would be problematic. In this case the summary contains the more accurate representation of reality because references to globals in alias definitions, unlike all other references, are not subject to linker preemption, and we shouldn't allow the less-accurate representation in the IR to leak anywhere else.

Philosophically, I think the summary should summarize the IR that we have (in the bitcode file), not the IR that we want. So I'm not completely comfortable with this approach - I'm actually thinking it may be better to add a skeleton alias canonicalization pass (taken from that draft alias canonicalization patch D29781) that is invoked during thinlto preparation. For now, it can just do this flatting of alias chains, and eventually expanded into the full canonicalization (and invoked in all paths as described in that patch description). I don't want to sign up to do the whole hog canonicalization currently since this bug only shows up at -O0 and I don't have the bandwidth to focus on it ATM. Does that seem like a reasonable interim approach?

Sounds reasonable enough. It seems a little better to do this during LTO because it would allow us to upgrade old bitcode but I'm happy for it to be a pass as well. I left some comments on D29781 about the overall approach there.

Uploading patch to do this now. Will update the summary if we go with this approach. Note that the clang side patch which I will upload separately contains the invocations of the new pass for the new PM (mirroring how this was done for NameAnonGlobals).

tejohnson updated this revision to Diff 177918.Dec 12 2018, 2:15 PM

Update per comments.

pcc added inline comments.Dec 12 2018, 2:22 PM
lib/Transforms/Utils/CanonicalizeAliases.cpp
51 ↗(On Diff #177918)

This isn't quite right; it will rewrite

ga1 = alias ga2
ga2 = alias gep(gv, 0, 1)

as

ga1 = alias gv

i.e. you will drop the offset from the alias. The implementation in https://reviews.llvm.org/D29781#1328720 handles this case correctly.

tejohnson marked an inline comment as done.Dec 12 2018, 6:07 PM
tejohnson added inline comments.
lib/Transforms/Utils/CanonicalizeAliases.cpp
51 ↗(On Diff #177918)

You're right. We want to look at the direct aliasee (recursively). Made that change and enhanced the test case to cover this.

tejohnson updated this revision to Diff 177983.Dec 12 2018, 6:07 PM

Address comments

tejohnson marked an inline comment as done.Dec 12 2018, 6:38 PM
tejohnson added inline comments.
lib/Transforms/Utils/CanonicalizeAliases.cpp
51 ↗(On Diff #177918)

Actually, I think my new approach won't quite work if both aliases in the chain involve expressions - probably needs something like your ConstantExpr case in D29781. Need to construct an example.

pcc added inline comments.Dec 12 2018, 6:46 PM
lib/Transforms/Utils/CanonicalizeAliases.cpp
51 ↗(On Diff #177918)

Yes, here's an example that won't canonicalize correctly:

ga1 = alias gep(ga2, 0, 1)
ga2 = alias gv
40 ↗(On Diff #177983)

I think you can change this to namespace { and then the functions below don't need to be static.

48 ↗(On Diff #177983)

With the current code you don't actually need the recursive call here, it could just be

while (auto *GA2 = dyn_cast<GlobalAlias>(Aliasee))
  Aliasee = GA2->getAliasee();

(but ignore if you plan to make this recursive).

51 ↗(On Diff #177983)

This is just GA->setAliasee(Aliasee), no?

tejohnson marked 4 inline comments as done.Dec 13 2018, 8:44 AM

New version which more closely mirrors logic proposed in D29781 comments. One difference is that I update all aliases in the chain in one recursive pass. In any case, this should make it easy to implement D29781 on top as a follow on.

lib/Transforms/Utils/CanonicalizeAliases.cpp
51 ↗(On Diff #177918)

Added that case to my test as well.

48 ↗(On Diff #177983)

I decided to make it recursive, so we can fix up all aliases in the chain in one go.

51 ↗(On Diff #177983)

Right, that was leftover from an older flawed version.

tejohnson updated this revision to Diff 178074.Dec 13 2018, 8:44 AM

Fix algorithm, update test.

pcc added inline comments.Dec 14 2018, 1:52 PM
lib/Transforms/Utils/CanonicalizeAliases.cpp
61 ↗(On Diff #178074)

You don't need to track this yourself, getWithOperands will do it for you.
http://llvm-cs.pcc.me.uk/lib/IR/Constants.cpp#1185

70 ↗(On Diff #178074)

Remove braces

tejohnson marked 2 inline comments as done.Dec 14 2018, 2:02 PM
tejohnson updated this revision to Diff 178280.Dec 14 2018, 2:03 PM

Address comments

ping - @pcc does this look ok now?

pcc accepted this revision.Jan 4 2019, 10:16 AM

LGTM

lib/Transforms/IPO/PassManagerBuilder.cpp
749 ↗(On Diff #178280)

Move comment.

This revision is now accepted and ready to land.Jan 4 2019, 10:16 AM
tejohnson marked an inline comment as done.Jan 4 2019, 10:35 AM
tejohnson updated this revision to Diff 180270.Jan 4 2019, 10:35 AM

Address comment

This revision was automatically updated to reflect the committed changes.