This is an archive of the discontinued LLVM Phabricator instance.

[StackSafety] Run ThinLTO
ClosedPublic

Authored by vitalybuka on Jun 5 2020, 1:20 AM.

Details

Summary

ThinLTO linking runs dataflow processing on collected
function parameters. Then StackSafetyGlobalInfoWrapperPass
in ThinLTO backend will run as usual looking up to external
symbol in the summary if needed.

Depends on D80985.

Diff Detail

Event Timeline

vitalybuka created this revision.Jun 5 2020, 1:20 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 5 2020, 1:20 AM
eugenis added inline comments.Jun 11 2020, 6:01 PM
llvm/lib/Analysis/StackSafetyAnalysis.cpp
620

Do we have a test for this overflow check?

923

Why is this necessary? This affects only !live || !dso_local functions, right? The rest is overwritten in the loop below. As I understand, the post-lto function analysis ignores such functions, so that's strictly a space saving thing. Explain what's going on in a comment, and why removing this access data won't cause false positive results.

llvm/lib/LTO/LTO.cpp
1437

Bad name - "process" is meaningless here. Consider "generateParamAccessSummary" ?

llvm/test/Analysis/StackSafetyAnalysis/ipa-alias.ll
55

For my education - how do you come up with these symbol lists?

vitalybuka marked 5 inline comments as done.

@eugenis review

llvm/lib/Analysis/StackSafetyAnalysis.cpp
620

yes
Example, in bit width 8
[-128,-127)+[-128,-127) = [0,1)
both non-wrapped and result is non-wrapped so we have no way to spot overflow

llvm/test/Analysis/StackSafetyAnalysis/ipa-alias.ll
55

it complains about each missing <obj, symbol> pair
then flags guess and try

eugenis accepted this revision.Jun 12 2020, 3:01 PM

LGTM with 2 notes

llvm/lib/Analysis/StackSafetyAnalysis.cpp
620

Sure. I was asking if we have a testcase that covers the overflow check.

924

typo: summaries

This revision is now accepted and ready to land.Jun 12 2020, 3:01 PM
vitalybuka marked 3 inline comments as done.

address comments

llvm/lib/Analysis/StackSafetyAnalysis.cpp
620

I've extracted this into function and added test to cover that.
It is possible to test this check, however we discussed clamping offsets to something more useful, e.g. 2^(ptrwidth/2). This way we will not be able to hit this check.

vitalybuka edited the summary of this revision. (Show Details)Jun 12 2020, 6:06 PM
vitalybuka edited the summary of this revision. (Show Details)
This revision was automatically updated to reflect the committed changes.

I just noticed that generateParamAccessSummary is taking a bit over 5% of the ThinLTO thin link step in an internal build (and will soon be more than 5% as I found another analysis that is hogging compile time that I'm going to work on fixing). This is currently the second hottest analysis in the thin link. Is the stack safety analysis meant to be always on with ThinLTO?

I have a theory on what is causing it to incur so much overhead, see below.

llvm/lib/Analysis/StackSafetyAnalysis.cpp
913

This is doing a hash table lookup by GUID in the combined index. Can the Callee field be changed to hold the ValueInfo instead of the GUID? Calling findSummaryInModule on a ValueInfo avoids the hash table lookup. E.g. see how we compute the Ref edges in ModuleSummaryAnalysis.cpp (i.e. via getOrInsertValueInfo), and then propagate that through the bitcode.

I just noticed that generateParamAccessSummary is taking a bit over 5% of the ThinLTO thin link step in an internal build (and will soon be more than 5% as I found another analysis that is hogging compile time that I'm going to work on fixing).

FYI with D84985 now committed, this goes up to >7% of the runtime.

This is currently the second hottest analysis in the thin link. Is the stack safety analysis meant to be always on with ThinLTO?

I have a theory on what is causing it to incur so much overhead, see below.

Thanks. Looking.

I just noticed that generateParamAccessSummary is taking a bit over 5% of the ThinLTO thin link step in an internal build (and will soon be more than 5% as I found another analysis that is hogging compile time that I'm going to work on fixing). This is currently the second hottest analysis in the thin link. Is the stack safety analysis meant to be always on with ThinLTO?

I have a theory on what is causing it to incur so much overhead, see below.

vitalybuka added a comment.EditedJul 31 2020, 3:57 PM

Is the stack safety analysis meant to be always on with ThinLTO?

During compilation most of the time it should be off.
However during linking I assume that most build FS->paramAccesses() is empty, so no hash lookup is expected. So I assume empty looks should be cheap:

for (auto &GVS : Index) {
    for (auto &GV : GVS.second.SummaryList) {
}}

As it paramAccesses suppose to be non-empty for MTE builds for now, so if it's not empty on internal build, then the bug it likely around why it's not empty there.
Can you send me email with internal build details?

Is the stack safety analysis meant to be always on with ThinLTO?

During compilation most of the time it should be off.
However during linking I assume that most build FS->paramAccesses() is empty, so no hash lookup is expected. So I assume empty looks should be cheap:

for (auto &GVS : Index) {
    for (auto &GV : GVS.second.SummaryList) {
}}

As it paramAccesses suppose to be non-empty for MTE builds for now, so if it's not empty on internal build, then the bug it likely around why it's not empty there.
Can you send me email with internal build details?

Will do

Is the stack safety analysis meant to be always on with ThinLTO?

During compilation most of the time it should be off.
However during linking I assume that most build FS->paramAccesses() is empty, so no hash lookup is expected. So I assume empty looks should be cheap:

for (auto &GVS : Index) {
    for (auto &GV : GVS.second.SummaryList) {
}}

As it paramAccesses suppose to be non-empty for MTE builds for now, so if it's not empty on internal build, then the bug it likely around why it's not empty there.
Can you send me email with internal build details?

Will do

Thanks, I was able to reproduce.
As expected hash table lookup was not reached. However code inserts empty FunctionInfo<FunctionSummary> into std::map Functions. Which does not affect correctness, but totally unnecessary and noticeable on profile.
https://reviews.llvm.org/rG08cf49658c1d resolves the issue.

I'll still check if we can replace guids with ValueInfo to optimize case when StackSafety is active.

Is the stack safety analysis meant to be always on with ThinLTO?

During compilation most of the time it should be off.
However during linking I assume that most build FS->paramAccesses() is empty, so no hash lookup is expected. So I assume empty looks should be cheap:

for (auto &GVS : Index) {
    for (auto &GV : GVS.second.SummaryList) {
}}

As it paramAccesses suppose to be non-empty for MTE builds for now, so if it's not empty on internal build, then the bug it likely around why it's not empty there.
Can you send me email with internal build details?

Will do

Thanks, I was able to reproduce.
As expected hash table lookup was not reached. However code inserts empty FunctionInfo<FunctionSummary> into std::map Functions. Which does not affect correctness, but totally unnecessary and noticeable on profile.
https://reviews.llvm.org/rG08cf49658c1d resolves the issue.

I'll still check if we can replace guids with ValueInfo to optimize case when StackSafety is active.

Thanks for the fix! I'm surprised simply inserting an entry for every function into a map would be so expensive. This was more expensive than the function importing algorithm, which also does a lot of map insertions (although per GUID not per function summary).

Looking at the code, could you improve that by not inserting a map entry in the case where the summary is not live and or not dso local? I.e. the case where the code isn't even looking at the paramAccesses(). Also, note that if any copy in the summary list for a VI is not live, all copies must be dead (the way computeDeadSymbols works guarantees this). So you could skip the SummaryList iteration completely if the first one is not live. I'm also exploiting this fact in my recent fix D84985.