This is an archive of the discontinued LLVM Phabricator instance.

[Inliner] Add a flag to disable manual alloca merging in the Inliner.
ClosedPublic

Authored by chandlerc on Aug 2 2016, 12:56 AM.

Details

Summary

This is off for now while testing can take place to make sure that in
fact we do sufficient stack coloring to fully obviate the manual alloca
array merging.

Some context on why we should be using stack coloring rather than
merging allocas in this way:

LLVM relies very heavily on analyzing pointers as coming from different
allocas in order to make aliasing decisions. These are some of the most
powerful aliasing signals available in LLVM. So merging allocas is an
extremely destructive operation on the LLVM IR -- it takes away highly
valuable and hard to reconstruct information.

As a consequence, inlined functions which happen to have array allocas
that this pattern matches will fail to be properly interleaved unless
SROA manages to hoist everything to an SSA register. Instead, the
inliner will have added an unnecessary dependence that one inlined
function execute after the other because they will have been rewritten
to refer to the same memory.

All that said, folks will reasonably want some time to experiment here
and make sure there are no significant regressions. A flag should give
us an easy knob to test.

For more context, see the thread here:
http://lists.llvm.org/pipermail/llvm-dev/2016-July/103277.html
http://lists.llvm.org/pipermail/llvm-dev/2016-August/103285.html

Diff Detail

Repository
rL LLVM

Event Timeline

chandlerc updated this revision to Diff 66435.Aug 2 2016, 12:56 AM
chandlerc retitled this revision from to [Inliner] Add a flag to disable manual alloca merging in the Inliner..
chandlerc updated this object.
chandlerc added a subscriber: llvm-commits.
silvas accepted this revision.Aug 2 2016, 1:14 AM
silvas added a reviewer: silvas.
silvas added a subscriber: silvas.

Pulling it out into a separate function seems like a nice cleanup anyway. That seems fine regardless of whether we ultimately remove the logic.

And the flag is off by default and easy enough to remove. So LGTM.

Any initial results from disabling the alloca merging?
Also, do you know if we have any PR's related to the information loss due to this alloca merging?

This revision is now accepted and ready to land.Aug 2 2016, 1:14 AM
mcrosier edited edge metadata.Aug 8 2016, 12:47 PM
mcrosier added a subscriber: bmakam.

@bmakam: Would you mind downloading and testing this patch? Please do full correctness and SPEC200X performance (our head-to-head methodology with train input would be fine).

bmakam added a comment.Aug 9 2016, 8:12 AM

@bmakam: Would you mind downloading and testing this patch? Please do full correctness and SPEC200X performance (our head-to-head methodology with train input would be fine).

Tested on Kryo. Only these SPEC200X benchmarks had non-noise performance gains(runtime) with this change:

spec2006/soplex:train[-2.347%, +3.157%]
spec2006/sphinx3:train[+0.469%, +2.124%]
spec2006/xalancbmk:train[+6.566%, +8.743%]

There were no regressions.

@bmakam: Would you mind downloading and testing this patch? Please do full correctness and SPEC200X performance (our head-to-head methodology with train input would be fine).

Tested on Kryo. Only these SPEC200X benchmarks had non-noise performance gains(runtime) with this change:

spec2006/soplex:train[-2.347%, +3.157%]
spec2006/sphinx3:train[+0.469%, +2.124%]
spec2006/xalancbmk:train[+6.566%, +8.743%]

There were no regressions.

Thanks, Balaram. No correctness issues, correct?

bmakam added a comment.Aug 9 2016, 8:28 AM

@bmakam: Would you mind downloading and testing this patch? Please do full correctness and SPEC200X performance (our head-to-head methodology with train input would be fine).

Tested on Kryo. Only these SPEC200X benchmarks had non-noise performance gains(runtime) with this change:

spec2006/soplex:train[-2.347%, +3.157%]
spec2006/sphinx3:train[+0.469%, +2.124%]
spec2006/xalancbmk:train[+6.566%, +8.743%]

There were no regressions.

Thanks, Balaram. No correctness issues, correct?

Oh I forgot running full correctness. There were no correctness issues in SPEC200X benchmarks. I will run full correctness tests.

@bmakam: Would you mind downloading and testing this patch? Please do full correctness and SPEC200X performance (our head-to-head methodology with train input would be fine).

Tested on Kryo. Only these SPEC200X benchmarks had non-noise performance gains(runtime) with this change:

spec2006/soplex:train[-2.347%, +3.157%]
spec2006/sphinx3:train[+0.469%, +2.124%]
spec2006/xalancbmk:train[+6.566%, +8.743%]

There were no regressions.

Thanks, Balaram. No correctness issues, correct?

Oh I forgot running full correctness. There were no correctness issues in SPEC200X benchmarks. I will run full correctness tests.

Just finished running full correctness tests. There were no correctness issues found in my tests.
Thanks,
Balaram

This revision was automatically updated to reflect the committed changes.