This is an archive of the discontinued LLVM Phabricator instance.

[AutoFDO] Avoid merging inlinee samples multiple times
ClosedPublic

Authored by hoyFB on Jul 30 2020, 6:23 PM.

Details

Summary

A function call can be replicated by optimizations like loop unroll and jump threading and the replicates end up sharing the sample nested callee profile. Therefore when it comes to merging samples for uninlined callees in the sample profile inliner, a callee profile can be merged multiple times which will cause an assert to fire.

This change avoids merging same callee profile for duplicate callsites by filtering out callee profiles with a non-zero head sample count.

Diff Detail

Unit TestsFailed

Event Timeline

hoyFB created this revision.Jul 30 2020, 6:23 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 30 2020, 6:23 PM
hoyFB requested review of this revision.Jul 30 2020, 6:23 PM
wenlei accepted this revision.Jul 30 2020, 8:15 PM

LGTM. Thanks!

llvm/lib/Transforms/IPO/SampleProfile.cpp
1107–1126

A comment explain the check? e.g. we want to merge it exactly once as earlier pass may have duplicated the call site already, and when duplicating, we don't slicing the original inlinee's profile.

This revision is now accepted and ready to land.Jul 30 2020, 8:15 PM
wmi added a comment.Jul 30 2020, 8:22 PM

The multiple inline instance copies generated from loop unrolling or jump threading should have their own FunctionSamples objects. Why merging multiple inline instance to the outline profile will trigger assertion? Could you have a testcase for it?

For the second problem, we got a bug report for the same problem. I create a testcase and a fix in https://reviews.llvm.org/D84994. I think your fix is more concise while mine is a straightforward fix. If you think your fix is better, free to use the testcase in D84994. Otherwise, I can commit that one.

wmi added a comment.Jul 30 2020, 8:25 PM

Yes, the fix in D84997 is more concise. Here I use a straightforward fix. It is ok for Hongtao to commit D84997, but just add a testcase (free to use the one here).

wmi added a comment.Jul 30 2020, 8:28 PM
In D84997#2186522, @wmi wrote:

Yes, the fix in D84997 is more concise. Here I use a straightforward fix. It is ok for Hongtao to commit D84997, but just add a testcase (free to use the one here).

Sorry. Paste to the wrong thread (meant to paste to D84994)

In D84997#2186518, @wmi wrote:

The multiple inline instance copies generated from loop unrolling or jump threading should have their own FunctionSamples objects. Why merging multiple inline instance to the outline profile will trigger assertion? Could you have a testcase for it?

It's a slightly difference case. In the training build, we didn't unroll, so there's a single inline site in the loop, but in the PGO optimizing build, the loop got unrolled with call site duplicated. And the duplicated call sites all point to the same FunctionSamples, in which case we should merge them only once. Agree that we should add a test case.

For the second problem, we got a bug report for the same problem. I create a testcase and a fix in https://reviews.llvm.org/D84994. I think your fix is more concise while mine is a straightforward fix. If you think your fix is better, free to use the testcase in D84994. Otherwise, I can commit that one.

wmi added a comment.Jul 30 2020, 9:06 PM

The multiple inline instance copies generated from loop unrolling or jump threading should have their own FunctionSamples objects. Why merging multiple inline instance to the outline profile will trigger assertion? Could you have a testcase for it?

It's a slightly difference case. In the training build, we didn't unroll, so there's a single inline site in the loop, but in the PGO optimizing build, the loop got unrolled with call site duplicated. And the duplicated call sites all point to the same FunctionSamples, in which case we should merge them only once. Agree that we should add a test case.

Thanks. So it is a combined version of normal PGO and sampleFDO, or the PGO here means the instrument hook for sampleFDO?

For the second problem, we got a bug report for the same problem. I create a testcase and a fix in https://reviews.llvm.org/D84994. I think your fix is more concise while mine is a straightforward fix. If you think your fix is better, free to use the testcase in D84994. Otherwise, I can commit that one.

I will commit Hongtao's version in D84994 so the two fixes can be separated.

In D84997#2186553, @wmi wrote:

The multiple inline instance copies generated from loop unrolling or jump threading should have their own FunctionSamples objects. Why merging multiple inline instance to the outline profile will trigger assertion? Could you have a testcase for it?

It's a slightly difference case. In the training build, we didn't unroll, so there's a single inline site in the loop, but in the PGO optimizing build, the loop got unrolled with call site duplicated. And the duplicated call sites all point to the same FunctionSamples, in which case we should merge them only once. Agree that we should add a test case.

Thanks. So it is a combined version of normal PGO and sampleFDO, or the PGO here means the instrument hook for sampleFDO?

Sorry, by PGO, I actually meant AutoFDO (sample PGO). We don't use instrumentation PGO internally so far. The new sample PGO with pseudo instrumentation we're developing internally didn't hit this problem - we used a trie tree to manipulate context profile merging there.

For the second problem, we got a bug report for the same problem. I create a testcase and a fix in https://reviews.llvm.org/D84994. I think your fix is more concise while mine is a straightforward fix. If you think your fix is better, free to use the testcase in D84994. Otherwise, I can commit that one.

I will commit Hongtao's version in D84994 so the two fixes can be separated.

Sounds good, thanks!

wmi added a comment.Jul 30 2020, 10:01 PM

I see. Thanks for the explanation.

hoyFB marked an inline comment as done.Jul 30 2020, 11:14 PM
In D84997#2186553, @wmi wrote:

The multiple inline instance copies generated from loop unrolling or jump threading should have their own FunctionSamples objects. Why merging multiple inline instance to the outline profile will trigger assertion? Could you have a testcase for it?

It's a slightly difference case. In the training build, we didn't unroll, so there's a single inline site in the loop, but in the PGO optimizing build, the loop got unrolled with call site duplicated. And the duplicated call sites all point to the same FunctionSamples, in which case we should merge them only once. Agree that we should add a test case.

Thanks. So it is a combined version of normal PGO and sampleFDO, or the PGO here means the instrument hook for sampleFDO?

For the second problem, we got a bug report for the same problem. I create a testcase and a fix in https://reviews.llvm.org/D84994. I think your fix is more concise while mine is a straightforward fix. If you think your fix is better, free to use the testcase in D84994. Otherwise, I can commit that one.

I will commit Hongtao's version in D84994 so the two fixes can be separated.

Thanks for committing the change for me!

hoyFB updated this revision to Diff 282138.Jul 30 2020, 11:15 PM

Updating D84997: [AutoFDO] Avoid merging inlinee samples multiple times

In D84997#2186518, @wmi wrote:

The multiple inline instance copies generated from loop unrolling or jump threading should have their own FunctionSamples objects. Why merging multiple inline instance to the outline profile will trigger assertion? Could you have a testcase for it?

It's a slightly difference case. In the training build, we didn't unroll, so there's a single inline site in the loop, but in the PGO optimizing build, the loop got unrolled with call site duplicated. And the duplicated call sites all point to the same FunctionSamples, in which case we should merge them only once. Agree that we should add a test case.

Test case added.

wmi accepted this revision.Jul 31 2020, 8:46 AM

LGTM. Thanks.

hoyFB edited the summary of this revision. (Show Details)Jul 31 2020, 9:06 AM
This revision was landed with ongoing or failed builds.Jul 31 2020, 9:30 AM
This revision was automatically updated to reflect the committed changes.