This is an archive of the discontinued LLVM Phabricator instance.

PGOInstrumentation, GCOVProfiling: Split indirectbr critical edges regardless of PHIs
ClosedPublic

Authored by MatzeB on Feb 17 2022, 3:51 PM.

Details

Summary

The SplitIndirectBrCriticalEdges function was originally designed for CodeGenPrepare and skipped splitting of edges when the destination block didn't contain any PHI instructions. This only makes sense when reducing COPYs like CodeGenPrepare. In the case of PGOInstrumentation or GCOVProfiling it would result in missed counters and wrong result in functions with computed goto.

Diff Detail

Event Timeline

MatzeB created this revision.Feb 17 2022, 3:51 PM
MatzeB requested review of this revision.Feb 17 2022, 3:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 17 2022, 3:51 PM
MatzeB edited the summary of this revision. (Show Details)
modimo added inline comments.Feb 17 2022, 5:56 PM
llvm/include/llvm/Transforms/Utils/BasicBlockUtils.h
503–505

Please document this flag in the comment above

llvm/test/Transforms/GCOVProfiling/split-indirectbr-critical-edges.ll
37

nit:s/impossibel/impossible/g

nit2: If we're willing to change the labels the indirect gotos jump to we can technically split an arbitrary # of edges. Not that we should, of course..

llvm/test/Transforms/PGOProfile/irreducible.ll
142

I'm curious how the counters shifted around now that indirectgoto->sw.bb gets split. How does the changed counter values in the profiles above map to that?

llvm/unittests/Transforms/Utils/BasicBlockUtilsTest.cpp
475

s/bb3/bb2/g has no PHI

MatzeB updated this revision to Diff 410884.Feb 23 2022, 11:29 AM
MatzeB marked 3 inline comments as done.

rebase, address review comments

llvm/test/Transforms/PGOProfile/irreducible.ll
142

This is hard to figure out given the size of the test and the fact that the profile files are checked-in as-is. I hope this is fine...

MatzeB updated this revision to Diff 410885.Feb 23 2022, 11:31 AM
MatzeB added inline comments.
llvm/test/Transforms/GCOVProfiling/split-indirectbr-critical-edges.ll
37

Not completely sure what you mean in "nit2", but I changed the wording to make it clear that this is a trick to keep the edge from getting split so the test keeps working (I know more normalization etc. could "fix" this, but we're only running a single pass here)

MatzeB updated this revision to Diff 410887.Feb 23 2022, 11:33 AM
MatzeB marked an inline comment as done.
modimo accepted this revision.Feb 23 2022, 1:10 PM

LGTM

llvm/test/Transforms/GCOVProfiling/split-indirectbr-critical-edges.ll
37

You got it, "nit2" was that handling this case isn't technically impossible. Thanks for the wording change.

llvm/test/Transforms/PGOProfile/irreducible.ll
142

Took a look because I was curious how to get this information. Adding -debug-only=pgo-instrumentation dumps the edges and the instrumented edges line up with the counts in the profile. Comparing the two (irreducible.proftext) left is original right is with this diff:

The count of 100 gets inferred now and the new counts are for the split edges as expected. The change that causes the INDIRECTGOTO_IRR_LOOP to go from 400->399 is that the probes shifted so that edge 21 and edge 22 on RHS flipped compared to LHS. That's not a big deal though.

This revision is now accepted and ready to land.Feb 23 2022, 1:10 PM
xur added a comment.Feb 23 2022, 2:52 PM

LGTM.
Thanks for this fix. I kind of know this (I used to have an assert to after calling SplitIndirectBrCriticalEdges(), but got rid of it because some cases it does return false).
Even this fix, there are still chance that we cannot split the edge.
Maybe we should emit a warning in this case.

Even this fix, there are still chance that we cannot split the edge.
Maybe we should emit a warning in this case.

Yes I'm aware that it's not always possible to split critical edges; I think it should be possible in those case to just instrument the source and destination of the critical edge with a counter to get valid data?

Admittedly I did not try to fix that either as this fix here seems to be all we need for our software... I think the remaining cases should only effect exception handling code which is not perf critical anyway? For cases with two indirectbr predecessors I always saw LLVM normalizing control flow to a single one in earlier passes, so that was never a problem for instrumentation in my experiments.

MatzeB added inline comments.Feb 23 2022, 4:28 PM
llvm/test/Transforms/PGOProfile/irreducible.ll
142

Thanks!

wenlei accepted this revision.Feb 23 2022, 5:38 PM

good catch, lgtm!