This is an archive of the discontinued LLVM Phabricator instance.

[SCCP] Reduce the number of times ResolvedUndefsIn is called for large modules.
ClosedPublic

Authored by efriedma on Oct 8 2020, 4:39 PM.

Details

Summary

If a module has many values that need to be resolved by ResolvedUndefsIn, compilation takes quadratic time overall. Solve should do a small amount of work, since not much is added to the worklists each time markOverdefined is called. But ResolvedUndefsIn is linear over the length of the function/module, so resolving one undef at a time is quadratic in general.

To solve this, make ResolvedUndefsIn resolve every undef value at once, instead of resolving them one at a time. This loses a little optimization power, but can be a lot faster.

We still need a loop around ResolvedUndefsIn because markOverdefined could change the set of blocks that are live. That should be uncommon, hopefully. We could optimize it by tracking which blocks transition from dead to live, instead of iterating over the whole module to find them. But I'll leave that for later. (The whole function will become a lot simpler once we start pruning branches on undef.)

The regression test changes seem minor. The specific cases in question could probably be optimized with a bit more work, but they seem like edge cases that don't really matter.

Fixes an "infinite" compile issue my team found on an internal workoad.

Diff Detail

Event Timeline

efriedma created this revision.Oct 8 2020, 4:39 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 8 2020, 4:39 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
efriedma requested review of this revision.Oct 8 2020, 4:39 PM
ychen added a subscriber: ychen.Oct 8 2020, 6:13 PM
fhahn added a comment.Oct 9 2020, 12:26 PM

This looks reasonable to me and IIUC this mostly restores the original behaviour. LGTM.

Out of curiosity, do you think keeping track of which functions may contain undef states may help with compile-time in your case? E.g. like in D74535

fhahn accepted this revision.Oct 9 2020, 12:26 PM
This revision is now accepted and ready to land.Oct 9 2020, 12:26 PM

I would assume that the first pass of ResolveUndefsIn isn't that expensive relative to the other work SCCP does, so it would be more important to ensure we don't need additional passes. And probably we want to work towards removing the branch handling before we invest more effort into optimizing ResolveUndefsIn.

fhahn added a comment.Oct 9 2020, 1:45 PM

I would assume that the first pass of ResolveUndefsIn isn't that expensive relative to the other work SCCP does, so it would be more important to ensure we don't need additional passes. And probably we want to work towards removing the branch handling before we invest more effort into optimizing ResolveUndefsIn.

SGTM