This is an archive of the discontinued LLVM Phabricator instance.

[CodeGenPrepare] Fix for using outdated/corrupt LoopInfo
ClosedPublic

Authored by chill on May 11 2023, 10:31 AM.

Details

Summary

Some transformation in CodeGenPrepare pass may create and/or delete
basic block, but they don't update the LoopInfo, so the LoopInfo may
end up containing dangling pointers and sometimes reused basic blocks,
which leads to "interesting" non-deterministic behaviour.

These transformations do not seem to alter the loop structure of the
function, and updating the loop info is quite straighforward.

Diff Detail

Event Timeline

chill created this revision.May 11 2023, 10:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2023, 10:31 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
chill requested review of this revision.May 11 2023, 10:31 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2023, 10:31 AM

We should probably have code somewhere that calls verify() on the LoopInfo, so we're confident these updates are correct, and to catch future issues. (If we do it once at the end of CodeGenPrepare, it should be cheap enough to just turn it on with +Asserts? Or we could stick it under EXPENSIVE_CHECKS, I guess.)

llvm/lib/CodeGen/CodeGenPrepare.cpp
2182

Wrong indentatation.

chill planned changes to this revision.May 23 2023, 5:41 AM

There are multiple issues with this patch, I may go back to rebuilding the LoopInfo, wel'll see ...

chill updated this revision to Diff 527414.Jun 1 2023, 7:29 AM
chill marked an inline comment as done.
efriedma added inline comments.Jun 1 2023, 10:51 AM
llvm/lib/CodeGen/CodeGenPrepare.cpp
590

Doing this for every basic block is going to get way too expensive, even for something labeled "expensive".

chill updated this revision to Diff 527785.Jun 2 2023, 2:38 AM
chill marked an inline comment as done.
chill updated this revision to Diff 530937.Jun 13 2023, 9:19 AM

Rebase/refresh, no changes.

efriedma added inline comments.Jun 13 2023, 10:36 AM
llvm/lib/CodeGen/CodeGenPrepare.cpp
650

Given the way the code is structured, isn't this always running just after you've just called releaseMemory()/analyze() on LI?

chill added inline comments.Jun 14 2023, 2:05 AM
llvm/lib/CodeGen/CodeGenPrepare.cpp
650

D'oh, indeed, after we exit the loop because of MadeChange being false it means all the code after LI->analyze didn't do anything, including changing the loop structure.

chill updated this revision to Diff 531714.Jun 15 2023, 6:08 AM

Fix some more issues with maintaining DT/LI up to date:

If the DT was modified in a normal (not huge) function, the over-the-blocks loop
will exit with an outdated DT, which is then used by mergeSExts.
Thus the reset of the DT was moved immediately after returning from optimizeBlock.
This ensures that mergeSExts will work with an up to date DT and allows
to avoid unnecessary rebuilds of the DT by moving the initial reset outside the outer loop.
The latter necessitates updating the LI and DT in splitLargeGEPOffsets and
eliminateFallThrough (only if DT is already present).

chill marked an inline comment as done.Jun 15 2023, 6:17 AM
efriedma accepted this revision.Jun 15 2023, 11:57 AM

LGTM

llvm/lib/CodeGen/CodeGenPrepare.cpp
632

Weird indentation

6946

I think we have a SplitBasicBlock utility that handles DT/LI automatically, but I don't want to make any complex refactorings block this.

This revision is now accepted and ready to land.Jun 15 2023, 11:57 AM
This revision was landed with ongoing or failed builds.Jun 19 2023, 1:43 AM
This revision was automatically updated to reflect the committed changes.
chill marked an inline comment as done.Jun 19 2023, 1:43 AM
nikic added a subscriber: nikic.Jun 19 2023, 5:06 AM

Doesn't this change essentially break the current model where DT is supposed to only be calculated lazily, if it is needed? Now we'll end up always calculating it on each iteration.

I think if you do this, you also need to start keeping DT up to date everywhere, instead of using the current crude invalidation approach.

chill added a comment.Jun 19 2023, 7:37 AM

Doesn't this change essentially break the current model where DT is supposed to only be calculated lazily, if it is needed? Now we'll end up always calculating it on each iteration.

I think if you do this, you also need to start keeping DT up to date everywhere, instead of using the current crude invalidation approach.

Doesn't this change essentially break the current model where DT is supposed to only be calculated lazily, if it is needed? Now we'll end up always calculating it on each iteration.

I think if you do this, you also need to start keeping DT up to date everywhere, instead of using the current crude invalidation approach.

Now that we don't have (?) correctness/corruption issues, I'll see what I can do about updating the DT. (From a quick glance it does not seem like a terribly huge amount of work).