This is an archive of the discontinued LLVM Phabricator instance.

[SimplifyCFG][TranformUtils]Do not simplify away a trivial basic block if both this block and at least one of its predecessors are loop latches.
ClosedPublic

Authored by mingmingl on Sep 18 2022, 6:14 PM.

Details

Summary

Before this patch, loop metadata (if exists) will override the metadata of each predecessor; if the predecessor block already has loop metadata, the orignal loop metadata won't be preserved abd cause missed loop transformations (see 'test2' in llvm/test/Transforms/SimplifyCFG/preserve-llvm-loop-metadata.ll).

To illustrate how inner-loop metadata might be dropped:

CFG Before

  
# BB is while.cond.exit, attached with loop metdata md2.
# BB->Pred is for.body, attached with loop metadata md1.
        entry
          |
          v
   ---> while.cond   ------------->  while.end
   |       |
   |       v
   |   while.body
   |       |
   |       v
   |    for.body <---- (md1)
   |       |  |______|
   |       v
   |    while.cond.exit (md2)
   |       |
   |_______|

CFG After

 
# while.cond1 is the merge of while.cond.exit and while.cond above.
# for.body is attached with md2, and md1 is dropped.
# If LoopSimplify runs later (as a part of loop pass), it could create
# dedicated exits for inner-loop (essentially adding `while.cond.exit` basck),
# but won't it won't see 'md1' nor restore it for the inner-loop.

        entry
          |
          v
  ---> while.cond1  ------------->  while.end
  |       |
  |       v
  |   while.body
  |       |
  |       v
  |    for.body <---- (md2)
  |_______|  |______|

Diff Detail

Event Timeline

mingmingl created this revision.Sep 18 2022, 6:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 18 2022, 6:14 PM
Herald added a subscriber: hiraditya. · View Herald Transcript
mingmingl requested review of this revision.Sep 18 2022, 6:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 18 2022, 6:14 PM

It would be clearer to describe the problem with example CFGs before and after (probably using ascii art) and explain how the meta data is lost during this transformation.

mingmingl planned changes to this revision.Sep 19 2022, 11:08 PM
mingmingl updated this revision to Diff 462208.Sep 22 2022, 9:40 AM
mingmingl edited the summary of this revision. (Show Details)

address comments.

It would be clearer to describe the problem with example CFGs before and after (probably using ascii art) and explain how the meta data is lost during this transformation.

Done.

mingmingl updated this revision to Diff 462213.Sep 22 2022, 9:45 AM
mingmingl added a reviewer: efriedma.

Add efriedma@ who has context on two related patch (https://reviews.llvm.org/D42691 and https://reviews.llvm.org/D35411)

p.s. 'test1' CHECK statements in 'preserve-llvm-loop-metadata.ll' is inadvertently removed. add them back.

Thanks for the nice graph. It looks like after the simplification, for.cond is used as latch for both the inner and outer loop, but only one MD data is allowed. In this case, I think we should disable the block merge.

The current representation of loop metadata (using a loop latch terminator attachment) is known to be fundamentally broken. Loop latches are not uniquely associated with loops (both in that a latch can be part of multiple loops and a loop may have multiple latches). Loop headers are. The solution to this problem is also known: Add support for basic block metadata, and attach loop metadata to the loop header.

I don't think adding these kinds of hacks is going to scale much further -- someone needs to actually address the root problem.

Re nikic's comment -- I agree with the statement in general, but I think the infrastructure fix is orthogonal to the bug fix (with the status quo). For what you suggested, we should probably file a bug to track it.

davidxl added inline comments.Sep 22 2022, 9:24 PM
llvm/include/llvm/Transforms/Utils/Local.h
160 ↗(On Diff #462213)

This fixme seems out of context. Is it needed?

llvm/lib/Transforms/Utils/Local.cpp
1147

There is no need to document option 1 and option 2 here.

mingmingl updated this revision to Diff 462529.Sep 23 2022, 9:38 AM
mingmingl marked 2 inline comments as done.
mingmingl edited the summary of this revision. (Show Details)

address comments.

mingmingl added inline comments.Sep 23 2022, 9:38 AM
llvm/include/llvm/Transforms/Utils/Local.h
160 ↗(On Diff #462213)

It's verbose for a fix, but related with test coverage

  1. When called by 'JumpThreading', loop latches are not attempted by TryToSimplifyUncondBranchFromEmptyBlock (code and comment)
  2. When called by 'SimplifyCFG', loop latches with more than two predecessors are not attempted by TryToSimplifyUncondBranchFromEmptyBlock (code and patch)

As a result of 1) and 2), test cases are possible for SimplifyCFG but not straightforward (if makes sense at all) for JumpThreading -> what that means is shown in the relevant test case (test2), empty block BB has one predecessor, but it shouldn't be optimized away (for inner-loop metadata).

Removed this comment.

Jumpthreading 'avoids' the problem with an earlier guard. Since the fix is in a common path, it is ok to just test simplifyCFG.

davidxl accepted this revision.Sep 27 2022, 7:54 PM

Perhaps add a TODO in the comment part for the longer term solution suggested by nikic. Otherwise LGTM

This revision is now accepted and ready to land.Sep 27 2022, 7:54 PM

It's fine to work around this in the meantime (it is a correctness problem when it comes to mustprogress at least) -- but shouldn't this be done by dropping the loop metadata? I believe that's the general rule for metadata: If a transform cannot safely preserve metadata, this is always resolved in favor of dropping the metadata, not preventing the transform.

address comments.

Perhaps add a TODO in the comment part for the longer term solution suggested by nikic. Otherwise LGTM

Thanks. Added a FIXME and quoted nikic's comment there.

The current representation of loop metadata (using a loop latch terminator attachment) is known to be fundamentally broken. Loop latches are not uniquely associated with loops (both in that a latch can be part of multiple loops and a loop may have multiple latches). Loop headers are. The solution to this problem is also known: Add support for basic block metadata, and attach loop metadata to the loop header.

I don't think adding these kinds of hacks is going to scale much further -- someone needs to actually address the root problem.

Thanks for the input! It makes sense to me that current representation won't scale well (from the history around this line, to first count for loop headers, then loop latches, then about the number of predecessors).

I didn't reply though, after davidxl@ beat me to it (that fixing loop metadata representation is a long shot to preserve the inner-loop metadata).

For efficiency considerations,

  1. Regarding the scope, this patch only makes a difference when 'BB' has only one predecessor, in the sense that ToT won't simplify it away when 'BB' has more than two predecessors (as a loop latch)
  2. branch-folder in the codegen pipeline is likely forward branches to unconditional branches to destination directly if this trivial block presents by then.
  3. I did a loadtest on one representative internal benchmark and didn't see regressions.

It's fine to work around this in the meantime (it is a correctness problem when it comes to mustprogress at least) -- but shouldn't this be done by dropping the loop metadata? I believe that's the general rule for metadata: If a transform cannot safely preserve metadata, this is always resolved in favor of dropping the metadata, not preventing the transform.

(Oh we sent out the comment around the same time and I quoted the previous comment :))

that's the general rule for metadata: If a transform cannot safely preserve metadata, this is always resolved in favor of dropping the metadata, not preventing the transform.

Got it. This would make sense to me.

I once thought hard about how to 'concat' loop metadata correctly when drafting something, but that effort ended up nowhere.

TBH I haven't thought through what it would take (maybe clang and llvm IR first, while keeping IR compatibility for frontends of other languages) to implement header-based loop metadata. However, I concur that

  1. in C++ code a loop could easily have many latches, compared with headers
  2. Even when inner loop and outer loop have the same IR header [1], the inner-loop and outer-loop have headers (in c++ form) respectively.

[1]

This example from https://llvm.org/docs/LoopTerminology.html#important-notes

for (int i = 0; i < 128; ++i)
  for (int j = 0; j < 128; ++j)
    body(i,j);

It's fine to work around this in the meantime (it is a correctness problem when it comes to mustprogress at least) -- but shouldn't this be done by dropping the loop metadata? I believe that's the general rule for metadata: If a transform cannot safely preserve metadata, this is always resolved in favor of dropping the metadata, not preventing the transform.

Dropping the metadata in this case changes the program semantics (unlike dropping profile data), thus the transformation in this case is unsafe so it should be disabled as this patch does.

fhahn added a comment.Sep 28 2022, 9:00 AM

It's fine to work around this in the meantime (it is a correctness problem when it comes to mustprogress at least) -- but shouldn't this be done by dropping the loop metadata? I believe that's the general rule for metadata: If a transform cannot safely preserve metadata, this is always resolved in favor of dropping the metadata, not preventing the transform.

Dropping the metadata in this case changes the program semantics (unlike dropping profile data), thus the transformation in this case is unsafe so it should be disabled as this patch does.

I think I am missing what the issue with dropping the loop metadata would be, could you elaborate? If we drop the metadata we should only lose information that may have enabled additional transformations, but shouldn't change the result of the program.

It's fine to work around this in the meantime (it is a correctness problem when it comes to mustprogress at least) -- but shouldn't this be done by dropping the loop metadata? I believe that's the general rule for metadata: If a transform cannot safely preserve metadata, this is always resolved in favor of dropping the metadata, not preventing the transform.

Dropping the metadata in this case changes the program semantics (unlike dropping profile data), thus the transformation in this case is unsafe so it should be disabled as this patch does.

I think I am missing what the issue with dropping the loop metadata would be, could you elaborate? If we drop the metadata we should only lose information that may have enabled additional transformations, but shouldn't change the result of the program.

True - I should have been clearer. The behavior of the program does not change. What changes is the 'expected' shape of the affected loop. This is similar in spirit for 'always_inline' /'noinline' etc. We should have strong reason to 'not' preserve user directives.

fix a typo (basck->back) around line 1145. No other change (history tab could show diff).