Page MenuHomePhabricator

[MergedLoadStoreMotion] Sink stores to BB with more than 2 predecessors
AcceptedPublic

Authored by dendibakh on Aug 14 2019, 10:55 AM.

Details

Summary

If we have:

bb5:
  br i1 %arg3, label %bb6, label %bb7

bb6:
  %tmp = getelementptr inbounds i32, i32* %arg1, i64 2
  store i32 3, i32* %tmp, align 4
  br label %bb9

bb7:
  %tmp8 = getelementptr inbounds i32, i32* %arg1, i64 2
  store i32 3, i32* %tmp8, align 4
  br label %bb9

bb9:	; preds = %bb4, %bb6, %bb7
  ...

We can't sink stores directly into bb9.
This patch creates new BB that is successor of %bb6 and %bb7
and sinks stores into that block.

Diff Detail

Event Timeline

dendibakh created this revision.Aug 14 2019, 10:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 14 2019, 10:55 AM

Please improve patch description, current one is not too useful.

dendibakh marked 2 inline comments as done.Aug 14 2019, 10:59 AM
dendibakh added inline comments.
llvm/lib/Transforms/Scalar/MergedLoadStoreMotion.cpp
364

Maybe we can invalidate only if new BB was created.

394

Maybe we can invalidate only if new BB was created.

Can you use utils/update_test_checks.py on this new IR file, please?

lebedev.ri added inline comments.Aug 14 2019, 11:02 AM
llvm/test/Transforms/InstMerge/st_sink_split_bb.ll
43–45

This test is hard to read, can you use utils/update_test_checks.py and precommit it first?

dendibakh edited the summary of this revision. (Show Details)

Ran update_test_checks.py

dendibakh edited the summary of this revision. (Show Details)Aug 14 2019, 11:17 AM
bjope added inline comments.Aug 15 2019, 2:40 PM
llvm/lib/Transforms/Scalar/MergedLoadStoreMotion.cpp
22

Could probably add something here that a new tail/footer block may be insterted if the tail/footer block has more predecessors (not only the two predecessors that are forming the diamond).

290

coding rules: asserts should have a string

293

coding rules: asserts should have a string

dendibakh updated this revision to Diff 215479.Aug 15 2019, 3:14 PM
dendibakh marked 2 inline comments as not done.

Fixed Bjorn's comments.

dendibakh marked 4 inline comments as done.Aug 15 2019, 3:15 PM
bjope added a comment.Aug 17 2019, 1:57 PM

Code changes basically looks good to me, but I have no idea about:

  • the expected gain
  • if this is handled elsewhere (or if it should be)
  • if there is a (high) cost related to "not preserving the CFG"
llvm/lib/Transforms/Scalar/MergedLoadStoreMotion.cpp
279–280

nit: Afaict it seems like a limitation in the current implementation to only sink when we find a diamond (there is a todo on line 72 about generalizing to other code structures). I guess the transform done by this method could be valid for any pair of joining flows, so it is a little bit weird to base it on a diamond head. The function would be more general if it for example took Pred0 and Pred1 as input (asserting that they both have identical single successors, and that Pred0!=Pred1), and then moving the logic about how we detect the predecessor pair to the caller.

I introduced an option that controls splitting the footer BB which is disabled by default.
So, no functional change at the moment.

@bjope, please review.

bjope added inline comments.Sep 4 2019, 10:23 AM
llvm/lib/Transforms/Scalar/MergedLoadStoreMotion.cpp
355

Are we sure that this iteration is safe now when we may end up adding new blocks while iterating.
I think the iterator still will be valid in the sense that it still points to a BB (or end) and that we can continue iterating. I also assume that it will be deterministic, right?

The inserted blocks may or may not end up being analysed here (afaict) depending on if their predecessors in BB order already have been analysed or not). But since the inserted blocks never will be diamond heads that isn't a problem right now, since this loop only care about diamond heads.

Maybe it is worth adding a comment that mergeStores can insert new blocks, and that it isn't sure if they will be included in the iteration or not?

bjope accepted this revision.Sep 4 2019, 10:34 AM
bjope added a reviewer: lebedev.ri.

LGTM (apart from the comment about maybe saying something about the iteration order).

@lebedev.ri had comments about the test case earlier, so you could wait another day or two before landing this, to see if someone else got a different opinion (I don't know that much about this, but the code seem to do what you are aiming for afaict).

This revision is now accepted and ready to land.Sep 4 2019, 10:34 AM
dendibakh marked an inline comment as done.Sep 4 2019, 4:26 PM
dendibakh added inline comments.
llvm/lib/Transforms/Scalar/MergedLoadStoreMotion.cpp
355

I took the idea from SimplifyCFG which does similar splitting. In SimplifyCFG we have the same implementation, i.e. we iterate over BBs in a function:
https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Scalar/SimplifyCFGPass.cpp#L162
simplifyCFG() internally does the splitting.

Anyway, thanks for a good point. I will add a comment.