This is an archive of the discontinued LLVM Phabricator instance.

[LICM] Hoist stores of invariant values to invariant addresses out of loops
ClosedPublic

Authored by reames on Aug 17 2018, 2:12 PM.

Details

Summary

(now ready for review)

Teach LICM to hoist stores out of loops when the store writes to a location otherwise unused in the loop, writes a value which is invariant, and is guaranteed to execute if the loop is entered.

Worth noting is that this transformation is partially overlapping with the existing promotion transformation. Reasons this is worthwhile anyway include:

  1. For multi-exit loops, this doesn't require duplication of the store.
  2. It kicks in for case where we can't prove we exit through a normal exit (i.e. we may throw), but can prove the store executes before that possible side exit.

Diff Detail

Event Timeline

reames created this revision.Aug 17 2018, 2:12 PM
reames updated this revision to Diff 161825.Aug 21 2018, 2:40 PM

Rebase on top of tests, with test updates.

Still not quite ready for review. I want to remove the writeonly call stuff into it's own patch.

reames updated this revision to Diff 161826.Aug 21 2018, 2:43 PM

Strip out the call hoisting bits until a future patch.

reames retitled this revision from [WIP] [LICM] Hoist stores of invariant values to invariant addresses out of loops to [LICM] Hoist stores of invariant values to invariant addresses out of loops.Aug 21 2018, 2:44 PM
reames added reviewers: mkazantsev, anna, skatkov, hfinkel.
reames set the repository for this revision to rL LLVM.
reames edited the summary of this revision. (Show Details)Aug 21 2018, 2:44 PM

I'm generally fine with the patch once all nits are addressed. It needs more extended testing for conditionally executed stores (we should show that they don't get hoisted).

At the first glance, AliasSetTracker looks independent, but I may be missing something. If possible, please split it into a separate patch and add some dedicated testing on it (my guess is that a unit test can be written on this change).

include/llvm/Analysis/AliasSetTracker.h
229–230

This function is growing bigger, maybe it makes sense to move it to cpp file. Just a suggestion.

233

Isn't it the same as !empty()?
Do you mind adding a clarifying comment on what is being proved here?

236

Unneeded ;

lib/Transforms/Scalar/LICM.cpp
670

Can it go separately?

685

No need for that.

722

We can hoist stores to locations that are read within the loop if we can prove that no read may happen before the store. It doesn't matter if this location is then stored again or not in this case.

Please add it as a TODO item, may be interesting in the future.

test/Transforms/LICM/store-hoisting.ll
47 ↗(On Diff #161826)

I don't see any tests exercising conditionally executed stores in this file. Please add some (with control flow and with potentially throwing calls before the store).

reames planned changes to this revision.Aug 21 2018, 8:20 PM
reames marked 2 inline comments as done.
reames added inline comments.
include/llvm/Analysis/AliasSetTracker.h
229–230

Fair, mind if I do that post commit?

lib/Transforms/Scalar/LICM.cpp
670

As in, land the TODO by itself? Sure, if you prefer.

722

The particular case mentioned could be done, but the motivating case you described offline w/a readonly call in the loop is much harder than it first sounds. The basic problem is the AST doesn't actually keep track of the *instructions* it only tracks memory locations. So, for a large read set, we don't have an efficient way to determine if there's a single mod.

I think the transform you're looking for is probably better done w/MemorySSA once we get around to that (larger) rewrite.

I'd prefer not to put a todo here if you don't mind.

test/Transforms/LICM/store-hoisting.ll
47 ↗(On Diff #161826)

Will do before next round of review.

reames updated this revision to Diff 161883.Aug 21 2018, 8:30 PM

Address first couple comments, another update to come before re-review justified.

Max, I can't test the unique instruction bit through AST printer at the moment. I could add printer support for the unique instruction helper, but I'm starting to seriously wonder if this belongs in AST at all. If you don't mind, I'd prefer to exercise through LICM at the moment so splitting/moving the API is easy.

reames planned changes to this revision.Aug 21 2018, 8:30 PM
reames updated this revision to Diff 161979.Aug 22 2018, 9:21 AM

Adding requested tests, ready for review.

anna added a comment.Aug 23 2018, 8:54 AM

I'll take a closer look at the patch, but at first glance it looks like some cases of invariant store that's preventing the vectorizer (because LICM wasn't hoisting/sinking the store) may be handled by running LICM before vectorization: testcases in D50665 maybe worth trying here - note that those test cases are running LICM before vectorization.

There are 3 kinds of tests worth adding:

  1. predicated invariant stores, i.e. the block containing the store itself is predicated and not guaranteed to execute (cannot be handled by LICM)
  2. invariant store value is a phi containing invariant incoming values and the phi result depends on an invariant condition (can be handled by LICM. This patch handles?)
  3. invariant store value is a phi containing invariant incoming values and the phi result depends on a variant condition (cannot be handled by LICM safely)
anna added inline comments.Aug 23 2018, 9:26 AM
lib/Transforms/Scalar/LICM.cpp
724

I might be missing something. Here we're only checking for the address we're storing into is invariant. Where's the check for the stored value is invariant? That decides if we should hoist or sink the store right?

Also, could you pls add test cases where the stored value is variant while the store address is invariant.

reames marked an inline comment as done.Aug 23 2018, 12:19 PM

There are 3 kinds of tests worth adding:

  1. predicated invariant stores, i.e. the block containing the store itself is predicated and not guaranteed to execute (cannot be handled by LICM)

Covered by existing early exit test.

  1. invariant store value is a phi containing invariant incoming values and the phi result depends on an invariant condition (can be handled by LICM. This patch handles?)

Unclear what you mean here.

  1. invariant store value is a phi containing invariant incoming values and the phi result depends on a variant condition (cannot be handled by LICM safely)

Again, I don't know what you mean by this. If the value is a phi in the loop, it's by definition not invariant.

lib/Transforms/Scalar/LICM.cpp
724

This is handled in the caller. See conditional logic before call to hoist. We check that *all* operands are loop invariant.

As for the test, see the existing neg_lv_value.

anna added a comment.Aug 23 2018, 12:45 PM

There are 3 kinds of tests worth adding:

  1. predicated invariant stores, i.e. the block containing the store itself is predicated and not guaranteed to execute (cannot be handled by LICM)

Covered by existing early exit test.

  1. invariant store value is a phi containing invariant incoming values and the phi result depends on an invariant condition (can be handled by LICM. This patch handles?)

Unclear what you mean here.

Added example:

define void @inv_val_store_to_inv_address_conditional_inv(i32* %a, i64 %n, i32* %b, i32 %k) {
entry:
  %ntrunc = trunc i64 %n to i32
  %cmp = icmp eq i32 %ntrunc, %k
  br label %for.body

for.body:                                         ; preds = %for.body, %entry
  %i = phi i64 [ %i.next, %latch ], [ 0, %entry ]
  %tmp1 = getelementptr inbounds i32, i32* %b, i64 %i
  %tmp2 = load i32, i32* %tmp1, align 8
  store i32 %ntrunc, i32* %tmp1
  br i1 %cmp, label %cond_store, label %cond_store_k

cond_store:
  br label %latch

cond_store_k:
  br label %latch

latch:
  %storeval = phi i32 [ %ntrunc, %cond_store ], [ %k, %cond_store_k ]
  store i32 %storeval, i32* %a <-- this store can be hoisted by LICM (the stored value is invariant in the loop). 
  %i.next = add nuw nsw i64 %i, 1
  %cond = icmp slt i64 %i.next, %n
  br i1 %cond, label %for.body, label %for.end

for.end:                                          ; preds = %for.body
  ret void
  1. invariant store value is a phi containing invariant incoming values and the phi result depends on a variant condition (cannot be handled by LICM safely)

Again, I don't know what you mean by this. If the value is a phi in the loop, it's by definition not invariant.

yes, that's why I stated it "cannot be handled by LICM safely", in the sense it is not correct :)

reames marked an inline comment as done.Aug 23 2018, 1:02 PM

Added example:

(snipped example for length)

This is a really interesting example, but it's out of scope for this patch. The critical bit in your example is not the store, it's the phi. You're pointing out that when we have a diamond which exists only to select between two loop invariant values, we can do phi-to-select transform and then hoist the resulting select out of the loop. I have some questions on your example, let's discuss offline. This is definitely off topic for this review, so let's move further discussion of this one to a bug or another review thread if we get that far.

Looks fine for me, except for I think that we should not give up improvement opportunities just because AST is not strong enough to do it. We may use other data structures for that.
I'm OK with concerns I had. If Anna is OK with her concerns, I think we can have it merged.

lib/Transforms/Scalar/LICM.cpp
722

It only means that we might want to give up using AST here. The mention that it can be generalized still makes sense.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 29 2018, 2:50 PM
This revision was automatically updated to reflect the committed changes.
reames added inline comments.Aug 29 2018, 3:11 PM
lib/Transforms/Scalar/LICM.cpp
722

Add the todo and corresponding tests:
Sending lib/Transforms/Scalar/LICM.cpp
Sending test/Transforms/LICM/store-hoisting.ll
Transmitting file data ..done
Committing transaction...
Committed revision 340978.

When writing the tests, I realized we can handle a lot of common cases, just not the alias set collapse one. Most of tests can be implemented easily.