This is an archive of the discontinued LLVM Phabricator instance.

[CallSiteSplitting] Only record conditions up to the IDom(call site).
ClosedPublic

Authored by fhahn on Mar 19 2018, 7:53 AM.

Details

Summary

We can stop recording conditions once we reached the immediate dominator
for the block containing the call site. Conditions in predecessors of the
that node will be the same for all paths to the call site and splitting
is not beneficial.

This patch makes CallSiteSplitting dependent on the DT anlysis. because
the immediate dominators seem to be the easiest way of finding the node
to stop at.

I had to update some exiting tests, because they were checking for
conditions that were true/false on all paths to the call site.

Diff Detail

Event Timeline

fhahn updated this revision to Diff 138946.Mar 19 2018, 8:46 AM

Update test/Other/new-pm-lto-defaults.ll to account for running the dominator tree analysis earlier.

junbuml added inline comments.Mar 28 2018, 9:40 AM
test/Transforms/CallSiteSplitting/callsite-split-or-phi.ll
46

We may want to keep this test as it is. With this patch, the call sites split will take %a, instead of null. null value might be propagated to the callsite later in the pipeline, but there is no reason not to pass the known null value early in the pipeline. If the callee has early exit condition with null check. Inliner may have better opportunity to get this callee inlined by passing null.

fhahn added inline comments.Mar 28 2018, 9:42 AM
test/Transforms/CallSiteSplitting/callsite-split-or-phi.ll
46

Ah yes, if we split anyways, we could also propagate other known constraints.

junbuml added inline comments.Mar 28 2018, 9:48 AM
test/Transforms/CallSiteSplitting/callsite-split-or-phi.ll
46

This may show that inliner fail to inline the callee with this change.

define i32 @test_eq_eq_eq(i32* %a, i32 %v, i32 %p) {
Header:
  %tobool1 = icmp eq i32* %a, null
  br i1 %tobool1, label %Header2, label %End

Header2:
  %tobool2 = icmp eq i32 %p, 10
  call void @dummy()
  br i1 %tobool2, label %Tail, label %TBB

TBB:
  %cmp = icmp eq i32 %v, 1
  call void @dummy()
  br i1 %cmp, label %Tail, label %End

Tail:
  %r = call i32 @callee(i32* %a, i32 %v, i32 %p)
  ret i32 %r

End:
  ret i32 %v
}


define i32 @callee(i32* %a, i32 %v, i32 %p) {
entry:
  %c = icmp eq i32* %a, null
  br i1 %c, label %BB1, label %BB2

BB1:
  ret i32 0

BB2:
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  call void @dummy2(i32 %v, i32 %p)
  br label %End

End:
  ret i32 %p
}

declare void @dummy()
declare void @dummy2(i32, i32)
fhahn updated this revision to Diff 140212.Mar 29 2018, 5:51 AM

I moved code around a bit and updated it to explicitly gather common conditions starting from IDom(CS). While I do think that we ideally handle that somewhere else before inlining, it is quite easy to do it here for now.

Thanks for doing this. Now, you intended it to be a NFC, right? If then, it will be good to keep the original test cases as it is. Adding more tests is definitely good.

lib/Transforms/Scalar/CallSiteSplitting.cpp
468–479

I think we should move this to line 461. Otherwise, we wont be able to handle the case where a known value is not detected in immediate predecessors, but there is a known value after IDom.

Looks like test_cond_no_effect() shows the case.

test/Transforms/CallSiteSplitting/callsite-split-or-phi.ll
133

We should keep this test to check that nonnull is passed to both call-sites.

133

We can add this test, but I don't think there is any change in this test with/without this patch.

185

We should keep this test to check that nonnull is passed to both call-sites.

junbuml added inline comments.Mar 29 2018, 8:12 AM
lib/Transforms/Scalar/CallSiteSplitting.cpp
468–479

In this case (test_cond_no_effect), we should just pass the known value without splitting. Looks like I tried thing similar before in https://reviews.llvm.org/D41782.

fhahn added inline comments.Mar 29 2018, 8:18 AM
lib/Transforms/Scalar/CallSiteSplitting.cpp
468–479

As this patch is at the moment, it is not a NFC, as it prevents splitting in case there are no suitable conditions along any path between the IDom and the call site.

In this case (test_cond_no_effect), we should just pass the known value without splitting. Looks like I tried thing similar before in https://reviews.llvm.org/D41782.

I just thought the same thing :) With my change, it should be quite straight forward to handle that case. But I would prefer to do that in a different patch

Sounds good to me.

fhahn added a comment.Apr 5 2018, 11:04 AM

I thought

lib/Transforms/Scalar/CallSiteSplitting.cpp
468–479

In this case (test_cond_no_effect), we should just pass the known value without splitting. Looks like I tried thing similar before in https://reviews.llvm.org/D41782.

I thought a bit more about how to best handle propagating common facts that do not require splitting. I do not think we can do it properly here without making things to complicated. I've shared a WIP patch that is a first step towards enabling IPSCCP to propagate facts from compare instructions : D45330

fhahn updated this revision to Diff 173155.Nov 8 2018, 6:36 AM

Resurrecting the patch. The cases we loose (propagating %a == null in test_cond_no_effect) is now handled by ipsccp, which is a better place IMO. @junbuml what do you think?

This patch makes CallSiteSplitting stop backtracking once it is not necessary any more and avoid splitting unnecessarily.

junbuml accepted this revision.Nov 8 2018, 1:24 PM

Thanks for revisiting this.
LGTM.

lib/Transforms/Scalar/CallSiteSplitting.cpp
155–156

It seems better to place "To != StopAt" first :

while (To != StopAt && !Visited.count(From->getSinglePredecessor()) &&
         (From = From->getSinglePredecessor()))
This revision is now accepted and ready to land.Nov 8 2018, 1:24 PM
This revision was automatically updated to reflect the committed changes.