Page MenuHomePhabricator

[JumpThread] Enhance finding partial redundant loads by continuing scanning single predecessor
ClosedPublic

Authored by junbuml on Jan 26 2017, 3:26 PM.

Details

Summary

While scanning predecessors to find an available loaded value, if the predecessor has a single predecessor, we can continue scanning through the single predecessor.

Diff Detail

Repository
rL LLVM

Event Timeline

junbuml created this revision.Jan 26 2017, 3:26 PM
rengolin added inline comments.Jan 30 2017, 5:14 AM
lib/Transforms/Scalar/JumpThreading.cpp
1009 ↗(On Diff #85973)

Why not just transform the single access above into the loop below? Why do you need both?

test/Transforms/JumpThreading/thread-loads.ll
312 ↗(On Diff #85973)

Is this just the phi, or is there also a call to fn2(%l2)?

If c1/c3 is false and c2 is true, then only fn2 is called, not fn3, which means the new cond3 block has to be only conditionalised via c1 not c2, as in the original IR.

326 ↗(On Diff #85973)

Isn't there just one hop here? %l2 -> %l1?

I thought you were testing multiple predecessors.

junbuml updated this revision to Diff 86333.Jan 30 2017, 1:15 PM
junbuml marked 3 inline comments as done.

Addressed Renato's comments.

junbuml added inline comments.Jan 30 2017, 1:17 PM
test/Transforms/JumpThreading/thread-loads.ll
312 ↗(On Diff #85973)

If c1/c3 is false and c2 is true, only fn2() will be called, so in the new cond2, it unconditionally branch to %end after calling fn2(). I added CHECKs for the branch to %end in cond2 and for the call to fn2(%l2) in cond2.

When %c1 is false, it directly branch to %cond3 from entry. We have this CHECK in entry. So, we only branch to cond2 from cond1 by %c1 when c1/c3 is false.

326 ↗(On Diff #85973)

It's one hop from cond2 to entry, but cond2 -> cond1 -> entry is two hop.
I also added another test which has three hop.

zzheng added a subscriber: zzheng.Jan 30 2017, 2:29 PM
zzheng added inline comments.
include/llvm/Analysis/Loads.h
89 ↗(On Diff #86333)

Can we use

Optional<unsigned &> NumScannedInst

here?

junbuml added inline comments.Jan 31 2017, 7:38 AM
include/llvm/Analysis/Loads.h
89 ↗(On Diff #86333)

Honestly I don't have much idea about the Optional<>, but it looks okay to use it here. However, I believe we can make a separate patch to consider using the Optional for other parameters, not just for this parameter. Please let me know if we have to use the Optional specifically in this parameter unlikely other parameters.

rengolin added inline comments.Jan 31 2017, 8:17 AM
include/llvm/Analysis/Loads.h
89 ↗(On Diff #86333)

I agree this sounds like something for a different patch.

zzheng added inline comments.Jan 31 2017, 8:41 AM
include/llvm/Analysis/Loads.h
89 ↗(On Diff #86333)

I agree with you guys. I was only looking at the new parameter and its usage and thought it's appropriate to use Optional<>.

rengolin accepted this revision.Jan 31 2017, 9:12 AM

LGTM, thanks!

test/Transforms/JumpThreading/thread-loads.ll
326 ↗(On Diff #85973)

Right, now it makes more sense. Thanks!

This revision is now accepted and ready to land.Jan 31 2017, 9:12 AM
This revision was automatically updated to reflect the committed changes.

Why is the right answer here to improve jump threading and not PRE if you want to catch more "partially redundant" loads ?

Additionally, why is this necessary at all?
-gvn already takes care of both of your testcases.

llvm/trunk/lib/Analysis/Loads.cpp
316

Scanned

348

Scanned

349

Scanned

Also, for the record, all of these testcases are fully redundant loads.

they are all of the form

load
if (...)
{
load
}

This is a full redundancy. The initial load dominates the others, with nothing in between, and so they are fully redundant This is also why every pass we have that does any load elimination will already eliminate them.

A partially redundant load would be

if (a)
 load a
else
  <nothing>
load a

This load is partially redundant because it unnecessarily happens twice when we take the if(a)'s true branch

PRE will turn it into:

if (a)
 load a
else
 load a
result = phi(...)

As mentioned, GVN already does this transformation, so i'd really like to see a testcase where you doing this in jump threading enables an optimization we don't get already.