This is an archive of the discontinued LLVM Phabricator instance.

[JumpThreading] Teach jump threading how to analyze (and (cmp A, C1), (cmp A, C2)) after InstCombine has turned it into (cmp (add A, C3), C4)
ClosedPublic

Authored by craig.topper on May 16 2017, 3:45 PM.

Details

Summary

Currently JumpThreading can use LazyValueInfo to analyze an 'and' or 'or' of compare if the compare is fed by a livein of a basic block. This can be used to to prove the condition can't be met for some predecessor and the jump from that predecessor can be moved to the false path of the condition.

But if the compare is something that InstCombine turns into an add and a single compare, it can't be analyzed because the livein is now an input to the add and not the compare.

This patch adds a new getPredicateOnEdgeAfterAdd entry point to LazyValueInfo that allows for an adjustment to be made to the livein's range before doing the compare predicate check. JumpThreading is then taught to use this entry point if it can see that there is an add of a livein with a constant that is used by a compare.

I'm not sure this is the right division of labor between the two passes so I'm open to other opinions on where the line should be.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.May 16 2017, 3:45 PM
reames edited edge metadata.May 16 2017, 5:27 PM

Can you give a test case without the instcombine processing? (i.e. I'd like to see what jump-threading is actually encountering) At least on first glance, this looks like something LVI should just be able to analyse without any special casing in jump threading.

Change test case to be the output after InstCombine.

@reames where you able to take a look at this?

Ok, I finally got a chance to look at this, sorry for the long delay.

I see the problem you're trying to solve, but I think this patch goes about it the wrong way. I don't think you need any special handling per se, this is a case that JumpThreading::ComputeValueKnownInPredeccessors should handle for you. We should be able to figure out that the %0 icmp expression is known the be false if the "if.then, if.end" edge is executed. In fact, the handling for recursion on icmps and binary operators appears to already be there, so I'm not quite sure why it's not working today. I think this likely comes down to some small bug in either CVKInP or in the jump-threading logic itself.

The problem is that LVI knows the value on the edge is within a specific ConstantRange, but the method we call in LVI is getConstantOnEdge which can't return a range.

reames added a comment.EditedMay 30 2017, 4:05 PM

Ok, my previous comment was wrong. This isn't something the existing infrastructure can handle because CVKInP reasons only about constants and we need to forward propagate constant ranges here to discharge the icmp. That's what I'd missed on the first look.

I think there's a couple of reasonable ways to implement this:

  1. ask for the constant range of each input to the add, then forward propagate using the ConstantRange accessors. Doing this specifically for add directly in CVKInP doesn't seem too ugly.
  2. restructure CVKInP to work in terms of constant ranges. If we'd directly returned the CRs from the inputs, and unwound the recursion in CVKInP applying the forward propagation rules (i.e. the add), then we'd handle not just this case, but many others. (CVKInPred essentially does this today for constants using the constant expressions.)
  3. we could sink the result of (2) directly into LVI - this is close to what you're doing currently, but substantially more general.
  4. In an alternate approach, we could backwards propagate a predicate (think weakest precondition style) to the beginning of the block, then ask LVI to discharge that predicate for each incoming value. (Clarification: LVI already does something similar to this inside getPredicateAt. We don't current handle the case within a single block, but potentially could. If we did, we should really separate this into it's own analysis layered on top of LVI.)

(1-3) and (4) have slightly different power. Doing (2) is probably the most straight forward extension of the existing logic, but we already have some reasoning ala (4) in LVI itself. Either approach seems reasonable and I could be convinced we should do either. I'd lean towards forward propagation of constant ranges just because that'd probably be easier to write and reason about.

The problem is that LVI knows the value on the edge is within a specific ConstantRange, but the method we call in LVI is getConstantOnEdge which can't return a range.

Our comments interwove here. :) I think I came to the same conclusion you did right?

Farhana edited edge metadata.May 30 2017, 6:38 PM

It seems to me that the improvement is done in the wrong place which is JumpThreading and made the change-set very specific. In my opinion, it does not have to be specific to JumpThreading. It can be a very general improvement of LVI using your new utilities.

Basically, we can have the following:

  • getPredicateOnEdge() does not require a value to be live-in but later functions do. So, I would think we can easily extend getPredicateOnEdge() to handle any kind of values/Range and do the forward substitution in getPredicateOnEdge() . Basically, the new code you have in JumpThreading that is processing a local value and computing the non-local value can be placed inside getPredicateOnEdge(). We can also handle all binary operators.
  • Then allow any kind of values from ComputeValueKnownInPredecessors() being queried in getPredicateOnEdge()

@reames For 1. were you envisioning detecting the add at the time we're visiting the compare similar to what I do now? Or were you envisioning propagating the add result as we unwind CVKInP after visiting the directly. The latter I think would require us to make CVKInP work in ConstantRanges like suggestion 2. For the former I think I'd also have to replicate some of the compare handling from LVI to calculate the result for the range from the add?

For 2, I think our range solving capabilities are considerably weaker than our ConstantExpr handling so I think we'd still need to rely on ConstantExpr when we have single element ranges?

@Farhana for your suggestion of moving the logic into getPredicateOnEdge. We'd have to decide how far back to make getPredicateOnEdge search for a livein. One instruction? Multiple instructions? We'd also still need some check in jump threading to know whether we are looking at a livein or whether we should recurse into CVKInP like we do today. So we'd have to try getPredicateOnEdge and if it fails, recurse into CVKInP only if we aren't looking at a livein.

@reames For 1. were you envisioning detecting the add at the time we're visiting the compare similar to what I do now? Or were you envisioning propagating the add result as we unwind CVKInP after visiting the directly. The latter I think would require us to make CVKInP work in ConstantRanges like suggestion 2. For the former I think I'd also have to replicate some of the compare handling from LVI to calculate the result for the range from the add?

I was picturing roughly the following:

  1. Pattern match add leading to compare (as done today)
  2. Ask LVI for constant range representing each input to the add
  3. Forward propagate those CRs using CR functions.
  4. Expose the compare handling you mention as a helper function and use it.

For 2, I think our range solving capabilities are considerably weaker than our ConstantExpr handling so I think we'd still need to rely on ConstantExpr when we have single element ranges?

This sounds like a reasonable implementation technique for ConstantRange when dealing with single element ranges, but the interface to caller code shouldn't reflect that. :)

Implement by asking LVI for a ConstantRange for the add on the edge and propagate forward in JumpThreading.

I tried Farhana's suggestion of moving logic into getPredicateOnEdge, but I had some trouble with PHINode's on single basic block loops. It became difficult to know what the caller was trying to ask to know if we should look back for an instruction. I think this was also compounded by the fact that we use getPredicateOnEdge for select conditions and maybe other things that aren't real compare instructions.

Lost the test case in the previous patch.

reames accepted this revision.Jun 22 2017, 2:03 PM

LGTM w/minor changes applied, no further review needed.

Very nice.

lib/Transforms/Scalar/JumpThreading.cpp
638 ↗(On Diff #102241)

This would probably be clearer with either a matcher or a helper lambda.

This revision is now accepted and ready to land.Jun 22 2017, 2:03 PM
This revision was automatically updated to reflect the committed changes.

Sorry for the drive by review, but I noticed one cleanup below and I have to ask: why is there no test case here? That seems kinda bad. I feel like there should be tests for folding to true, folding to false, and some negative testing that things which *shouldn't* be threaded here aren't....

llvm/trunk/lib/Transforms/Scalar/JumpThreading.cpp
647–649

You can remove all of the isa<Instruction> and cast<Intsruction> dance by using m_Instruction in the pattern match and making AddLHS an Instruction*.

I had a test. Maybe I forgot to git add it when I turned in. I'll check.

craig.topper added inline comments.Jun 27 2017, 8:16 AM
llvm/trunk/lib/Transforms/Scalar/JumpThreading.cpp
647–649

AddLHS isn't always an instruction. It might be a livein to the block including an argument to the function and we need to handle that.

Test case committed in r306416