This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Use loop guard info when computing the max BE taken count in howFarToZero.
ClosedPublic

Authored by fhahn on Sep 4 2019, 8:14 AM.

Details

Summary

For some expressions, we can use information from loop guards when
we are looking for a maximum. This patch applies information from
loop guards to the expression used to compute the maximum backedge
taken count in howFarToZero. It currently replaces an unknown
expression X with UMin(X, Y), if the loop is guarded by
X ult Y.

This patch is minimal in what conditions it applies, and there
are a few TODOs to generalize.

This partly addresses PR40961. We will also need an update to
LV to address it completely.

Event Timeline

fhahn created this revision.Sep 4 2019, 8:14 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 4 2019, 8:14 AM

Instead of writing a C++ unittest, you should be able to use "opt -analyze -scalar-evolution" to test this.

Not sure I like the approach; rewriting the SCEV to a more complicated expression seems like it's stacking complexity. But it might be the least code overall, I guess.

fhahn added a comment.Sep 4 2019, 1:22 PM

Instead of writing a C++ unittest, you should be able to use "opt -analyze -scalar-evolution" to test this.

Sure, I'll do that. The current version of the unit test only tests stuff covered by -analyze -scalar-evolution.

Not sure I like the approach; rewriting the SCEV to a more complicated expression seems like it's stacking complexity. But it might be the least code overall, I guess.

Yep, the re-writing the expression was the least invasive change. But I would be happy to change the implementation to use additional conditions differently. To cover the motivating case for the patch, it would be enough to collect the additional conditions, pass them to getRangeRef and use them there. But I think the additional conditions could be helpful in the various reasoning functions as well. Currently we can pass one additional condition to most reasoning functions, but here it would be helpful to pass multiple conditions. I am happy with either direction, please let me know what you prefer :)

In terms of general API, I don't think we want to expose "applyLoopGuards"; the SCEV transform proposed here isn't really useful outside of trying to find the minimum or maximum, as far as I can tell. Which min/max expressions we want to form depends on whether we're computing a "max" or a "min". And restricting the API so the point in the CFG we're querying has to be a loop header doesn't seem helpful; other places might care about values after a loop etc.

In terms of the implementation, this composes well, in a sense: it annotates the relevant SCEV expressions with the relevant conditions, then uses the general implementation that ignores control flow. But I'm not sure how it scales for larger SCEV expressions; keeping a map on the side seems like it would have more predictable performance.

I don't really have a general vision for what the range computation implementation should look like in the future: what sources of information are practical to use? What overall data structure do we use to integrate them?

fhahn added a comment.Sep 6 2019, 12:58 PM

In terms of general API, I don't think we want to expose "applyLoopGuards"; the SCEV transform proposed here isn't really useful outside of trying to find the minimum or maximum, as far as I can tell. Which min/max expressions we want to form depends on whether we're computing a "max" or a "min". And restricting the API so the point in the CFG we're querying has to be a loop header doesn't seem helpful; other places might care about values after a loop etc.

In terms of the implementation, this composes well, in a sense: it annotates the relevant SCEV expressions with the relevant conditions, then uses the general implementation that ignores control flow. But I'm not sure how it scales for larger SCEV expressions; keeping a map on the side seems like it would have more predictable performance.

Yeah, duplicating a whole expression just because we want to attach additional range information could lead to poor performance for large expressions. I'll look into the map approach.

I don't really have a general vision for what the range computation implementation should look like in the future: what sources of information are practical to use? What overall data structure do we use to integrate them?

Do you mean in general or in for SCEV?

Do you mean in general or in for SCEV?

Well, I guess there's a question in general, but here I'm more concerned specifically about SCEV: what other analysis can it use, what sort of data structures can we use for caching. But maybe I'm looking too many steps ahead; I don't want to get in the way of an incremental improvement.

SCEV already has support for isLoopEntryGuardedByCond which should be a super set of the added code. Have you explored why the distance expression isn't be canonicalized at construction?

(There are some circularity issues between exit counts and scev canonical forms, so that might be your issue. Or it might be something easy to fix. We can always hope.)

fhahn planned changes to this revision.Sep 25 2019, 2:18 PM

SCEV already has support for isLoopEntryGuardedByCond which should be a super set of the added code.

Agreed, I think the final version of the patch should definitely share the guard discovery logic with isLoopEntryGuardedByCond.

Have you explored why the distance expression isn't be canonicalized at construction?

Do you mean why the information from the loop guard is not included in the distance expression when we create it?
IIUC, the distance expression here is the subtraction of RHS and LHS of the exit condition. As Eli indicated, including information from the guards in the expressions for the operands of the condition might increase the complexity, without much gain, besides using it to get more precise upper bounds on ranges. Do you think there would be additional benefits for including information from the guards in the distance expression, beside better range information? AFAIK, some of the reasoning methods themselves try to rewrite/simplify expressions based on information from loop guards and we might be able to skip some of those rewrites.

fhahn updated this revision to Diff 292596.Sep 17 2020, 12:54 PM

Finally had some time to get back to this one. Rebased and also added SCEV test for PR40961. I will also post a follow-up that extends this to also support assumes to catch PR47247.

Sorry for the long delay! There is still bits that can be improved (e.g. sharing code with isLoopEntryGuardedByCond), but I'd like to converge on the general approach before focusing on that.

SCEV already has support for isLoopEntryGuardedByCond which should be a super set of the added code.

Yes, this code here is similar to isLoopEntryGuardedByCond in that it looks at the same information (conditions dominating the loop body). But it is also different conceptually I think. isLoopEntryGuardedByCond is useful, if we known which question to ask, e.g. we could use the loop guards to check if an expressions is less than a constant (but we need to know which specific constant/condition to check).

Unfortunately I do not think this is very useful to clamp the range of the expressions at hand here, because it would require both explicitly checking what information is available through guards, as well as checking how it is used in the expression.

The approach of applyLoopGuards is different: the idea is to re-write an expression with information from the loop guards and use the existing reasoning to try to use the additional info to simplify the overall expression. This approach seems quite effective and allows for all the existing simplification logic to be completely reused.

Have you explored why the distance expression isn't be canonicalized at construction?

(There are some circularity issues between exit counts and scev canonical forms, so that might be your issue. Or it might be something easy to fix. We can always hope.)

I thought about this one a bit more. I am not sure any general canonicalization is missing here, because the information we apply here is very context specific: we can use information from the guards, because we construct an expression that is only valid in the loop body. And the backedge taken count is a artificial expression we are building up specifically in howFarToZero (and it's sibling functions). Unless I am missing something, I don't think there is any general canonicalization that is missing here.

reames accepted this revision.Sep 17 2020, 1:55 PM

LGTM. This makes a nice starting point - as you note, we can definitely extend this.

This revision is now accepted and ready to land.Sep 17 2020, 1:55 PM
fhahn updated this revision to Diff 293210.Sep 21 2020, 10:14 AM

LGTM. This makes a nice starting point - as you note, we can definitely extend this.

Thank you very much for taking a look! I made some smaller changes to fix a few edge-cases and extended the test coverage in 3cbdfe424fec.

I plan to land this in a day or two.

fhahn updated this revision to Diff 293991.Sep 24 2020, 3:01 AM

Another rebase just before committing.

This revision was landed with ongoing or failed builds.Sep 24 2020, 3:14 AM
This revision was automatically updated to reflect the committed changes.