This is an archive of the discontinued LLVM Phabricator instance.

[ScalarEvolution] Ensure backedge-taken counts are not pointers.
ClosedPublic

Authored by efriedma on Jun 3 2021, 4:09 PM.

Details

Summary

A backedge-taken count doesn't refer to memory; returning a pointer type is nonsense. So make sure we always return an integer.

The obvious way to do this would be to just convert the operands of the icmp to integers, but that doesn't quite work out at the moment: isLoopEntryGuardedByCond currently gets confused by ptrtoint operations. So we perform the ptrtoint conversion late for lt/gt operations.

The test changes are mostly innocuous. The most interesting changes are more complex SCEV expressions of the form "(-1 * (ptrtoint i8* %ptr to i64)) + %ptr)". This is expected: we can't fold this to zero because we need to preserve the pointer base.

The call to isLoopEntryGuardedByCond in howFarToZero is less precise because of ptrtoint operations; this shows up in the function pr46786_c26_char in ptrtoint.ll. Fixing it here would require more complex refactoring. It should eventually be fixed by future improvements to isImpliedCond.

See https://bugs.llvm.org/show_bug.cgi?id=46786 for context.

Diff Detail

Event Timeline

efriedma created this revision.Jun 3 2021, 4:09 PM
efriedma requested review of this revision.Jun 3 2021, 4:09 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 3 2021, 4:09 PM

I have previously looked into something similar.
Are we not worried that this breaks SCEV for non-integral pointers/too small of an gep index sizes?

llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
835

Please precommit regeneration of check lines in this test file.

llvm/test/Transforms/PhaseOrdering/scev-custom-dl.ll
118

here

lebedev.ri added inline comments.Jun 4 2021, 1:29 AM
llvm/test/Transforms/IndVarSimplify/lftr-reuse.ll
32–33 ↗(On Diff #349718)

I *think* this is not a non-integral pointer, and i think GEP index size (in DL) isn't too small,
so i'm not sure what is going on here?

For pointers where the index type is smaller than the pointer type, coming up with the right result is a little tricky. As far as I can tell from the LangRef rules, icmp compares all the bits of the pointer, not just the index bits? So I guess we can special-case comparisons where both sides have the same pointer base: the non-index bits will be the same. If both sides have the same pointer base, and we've inferred some sort of nowrap, we can then convert the icmp to compare the index bits. I guess we can handle non-integral pointers the same way.

But I don't want to try to implement this logic without any real-world testing. As far as I know, no in-tree targets use this feature. So I'll leave it to someone who cares about such targets, and I don't think it should block this patch.

llvm/test/Transforms/IndVarSimplify/lftr-reuse.ll
32–33 ↗(On Diff #349718)

Maybe something to do with the icmp in the preheader; I'll look further.

For pointers where the index type is smaller than the pointer type, coming up with the right result is a little tricky. As far as I can tell from the LangRef rules, icmp compares all the bits of the pointer, not just the index bits? So I guess we can special-case comparisons where both sides have the same pointer base: the non-index bits will be the same. If both sides have the same pointer base, and we've inferred some sort of nowrap, we can then convert the icmp to compare the index bits. I guess we can handle non-integral pointers the same way.

But I don't want to try to implement this logic without any real-world testing. As far as I know, no in-tree targets use this feature. So I'll leave it to someone who cares about such targets, and I don't think it should block this patch.

So that is "yes, let's \"break\" SCEV for non-integral pointers and datalayouts with too small of an GEP index sizes".
I see. That does indeed allow us to fix SCEV :)

efriedma updated this revision to Diff 349954.Jun 4 2021, 1:23 PM

Also perform ptrtoint conversion when analyzing implied conditions, for consistency with backedge taken analysis. This fixes some minor diffs in the previous version of the patch.

reames requested changes to this revision.Jun 4 2021, 2:44 PM

The implementation and description of this patch seem badly out of sync.

Modifying the loop exit computation to always return an integer type *might* be reasonable. As has been raised already, there are concern with non-integral pointers, or cases where we don't *know* the size of a pointer, but those might be addressable. I have concerns, but they're potentially addressable concerns.

However, that doesn't seem to be what the implementation of this patch actually does? Having isKnownPredicate simply return false for a pointer argument which can't be cast is definitely not okay. Did you maybe upload the wrong diff?

This revision now requires changes to proceed.Jun 4 2021, 2:44 PM

It's the right diff. The initial version only touched the backedge calculation, but it turned out to be sort of tied together with condition handling because the backedge calculation queries it. I'll update the description.

cases where we don't *know* the size of a pointer

There are no such cases. datalayout has been mandatory for a while.

Having isKnownPredicate simply return false for a pointer argument which can't be cast is definitely not okay.

The only pointers which can't be cast are non-integral pointers, and other exotic pointers where the index type is not equal to the pointer width. This doesn't apply to any in-tree target, as far as I know. I'm not sure why a minor optimization loss for out-of-tree targets is "definitely not okay". (We don't even have any test coverage for this besides the one loop in scev-custom-dl.ll.)

efriedma edited the summary of this revision. (Show Details)Jun 7 2021, 12:31 PM

Eli,

Sorry for the silence. Partly busy, and partly having a hard time explaining something which seems obvious to me. Let me give this another try.

First, let's start with the interface question. You seem to be assuming that the only time we'll query facts about two SCEVs is when they're both trip counts. This is blatantly untrue. We will frequently query relations between SCEVs when e.g. discharging loop exits in IndVarSimplify. (And a bunch of other places.) Having those queries return unknown more often would be undesirable.

Second, the getLosslessPtrToIntExpr routine seems to not match what you're describing. It appears to have a bunch of cases where it does not succeed. Examples include min/max expressions, and udivs. (Going by code structure here. If I'm missing something, let me know.) Are you claiming that this routine is guaranteed to succeed? The existing code and comments don't seem to make this obvious.

If you want to achieve the stated goal, have you tried checking for lossless conversion of the *result* of the exit computation? If that worked, it would be a much cleaner and more obvious change. (e.g. never return an exit count which can't be converted to integer type) This would mean instrumenting the construction of ExitLimit to drop information if needed. Or maybe that part of your change with an assert in the ExitLimit constructor? Not sure there.

Having those queries return unknown more often would be undesirable.

I agree.

Second, the getLosslessPtrToIntExpr routine seems to not match what you're describing. It appears to have a bunch of cases where it does not succeed. Examples include min/max expressions, and udivs. (Going by code structure here. If I'm missing something, let me know.) Are you claiming that this routine is guaranteed to succeed? The existing code and comments don't seem to make this obvious.

SCEVPtrToIntSinkingRewriter always succeeds; it inherits implementations for every kind of expression from SCEVRewriteVisitor. The explicit implementations of visitAddExpr and visitMulExpr are just there to preserve the flags. Not sure how you're reaching the conclusion that it fails for min/max expressions.

If you want to achieve the stated goal, have you tried checking for lossless conversion of the *result* of the exit computation? If that worked, it would be a much cleaner and more obvious change. (e.g. never return an exit count which can't be converted to integer type) This would mean instrumenting the construction of ExitLimit to drop information if needed. Or maybe that part of your change with an assert in the ExitLimit constructor? Not sure there.

This could work as an intermediate step. Digging a little, it wouldn't be that painful; there aren't that many places in howFarToZero and howManyLessThans/howManyGreaterThans that would need changes.

But we eventually want something like the current patch, I think. We don't really want to be doing computations like getMinusSCEV(LHS, RHS) where LHS and RHS are pointers. Currently, that returns a SCEV with pointer type, but it doesn't really make sense to treat it as a pointer, particularly in the context of an "icmp eq".

Having those queries return unknown more often would be undesirable.

I agree.

Second, the getLosslessPtrToIntExpr routine seems to not match what you're describing. It appears to have a bunch of cases where it does not succeed. Examples include min/max expressions, and udivs. (Going by code structure here. If I'm missing something, let me know.) Are you claiming that this routine is guaranteed to succeed? The existing code and comments don't seem to make this obvious.

SCEVPtrToIntSinkingRewriter always succeeds; it inherits implementations for every kind of expression from SCEVRewriteVisitor. The explicit implementations of visitAddExpr and visitMulExpr are just there to preserve the flags. Not sure how you're reaching the conclusion that it fails for min/max expressions.

Reading back through, I see you are correct. The only bailout (e.g. return of SCEVCouldNotCompute) is the effective type size check. Can you update/add some comments to the function to that effect?

If you want to achieve the stated goal, have you tried checking for lossless conversion of the *result* of the exit computation? If that worked, it would be a much cleaner and more obvious change. (e.g. never return an exit count which can't be converted to integer type) This would mean instrumenting the construction of ExitLimit to drop information if needed. Or maybe that part of your change with an assert in the ExitLimit constructor? Not sure there.

This could work as an intermediate step. Digging a little, it wouldn't be that painful; there aren't that many places in howFarToZero and howManyLessThans/howManyGreaterThans that would need changes.

But we eventually want something like the current patch, I think. We don't really want to be doing computations like getMinusSCEV(LHS, RHS) where LHS and RHS are pointers. Currently, that returns a SCEV with pointer type, but it doesn't really make sense to treat it as a pointer, particularly in the context of an "icmp eq".

Can I ask you to be incremental here? I'm not really bothered by your change in computeExitLimit, but I am very bothered by the accessor routine changes. This is complicated, and I don't think any of us fully understand it. I'd prefer a series of incremental patches with each one standing on it's own building towards the desired objective. In particular, I'm worried (mostly on your other patch admittedly) that this intersects with the discussion around pointer providence in subtle ways. Incrementalism, and thus time to catch mistakes early, seems very worthwhile here.

On the pointer subtraction point, that's a whole different can of worms. The vectorizers and various loop transforms - I think? - use that idiom for overlap checks. I agree it's very questionable, but I think that's a deeper issue than "just" SCEV.

efriedma updated this revision to Diff 352553.Jun 16 2021, 2:24 PM
efriedma edited the summary of this revision. (Show Details)

Updated to avoid the isImpliedCond changes, mostly. howFarToZero is slightly less powerful with this version of patch (see revised commit message), but hopefully the difference doesn't matter in practice.

reames requested changes to this revision.Jun 18 2021, 10:00 AM

This looks really close to good to go.

One minor comment inline, but the main thing is we need to tweak getLosslessPtrToIntExpr to return CouldNotCompute for a non-integral type. As described in D104547, we can't insert new ptrtoints for non-integrals. We could move a single cast down to an operand, but we can do that later. A simple bailout should be fine for the moment. I don't think that really changes anything else in the patch.

llvm/lib/Analysis/ScalarEvolution.cpp
7513–7514

You've got the refactoring to reuse the constructor mixed in with functional changes. Would you mind splitting that out, landing, and rebasing this change?

This revision now requires changes to proceed.Jun 18 2021, 10:00 AM
efriedma updated this revision to Diff 353254.Jun 20 2021, 5:47 PM

Address review comments.

reames accepted this revision.Jun 21 2021, 9:29 AM

LGTM.

For the record, some of the test diffs are a bit concerning code quality wise, but I think we may need to accept them to achieve the goal. At least a couple of them have obvious possibilities for improvement in follow ups if needed, so I'm willing to have this go in. We may end up having to revert, fix a few things, then reapply though. We'll see.

Also, mostly for myself when reading through this later, this may regression exit counts for non-integral pointer tested loops, but that's - I think - okay because they're not idiomatic. Once (if?), we have a generic answer for pointer subtraction we can relax some of these and restore power while maintaining integer results.

llvm/lib/Analysis/ScalarEvolution.cpp
1084

Just for my context, they are? This is news to me. Do you have an example?

This revision is now accepted and ready to land.Jun 21 2021, 9:29 AM
efriedma added inline comments.Jun 21 2021, 10:09 AM
llvm/lib/Analysis/ScalarEvolution.cpp
1084

I got this impression reading the patches that led up to adding the "indexing type" for pointers. I might be wrong, though.

This revision was landed with ongoing or failed builds.Jun 21 2021, 4:25 PM
This revision was automatically updated to reflect the committed changes.