This is an archive of the discontinued LLVM Phabricator instance.

[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers.
ClosedPublic

Authored by efriedma on Jun 23 2021, 11:48 AM.

Details

Summary

As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers.

There are a few ways we could deal with this:

  1. We could completely forbid subtracting pointers in getMinusSCEV()
  2. We could forbid subracting pointers with different pointer bases (this patch).
  3. We could try to ptrtoint pointer operands.

The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option.

As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact.

This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better.

I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test.

Diff Detail

Event Timeline

efriedma created this revision.Jun 23 2021, 11:48 AM
efriedma requested review of this revision.Jun 23 2021, 11:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2021, 11:48 AM

This looks reasonable. I'd probably still perfect we just special case the non-integral pointers here because I don't see any reason to harm LSR for integral ones, but if you really really want to treat all pointers the same, this looks acceptable.

Let's give it a day or two for discussion, but if we find nothing better, I'd be willing to LGTM this.

I'd probably still perfect we just special case the non-integral pointers here because I don't see any reason to harm LSR for integral ones,

We could do something different for LSR icmpzero, that's different from what we do with getMinusSCEV in general, for integral pointers. It's not hard to change, the question is what exactly can we do. These are the possibilities I've thought of:

  1. What we do without any changes. This involves multiplying pointers by -1, which violates the rules laid out in D104498. This is not something we really to be doing; even with integral pointers, it leads to issues with pointer provenance.
  2. What this patch does, don't do icmpzero analysis for pointers with different pointer bases. The change seem mostly okay.
  3. Explicitly ptrtoint icmp operands. This isn't too hard to implement. Without any other changes, it produces a different set of diffs, which isn't obviously better. (The issue here is that "p" and "ptrtoint p" are different SCEV expressions, so LSR heuristics treat them differently.)
  4. Explicitly ptrtoint one of the operands of the icmp. Or a variation, explicitly ptrtoint one of the operands of the icmp if they have different pointer bases. I think I briefly tried this, but I don't remember the details; I'll experiment a bit more.
  5. Explicitly ptrtoint all pointers analyzed by LSR, so the heuristics treat them as equal. This produces messy results, which are hard to clean up.
mkazantsev added inline comments.Jun 23 2021, 9:39 PM
llvm/lib/Analysis/ScalarEvolution.cpp
4145

Should we fail by assertion if RHS is a pointer and LHS is not? I cannot figure a physical meaning of such situation.

efriedma added inline comments.Jun 23 2021, 10:37 PM
llvm/lib/Analysis/ScalarEvolution.cpp
4145

The result almost certainly doesn't represent any value the program cares about, sure. And probably no optimization would subtract a pointer from an integer intentionally. Some of the code I'm patching just didn't consider whether a SCEV was pointer-typed, though; for example, isImpliedCondBalancedTypes() used getNotSCEV() on a pointer.

Maybe I'll leave adding an assertion for a followup, though? I'm already sort of worried about this causing assertion failures in some codepath that isn't covered by the regression tests.

I'm generally worried that previously minus SCEV was always returning something, and now it's returning CouldNotCompute. We may end up catching multiple failures with unexpected CouldNotCompute in different places, such as implication proof.

Do we know how many situations may lead to this query? I would strongly prefer to assert away this situation and avoid ever making a query to compute minus if SCEVs with different bases. Unless it is, for some reason, a massive amount of work.

I think the options we have are either we change getMinusSCEV to have this behavior, or we assert the operands to getMinusSCEV aren't pointers and add a new API that has this behavior. See the current version of D104498 for the impact of adding a new getPointerDiff() API. @reames didn't think the new API was necessary. I'm sort of on the fence; I don't think adding a new API that's essentially the same thing as getMinusSCEV() is really that useful, but it is easier to audit.

@reames didn't think the new API was necessary. I'm sort of on the fence; I don't think adding a new API that's essentially the same thing as getMinusSCEV() is really that useful, but it is easier to audit.

I'd still prefer not adding a new API. There aren't that many uses of this API. I also don't really care though. So I'll defer to Eli's preference here since he's the one actually doing the work.

@mkazantsev Any further thoughts?

mkazantsev accepted this revision.Jul 5 2021, 8:51 PM

Sorry for late response.
I'm fine with that, but please revert it if we face massive CouldNotCompute storm in different places. It's the only concern I have, and it's more theoretical than real.

This revision is now accepted and ready to land.Jul 5 2021, 8:51 PM
This revision was landed with ongoing or failed builds.Jul 6 2021, 10:55 AM
This revision was automatically updated to reflect the committed changes.

We're seeing a crash after this patch. Sent out reproducer in https://bugs.llvm.org/show_bug.cgi?id=51787.

llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp