This is an archive of the discontinued LLVM Phabricator instance.

[SCEV] Stop blindly propagating flags from inbound geps to SCEV nodes
ClosedPublic

Authored by reames on Sep 14 2021, 3:15 PM.

Details

Summary

This fixes a violation of the wrap flag rules introduced in c4048d8f. This was also noted in the (very old) PR23527.

The issue being fixed is that we assume the inbound flag on any GEP assumes that all users of *any* gep (or add) which happens to map to that SCEV would also be UB if the (other) gep overflowed. That's simply not true.

In terms of the test diffs, I don't see anything seriously problematic. The lost flags are expected (given the semantic restriction on when its legal to tag the SCEV), and there are several cases where the previously inferred flags are unsound per the new semantics.

The only common trend I noticed when looking at the deltas is that by not considering branch on poison as immediate UB in ValueTracking, we do miss a few cases we could reclaim. We may be able to claw some of these back with the follow ideas mentioned in PR51817.

It's worth noting that most of the changes are analysis result only changes. The two transform changes are pretty minimal. In one case, we miss the opportunity to infer a nuw (correctly). In the other, we fail to fold an exit and produce a loop invariant form instead. This one is probably over-reduced as the program appears to be undefined in practice, and neither before or after exploits that.

Diff Detail

Event Timeline

reames created this revision.Sep 14 2021, 3:15 PM
reames requested review of this revision.Sep 14 2021, 3:15 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 14 2021, 3:15 PM
nikic accepted this revision.Sep 30 2021, 1:07 PM

LGTM in that the code change here is clearly right. If we get too many perf regression reports though, we may have to back this out and do some improvements to the poison reasoning first. I don't consider fixing this particularly urgent, as it's a very long-standing known bug.

Note that there are some polly test failures -- please update those as well to make sure there are no surprises there.

llvm/test/Analysis/LoopAccessAnalysis/memcheck-wrapping-pointers.ll
51

I believe we're missing out here, because isSCEVExprNeverPoison() only looks at the immediate operations of the SCEV to find an addrec. In this case, it would have to look through the mul and the zext to find it.

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/stencil.ll
15

Lower cost after less precise analysis seems suspicious...

llvm/test/Analysis/ScalarEvolution/load.ll
76

We should be able to keep this one by considering not just addrecs, but also SCEVUnknown defined in the loop header.

This revision is now accepted and ready to land.Sep 30 2021, 1:07 PM

If we get too many perf regression reports though, we may have to back this out and do some improvements to the poison reasoning first. I don't consider fixing this particularly urgent, as it's a very long-standing known bug.

We're caught in a bit of a catch-22 here. If I strengthen the poison reasoning without fixing this and D109845, we start seeing miscompiles in practice. I'd originally hoped to start with the stronger analysis, but my first attempt locally turned up some worrying results. If I can get these in, the basic structure I used in D109845 ports to the generic case exactly as you mention in comments.

At worst, I'll put both under a flag, implement the enhancements to the analysis under the same flag, then enable all of it in one big flag throw. I'm really hoping not to have to do that though...

This revision was landed with ongoing or failed builds.Oct 1 2021, 4:32 PM
This revision was automatically updated to reflect the committed changes.
dim added a subscriber: dim.Dec 6 2021, 2:27 PM

From a fellow FreeBSD user I received a test case which *appears* to be fixed by this particular review/commit, and which otherwise leads to an assertion:

Assertion failed: (isSCEVable(Ty) && "Type is not SCEVable!"), function getTypeSizeInBits, file /usr/src/contrib/llvm-project/llvm/lib/Analysis/ScalarEvolution.cpp, line 3922.
PLEASE submit a bug report to https://bugs.freebsd.org/submit/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: clang -cc1 -triple x86_64-- -S -target-cpu znver3 -O2 read_termcap-min.c
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'read_termcap-min.c'.
4.      Running pass 'Loop Pass Manager' on function '@e'
5.      Running pass 'Loop Strength Reduction' on basic block '%while.cond2.15'

However, it *needs* the -target-cpu znver3, otherwise the assertion won't trigger. I assume this review/commit is not expected to fix this?

For reference, the minimized test case is just this (easiest is to use clang 13.0.0 to check it, it should assert right away):

// clang -cc1 -triple x86_64-- -S -target-cpu znver3 -O2 read_termcap-min.c
char *a;
char **b;
char *d[0];
int e() {
  char c[0];
  b = d;
  a = c;
  while (++a)
    if (*a) {
      while (++a)
        if (*a)
          break;
      b++;
      if (b >= d + 32)
        break;
    }
  return 0;
}

From a fellow FreeBSD user I received a test case which *appears* to be fixed by this particular review/commit, and which otherwise leads to an assertion:

I would doubt there's any connection here. I suspect that if you reduce this to an IR example using clang-13 and llvm-reduce/bugpoint that the reduced test case would still trigger after this change.

jrtc27 added a subscriber: jrtc27.Dec 11 2021, 1:34 PM

Indeed, both die on https://godbolt.org/z/zh81Pa8fd with a stack overflow (presumably that's what happens if assertions are disabled and don't catch it), and both are happy with trunk's IR (https://godbolt.org/z/v31Tzo7zP for both lots of IR). Something in SCEV/loop-reduce is getting very confused and keeps inserting new copies of the same GEP until the recursive expansion runs out of stack space.

Can you file a bug now that the transition is complete?

dim added a comment.Dec 14 2021, 10:24 AM

Can you file a bug now that the transition is complete?

https://github.com/llvm/llvm-project/issues/52697