Page MenuHomePhabricator

[SCEV] Fix nsw flags for GEP expressions
ClosedPublic

Authored by nikic on Nov 2 2020, 2:25 PM.

Details

Summary

The SCEV code for constructing GEP expressions currently assumes that the addition of the base and all the offsets is nsw if the GEP is inbounds. While the addition of the offsets is indeed nsw, the addition to the base address is not, as the base address is interpreted as an unsigned value (gep inbounds 0x7fffffff, 1 is legal).

Fix the GEP expression code to not assume nsw for the base+offset calculation. However, do assume nuw if we know that the offset is non-negative. With this, we use the same behavior as the construction of GEP addrecs does. (Modulo the fact that we disregard SCEV unification, as the pre-existing FIXME points out).

Diff Detail

Event Timeline

nikic created this revision.Nov 2 2020, 2:25 PM
nikic requested review of this revision.Nov 2 2020, 2:25 PM
nikic added inline comments.Nov 2 2020, 2:34 PM
llvm/test/Analysis/ScalarEvolution/nsw.ll
130–131

An unexpected improvement.

llvm/test/Transforms/LoopFusion/simple.ll
301

Genuine regression. Looking at the loop fusion log, the problem is that SCEV can no longer prove

Before:
    Relation: {%arg,+,4}<nsw><%bb19>  >=  {(-12 + %arg)<nsw>,+,4}<nsw><%bb19>

After:
    Relation: {%arg,+,4}<nuw><%bb19>  may <  {(-12 + %arg),+,4}<nw><%bb19>

where the comparisons are in a signed sense. And indeed, I do not believe that condition actually holds.

mkazantsev added inline comments.Nov 2 2020, 8:56 PM
llvm/lib/Analysis/ScalarEvolution.cpp
3487

How about case when gep has no inbounds flag, but we can prove that base+ offset does not sign-wrap? Likewise, we could prove that base + offset does not unsign-wrap even without non-negative fact for offset in some cases. Why don't we use isKnownPredicate(slt/ult, base, base + offset) (at least in affine case)?

mkazantsev added inline comments.Nov 2 2020, 9:04 PM
llvm/lib/Analysis/ScalarEvolution.cpp
3488

Isn't inbounds gep always nuw? Do we really need the non-negative check?

I'd suggest spllitting this one into two parts: fix for illegal nsw and adding of missing nuw (which is pure improvement).

nikic added inline comments.Thu, Nov 5, 12:25 PM
llvm/lib/Analysis/ScalarEvolution.cpp
3487

If we can prove that, the fact wouldn't follow from the presence of inbounds in the GEP -- in fact, we could use this method to prove additional nowrap flags on arbitrary adds. As such, I don't think that this code is the right place to perform such a check.

3488

As the offset is interpreted as a signed number, no. For a negative offset, sub %base, -1 * %offset would be nuw, but we don't have a way to express that.

mkazantsev accepted this revision.Sun, Nov 8, 8:06 PM

Looks good, thanks!

llvm/lib/Analysis/ScalarEvolution.cpp
3488

Ah, got it. It's nasty. Thanks for explaining!

This revision is now accepted and ready to land.Sun, Nov 8, 8:06 PM
This revision was landed with ongoing or failed builds.Fri, Nov 13, 9:19 AM
This revision was automatically updated to reflect the committed changes.

This patch does not have a new test case added to demonstrate what this patch is trying to fix or reference the bug number showing the bug end-to-end.

This patch does not have a new test case added to demonstrate what this patch is trying to fix or reference the bug number showing the bug end-to-end.

This does not fix a specific end-to-end miscompile. It fixes incorrect analysis results that may result in a miscompile. See also D90708 for a clarification of GEP inbounds semantics in LangRef.

thanks for the background. Regarding D90708 about the LangRef change, was there a RFC before the patch?

This patch does not have a new test case added to demonstrate what this patch is trying to fix or reference the bug number showing the bug end-to-end.

This does not fix a specific end-to-end miscompile. It fixes incorrect analysis results that may result in a miscompile. See also D90708 for a clarification of GEP inbounds semantics in LangRef.

We're actually seeing this *cause* a miscompile, AFAICT.

The method my_strnxfrm_unicode_full_bin from mysql 5.7 [1] has a loop like:

const uchar *const weightend = dst + (nweights*3);
const uchar *const end = (weightend < de) ? weightend : de;

while (dst < end-2)
{
  *dst++= 0x00;
  *dst++= 0x00;
  *dst++= 0x20;
  nweights--;
}

The failure we see is ASAN kill the process with heap-buffer-overflow when it runs into bad memory. However, the write is well past end, indicating the loop is not exiting when it should.

I dumped the IR for this method, and it looks like the beginning & end changed like so:

  %add.ptr56 = getelementptr inbounds i8, i8* %cond, i64 -2, !dbg !1195  ; This is "end - 2"
...
  br i1 %cmp57212, label %while.body, label %while.end, !dbg !1197

while.body:                                       ; preds = %if.then48, %163
...
  %incdec.ptr65 = getelementptr inbounds i8, i8* %dst.addr.4214, i64 3, !dbg !1202  ; This is the last "*dst++ = 0x20" expression
...
  %cmp57 = icmp ult i8* %incdec.ptr65, %add.ptr56, !dbg !1196  ; Compare dst against end - 2
  br i1 %cmp57, label %while.body, label %while.end, !dbg !1197, !llvm.loop !1205

After this patch, we see:

  %add.ptr56 = getelementptr inbounds i8, i8* %cond, i64 -2, !dbg !1195
...
  br i1 %cmp57212, label %while.body.preheader, label %while.end, !dbg !1197

while.body.preheader:                             ; preds = %if.then48
  %scevgep228 = getelementptr i8, i8* %dst.addr.3, i64 3, !dbg !1197
  br label %while.body, !dbg !1197

while.body:                                       ; preds = %while.body.preheader, %163
...
  %incdec.ptr65 = getelementptr inbounds i8, i8* %dst.addr.4214, i64 3, !dbg !1202
...
  %cmp57229 = icmp ult i8* %scevgep228, %add.ptr56, !dbg !1197
  br i1 %cmp57229, label %while.body, label %while.end, !dbg !1197, !llvm.loop !1205

If I'm reading the diff correctly, this patch adds a while.body.preheader BB for the SCEV GEP expression, and at the end compares that against end - 2 to exit the loop. However, we jump back to the loop body, not the preheader, so the GEP expression is never updated. And hence, we end up with an infinite loop.

[1] https://github.com/mysql/mysql-server/blob/5.7/strings/ctype-utf8.c#L5217

Turns out this isn't failing at trunk anymore. It looks like this was fixed by 02fdbc3567249471349474c70828cb5a5d4881c8. For posterity, here's my reduction:

#include <cstdio>

void repro(char *dst, size_t dstlen, char *src) {
  char *end = dst + dstlen;

  while (dst < end - 3) {
    *dst++ = 0x00;
    *dst++ = 0x00;
    *dst++ = 0x00;
  }
}

int main() {
  char src[] = {1, 2, 3};
  char dst[100];
  repro(dst, 100, src);
}

That fails with running with clang++ -O1 -fsanitize=address,null -fno-sanitize-recover=all