This is an archive of the discontinued LLVM Phabricator instance.

[Vectorization] Actually return from error case in isStridedPtr
ClosedPublic

Authored by timshen on Jan 5 2016, 2:43 PM.

Details

Summary

The early return seems to be missed. This causes a radical and wrong loop optimization on powerpc. It isn't reproducible on x86_64, because "UseInterleaved" is false.

Diff Detail

Event Timeline

timshen updated this revision to Diff 44064.Jan 5 2016, 2:43 PM
timshen retitled this revision from to [Vectorization] Actually return from error case in isStridedPtr.
timshen updated this object.
timshen added subscribers: llvm-commits, echristo.

This test case is a bit large, can it be made smaller?

test/Analysis/LoopAccessAnalysis/interleave_innermost.ll
1 ↗(On Diff #44064)

You can shorten this pipeline with:
; RUN: opt -O2 -S < %s | FileCheck %s

timshen updated this revision to Diff 44072.Jan 5 2016, 4:51 PM

Done.

In order to reproduce the bug, it seems to require a nested loop structure. I've reduced the test file to 54 lines, in which the nested loops are pretty obvious to me. Do you think that's simple enough?

timshen marked an inline comment as done.Jan 5 2016, 4:52 PM

Mark previous comment as done.

The change itself looks good, but there are several things about the test that could be improved:

  1. I think you don't need to run the entire O2 pipeline on the test. For instance, if the offending optimization is "loop-vectorizer", we could extract IR before it, and replace opt -O2 with opt -loop-vectorize on this IR (to find the list of the O2 passes, run opt -O2 -debug-pass=Arguments).
  2. Do you use CHECK + XFAIL only to make sure that the specified lines aren't present in the output? Could you use CHECK-NOT instead? Also, if you run only one pass, it will be much easier to check that nothing unexpected happened with the IR.
  3. Could you please run opt -instnamer on the IR to avoid %123 and <label>:123 names? It's very painful to edit tests with such variable names (as the numbers must be consecutive).
  4. Minor point: you probably could remove function attributes.

Thanks,
Michael

timshen updated this revision to Diff 44080.Jan 5 2016, 8:02 PM

Done. These are really useful options! Thanks!

majnemer edited edge metadata.Jan 5 2016, 8:26 PM

FWIW, the following also triggers the missing early return with opt -loop-vectorize

target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"

define void @TestFoo(i1 %X, i1 %Y) {
bb:
  br label %.loopexit5.outer

.loopexit5.outer:
  br label %.lr.ph12

.loopexit:
  br i1 %X, label %.loopexit5.outer, label %.lr.ph12

.lr.ph12:
  %f.110 = phi i32* [ %tmp1, %.loopexit ], [ null, %.loopexit5.outer ]
  %tmp1 = getelementptr inbounds i32, i32* %f.110, i64 -1
  br i1 %Y, label %bb4, label %.loopexit

bb4:
  %j.27 = phi i32 [ 0, %.lr.ph12 ], [ %tmp7, %bb4 ]
  %tmp5 = load i32, i32* %f.110, align 4
  %tmp7 = add nsw i32 %j.27, 1
  %exitcond = icmp eq i32 %tmp7, 0
  br i1 %exitcond, label %.loopexit, label %bb4
}
test/Analysis/LoopAccessAnalysis/interleave_innermost.ll
1 ↗(On Diff #44080)

Is -instcombine necessary here?

timshen updated this revision to Diff 44090.Jan 5 2016, 9:53 PM
timshen edited edge metadata.

Yes, your test case is much simpler, thanks! :) I tweaked a bit (stride needs to be at least 2 to trigger interleave) to reveal the wrong behavior of the old opt. In such case it has to be a XFAIL, since what we hope is "interleave should not fire" - testing the correct code may be fragile.

timshen marked an inline comment as done.Jan 5 2016, 9:55 PM
timshen added inline comments.
test/Analysis/LoopAccessAnalysis/interleave_innermost.ll
2 ↗(On Diff #44090)

Not applicable. It was necessary, otherwise the old opt produces correct code.

1 ↗(On Diff #44080)

Not applicable. It was necessary, otherwise the old opt produces correct code.

timshen marked 2 inline comments as done.Jan 5 2016, 10:31 PM

Check not?

I'm not sure - can I use CHECK-NOT(s) for a multi-line pattern?

A bit thinking: semantically I want not (LINE_PATTERN_1 and LINE_PATTERN_2 and LINE_PATTERN3), which means I may do something roughly like (not LINE_PATTERN_1) or (not LINE_PATTERN_2) or (not LINE_PATTERN_3), not LINE_PATTERNx can be expressed as CHECK-NOT, but I'm not sure if we have a way to express or.

timshen updated this revision to Diff 44151.Jan 6 2016, 1:16 PM

Removed XFAIL and CHECK-NOT for interleave related value names - if XFAIL doesn't seem to be used in such a scenario.

echristo edited edge metadata.Jan 7 2016, 4:01 PM

I don't think that you can CHECK-NOT two things in a row like this. You'll need a check between. In general for this you should be able to just check that there aren't any vector types in the program? Or you could add something in between as well.

-eric

timshen updated this revision to Diff 44292.Jan 7 2016, 4:51 PM
timshen edited edge metadata.

I think CHECK-NOT for "wide.vec" is good enough.

echristo accepted this revision.Jan 7 2016, 4:52 PM
echristo edited edge metadata.

Seems reasonable to me as well.

-eric

This revision is now accepted and ready to land.Jan 7 2016, 4:52 PM
timshen closed this revision.Jan 8 2016, 11:33 AM

Thanks iteratee for committing as r257134 !