This is an archive of the discontinued LLVM Phabricator instance.

[clang] Update an optimization remark test for change D18777
AbandonedPublic

Authored by lihuang on Jun 27 2016, 4:12 PM.

Details

Reviewers
reames
Summary

Update an optimization remark test for change D18777.

This test checks the loop-vectorization remarks when pointer checking threshold is exceeded. The change in D18777 would introduce zexts that cannot be removed so that the "loop not vectorized" reason is changed, hence breaking this test.

Modified the offsets to be 1 and the zexts could be finally removed by indvars (this magic fact is attributed to some scev mechanisms). Since the purpose of this test is checking the vectorization options, the offset numbers don't matter.

Diff Detail

Event Timeline

lihuang updated this revision to Diff 62024.Jun 27 2016, 4:12 PM
lihuang retitled this revision from to [clang] Update an optimization remark test for change D18777.
lihuang updated this object.
lihuang added reviewers: sanjoy, reames.
lihuang added a subscriber: cfe-commits.
sanjoy edited edge metadata.Jun 28 2016, 5:38 PM
sanjoy added a subscriber: anemet.

Sound plausible, but I don't know this area (optimization remarks) well enough to sign off on this. @anemet can you please take a look?

This test checks the loop-vectorization remarks when pointer checking threshold is exceeded. The change in D18777 would introduce zexts that cannot be removed so that the "loop not vectorized" reason is changed, hence breaking this test.

Can you please elaborate? The "loop not vectorized" reason is changed, to what?

Hi Adam,

The change in D18777 breaks this test becasue it converts some sexts to zexts, which cannot be eliminated by indvar-simplification after widening IV.

The IR after indvar-simplification and before loop-vectorization is like:

...
%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
...
add nuw nsw i64 %indvars.iv, 2                  // i + 2
%12 = trunc i64 %11 to i32                
%idxprom2047 = zext i32 %12 to i64
%arrayidx21 = getelementptr inbounds i32, i32* %C, i64 %idxprom2047
...
%14 = add nuw nsw i64 %indvars.iv, 3            // i + 3
%15 = trunc i64 %14 to i32
%idxprom2448 = zext i32 %15 to i64
...

IV is promoted to 64-bit but the trunc/zext cannot be eliminated (at least cannot be eliminated with the -O1 pass pipeline). Then optimzation remark becomes:

optimization-remark-options.c:17:3: remark: loop not vectorized: cannot identify array bounds
   [-Rpass-analysis=loop-vectorize]
for (int i = 0; i < N; i++) {

IV is promoted to 64-bit but the trunc/zext cannot be eliminated (at least cannot be eliminated with the -O1 pass pipeline). Then optimzation remark becomes:

optimization-remark-options.c:17:3: remark: loop not vectorized: cannot identify array bounds
   [-Rpass-analysis=loop-vectorize]
for (int i = 0; i < N; i++) {

That sounds like an optimization regression. It seems to me that you could create a testcase with fewer arrays than in the above test such that you don't exceed the max number of memchecks. This new testcase would be vectorized before D18777 but not after.

Adam

You are right. A regression test could be:

void foo2(int *dw, int *uw, int *A, int *B, int *C, int *D, int N) {
  for (int i = 0; i < N; i++) {
    dw[i] = A[i] + B[i - 1] + C[i - 2];
    uw[i] = A[i] + B[i + 2];
  }
}

need to fix the fundamental problem.

sanjoy resigned from this revision.Aug 5 2016, 6:51 PM
sanjoy removed a reviewer: sanjoy.
lihuang abandoned this revision.Aug 10 2016, 3:05 PM