This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorize] Induction variables: support arbitrary constant step
ClosedPublic

Authored by HaoLiu on Jan 26 2015, 10:36 PM.

Details

Summary

Hi,

Now for induction variables it's possible to have only -1 and +1 step values. But for induction variables with other steps, LV will do nothing.
Alexey wrote a path in D6051 to support arbitrary induction variable steps. I think his patch is very useful so I toke it over and fixed some bugs (Minor bugs about induction calculation which caused some runtime failures in SPEC2000 and LNT). Now this patch can pass our internal tests.

There are two kinds of induction variables:

  1. integer induction: for (int i = 0; i < 1024; i+=2) { int tmp = *A++; sum += i * tmp; }

"i" is an integer induction variable of step 2. Actually such case can be well vectorized if we support arbitrary induction variable steps.

  1. pointer induction: for (int i = 0; i < 1024; i++) { int tmp0 = *A++; int tmp1 = *A++; sum += tmp0 * tmp1; }

pointer "A" is an pointer induction variable of step 2. Even we support arbitrary stepsCurrently, we still can not vectorize such case well. LoopVectorizer will say "vectorization is possible but not benefical". But we still can force the LoopVectorizer to do vectorization.

Actually if the targets support masked/interleaved load/store, we can vectorize the second case very well. For example, AArch64 backend supports interleaved load/store. To vectorize "tmp0" and "tmp1", we only need one interleaved load such as "LD2 {V0, V1}, [X0]". Vector register V0 and V1 will contain interleaved data. V0 contains "A[0], A[2], A[4], A[6]", and V1 contains "A[1], A[3], A[5], A[7]".

This patch has no big impact on performance according to the tests on AArch64 targets. There are few cases like the first case. There are many cases like the second case, the loop vectorizer thinks it is not beneficial to do vectorization, and most likely it will just do interleave. But I think if we support masked/interleaved load/store in the future, we can get many performance improvements. But I think the first step is that the LoopVectorizer should support arbitrary induction variable steps.

Review please.

Thanks,
-Hao

Diff Detail

Event Timeline

HaoLiu retitled this revision from to [LoopVectorize] Induction variables: support arbitrary constant step.
HaoLiu updated this object.
HaoLiu edited the test plan for this revision. (Show Details)
HaoLiu added reviewers: nadav, hfinkel.
HaoLiu added subscribers: volkalexey, Unknown Object (MLST).
hfinkel edited edge metadata.Jan 27 2015, 6:38 AM

Thanks for continuing to work on this. I think this is a nice over-all code simplification (that also gives us some forward-looking added capabilities).

lib/Transforms/Vectorize/LoopVectorize.cpp
712

I prefer that is* function return a boolean, but this returns a (-1, 0, 1) value. Maybe getConsecutiveDirection() would be better?

722

Given that we don't vectorize loops with wrap-around index spaces, as far as I know, we should be able to add nsw (or nuw or both) flags on these operations. Please do so when possible.

1670

I know it was like this before, but nsw/nuw?

3205

nsw/nuw? (same with the others in the code below)

test/Transforms/LoopVectorize/arbitrary-induction-step.ll
8

If you're checking debug output, you'll need to add:

; REQUIRES: asserts

to the entire test. Please break this out into a separate test file, if you'd like to do this, to limit the reduction in test coverage for non-asserts builds.

aschwaighofer added inline comments.Jan 27 2015, 8:03 AM
lib/Transforms/Vectorize/LoopVectorize.cpp
722

Hal, I don't think this is safe. Consider the following example. The canonical induction variable does not wrap but the derived one does.

int test(int *a) {

int r = 0;
uint64_t idx2=0;
for (uint64_t i = 0 ; i < UINT64_MAX; ++i) {
  r+=a[idx2];
  idx2 += 2;
}
return r;

}

hfinkel added inline comments.Jan 27 2015, 8:10 AM
lib/Transforms/Vectorize/LoopVectorize.cpp
722

Agreed. But shouldn't we be able to get the nsw/nuw flags from the original scalar operation?

(If so, but this departs too far from the scope of this patch, I'm fine with adding a FIXME and leaving this for follow-up)

HaoLiu edited edge metadata.

Hi,

According to the comments, a new patch has been attached.
Review, please.

Thanks,
-Hao

lib/Transforms/Vectorize/LoopVectorize.cpp
722

Agree with you two. Several FIXMEs have been added.

test/Transforms/LoopVectorize/arbitrary-induction-step.ll
8

Hal, thanks for let me know about this.
As I think it's not worth to add another file just for testing the debug info, I remove such 'CHECK' and '-debug-only'.

hfinkel accepted this revision.Jan 28 2015, 7:22 AM
hfinkel edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Jan 28 2015, 7:22 AM
nadav edited edge metadata.Jan 28 2015, 1:59 PM

Hao, this patch LGTM. This is a major change to the vectorizer so please run the llvm test suite and try to identify miscompiles or regressions.

HaoLiu closed this revision.Mar 15 2015, 7:19 PM