This is an archive of the discontinued LLVM Phabricator instance.

LoopVectorize support for simd functions
Needs ReviewPublic

Authored by mmasten on Nov 28 2017, 1:11 PM.

Details

Summary

This patch includes the LoopVectorize changes necessary to call simd functions and is associated with revision D22792.

Diff Detail

Event Timeline

mmasten created this revision.Nov 28 2017, 1:11 PM
fhahn added a subscriber: fhahn.Nov 28 2017, 2:20 PM
hfinkel added inline comments.Dec 14 2017, 6:30 PM
lib/Analysis/LoopAccessAnalysis.cpp
1735

I don't think that this is obvious. If IsAnnotatedParallel is true, then this is fine. If not, I don't think that vector-variants implies anything about the memory-referencing behavior of the variant (and it might be some external thing we can't analyze). I imagine that, at least most of the time, when locally defined, these will be argmemonly functions (but I don't think that even that is guaranteed, unfortunately).

Also, if we've been told to vectorize, and assured that it is safe, we should do this even if there are calls without vector variants (we'll just scalarize in that case). I see no reason that a function with inapplicable vector variants should be special in this sense. You might want to do that and then rebase on top of that change.

lib/Analysis/VectorUtils.cpp
601

Please make this comment more formal. In part, you should explain that the ABI for _Complex is platform dependent (including at the IR level).

lib/Transforms/Vectorize/LoopVectorize.cpp
4731

the only -> The only

4736

the only -> The only

4965

SimdVariant might be nullptr; don't segfault in the DEBUG statement in that case.

5082

Don't do this. VecClone needs to be able to detect when it doesn't need to generate a new function.

Hi Matt, this is very nice. I have a couple of comments to add to Hal's one.

  1. I would like to see some description of which bits of the patch the tests are verifying. For example, I can see that you have two similar tests with masked and unmasked functions, but it is not clear from the IR I see and from the opt invocation why one tests generates a masked call and the other one an unmasked one. It would be great if you could also add the original C code you intended to vectorize with this patch as a comment.
  2. Would it be possible to reduce the tests to minimal size? For example, you have generated "vector-variants"="_ZGVbN8vlu_dowork,_ZGVcN8vlu_dowork,_ZGVdN8vlu_dowork,_ZGVeN8vlu_dowork,_ZGVbM8vlu_dowork,_ZGVcM8vlu_dowork,_ZGVdM8vlu_dowork,_ZGVeM8vlu_dowork", but it should also work with vector-variants consisting only of the single variant you want to use in the vecttorizer.
  3. You are testing linear and uniform. Could you also add tests for simple cases like the following?
#pragma omp declare simd linear(y)
double f(double x) {
//...
}

void loop(double *x, double *y, int N) {
  for (int i = 0; i < N; ++i) {
    y[i] = f(x[i]);
  }
}
include/llvm/Analysis/VectorUtils.h
182

Given that this is likely to change for different architectures, I wonder whether it is worth to redirect it to an overload method of TargetTransformInfo.

test/Transforms/LoopVectorize/masked_simd_func.ll
91

Could you unit test this only adding the _ZGVdM8vlu_dowork variant to the attribute?