This is an archive of the discontinued LLVM Phabricator instance.

Ignore Unknown dependencies using vectorize.ivdep metadata
Needs ReviewPublic

Authored by YashasAndaluri on Nov 23 2019, 9:36 AM.

Details

Summary

This patch implements the functionality of llvm.loop.vectorize.ivdep.enable metadata. The metadata indicates that dependencies of Unknown type can be assumed to be Safe during the CanVecMem check. This patch depends on D69391

Diff Detail

Event Timeline

YashasAndaluri created this revision.Nov 23 2019, 9:36 AM

[serious] We did not come to a conclusion during the discussion about the semantics ivdep being implementation-dependent.

Also consider that vectorize_enable(assume_safety) has been designed such that if previous passes move code into the loop that has not been in the source code loop, the safety semantic does not apply to the moved code. This implementation would apply ivdep on the moved code (which might be okay if we shift the responsibility to the code motion pass to remove the metadata).

llvm/docs/LangRef.rst
5492

[serious] Please also describe the relationship to llvm.mem.parallel_loop_access.

llvm/include/llvm/Analysis/LoopAccessAnalysis.h
204

[style] Start variables with a capital letter. Also to not use two different styles on the same line.

llvm/lib/Analysis/LoopAccessAnalysis.cpp
2035–2037

[serious] areDepsSafe is an expensive call which we should not do redundantly when only one of the result is actually used.

[suggestion] Use only CanVecMem and update it according to ivdep.

Passed Ivdep hint from LoopVectorizationLegality using GetLAA, to avoid computing areDepsSafe twice.

Thank you for checking this diff and for the suggestions! Sorry for the delayed update on this.

[serious] We did not come to a conclusion during the discussion about the semantics ivdep being implementation-dependent.

I was under the impression that Unknown dependencies were ambiguous to the vectorizer and so would be along the lines of the Cray semantics. Please suggest what would be a good approach for this.

Also consider that vectorize_enable(assume_safety) has been designed such that if previous passes move code into the loop that has not been in the source code loop, the safety semantic does not apply to the moved code. This implementation would apply ivdep on the moved code (which might be okay if we shift the responsibility to the code motion pass to remove the metadata).

I believe vectorize_enable(assume_safety) does this by marking the memory access instructions in the original loop (before any code is moved into the loop) with llvm.access.group metadata? Would it be recommended to do something similar for ivdep?

YashasAndaluri marked an inline comment as done.Dec 12 2019, 6:16 AM
YashasAndaluri added inline comments.
llvm/docs/LangRef.rst
5492

llvm.loop.parallel_accesses metadata indicates that no dependencies exist between instructions marked with llvm.access.group metadata and can be executed in parallel, whereas llvm.loop.vectorize.ivdep.enable indicates that Unknown dependencies are safe for vectorization.

[serious] We did not come to a conclusion during the discussion about the semantics ivdep being implementation-dependent.

I was under the impression that Unknown dependencies were ambiguous to the vectorizer and so would be along the lines of the Cray semantics. Please suggest what would be a good approach for this.

Cray's semantics are dangerous in that it may unpredictably miscompile depending on which compiler you are using. Just being slower would have been fine. The issue is that there might not be a good approach and I am not convinced that approximated compatibility to Cray is a good-enough reason to introduce unpredictable miscompilation. However, just parsing #pragma ivdep but ignoring it might be fine.

Why may it miscompile unpredictably? Consider:

#pragma ivdep
for (int i = 0; i < n; i+=1)
  sum[i/n] += A[i];

Since 0 <= i < n, i/n will always evaluate to 0. But does the compiler known? If it does, it might recognize the reduction in

double tmp = sum[0];
for (int i = 0; i < n; i+=1)
  tmp += A[i];
sum[0] = tmp;

and vectorize it correctly. However, if it does not, ivdep allows it to ignore the loop-carried dependency and write arbitrary values to sum[0] (for instance just the last value sum[0] = A[n-1]), i.e. miscompile the reduction.

We cannot assume that LLVM's capability to recognize dependencies and reductions are stricly superior over Cray's, hence miscompile programs that Cray's compiler did not. Even worse, the user cannot deduce whether their code will be compiled correctly or not.

Updated tests

Add checks for vector instructions

Do you try getting this uptreamed? It won't be accepted before we come to a conclusion for the unpredictable miscompilation.

Do you try getting this uptreamed? It won't be accepted before we come to a conclusion for the unpredictable miscompilation.

I understand, I am not trying to upstream it.

mdchen added a subscriber: mdchen.Feb 9 2021, 12:58 AM
troyj added a subscriber: troyj.Feb 9 2021, 5:30 PM
Meinersbur resigned from this revision.Feb 19 2021, 10:53 AM