This is an archive of the discontinued LLVM Phabricator instance.

[LV] Parallel annotated loop does not imply all loads can be hoisted.
ClosedPublic

Authored by fodinabor on Jun 8 2021, 9:19 AM.

Details

Summary

As noted in https://bugs.llvm.org/show_bug.cgi?id=46666, the current behavior of assuming if-conversion safety if a loop is annotated parallel (!llvm.loop.parallel_accesses), is not expectable, the documentation for this behavior was since removed from the LangRef again, and can lead to invalid reads.
This was observed in POCL (https://github.com/pocl/pocl/issues/757) and would require similar workarounds in current work at hipSYCL.

The question remains why this was initially added and what the implications of removing this optimization would be.
Do we need an alternative mechanism to propagate the information about legality of if-conversion?
Or is the idea that conditional loads in #pragma clang loop vectorize(assume_safety) can be executed unmasked without additional checks flawed in general?
I think this implication is not part of what a user of that pragma (and corresponding metadata) would expect and thus dangerous.

Only two additional tests failed, which are adapted in this patch. Depending on the further direction force-ifcvt.ll should be removed or further adapted.

Diff Detail

Event Timeline

fodinabor created this revision.Jun 8 2021, 9:19 AM
fodinabor requested review of this revision.Jun 8 2021, 9:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 8 2021, 9:19 AM

Could you also update the language reference? It currently reads:

The metadata on memory reads also implies that if conversion (i.e. speculative execution within a loop iteration) is safe.

OK, disregard that, I was looking at an old language reference (the first found by Google).

The part was added in by @hfinkel in rG411d31ad72456ba88c0b0bee0faba2b774add65f and removed by me in rG978ba61536c2cdafa8454b7330c5d8e58d0d5048 (might have been unintentional, at least don't remember removing intentionally)

As @pjaaskel noted in the bug report, the speculative semantics violate semantics such as by OpenMP. IMHO if really needed, speculation-safety should have its own metadata.

inducer added a subscriber: inducer.Jun 8 2021, 2:59 PM

Thanks for tackling this! As a minor point, the correct link to the original pocl bug report is https://github.com/pocl/pocl/issues/757.

Meinersbur added inline comments.Jun 8 2021, 5:29 PM
llvm/test/Transforms/LoopVectorize/X86/force-ifcvt.ll
-41–-36

I suggest to remove broken-by-design tests entirely.

Kazhuu added a subscriber: Kazhuu.Jun 8 2021, 10:25 PM
fodinabor updated this revision to Diff 350826.Jun 9 2021, 2:10 AM

Remove broken-by-design force-ifcvt.ll test.

fodinabor edited the summary of this revision. (Show Details)Jun 9 2021, 2:11 AM

I am in favor of this. Would be great if we had another person had a look. @pjaaskel? @fhahn?

To make another case: I think this kind speculation causes undefined behaviour/implementation-dependent semantics. For example:

#pragma omp simd // adds llvm.loop.parallel_accesses
for (int i = 0; i < N; i+=1) {
  double *C = i ? A : B;
  V = C[i].;
}

which an optimization might transform to

#pragma omp simd
for (int i = 0; i < N; i+=1) {
  if (i)
    V = A[i];
  if (!i)
    V = *B;
}

This moved from an unconditional memory access (no if-conversion) to two speculable accesses. In cases such as N == 1, A was never accessed (and might not have been allocated), but it would after an if-conversion. IMHO it is not predictable what memory is going to accessed.

jdoerfert accepted this revision.Jun 10 2021, 1:45 PM
jdoerfert added a subscriber: jdoerfert.

This fixes a bug and people are in favor. If we need to add a new mechanism for this we can afterwards still. LGTM

This revision is now accepted and ready to land.Jun 10 2021, 1:45 PM

FWIW, if someone adds documentation what #pragma clang loop vectorize(assume_safety) actually means (somewhere), we can go back and make it imply if-conversion. However, we should not conflate it with the access groups/parallel metadata. Instead, annotate loads as "speculatable" (or sth. similar) and we go with that.