This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Emit parallel_loop_access for each loop in the loop stack.
ClosedPublic

Authored by Meinersbur on Jun 30 2018, 10:03 PM.

Diff Detail

Repository
rC Clang

Event Timeline

Meinersbur created this revision.Jun 30 2018, 10:03 PM

I don't think that this is the intended behavior of the #pragma clang loop. it is better to ask the author of this pragma is this correct or not.

I don't think that this is the intended behavior of the #pragma clang loop. it is better to ask the author of this pragma is this correct or not.

It is the intended behavior that the memory accesses are independent with respect to the outer (annotated) loop (even if those accesses are within an inner loop). They're not independent with respect to the inner loop unless that loop is also annotated. Thus, this looks correct.

Michael, can you please add a test with two inner loops, one where more than one is annotated, and one where only the outer loop is annotated? It's not clear to me that I->setMetadata will do the right thing here in the former case.

I don't think that this is the intended behavior of the #pragma clang loop. it is better to ask the author of this pragma is this correct or not.

I understand it as the intended behavior of the assume_safety option (also used for #pragma omp simd).

@tyler.nowicki What is the intended behaviour?

Hi Michael, Hal,

Sorry it has been a while since I looked at this. My memory is a little
fuzzy. The intent of 'assume_safety' is to tell LAA to skip dependency
checking on loads and stores so the vectorizer doesn't stop as soon as it
sees both in a loop. At the time 'assume_safety' was implemented the
vectorizer was limited to inner-loops. I am not up-to-date but it seems to
have the ability to perform some vectorization of non-inner loop
instructions?

If we can vectorize non-inner loop instructions then what behavior would
make the most sense: 'assume_safety' applies to the same loop scope(s) as
the other loop pragmas, or it applies to all nested loops?

My opinion is that for consistency 'assume_safety' and similar options
apply to the same scope(s) as 'vectorize(enable)'. But I am open to
alternatives if others see it differently.

Tyler Nowicki

Michael, can you please add a test with two inner loops, one where more than one is annotated, and one where only the outer loop is annotated? It's not clear to me that I->setMetadata will do the right thing here in the former case.

The test case pragma-loop-safety-nested.cpp check the case where the outer loop is annotated.

Iterating over Active will iterate from outer to inner loops, meaning I->setMetadata will overwrite the annotation of the outermost loop (which IMHO is the most useful behaviour). Since it not possible to add multiple !llvm.mem.parallel_loop_access annotation, we cannot annotate multiple loops.

Michael, can you please add a test with two inner loops, one where more than one is annotated, and one where only the outer loop is annotated? It's not clear to me that I->setMetadata will do the right thing here in the former case.

The test case pragma-loop-safety-nested.cpp check the case where the outer loop is annotated.

Iterating over Active will iterate from outer to inner loops, meaning I->setMetadata will overwrite the annotation of the outermost loop (which IMHO is the most useful behaviour). Since it not possible to add multiple !llvm.mem.parallel_loop_access annotation, we cannot annotate multiple loops.

We specifically defined the metadata to support nested loops. The LangRef says, "The llvm.mem.parallel_loop_access metadata refers to a loop identifier, or metadata containing a list of loop identifiers for nested loops." To handle nested loops, we need to make the instruction metadata point to a list of loop IDs.

We specifically defined the metadata to support nested loops. The LangRef says, "The llvm.mem.parallel_loop_access metadata refers to a loop identifier, or metadata containing a list of loop identifiers for nested loops." To handle nested loops, we need to make the instruction metadata point to a list of loop IDs.

Thanks for pointing me to it. I've not seen parallel_loop_access pointing to a list, so I didn't know it was possible.

We specifically defined the metadata to support nested loops. The LangRef says, "The llvm.mem.parallel_loop_access metadata refers to a loop identifier, or metadata containing a list of loop identifiers for nested loops." To handle nested loops, we need to make the instruction metadata point to a list of loop IDs.

Thanks for pointing me to it. I've not seen parallel_loop_access pointing to a list, so I didn't know it was possible.

Yea, well, we defined it to be possible. As to whether the backend will actually correctly read the metadata in that format, I don't know.

  • Allow multiple parallel annotations
Meinersbur updated this revision to Diff 153803.Jul 2 2018, 2:46 PM
  • Adapt test file message.
  • Hoist I->mayReadOrWriteMemory() condition.
hfinkel accepted this revision.Aug 1 2018, 5:32 PM

LGTM

This revision is now accepted and ready to land.Aug 1 2018, 5:32 PM
This revision was automatically updated to reflect the committed changes.