Download Raw Diff

Details

Reviewers

ABataev
hfinkel
amusman
tyler.nowicki

Commits

rGcba47b49785f: [CodeGen] Emit parallel_loop_access for each loop in the loop stack.
rL338810: [CodeGen] Emit parallel_loop_access for each loop in the loop stack.
rC338810: [CodeGen] Emit parallel_loop_access for each loop in the loop stack.

Summary

Emit !llvm.mem.parallel_loop_access metadata for memory accesses even if the parallel loop is not the top on the loop stack.

Fixes llvm.org/PR37558.

Diff Detail

Repository: rC Clang

Event Timeline

Meinersbur created this revision.Jun 30 2018, 10:03 PM

I don't think that this is the intended behavior of the #pragma clang loop. it is better to ask the author of this pragma is this correct or not.

Meinersbur added reviewers: amusman, tyler.nowicki.Jul 2 2018, 8:36 AM

In D48808#1149534, @ABataev wrote:

I don't think that this is the intended behavior of the #pragma clang loop. it is better to ask the author of this pragma is this correct or not.

It is the intended behavior that the memory accesses are independent with respect to the outer (annotated) loop (even if those accesses are within an inner loop). They're not independent with respect to the inner loop unless that loop is also annotated. Thus, this looks correct.

Michael, can you please add a test with two inner loops, one where more than one is annotated, and one where only the outer loop is annotated? It's not clear to me that I->setMetadata will do the right thing here in the former case.

In D48808#1149534, @ABataev wrote:

I don't think that this is the intended behavior of the #pragma clang loop. it is better to ask the author of this pragma is this correct or not.

I understand it as the intended behavior of the assume_safety option (also used for #pragma omp simd).

@tyler.nowicki What is the intended behaviour?

Hi Michael, Hal,

Sorry it has been a while since I looked at this. My memory is a little
fuzzy. The intent of 'assume_safety' is to tell LAA to skip dependency
checking on loads and stores so the vectorizer doesn't stop as soon as it
sees both in a loop. At the time 'assume_safety' was implemented the
vectorizer was limited to inner-loops. I am not up-to-date but it seems to
have the ability to perform some vectorization of non-inner loop
instructions?

If we can vectorize non-inner loop instructions then what behavior would
make the most sense: 'assume_safety' applies to the same loop scope(s) as
the other loop pragmas, or it applies to all nested loops?

My opinion is that for consistency 'assume_safety' and similar options
apply to the same scope(s) as 'vectorize(enable)'. But I am open to
alternatives if others see it differently.

Tyler Nowicki

In D48808#1149549, @hfinkel wrote:

In D48808#1149534, @ABataev wrote:

Michael, can you please add a test with two inner loops, one where more than one is annotated, and one where only the outer loop is annotated? It's not clear to me that I->setMetadata will do the right thing here in the former case.

The test case pragma-loop-safety-nested.cpp check the case where the outer loop is annotated.

Iterating over Active will iterate from outer to inner loops, meaning I->setMetadata will overwrite the annotation of the outermost loop (which IMHO is the most useful behaviour). Since it not possible to add multiple !llvm.mem.parallel_loop_access annotation, we cannot annotate multiple loops.

In D48808#1149828, @Meinersbur wrote:

In D48808#1149549, @hfinkel wrote:

In D48808#1149534, @ABataev wrote:

Michael, can you please add a test with two inner loops, one where more than one is annotated, and one where only the outer loop is annotated? It's not clear to me that I->setMetadata will do the right thing here in the former case.

The test case pragma-loop-safety-nested.cpp check the case where the outer loop is annotated.

Iterating over Active will iterate from outer to inner loops, meaning I->setMetadata will overwrite the annotation of the outermost loop (which IMHO is the most useful behaviour). Since it not possible to add multiple !llvm.mem.parallel_loop_access annotation, we cannot annotate multiple loops.

We specifically defined the metadata to support nested loops. The LangRef says, "The llvm.mem.parallel_loop_access metadata refers to a loop identifier, or metadata containing a list of loop identifiers for nested loops." To handle nested loops, we need to make the instruction metadata point to a list of loop IDs.

In D48808#1149845, @hfinkel wrote:

We specifically defined the metadata to support nested loops. The LangRef says, "The llvm.mem.parallel_loop_access metadata refers to a loop identifier, or metadata containing a list of loop identifiers for nested loops." To handle nested loops, we need to make the instruction metadata point to a list of loop IDs.

Thanks for pointing me to it. I've not seen parallel_loop_access pointing to a list, so I didn't know it was possible.

In D48808#1149862, @Meinersbur wrote:

In D48808#1149845, @hfinkel wrote:

We specifically defined the metadata to support nested loops. The LangRef says, "The llvm.mem.parallel_loop_access metadata refers to a loop identifier, or metadata containing a list of loop identifiers for nested loops." To handle nested loops, we need to make the instruction metadata point to a list of loop IDs.

Thanks for pointing me to it. I've not seen parallel_loop_access pointing to a list, so I didn't know it was possible.

Yea, well, we defined it to be possible. As to whether the backend will actually correctly read the metadata in that format, I don't know.

Allow multiple parallel annotations

Harbormaster completed remote builds in B19955: Diff 153769.Jul 2 2018, 12:41 PM

Adapt test file message.

Harbormaster completed remote builds in B19963: Diff 153803.Jul 2 2018, 2:46 PM

Hoist I->mayReadOrWriteMemory() condition.

Harbormaster completed remote builds in B20332: Diff 155288.Jul 12 2018, 3:37 PM

ping

LGTM

This revision is now accepted and ready to land.Aug 1 2018, 5:32 PM

Closed by commit rC338810: [CodeGen] Emit parallel_loop_access for each loop in the loop stack. (authored by Meinersbur). · Explain WhyAug 2 2018, 9:43 PM

This revision was automatically updated to reflect the committed changes.

This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Emit parallel_loop_access for each loop in the loop stack.
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 158912

lib/CodeGen/CGLoopInfo.cpp

test/CodeGenCXX/pragma-loop-safety-nested.cpp

test/CodeGenCXX/pragma-loop-safety-outer.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Emit parallel_loop_access for each loop in the loop stack.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 158912

lib/CodeGen/CGLoopInfo.cpp

test/CodeGenCXX/pragma-loop-safety-nested.cpp

test/CodeGenCXX/pragma-loop-safety-outer.cpp

[CodeGen] Emit parallel_loop_access for each loop in the loop stack.
ClosedPublic