This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][NVPTX] Use single loops when generating code for distribute parallel for
ClosedPublic

Authored by gtbercea on Oct 19 2018, 12:33 PM.

Details

Summary

This patch adds a new code generation path for bound sharing directives containing distribute parallel for. The new code generation scheme applies to chunked schedules on distribute and parallel for directives. The scheme simplifies the code that is being generated by eliminating the need for an outer for loop over chunks for both distribute and parallel for directives. In the case of distribute it applies to any sized chunk while in the parallel for case it only applies when chunk size is 1.

Diff Detail

Repository
rL LLVM

Event Timeline

gtbercea created this revision.Oct 19 2018, 12:33 PM
ABataev added inline comments.Oct 24 2018, 7:29 AM
lib/CodeGen/CGOpenMPRuntime.h
904 ↗(On Diff #170241)

I'd rename this into isDistStaticChunked

lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
4249 ↗(On Diff #170241)

If the ChunkOne field is not required, you need to restore original code here

lib/CodeGen/CGStmtOpenMP.cpp
2360 ↗(On Diff #170241)

This whole code is very similar to the unchunked case. Could you merge it?

2362 ↗(On Diff #170241)

It allows you to check only the implicit case, what about if the user explicitly specifies that chunk is 1?

3421 ↗(On Diff #170241)

Again, very similar to the unchunked code. Merge it.

lib/Sema/SemaOpenMP.cpp
5207 ↗(On Diff #170241)

Seems to me, you need to use NumIterations instead of LastIteration

gtbercea updated this revision to Diff 171081.Oct 25 2018, 6:41 AM
Refactor chunk one checking.
gtbercea updated this revision to Diff 171084.Oct 25 2018, 6:53 AM
gtbercea marked an inline comment as done.

Use NumIterations.

gtbercea marked 3 inline comments as done.Oct 25 2018, 6:55 AM
gtbercea added inline comments.
lib/CodeGen/CGOpenMPRuntime.h
904 ↗(On Diff #170241)

I've used the same naming convention as the isStaticNonchunked function for consistency.

ABataev added inline comments.Oct 25 2018, 6:56 AM
lib/CodeGen/CGOpenMPRuntime.h
904 ↗(On Diff #170241)

What about this?

lib/CodeGen/CGStmtOpenMP.cpp
2360 ↗(On Diff #170241)

What about this?

3421 ↗(On Diff #170241)

?

lib/Sema/SemaOpenMP.cpp
5207 ↗(On Diff #170241)

Add the tests for the collapsed loops.

gtbercea updated this revision to Diff 171116.Oct 25 2018, 8:58 AM
gtbercea marked an inline comment as done.
Refactor static chunk schedules. Fix tests.
gtbercea marked 2 inline comments as done.Oct 25 2018, 9:00 AM

What about collapsed loops?

lib/CodeGen/CGStmtOpenMP.cpp
3390 ↗(On Diff #171116)

Restore the original code here, the logic can be simplified

3396 ↗(On Diff #171116)

The same

3434–3454 ↗(On Diff #171116)

Please, simplify this

gtbercea updated this revision to Diff 171127.Oct 25 2018, 10:02 AM
Simplify code.
gtbercea marked 3 inline comments as done.Oct 25 2018, 10:04 AM
gtbercea updated this revision to Diff 171170.Oct 25 2018, 12:34 PM

Add test for collapse.

ABataev added inline comments.Oct 25 2018, 12:55 PM
test/OpenMP/distribute_parallel_for_codegen.cpp
410 ↗(On Diff #171170)

Bad check for names, you should not rely on them.

test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
117 ↗(On Diff #171170)

Also, bad checks.

gtbercea updated this revision to Diff 171180.Oct 25 2018, 1:39 PM

Change tests.

gtbercea marked 2 inline comments as done.Oct 25 2018, 1:40 PM
ABataev accepted this revision.Oct 26 2018, 8:17 AM

LG, with a nit

lib/Sema/SemaOpenMP.cpp
5308 ↗(On Diff #171180)

Fix the comment here

This revision is now accepted and ready to land.Oct 26 2018, 8:17 AM
gtbercea updated this revision to Diff 171499.Oct 29 2018, 7:37 AM

Fix comment.

gtbercea marked an inline comment as done.Oct 29 2018, 7:37 AM
This revision was automatically updated to reflect the committed changes.