This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Stride in distribute parallel for loops with no chunk size
ClosedPublic

Authored by grokos on Sep 12 2016, 5:42 PM.

Details

Summary

When we have a distribute parallel for the compiler calls __kmpc_for_static_init_* which in turn calls __kmp_for_static_init. In the latter function, when no chunk size is specified in the dist_schedule clause, the code does not set pstride correctly; *pstride is left to be equal to 1. Whereas this is OK for plain parallel for-loops with no chunk size, it is obviously wrong for distribute parallel for-loops and causes chunks of the distributed loop to be distributed multiple times to multiple teams. The fix is to set *pstride to some value at least as large as the loop trip count.

Diff Detail

Repository
rL LLVM

Event Timeline

grokos updated this revision to Diff 71072.Sep 12 2016, 5:42 PM
grokos retitled this revision from to [OpenMP] Stride in distribute parallel for loops with no chunk size.
grokos updated this object.
grokos set the repository for this revision to rL LLVM.
grokos added a subscriber: openmp-commits.

Code looks fine to me, but it'd be really useful also to add a regression test...

AndreyChurbanov accepted this revision.Sep 27 2016, 8:34 AM
AndreyChurbanov edited edge metadata.

LGTM

This revision is now accepted and ready to land.Sep 27 2016, 8:34 AM
This revision was automatically updated to reflect the committed changes.