Page MenuHomePhabricator

[OPENMP][NVPTX]Fix behavior of omp_in_parallel() function.
AbandonedPublic

Authored by ABataev on May 1 2019, 7:34 AM.

Details

Summary

According to the OpenMP standard, the effect of the omp_in_parallel routine is to return true if the current task is enclosed by an active parallel region, and the parallel region is enclosed by the outermost initial task region on the device; otherwise it returns false.
Active parallel region - A parallel region that is executed by a team consisting of more than one thread.

Event Timeline

ABataev created this revision.May 1 2019, 7:34 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 1 2019, 7:34 AM

My understanding of enclosed is that its meaning is recursive. So omp_in_parallel is to return true iff (at least) one of the surrounding parallel regions is active. That also matches the summary in OpenMP 5.0 which says

The omp_in_parallel routine returns true if the active-levels-var ICV is greater than zero; otherwise, it returns false.

and

active-levels-var - the number of nested active parallel regions that enclose the current task such that all of the parallel regions are enclosed by the outermost initial task region on the current device. There is one copy of this ICV per data environment.

In that sense I think the test is currently right and the change is not compliant with the standard.

My understanding of enclosed is that its meaning is recursive. So omp_in_parallel is to return true iff (at least) one of the surrounding parallel regions is active. That also matches the summary in OpenMP 5.0 which says

The omp_in_parallel routine returns true if the active-levels-var ICV is greater than zero; otherwise, it returns false.

and

active-levels-var - the number of nested active parallel regions that enclose the current task such that all of the parallel regions are enclosed by the outermost initial task region on the current device. There is one copy of this ICV per data environment.

In that sense I think the test is currently right and the change is not compliant with the standard.

Hmm, the current implementation also does not look correct in this case. It does not take into account if the enclosing parallel region is active or not. If number of threads is 1, this regions is not active. Currently, it just checks if the innermost region is a parallel region regardless the number of threads in it (plus, i does not work correctly if we have outer parallel + inner task region)

ABataev abandoned this revision.May 2 2019, 7:37 AM