This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] With nested parallelism, threadprivate variables become shared on outer parallel when appearing in inner parallel copyin clause
Needs ReviewPublic

Authored by Prince781 on Jul 11 2019, 1:18 PM.

Details

Summary

There is a bug since at least clang 8.0.0 wherein a static threadprivate variable appearing in a copyin() clause on a parallel construct (that is nested within another parallel construct) becomes shared on the outer parallel. This happens only when the threadprivate variable is backed by TLS and does not appear in global scope. Here is an example that compiles incorrectly:

#include <omp.h>
#include <stdbool.h>
#include <stdio.h>
#include <assert.h>
#define NT 4
 
int main(void) {
    static int threadprivate_var = 1;
    #pragma omp threadprivate(threadprivate_var)
 
    omp_set_dynamic(false);
    omp_set_num_threads(NT);
    omp_set_nested(true);
 
    #pragma omp parallel
    {
        threadprivate_var = 1;
        printf("[B] thread %d: val %d: threadprivate @ %p\n", omp_get_thread_num(), threadprivate_var, &threadprivate_var);
 
        #pragma omp master
        {
            threadprivate_var = 2;
            #pragma omp parallel copyin(threadprivate_var)
            {
                printf("[B] thread %d, %d: val %d: threadprivate @ %p\n", omp_get_ancestor_thread_num(1), omp_get_thread_num(), threadprivate_var, &threadprivate_var);
                // check that copyin succeeded
                assert(threadprivate_var == 2);
            }
        }
        #pragma omp barrier
        printf("[A] thread %d: val %d: threadprivate @ %p\n", omp_get_thread_num(), threadprivate_var, &threadprivate_var);
        if (omp_get_thread_num() != 0)  // 0 is the master thread
            // non-master threads should not have seen changes
            assert(threadprivate_var == 1);
    }
}

The resulting IR looks something like this:

@main.threadprivate_var = internal thread_local global i32 1, align 4
…
main() {
   call void __kmpc_fork_call(omp_outlined_outer_parallel_region, &main.threadprivate_var)
}
…
omp_outlined_outer_parallel_region(…, i32* %threadprivate_var) {
    if (I am the master thread)
        call void __kmpc_fork_call(omp_outlined_inner_parallel_region, %threadprivate_var)
}
…
omp_outlined_inner_parallel_region(…, i32* %threadprivate_var) {
}

When it should look something like this:

@main.threadprivate_var = internal thread_local global i32 1, align 4
…
main() {
   call void __kmpc_fork_call(omp_outlined_outer_parallel_region)
}
…
omp_outlined_outer_parallel_region(…) {
    if (I am the master thread)
        call void __kmpc_fork_call(omp_outlined_inner_parallel_region, &main.threadprivate_var)
}
…
omp_outlined_inner_parallel_region(…, i32* %threadprivate_var) {
}

Without the copyin, the function for the outer parallel region does not have the extra parameter. For the copyin clause above to work, the inner parallel needs a reference to the thread-local variable of the encountering thread (in this case, the master thread) in an extra parameter. It does not make sense for the outer parallel function(s) to capture the thread-local variable.

I’ve made a patch that prevents TLS-backed threadprivate variables from being captured in outer scopes. I don’t know if this is the best way to go about it, so I welcome feedback from someone with much more knowledge on clang’s OpenMP backend.

Diff Detail

Event Timeline

Prince781 created this revision.Jul 11 2019, 1:18 PM
ABataev added inline comments.Jul 11 2019, 1:22 PM
clang/lib/Sema/SemaExpr.cpp
15329

this is not the right place to fix this bug, it must be fixed in SemaOpenMP.cpp. If you want, I can try to fix it.

Prince781 added inline comments.Jul 11 2019, 1:33 PM
clang/lib/Sema/SemaExpr.cpp
15329

I see. Thanks for your comment. I noticed elsewhere in this function tryCaptureVariable that there is a if (getLangOpts().OpenMP) ..., so I thought it would be okay.

If you don't mind, I wish to take another look at this and get back to you later.

ABataev added inline comments.Jul 11 2019, 1:33 PM
clang/lib/Sema/SemaExpr.cpp
15329

Ok, no problems

Fixed this bug myself to be sure it will be merged with 9.0 release, sorry.

Fixed this bug myself to be sure it will be merged with 9.0 release, sorry.

That's cool. Thanks for the fix.

malcolm.parsons resigned from this revision.Jan 13 2020, 9:14 AM