This is an archive of the discontinued LLVM Phabricator instance.

[openmp] Annotate tmp variables with omp_thread_mem_alloc
ClosedPublic

Authored by JonChesterfield on Aug 12 2021, 9:03 AM.

Details

Summary

Fixes miscompile of calls into ocml. Bug 51445.

The stack variable double __tmp is moved to dynamically allocated shared
memory by CGOpenMPRuntimeGPU. This is usually fine, but when the variable
is passed to a function that is explicitly annotated address_space(5) then
allocating the variable off-stack leads to a miscompile in the back end,
which cannot decide to move the variable back to the stack from shared.

This could be fixed by removing the AS(5) annotation from the math library
or by explicitly marking the variables as thread_mem_alloc. The cast to
AS(5) is still a no-op once IR is reached.

Diff Detail

Event Timeline

JonChesterfield requested review of this revision.Aug 12 2021, 9:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 12 2021, 9:03 AM
JonChesterfield added a subscriber: ye-luo.

@ye-luo this fixes the modf and sincos test cases from https://github.com/ye-luo/openmp-target

jdoerfert accepted this revision.Aug 12 2021, 9:12 AM

Assuming this causes us to generate an alloc as(5) for __tmp, LG

This revision is now accepted and ready to land.Aug 12 2021, 9:12 AM

Assuming this causes us to generate an alloc as(5) for __tmp, LG

Yes, with the running example and this patch we get the perfect:

%__tmp = alloca double, align 8, addrspace(5)
%call = call double @__ocml_sincos_f64(double %__x, double addrspace(5)* %__tmp)

Before, we get:

%__tmp = call fastcc i8* @__kmpc_alloc_shared()
%__tmp_on_stack = bitcast i8* %__tmp to double*
%__tmp_on_stack.ascast = addrspacecast double* %__tmp_on_stack to double addrspace(5)*
%call = call double @__ocml_sincos_f64(double %__x, double addrspace(5)* %__tmp_on_stack.ascast)

which interacts badly with the addrspace(5) annotation on the ocml function

This revision was landed with ongoing or failed builds.Aug 12 2021, 9:30 AM
This revision was automatically updated to reflect the committed changes.
JonChesterfield added a comment.EditedAug 12 2021, 9:40 AM

Failed a CI job that builds an openmp test in an environment without omp.h, will revert.

Thoughts on fixing? Putting the omp allocator definition in this header is likely to collide with a real omp.h. Fairly clean fix is to move the definitions into the deviceRTL. Tempting fix is to add a minimal omp.h to the test dir

example failure:

In file included from /home/tcwg-buildslave/worker/clang-aarch64-quick/llvm/clang/test/Headers/amdgcn_openmp_device_math.c:9:
In file included from /home/tcwg-buildslave/worker/clang-aarch64-quick/llvm/clang/test/Headers/../../lib/Headers/openmp_wrappers/math.h:55:
/home/tcwg-buildslave/worker/clang-aarch64-quick/stage1/lib/clang/14.0.0/include/__clang_hip_math.h:23:10: fatal error: 'omp.h' file not found
#include <omp.h>
JonChesterfield reopened this revision.Aug 12 2021, 9:44 AM
This revision is now accepted and ready to land.Aug 12 2021, 9:44 AM

Failed a CI job that builds an openmp test in an environment without omp.h, will revert.

Thoughts on fixing? Putting the omp allocator definition in this header is likely to collide with a real omp.h. Fairly clean fix is to move the definitions into the deviceRTL. Tempting fix is to add a minimal omp.h to the test dir

example failure:

In file included from /home/tcwg-buildslave/worker/clang-aarch64-quick/llvm/clang/test/Headers/amdgcn_openmp_device_math.c:9:
In file included from /home/tcwg-buildslave/worker/clang-aarch64-quick/llvm/clang/test/Headers/../../lib/Headers/openmp_wrappers/math.h:55:
/home/tcwg-buildslave/worker/clang-aarch64-quick/stage1/lib/clang/14.0.0/include/__clang_hip_math.h:23:10: fatal error: 'omp.h' file not found
#include <omp.h>

yes add omp.h, in clang/test/Headers/Inputs/include.

JonChesterfield added a comment.EditedAug 18 2021, 4:47 PM

intended content for the new omp.h is

#ifndef __OMP_H
#define __OMP_H

#if _OPENMP
// Follows the pattern in interface.h
// Clang sema checks this type carefully, needs to closely match that from omp.h
typedef enum omp_allocator_handle_t {
  omp_null_allocator = 0,
  omp_default_mem_alloc = 1,
  omp_large_cap_mem_alloc = 2,
  omp_const_mem_alloc = 3,
  omp_high_bw_mem_alloc = 4,
  omp_low_lat_mem_alloc = 5,
  omp_cgroup_mem_alloc = 6,
  omp_pteam_mem_alloc = 7,
  omp_thread_mem_alloc = 8,
  KMP_ALLOCATOR_MAX_HANDLE = ~(0U)
} omp_allocator_handle_t;
#endif

#endif

trying to reproduce the failure locally first to verify that the fix works

edit: confirmed, will let the pre-commit build go through green then re-commit

  • add minimal omp.h to clang test inputs
This revision was landed with ongoing or failed builds.Aug 18 2021, 6:22 PM
This revision was automatically updated to reflect the committed changes.