This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Implement 5.0 memory management
ClosedPublic

Authored by jlpeyton on Mar 25 2019, 10:49 AM.

Details

Summary
  • Replace HBWMALLOC API with more general MEMKIND API, new functions and variables added.
  • Have libmemkind.so loaded when accessible.
  • Redirect memspaces to default one except for high bandwidth which is processed separately.
  • Ignore some allocator traits e.g., sync_hint, access, pinned, while others are processed normally e.g., alignment, pool_size, fallback, fb_data, partition.
  • Add tests for memory management

Patch by Andrey Churbanov

Diff Detail

Repository
rOMP OpenMP

Event Timeline

jlpeyton created this revision.Mar 25 2019, 10:49 AM
ABataev added inline comments.
runtime/src/include/50/omp.h.var
294

By default, the base type for enum is int. Value UINTPTR_MAX is too big for int.

AndreyChurbanov added inline comments.
runtime/src/include/50/omp.h.var
294

Alexey, theoretically you are correct. But the majority of compilers (gcc, clang, icc, maybe others) support pointer-size enum by default. The OpenMP language committee decided that this type must be enumeration in order to allow strict type checking by compilers. And at the same time it is supposed to keep pointer value as a convenient way to designate custom memory allocator or custom memory space object.

An alternative would require mapping of pointers to integers inside runtime library that looks a worse solution comparing to utilizing common non-standard language extension.

Compilers those do not support pointer-size enumeration can use implementation under #if above, that is currently specified for cl and icl (and maybe others) on Windows. This breaks OpenMP specification requirement, but again looks preferable comparing to mapping pointers to integers inside the runtime.

If you still think this is a big problem, then we can think of changing the OpenMP specification to not require enumeration types here, and allow integers instead. But this is also not a simple and time consuming procedure.

ABataev added inline comments.Mar 26 2019, 4:06 AM
runtime/src/include/50/omp.h.var
294

I think it is better to use based enum explicitly, if the compiler supports it. I think you need to add a check for c++11 and use enum omp_allocator_handle_t : uintptr_t explicitly.

jlpeyton updated this revision to Diff 192665.Mar 28 2019, 9:54 AM
  1. Addressed Alexey’s comment on enumeration type in C++11 - added base type under “#if __cplusplus >= 201103”.
  2. Fixed Fortan headers so that they can be built by gfortran – removed bind(c) from omp_init_allocator function declaration (gfortran complained that array of structures is not c-boundable type). Also fixed implementation in kmp_ftn_entry.h – added pointers dereference.
This revision is now accepted and ready to land.Apr 8 2019, 10:32 AM
This revision was automatically updated to reflect the committed changes.
tianshilei1992 added inline comments.
runtime/src/include/50/omp.h.var
391

This enum causes the front end some troubles.

extern int printf(const char *, ...);

typedef enum omp_allocator_handle_t {
  omp_null_allocator = 0,
  omp_default_mem_alloc = 1,
  omp_large_cap_mem_alloc = 2,
  omp_const_mem_alloc = 3,
  omp_high_bw_mem_alloc = 4,
  omp_low_lat_mem_alloc = 5,
  omp_cgroup_mem_alloc = 6,
  omp_pteam_mem_alloc = 7,
  omp_thread_mem_alloc = 8,
  llvm_omp_target_host_mem_alloc = 100,
  llvm_omp_target_shared_mem_alloc = 101,
  llvm_omp_target_device_mem_alloc = 102,
  KMP_ALLOCATOR_MAX_HANDLE = 18446744073709551615UL
} omp_allocator_handle_t;

int main(int argc, char *argv[]) {
  printf("sizeof(omp_allocator_handle_t)=%zu, sizeof(omp_null_allocator)=%zu\n",
         sizeof(omp_allocator_handle_t), sizeof(omp_null_allocator));
  return 0;
}

This simple code shows that sizeof(omp_allocator_handle_t)=8, while sizeof(omp_null_allocator)=4. It is not safe to assume that all members of the enum will be of the same type as omp_allocator_handle_t. It's not generally a problem at runtime, but things are messed up in front end if user uses allocators.

In clang, in order to determine the type of omp_allocator_handle_t, clang checks the type of those predefined allocators. The first one it checks is omp_null_allocator. What clang gets is a int, instead of an enum of size 8. If the allocator is captured by a region, let's say a parallel region, the allocator will be privatized. Because clang deems omp_allocator_handle_t as an int, it will first cast the value returned by runtime (for libomp it is a void *) to int, and then in the outlined function, it casts the int back to omp_allocator_handle_t. This two casts completely shaves the first 32-bit of the pointer, and when the private "new" pointer is fed to another runtime function call __kmpc_allocate(), it causes segment fault.

Can we define those allocators like how they are defined on Windows, aka global variables? This can make sure the size is fixed.

Herald added a project: Restricted Project. · View Herald Transcript

@ABataev Can we handle the detection of type omp_allocator_handle_t in another way? For example, is it possible to get the TypeDecl * from the identifier table and use that as the type (we can get the QualType from TypeDecl *)?

@ABataev Can we handle the detection of type omp_allocator_handle_t in another way? For example, is it possible to get the TypeDecl * from the identifier table and use that as the type (we can get the QualType from TypeDecl *)?

Can you try to fix initializers instead and make them 0UL, 1UL, etc.?

@ABataev Can we handle the detection of type omp_allocator_handle_t in another way? For example, is it possible to get the TypeDecl * from the identifier table and use that as the type (we can get the QualType from TypeDecl *)?

Can you try to fix initializers instead and make them 0UL, 1UL, etc.?

I tried that. It doesn't work.

➜  cat test_omp_null_allocator.c
extern int printf(const char *, ...);

typedef enum omp_allocator_handle_t {
  omp_null_allocator = 0UL,
  omp_default_mem_alloc = 1UL,
  omp_large_cap_mem_alloc = 2UL,
  omp_const_mem_alloc = 3UL,
  omp_high_bw_mem_alloc = 4UL,
  omp_low_lat_mem_alloc = 5UL,
  omp_cgroup_mem_alloc = 6UL,
  omp_pteam_mem_alloc = 7UL,
  omp_thread_mem_alloc = 8UL,
  llvm_omp_target_host_mem_alloc = 100UL,
  llvm_omp_target_shared_mem_alloc = 101UL,
  llvm_omp_target_device_mem_alloc = 102UL,
  KMP_ALLOCATOR_MAX_HANDLE = 18446744073709551615UL
} omp_allocator_handle_t;

int main(int argc, char *argv[]) {
  printf("sizeof(omp_allocator_handle_t)=%zu, sizeof(omp_null_allocator)=%zu\n",
         sizeof(omp_allocator_handle_t), sizeof(omp_null_allocator));
  return 0;
}
➜  clang test_omp_null_allocator.c -o test_omp_null_allocator
➜  ./test_omp_null_allocator
sizeof(omp_allocator_handle_t)=8, sizeof(omp_null_allocator)=4

@ABataev Can we handle the detection of type omp_allocator_handle_t in another way? For example, is it possible to get the TypeDecl * from the identifier table and use that as the type (we can get the QualType from TypeDecl *)?

Can you try to fix initializers instead and make them 0UL, 1UL, etc.?

I tried that. It doesn't work.

➜  cat test_omp_null_allocator.c
extern int printf(const char *, ...);

typedef enum omp_allocator_handle_t {
  omp_null_allocator = 0UL,
  omp_default_mem_alloc = 1UL,
  omp_large_cap_mem_alloc = 2UL,
  omp_const_mem_alloc = 3UL,
  omp_high_bw_mem_alloc = 4UL,
  omp_low_lat_mem_alloc = 5UL,
  omp_cgroup_mem_alloc = 6UL,
  omp_pteam_mem_alloc = 7UL,
  omp_thread_mem_alloc = 8UL,
  llvm_omp_target_host_mem_alloc = 100UL,
  llvm_omp_target_shared_mem_alloc = 101UL,
  llvm_omp_target_device_mem_alloc = 102UL,
  KMP_ALLOCATOR_MAX_HANDLE = 18446744073709551615UL
} omp_allocator_handle_t;

int main(int argc, char *argv[]) {
  printf("sizeof(omp_allocator_handle_t)=%zu, sizeof(omp_null_allocator)=%zu\n",
         sizeof(omp_allocator_handle_t), sizeof(omp_null_allocator));
  return 0;
}
➜  clang test_omp_null_allocator.c -o test_omp_null_allocator
➜  ./test_omp_null_allocator
sizeof(omp_allocator_handle_t)=8, sizeof(omp_null_allocator)=4

Try ULL instead of UL

@ABataev Can we handle the detection of type omp_allocator_handle_t in another way? For example, is it possible to get the TypeDecl * from the identifier table and use that as the type (we can get the QualType from TypeDecl *)?

Can you try to fix initializers instead and make them 0UL, 1UL, etc.?

I tried that. It doesn't work.

➜  cat test_omp_null_allocator.c
extern int printf(const char *, ...);

typedef enum omp_allocator_handle_t {
  omp_null_allocator = 0UL,
  omp_default_mem_alloc = 1UL,
  omp_large_cap_mem_alloc = 2UL,
  omp_const_mem_alloc = 3UL,
  omp_high_bw_mem_alloc = 4UL,
  omp_low_lat_mem_alloc = 5UL,
  omp_cgroup_mem_alloc = 6UL,
  omp_pteam_mem_alloc = 7UL,
  omp_thread_mem_alloc = 8UL,
  llvm_omp_target_host_mem_alloc = 100UL,
  llvm_omp_target_shared_mem_alloc = 101UL,
  llvm_omp_target_device_mem_alloc = 102UL,
  KMP_ALLOCATOR_MAX_HANDLE = 18446744073709551615UL
} omp_allocator_handle_t;

int main(int argc, char *argv[]) {
  printf("sizeof(omp_allocator_handle_t)=%zu, sizeof(omp_null_allocator)=%zu\n",
         sizeof(omp_allocator_handle_t), sizeof(omp_null_allocator));
  return 0;
}
➜  clang test_omp_null_allocator.c -o test_omp_null_allocator
➜  ./test_omp_null_allocator
sizeof(omp_allocator_handle_t)=8, sizeof(omp_null_allocator)=4

Try ULL instead of UL

But the thing is, this has to work on X86 as well. On 32-bit systems, void * is 32-bit, which is a int.

@ABataev Can we handle the detection of type omp_allocator_handle_t in another way? For example, is it possible to get the TypeDecl * from the identifier table and use that as the type (we can get the QualType from TypeDecl *)?

Can you try to fix initializers instead and make them 0UL, 1UL, etc.?

I tried that. It doesn't work.

➜  cat test_omp_null_allocator.c
extern int printf(const char *, ...);

typedef enum omp_allocator_handle_t {
  omp_null_allocator = 0UL,
  omp_default_mem_alloc = 1UL,
  omp_large_cap_mem_alloc = 2UL,
  omp_const_mem_alloc = 3UL,
  omp_high_bw_mem_alloc = 4UL,
  omp_low_lat_mem_alloc = 5UL,
  omp_cgroup_mem_alloc = 6UL,
  omp_pteam_mem_alloc = 7UL,
  omp_thread_mem_alloc = 8UL,
  llvm_omp_target_host_mem_alloc = 100UL,
  llvm_omp_target_shared_mem_alloc = 101UL,
  llvm_omp_target_device_mem_alloc = 102UL,
  KMP_ALLOCATOR_MAX_HANDLE = 18446744073709551615UL
} omp_allocator_handle_t;

int main(int argc, char *argv[]) {
  printf("sizeof(omp_allocator_handle_t)=%zu, sizeof(omp_null_allocator)=%zu\n",
         sizeof(omp_allocator_handle_t), sizeof(omp_null_allocator));
  return 0;
}
➜  clang test_omp_null_allocator.c -o test_omp_null_allocator
➜  ./test_omp_null_allocator
sizeof(omp_allocator_handle_t)=8, sizeof(omp_null_allocator)=4

Try ULL instead of UL

But the thing is, this has to work on X86 as well. On 32-bit systems, void * is 32-bit, which is a int.

How about explicit vast to uintptr_t, like (uintptr_t) 0, etc.?

@ABataev Can we handle the detection of type omp_allocator_handle_t in another way? For example, is it possible to get the TypeDecl * from the identifier table and use that as the type (we can get the QualType from TypeDecl *)?

Can you try to fix initializers instead and make them 0UL, 1UL, etc.?

I tried that. It doesn't work.

➜  cat test_omp_null_allocator.c
extern int printf(const char *, ...);

typedef enum omp_allocator_handle_t {
  omp_null_allocator = 0UL,
  omp_default_mem_alloc = 1UL,
  omp_large_cap_mem_alloc = 2UL,
  omp_const_mem_alloc = 3UL,
  omp_high_bw_mem_alloc = 4UL,
  omp_low_lat_mem_alloc = 5UL,
  omp_cgroup_mem_alloc = 6UL,
  omp_pteam_mem_alloc = 7UL,
  omp_thread_mem_alloc = 8UL,
  llvm_omp_target_host_mem_alloc = 100UL,
  llvm_omp_target_shared_mem_alloc = 101UL,
  llvm_omp_target_device_mem_alloc = 102UL,
  KMP_ALLOCATOR_MAX_HANDLE = 18446744073709551615UL
} omp_allocator_handle_t;

int main(int argc, char *argv[]) {
  printf("sizeof(omp_allocator_handle_t)=%zu, sizeof(omp_null_allocator)=%zu\n",
         sizeof(omp_allocator_handle_t), sizeof(omp_null_allocator));
  return 0;
}
➜  clang test_omp_null_allocator.c -o test_omp_null_allocator
➜  ./test_omp_null_allocator
sizeof(omp_allocator_handle_t)=8, sizeof(omp_null_allocator)=4

Try ULL instead of UL

But the thing is, this has to work on X86 as well. On 32-bit systems, void * is 32-bit, which is a int.

How about explicit vast to uintptr_t, like (uintptr_t) 0, etc.?

Not working.

@ABataev Can we handle the detection of type omp_allocator_handle_t in another way? For example, is it possible to get the TypeDecl * from the identifier table and use that as the type (we can get the QualType from TypeDecl *)?

Can you try to fix initializers instead and make them 0UL, 1UL, etc.?

I tried that. It doesn't work.

➜  cat test_omp_null_allocator.c
extern int printf(const char *, ...);

typedef enum omp_allocator_handle_t {
  omp_null_allocator = 0UL,
  omp_default_mem_alloc = 1UL,
  omp_large_cap_mem_alloc = 2UL,
  omp_const_mem_alloc = 3UL,
  omp_high_bw_mem_alloc = 4UL,
  omp_low_lat_mem_alloc = 5UL,
  omp_cgroup_mem_alloc = 6UL,
  omp_pteam_mem_alloc = 7UL,
  omp_thread_mem_alloc = 8UL,
  llvm_omp_target_host_mem_alloc = 100UL,
  llvm_omp_target_shared_mem_alloc = 101UL,
  llvm_omp_target_device_mem_alloc = 102UL,
  KMP_ALLOCATOR_MAX_HANDLE = 18446744073709551615UL
} omp_allocator_handle_t;

int main(int argc, char *argv[]) {
  printf("sizeof(omp_allocator_handle_t)=%zu, sizeof(omp_null_allocator)=%zu\n",
         sizeof(omp_allocator_handle_t), sizeof(omp_null_allocator));
  return 0;
}
➜  clang test_omp_null_allocator.c -o test_omp_null_allocator
➜  ./test_omp_null_allocator
sizeof(omp_allocator_handle_t)=8, sizeof(omp_null_allocator)=4

Try ULL instead of UL

But the thing is, this has to work on X86 as well. On 32-bit systems, void * is 32-bit, which is a int.

How about explicit vast to uintptr_t, like (uintptr_t) 0, etc.?

Not working.

Conditional versions for 32/64 bit systems? With ULL for 64 bit and ul for 32 bits? We'd better to fix the type declaration to make it consistent across compilation units.

@ABataev Can we handle the detection of type omp_allocator_handle_t in another way? For example, is it possible to get the TypeDecl * from the identifier table and use that as the type (we can get the QualType from TypeDecl *)?

Can you try to fix initializers instead and make them 0UL, 1UL, etc.?

I tried that. It doesn't work.

➜  cat test_omp_null_allocator.c
extern int printf(const char *, ...);

typedef enum omp_allocator_handle_t {
  omp_null_allocator = 0UL,
  omp_default_mem_alloc = 1UL,
  omp_large_cap_mem_alloc = 2UL,
  omp_const_mem_alloc = 3UL,
  omp_high_bw_mem_alloc = 4UL,
  omp_low_lat_mem_alloc = 5UL,
  omp_cgroup_mem_alloc = 6UL,
  omp_pteam_mem_alloc = 7UL,
  omp_thread_mem_alloc = 8UL,
  llvm_omp_target_host_mem_alloc = 100UL,
  llvm_omp_target_shared_mem_alloc = 101UL,
  llvm_omp_target_device_mem_alloc = 102UL,
  KMP_ALLOCATOR_MAX_HANDLE = 18446744073709551615UL
} omp_allocator_handle_t;

int main(int argc, char *argv[]) {
  printf("sizeof(omp_allocator_handle_t)=%zu, sizeof(omp_null_allocator)=%zu\n",
         sizeof(omp_allocator_handle_t), sizeof(omp_null_allocator));
  return 0;
}
➜  clang test_omp_null_allocator.c -o test_omp_null_allocator
➜  ./test_omp_null_allocator
sizeof(omp_allocator_handle_t)=8, sizeof(omp_null_allocator)=4

Try ULL instead of UL

But the thing is, this has to work on X86 as well. On 32-bit systems, void * is 32-bit, which is a int.

How about explicit vast to uintptr_t, like (uintptr_t) 0, etc.?

Not working.

Conditional versions for 32/64 bit systems? With ULL for 64 bit and ul for 32 bits? We'd better to fix the type declaration to make it consistent across compilation units.

Nah, let's forget about portability now, even 0ULL doesn't work.

@ABataev Can we handle the detection of type omp_allocator_handle_t in another way? For example, is it possible to get the TypeDecl * from the identifier table and use that as the type (we can get the QualType from TypeDecl *)?

I tried to avoid this, since we have different representations fir this type. And such lookup would require some non-portable lookup. But if it is not stable anyway, we can try to do it and tie the frontend to the libomp implementation. Or need a better way to represent this type and corresponding constants.

runtime/src/kmp_stub.cpp