This patch adds support for two environment variables to configure the device.
`LIBOMPTARGET_STACK_SIZE` sets the amount of memory in bytes that each thread
has for its stack. `LIBOMPTARGET_HEAP_SIZE` sets the amount of heap memory
that can be allocated using malloc / free on the device.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
652 | These enums don't seem to be defined in cuda.h, or somewhere else. Can you please take a look? |
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
652 | My cuda.h defines them. /opt/cuda/targets/x86_64-linux/include/cuda.h 1130: CU_LIMIT_STACK_SIZE = 0x00, /**< GPU thread stack size */ https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__TYPES.html lists them as well. |
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
652 | I meant in the compiler sources. The other enums used in libomptarget seem to be defined in openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h. Absence of these definitions on a machine that doesn't have Cuda drivers installed, is causing a build fail for me with these errors: llvm/openmp/libomptarget/plugins/cuda/src/rtl.cpp: In member function ‘int {anonymous}::DeviceRTLTy::initDevice(int)’: llvm/openmp/libomptarget/plugins/cuda/src/rtl.cpp:649:25: error: ‘CU_LIMIT_STACK_SIZE’ was not declared in this scope if (cuCtxSetLimit(CU_LIMIT_STACK_SIZE, StackLimit) != CUDA_SUCCESS) ^~~~~~~~~~~~~~~~~~~ I looked at the buildbots, but they are skipping building the cuda plugin altogether, so don't report any fails. For instance, https://lab.llvm.org/buildbot/#/builders/84/builds/12107/steps/4/logs/stdio has this in the log: -- Could NOT find LIBOMPTARGET_DEP_CUDA_DRIVER (missing: LIBOMPTARGET_DEP_CUDA_DRIVER_LIBRARIES) -- Could NOT find LIBOMPTARGET_DEP_VEO (missing: LIBOMPTARGET_DEP_VEO_LIBRARIES LIBOMPTARGET_DEP_VEOSINFO_LIBRARIES LIBOMPTARGET_DEP_VEO_INCLUDE_DIRS) -- LIBOMPTARGET: Building offloading runtime library libomptarget. ... -- LIBOMPTARGET: Not building CUDA offloading plugin: libelf dependency not found. ... -- check-libomptarget does nothing. |
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
652 | This patch makes the build pass for me, but I have no way to verify it. @jhuber6 can you please take a look? diff --git a/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.cpp b/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.cpp index c84b3814065e..235efd2728de 100644 --- a/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.cpp +++ b/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.cpp @@ -61,6 +61,9 @@ DLWRAP(cuDeviceCanAccessPeer, 3); DLWRAP(cuCtxEnablePeerAccess, 2); DLWRAP(cuMemcpyPeerAsync, 6); +DLWRAP(cuCtxGetLimit, 2); +DLWRAP(cuCtxSetLimit, 2); + DLWRAP_FINALIZE(); #ifndef DYNAMIC_CUDA_PATH diff --git a/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h b/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h index 045c39cacc97..17aa2a12ef6c 100644 --- a/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h +++ b/openmp/libomptarget/plugins/cuda/dynamic_cuda/cuda.h @@ -34,6 +34,17 @@ typedef enum CUstream_flags_enum { CU_STREAM_NON_BLOCKING = 0x1, } CUstream_flags; +typedef enum CUlimit_enum { + CU_LIMIT_STACK_SIZE = 0x0, + CU_LIMIT_PRINTF_FIFO_SIZE = 0x1, + CU_LIMIT_MALLOC_HEAP_SIZE = 0x2, + CU_LIMIT_DEV_RUNTIME_SYNC_DEPTH = 0x3, + CU_LIMIT_DEV_RUNTIME_PENDING_LAUNCH_COUNT = 0x4, + CU_LIMIT_MAX_L2_FETCH_GRANULARITY = 0x5, + CU_LIMIT_PERSISTING_L2_CACHE_SIZE = 0x6, + CU_LIMIT_MAX +} CUlimit; + typedef enum CUdevice_attribute_enum { CU_DEVICE_ATTRIBUTE_MAX_BLOCK_DIM_X = 2, CU_DEVICE_ATTRIBUTE_MAX_GRID_DIM_X = 5, @@ -100,4 +111,7 @@ CUresult cuCtxEnablePeerAccess(CUcontext, unsigned); CUresult cuMemcpyPeerAsync(CUdeviceptr, CUcontext, CUdeviceptr, CUcontext, size_t, CUstream); +CUresult cuCtxGetLimit(size_t *, CUlimit); +CUresult cuCtxSetLimit(CUlimit, size_t); + |
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
652 | Right, that makes sense. The above looks good to me. Could you commit it? |
Johannes,
I do not have the access rights needed for git push. Could you please help commit the fix?
Thanks,
Abhinav
There're three fails in my local check-all. Is this expected?
Failed Tests (3): libomptarget :: x86_64-pc-linux-gnu :: offloading/memory_manager.cpp libomptarget :: x86_64-pc-linux-gnu :: offloading/parallel_offloading_map.cpp libomptarget :: x86_64-pc-linux-gnu :: offloading/taskloop_offload_nowait.cpp
Pengfei, these fails are from a different part of code, that should not be affected by this change.
I see these fails in my environment (RHEL 8.2, gcc 8.3.1) for commit 4a76bd0e (previous buildable commit) as well.
clang-tidy: error: 'cuda.h' file not found [clang-diagnostic-error]
not useful