This patch starts adding support for adding information dumps to libomptarget and rtl plugins. The information printing is controlled by the LIBOMPTARGET_INFO environment variable introduced in D86483. The goal of this patch is to provide the user with additional information about the device during kernel execution and providing the user with information dumps in the case of failure. This patch added the ability to dump the pointer mapping table as well as printing the number of blocks and threads in the cuda RTL
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
@jdoerfert Any suggestions?
I think the current documentation is insufficient. Adding/updating any documents for any of the macro or overall are helpful.
Although other macros are not touched, it is still helpful to document them like those you added. You are probably the best person who knows how and when such macros should be used now.
I guess I'd prefer doxygen above the macro, as if they are functions. The reason is that I want us to make them functions at some point anyway.
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
508 | Maybe: "Device %d supports up to % CUDA blocks and %d threads with a warp size of %d". | |
956 | Can we please merge those two messages: INFO("Launching kernel with %d blocks and %d threads in %s mode\n", CudaBlocksPerGrid, CudaThreadsPerBlock,(KernelInfo->ExecutionMode == SPMD)? "SPMD" : "Generic") | |
openmp/libomptarget/src/interface.cpp | ||
37 | Don't we have to add space here to match the 18 above? |
openmp/libomptarget/src/interface.cpp | ||
---|---|---|
37 | The width of the pointer is 16 plus the "0x" at the start. |
I'm fine with this, one minor comment below. @ye-luo wdyt?
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
942 | Forgot, we need to print the device number here too. Also the kernel name if we have it. That name is not great but has the line number in it (IIRC). Maybe add a first mandatory argument to the INFO macro that is the device number. Then we can always forward something like: "[OMP-INFO][Device %d] " __VAR_ARGS__. |
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
942 | You mean using the CUDA RTL to get the name from the CUfunction value in the KernelTy struct? |
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
942 | We add it to KernelTy. When we do KernelsList.emplace_back(Func, ExecModeVal); in rtl.cpp, we know the name. |
Added the target kernel name and device ID to the message. Added a new method to get the offload entry given the pointer. I can't decide if it's better to just always print the device ID if we have it versus putting it in the macro itself.
Does this require any test files?
I believe there is no reason to be somewhere in this methods without a device ID. Thus, I would require it as part of the function/macro.
Does this require any test files?
I think that would be good. We should be able to set the INFO variable in the run line as well. Do we have a test for the DP macro we can copy?
openmp/libomptarget/plugins/cuda/src/rtl.cpp | ||
---|---|---|
298 | can we replace findOffloadEntry with this please. |
The changes I requested as been added. Remove my blocking. Still need other reviews to be addressed.
Changed table dumps to require LIBOMPTARGET_INFO > 1. We could move this to a bitvector in the future. Added an extra message to inform users to set LIBOMPTARGET_INFO on failure.
openmp/libomptarget/src/interface.cpp | ||
---|---|---|
82 | Need to be more specific about the "tables" |
clang-tidy: error: use of undeclared identifier 'getenv' [clang-diagnostic-error]
not useful