Add support for passing source locations to libomptarget runtime functions. This will allow the runtime system to give much more insightful error messages and debugging values.
Details
Diff Detail
Event Timeline
WIP for adding ident_t support to libomptarget. Still breaks some tests, just wanted to get a start.
That's what I did for all the clang tests, they all pass as far as I know and I can use the built clang to correctly build offloaded applications for CUDA at least. The tests that are problematic are the OpenMPOpt tests because I'm assuming they had a hard coded argument index somewhere into one of the runtime functions that needs to be incremented by one now.
I also didn't add the an ident_t argument to all of the runtime functions, only the ones that I could see being generated by CGOpenMPRuntime.cpp. I just made those functions pass in a nullptr if they call into one of the functions that requires an ident_t.
openmp/libomptarget/src/interface.cpp | ||
---|---|---|
160 | I should probably remove const here so we can pass loc to __kmpc_omp_taskwait. |
Fixed failing tests from OpenMPOpt. Formatted files which results in a lot of changes showing.
clang/lib/CodeGen/CGOpenMPRuntime.cpp | ||
---|---|---|
446 | A lot here seems unrelated. Did you clang format the file? I use the clang format patch tool. |
Yeah, I did clang-format since the last patch complained about some formatting in the linter. Didn't expect it to change quite so much. Also for some reason it won't let me quote your comment. Might've been better to ignore it until the final commit so it doesn't clutter up everything.
Why are we adding the extra parameter to those additional functions? Non-mapper API functions have been deprecated, clang does not emit them anymore...
I wasn't aware they were explicitly deprecated. If we're keeping around old interfaces for backwards compatibility I should also add in the old mapper functions without the ident_t pointer and call into the new functions with a nullptr.
Correct, all __tgt_target_* functions not ending in _mapper are part of the old interface and we are keeping them for compatibility with older versions of clang. These older clang versions do not emit the location pointer anyway, so this extra argument should be removed and each such function should call into its new API equivalent passing a nullptr.
We need the location pointer only for __tgt_target_*_mapper functions as well as __kmpc_push_target_tripcount - this is the current interface.
Since the functions are extern "C" I'll need to rename them to something else to keep them backwards compatible and then change what current Clang generates. something like __tgt_target_mapper_loc? Seems like a hacky solution to just keep adding suffixed whenever we want a new interface though.
Updated runtime library to have legacy calls into the new functions with source location information. Because the library interface requires C function naming this required adding a suffix to all the functions clang generates. Now tgt_*_mapper becomes tgt_*_mapper_loc. It's not a great solution but it's required if we want to keep this backwards compatible.
Yes, this used to be a point of contention within the community. We discussed the issue sometime ago and the majority of developers was in favor of this approach (as opposed to e.g. having an extra pointer to a structure which will contain additional information and which will be extended every time we add a new feature).
The libomptarget-part of this patch looks good, I'm leaving the other reviewers look at the clang-part.
In the future we'll want the ability to pass the mapped variable's string name so we can refer to target pointers by name and line number in the runtime. Since this will require adding another argument containing a list of the names we'll probably want to try to add both at the same time to avoid the need for more suffixes.
openmp/libomptarget/src/interface.cpp | ||
---|---|---|
122 | why not pass arg_mappers | |
177 | Please call the new version with a nullptr here. | |
202 | Why is this passing nullptr in the end? | |
263 | use the new version | |
328 | use the new version | |
357 | why not pass arg_mappers | |
420 | Where is the non-loc version? why not next to it? | |
479 | Please call the new version |
openmp/libomptarget/src/interface.cpp | ||
---|---|---|
174 | Remove this and call the nowait_mapper_loc version. Let's not duplicate logic, also below a few times. |
Added definition for the ident_t struct from kmp.h along with a method to extract the source location information. Checking the target outcome now prints the file location if ident_t location is available.
Should we wait until the next OpenMP LLVM meeting to push this?
openmp/libomptarget/include/Ident.h | ||
---|---|---|
49–52 | This will probably break with a Windows file path, but I don't think you can even build most offloading if you're on Windows. Should I just add a processor check? #ifdef _WIN32 #define PATH_DELIMITER '\\' #else #define PATH_DELIMITER '/' #endif |
openmp/libomptarget/include/Ident.h | ||
---|---|---|
49–52 | You are right, libomptarget does not run on Windows, but some fork which stays in sync with upstream libomptarget may do so and this change may break it. Can you add the proposed pre-processor check? |
Removing the _loc suffix. The Mapper API hasn't been officially released in Clang 11.x so we're still free to make changes. Currently working on augmenting the mapper API with variable declaration information so we can associate mapped pointers with their source names if the user specified debug locations.
Current build, fails offloading/target_depend_nowait for an unknown reason after calling cuStreamSynchronize in __tgt_target_teams_mapper_nowait.
I did a pull a few hours ago, so I have that patch in my tree. I compiled it both with master and this patch applied, it worked on master and failed using my patch. There are some slight differences between the code generated between the two in the device code generated, the host code only differs in the ident_t argument.
Master just has this under the device module
%class.omptarget_nvptx_Queue = type { [32 x %class.omptarget_nvptx_ThreadPrivateContext], [32 x %class.omptarget_nvptx_ThreadPrivateContext*], i32, [32 x i32], i32, [8 x i8] }
While this patch has this in addition.
%class.omptarget_nvptx_Queue = type { [32 x %class.omptarget_nvptx_ThreadPrivateContext.34], [32 x %class.omptarget_nvptx_ThreadPrivateContext.34*], i32, [32 x i32], i32, [8 x i8] } %class.omptarget_nvptx_ThreadPrivateContext.34 = type { %class.omptarget_nvptx_TeamDescr.32, [1024 x %class.omptarget_nvptx_TaskDescr], [1024 x %class.omptarget_nvptx_TaskDescr*], %union.anon, [1024 x i32], [1024 x i64], [1024 x i64], [1024 x i64], [1024 x i64], i64, [8 x i8] } %class.omptarget_nvptx_TeamDescr.32 = type { %class.omptarget_nvptx_TaskDescr, %class.omptarget_nvptx_WorkDescr, [32 x %struct.__kmpc_data_sharing_worker_slot_static] }
This probably has nothing to do with the problem but I'm wondering why this patch does anything that changes the device codegen.
A lot here seems unrelated. Did you clang format the file? I use the clang format patch tool.