Two globals KernelInfoTable & SymbolInfoTable are moved
into RTLDeviceInfoTy class.
This builds on the top of D102691.
[2/2]
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
The vector<> part is for N GPUs, and each function call is for a specific GPU. So we never need to pass the whole vector along, only the element corresponding to that GPU.
I'd prefer we populate a new map for each code object and then merge them, instead of mutating the existing map, but I guess that's not a significant difference.
PopulateInfoTablesContext looks like a hazard - it's a struct holding references, which is usually a bad sign for lifetimes. If that's only used to pass data across the void* boundary, let's use a lambda instance instead.
PopulateInfoTablesContext looks like a hazard - it's a struct holding references, which is usually a bad sign for lifetimes. If that's only used to pass data across the void* boundary, let's use a lambda instance instead.
The capturing lambda won't work here so that's why I chose to use a struct of references.
I could not find alternative to references in the PopulateInfoTablesContext. I could use pointers
but I don't see how that is different from having references.
Updated the patch to pass map directly instead of whole vector.
openmp/libomptarget/plugins/amdgpu/impl/system.cpp | ||
---|---|---|
820–822 | Minor point, output parameters at the end of the list is probably more common than at the start of the list for C++. I'd rather pass mutable things by pointer, so the call site looks like function(42, &something_mutated), but that convention is from a different codebase to this one. Better than either would be returning a pair<status, map> instead of mutating it in place. But given this change is moving from a global variable to a parameter, getting rid of the mutation at the same time seems a step too far for one patch. | |
994–995 | Would lean towards dropping the void*, e.g. via D103030: static hsa_status_t populate_InfoTables(hsa_executable_symbol_t symbol, std::map<std::string, atl_kernel_info_t> &KernelInfoTable, std::map<std::string, atl_symbol_info_t> &SymbolInfoTable) { | |
openmp/libomptarget/plugins/amdgpu/src/rtl.cpp | ||
89 | This seems orthogonal to moving the global. Also, looks like we now have declarations in rtl.cpp and definitions in atmi_interop_hsa.cpp, which seems bad as we lose the compiler checking that the call site matches the definition |
except for the lost error check above, I think this is now a non-functional change. LG with that error path restored
openmp/libomptarget/plugins/amdgpu/impl/system.cpp | ||
---|---|---|
1123 | lost an err = assignment and check here |
openmp/libomptarget/plugins/amdgpu/src/rtl.cpp | ||
---|---|---|
499 | This is dead - resize the vector will populate it with default constructed map<string>, and clear on a default constructed map is a no-op | |
577 | The KernelInfoTable and SymbolInfoTable calls here are also dead. They're free store allocated memory, don't need to tear them down before closing hsa (unlike kernel arg pool, which contains pointers from hsa allocated memory) |
passing by const& is much better than passing by &, I think we should do that in this patch
openmp/libomptarget/plugins/amdgpu/impl/atmi_interop_hsa.h | ||
---|---|---|
80 | This is only called from one place, with info == HSA_EXECUTABLE_SYMBOL_INFO_KERNEL_KERNARG_SEGMENT_SIZE, so can be simplified to a map lookup. Orthogonal to this patch. This and get_symbol_info only do lookups in the map so we should pass the map to them by const&. |
clang-tidy: warning: invalid case style for function 'atmi_interop_hsa_get_symbol_info' [readability-identifier-naming]
not useful