This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][Libomptarget] Remove global KernelNameMap
ClosedPublic

Authored by pdhaliwal on May 18 2021, 7:21 AM.

Details

Summary

KernelNameMap contains entries like "key.kd" => key which clearly
could be replaced by simple logic of removing suffix from the key.

Diff Detail

Event Timeline

pdhaliwal created this revision.May 18 2021, 7:21 AM
pdhaliwal requested review of this revision.May 18 2021, 7:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 18 2021, 7:21 AM

Let's delete it instead. It maps 'foo' to 'foo.kd'. we can replace that with a function that appends .kd to the string.

Removed KernelNameMap.

pdhaliwal retitled this revision from [AMDGPU][Libomptarget] Move KernelNameMap to function scope to [AMDGPU][Libomptarget] Remove global KernelNameMap.May 19 2021, 12:53 AM
pdhaliwal edited the summary of this revision. (Show Details)

I like the direction. Could we hold it for a day or so? I'd like to check through the uses of the kernel name to see if there's a missing edge case, or if we can simplify this a step further.

It looks like the msgpack data always contains the foo and the foo.kd strings, under different keys. I wonder if that's something we can rely on the compiler emitting.

openmp/libomptarget/plugins/amdgpu/impl/system.cpp
864

I think this is the earliest point we can check the foo to foo.kd mapping is in place, so that we can error out if it isn't

1036

Can we use the foo.kd form everywhere instead of truncating it?

t-tye added a comment.May 19 2021, 1:12 AM

I like the direction. Could we hold it for a day or so? I'd like to check through the uses of the kernel name to see if there's a missing edge case, or if we can simplify this a step further.

It looks like the msgpack data always contains the foo and the foo.kd strings, under different keys. I wonder if that's something we can rely on the compiler emitting.

The HSA ABI metadata is defined in https://llvm.org/docs/AMDGPUUsage.html#code-object-metadata . The kernel .symbol attribute provides the ELF symbol that should be used to access the kernel descriptor used in the AQL packet to specify the kernel to execute. The intent is not to make assumptions about the symbols, but simply use the metadata.

Indeed, but HSA is not the only piece here. This is tied to some generic openmp data, which I think uses the IR symbol name as the ID. That seems to be .symbol.

So far, above co v2 which is unimplemented, kernel name is derivable by appending .kd, so the suggestion here was to only use one of the two strings.

Taken a closer look at this. Cut down example msgpack,

".name" : "__omp_vmul_l8",
".symbol" : "__omp_vmul_l8.kd",

Most things use the 'name' version, i.e. without the trailing .kd suffix. The symbol name passed into the openmp runtime is that one, and that's the form used as a key in KernelInfoTable.

The HSA functions expect the suffixed version, and that's the one SYMBOL_INFO_NAME returns.

So what we presently do is:

  • Build a map from foo.kd to foo
  • Check, obliquely, that the original metadata had both foo.kd and foo present
  • Use that map to strip .kd from the result of a HSA call

The logic to drop a trailing .kd therefore looks like the right change to make. We can check the .name and .symbol fields when reading the metadata and error out if they don't match the expected format, and likewise error out if HSA returns a string that doesn't end .kd, and at that point we can rip out the map.

That's close to the patch as written above, but now I've actually worked through the control flow myself as well.

openmp/libomptarget/plugins/amdgpu/impl/system.cpp
137

these ^ are probably not the ideal structures. We convert char* into std::string at each call site, and map() is expected to be slower than unordered_map for the ~random access lookup.

vector<unordered_map<const char*, info_t>> with the custom equality / hasher for C strings is a reasonable choice.

1036

^that turns out to be worse, truncating is the way to go. Probably cleaner to mutate the name[] buffer than do extra things with strings.

pdhaliwal updated this revision to Diff 346946.May 21 2021, 1:04 AM

Added check for .kd in symbol name.

Couple of style comments above. I think this works as-is, so we have confirmation that the map of strings can go. Which is great!

openmp/libomptarget/plugins/amdgpu/impl/system.cpp
877

I think the above works but could be clearer as:
(kernelName + ".kd" ) != symbolName

kernelName can drop out of scope after this test (probably move it into a nested brace to make that clear), so we could mutate it instead of making a new string to probably save an allocation. But in general, '(foo + ".kd") == bar' seems simpler than the find calls.

927–928

Lets drop the debug print and comment here now that the map is gone

pdhaliwal updated this revision to Diff 346998.May 21 2021, 5:27 AM

Simplified if condition.

pdhaliwal marked 3 inline comments as done.May 21 2021, 5:29 AM
JonChesterfield accepted this revision.May 21 2021, 6:59 AM

Like it, thanks!

This revision is now accepted and ready to land.May 21 2021, 6:59 AM
This revision was landed with ongoing or failed builds.May 24 2021, 1:46 AM
This revision was automatically updated to reflect the committed changes.