If amdgpu-flat-work-group-size is not specified in LLVM IR, the backend
uses default value of 1024. For this, minimum waves per EU should be 4.
However, backend is still setting minimum value to 1 instead of calculated
value. This is not observed normally as frontend always provide
amdgpu-flat-work-group-size attribute.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Should clang now be emitting 1 for the minimum range for OpenCL? Or should we go the other direction and have HIP bump the minimum required group size by default in the frontend?
My understanding is that clang already emits amdgpu-flat-work-group-size in range of [1, 256] as default value and HIP defaults to [1,1024]. And logic to calculate min-waves-per-eu is dependent on the maximum value of work-group size not minimum.
Hi, your git commit contains extra Phabricator tags. You can drop Reviewers: Subscribers: Tags: and the text Summary: from the git commit with the following script:
arcfilter () { arc amend git log -1 --pretty=%B | awk '/Reviewers:|Subscribers:/{p=1} /Reviewed By:|Differential Revision:/{p=0} !p && !/^Summary:$/ {sub(/^Summary: /,"");print}' | git commit --amend --date=now -F - }
Reviewed By: is considered important by some people. Please keep the tag. (--date=now is my personal preference (author dates are usually not useful. Using committer dates can make log almost monotonic in time))
llvm/utils/git/pre-push.py can validate the message does not include unneeded tags.