This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Add 'amdgpu-flat-work-group-size' to OpenMP kernels
Needs RevisionPublic

Authored by jhuber6 on Jan 23 2023, 11:26 AM.

Details

Summary

This patch adds the amdgpu-flat-work-group-size=1,1024 attribute to
OpenMP kernels targeting AMDGPU. This also lets us use
--gpu-max-threads-per-block which is loosened from being a HIP only
option.

Diff Detail

Event Timeline

jhuber6 created this revision.Jan 23 2023, 11:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 23 2023, 11:26 AM
jhuber6 requested review of this revision.Jan 23 2023, 11:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 23 2023, 11:26 AM
arsenm added inline comments.Jan 23 2023, 11:32 AM
clang/lib/CodeGen/TargetInfo.cpp
9552

Probably shouldn’t check the language, just it’s a kernel. Also shouldn’t emit this if it’s the default 1024. I’ve been trying to cut down on the superfluous attribute spam

jhuber6 added inline comments.Jan 23 2023, 11:36 AM
clang/lib/CodeGen/TargetInfo.cpp
9552

There's a section for HIP above that does the same. We could probably consolidate here for all "AMDGPU" kernels and get rid of the redundant attribute. Maybe in a separate patch?

arsenm added inline comments.Jan 23 2023, 11:38 AM
clang/lib/CodeGen/TargetInfo.cpp
9552

All the isCUDA || HIP || OpenMP checks scattered around are driving me crazy. A bunch of the out of tree divergent patches are just adding to them. We should just purge everything checking languages to the actual features and stop putting language names in things

jhuber6 added inline comments.Jan 23 2023, 11:40 AM
clang/lib/CodeGen/TargetInfo.cpp
9552

OpenCL is the odd one out as far as I know, HIP and OpenMP are mostly equivalent as far as attributes go.

yaxunl added inline comments.Jan 23 2023, 1:47 PM
clang/lib/CodeGen/TargetInfo.cpp
9552

OpenCL uses 256 as default max block size. This is to avoid performance regressions for existing apps. HIP uses 1024 by default.

clang/lib/Driver/ToolChains/AMDGPUOpenMP.cpp
61

we should keep the default value in Options.td instead of having multiple copies at different places. save as below.

jdoerfert added inline comments.Jan 23 2023, 4:02 PM
clang/lib/Driver/ToolChains/AMDGPU.cpp
791

^

arsenm added inline comments.Feb 2 2023, 6:31 PM
clang/include/clang/Basic/LangOptions.def
271

probably should drop the language and describe what it is

arsenm requested changes to this revision.Apr 26 2023, 4:56 AM
This revision now requires changes to proceed.Apr 26 2023, 4:56 AM
yaxunl added inline comments.Apr 26 2023, 7:36 AM
clang/include/clang/Basic/LangOptions.def
271

CUDA does not use it. Drop the language may cause confusion.

arsenm added inline comments.Apr 26 2023, 10:43 AM
clang/include/clang/Basic/LangOptions.def
271

If CUDA doesn't respect the flag, that's just broken. The concept is common among all the languages, and all the unnecessary language qualifications are a plague. Controls should be expressed as a generic concept that languages selectively enable