This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU (CLANG)
ClosedPublic

Authored by t-tye on Feb 24 2018, 7:56 PM.

Details

Summary
  • Remove use of the opencl and amdopencl environment member of the target triple for the AMDGPU target.
  • Use a function attribute to communicate to the AMDGPU backend.

Diff Detail

Repository
rL LLVM

Event Timeline

t-tye created this revision.Feb 24 2018, 7:56 PM
t-tye retitled this revision from CLANG: Remove use of OpenCL triple environment and replace with function attribute for AMDGPU to [AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU.
nhaehnle removed a subscriber: nhaehnle.Feb 25 2018, 7:36 AM
kzhuravl accepted this revision.Feb 26 2018, 6:56 AM

LGTM.

This revision is now accepted and ready to land.Feb 26 2018, 6:56 AM
yaxunl accepted this revision.Feb 26 2018, 7:42 AM

LGTM. Thanks!

t-tye updated this revision to Diff 136234.Feb 27 2018, 11:38 PM
t-tye added a reviewer: arsenm.
  • Changed to use function attribute that specifies the number of bytes of implicit arguments for OpenCL kernels when using the AMDHSA OS.
kzhuravl requested changes to this revision.Mar 6 2018, 2:59 PM
kzhuravl added inline comments.
docs/UsersManual.rst
2182 ↗(On Diff #136234)

-mcpu=gfx900

2290 ↗(On Diff #136234)

-mcpu=gfx900

lib/CodeGen/TargetInfo.cpp
7660 ↗(On Diff #136234)

Make <= 80 chars long.

This revision now requires changes to proceed.Mar 6 2018, 2:59 PM
yaxunl added inline comments.Mar 19 2018, 10:43 AM
lib/CodeGen/TargetInfo.cpp
7661 ↗(On Diff #136234)

Now we support enqueue_kernel, there are two extra hidden arguments. Totally 6 hidden arguments. The implicitarg-num-bytes should be 48 now.

t-tye added inline comments.Mar 19 2018, 10:57 AM
lib/CodeGen/TargetInfo.cpp
7661 ↗(On Diff #136234)

Can you give the ABI for these implicit arguments so can update the AMDGUUsage document?

yaxunl added inline comments.Mar 19 2018, 11:16 AM
lib/CodeGen/TargetInfo.cpp
7661 ↗(On Diff #136234)

The first 3 implicit arguments are x, y, z components of global offset.
The 4-th implicit argument is pointer to printf buffer.
The 5-th implicit argument is pointer to virtual queue used by enqueue_kernel.
The 6-th implicit argument is pointer to AqlWrap struct used by enqueue_kernel.
The size and alignment of each implicit argument is 8.

t-tye updated this revision to Diff 139083.Mar 20 2018, 12:38 AM

Address review feedback.

t-tye marked 3 inline comments as done.Mar 20 2018, 12:40 AM
t-tye added inline comments.
lib/CodeGen/TargetInfo.cpp
7661 ↗(On Diff #136234)

Added description to AMDGPUUsage for current 4 implicit arguments.

Will address change for device enqueue as a separate patch.

t-tye retitled this revision from [AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU to [AMDGPU] Remove use of OpenCL triple environment and replace with function attribute for AMDGPU (CLANG).Mar 20 2018, 12:07 PM
t-tye added inline comments.Mar 20 2018, 12:13 PM
lib/CodeGen/TargetInfo.cpp
7661 ↗(On Diff #136234)

Posted patches D44696 (CLANG) and D44697 (LLVM) to increase implicit arg size to 48 bytes.

arsenm accepted this revision.Mar 22 2018, 12:48 PM

LGTM

This revision was not accepted when it landed; it landed in state Needs Review.Mar 23 2018, 11:45 AM
This revision was automatically updated to reflect the committed changes.