Page MenuHomePhabricator

Add documentation for target ID and ClangOffloadBundlerFormat
Needs ReviewPublic

Authored by yaxunl on Jul 28 2020, 9:08 PM.

Details

Summary

Added clang/docs/ClangOffloadBundlerFileFormat.rst to document
the file format used by clang-offload-bundler for embedding device binaries.

Added documentation for target ID to llvm/docs/AMDGPUUsage.rst

Copied the previous version of AMDGPUUsage.rst to AMDGPUUsage_V3.rst

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
yaxunl added inline comments.Aug 10 2020, 8:09 AM
llvm/docs/AMDGPUUsage.rst
264

xnack and sram-ecc are now set through target ID, and their default value are now changed to "default". As such, the default value of all features are now described in the "Target Features" section instead of in this table.

Can you add the proposed commit message to the phab diff as well? The commit subject does by far not cover changes like "Embedding Bundled Code Objects".

clang/docs/ClangOffloadBundlerFileFormat.rst
89

It is still completely unclear how non AMDGPU targets use this. The description is confusing IMHO. Specifically, the <bundle_entry_id> specifies "-" <processor_or_target_id> but it is unclear what that means for other targets, given that the processor_or_target_id description links to the AMDGPU docs. If you want to say other targets do not use the target_id at all, so there is nothing, please specify say that explicitly. If there is something, we have to say something about that as well, at least that there are alternatives to the AMDGPU way.

yaxunl updated this revision to Diff 284700.Aug 11 2020, 7:03 AM
yaxunl marked an inline comment as done.
yaxunl retitled this revision from Add target ID to AMDGPU documentation to Add documentation for target ID and ClangOffloadBundlerFormat.

revised by Johannes' comments.

yaxunl edited the summary of this revision. (Show Details)Aug 11 2020, 7:10 AM

Can you add the proposed commit message to the phab diff as well? The commit subject does by far not cover changes like "Embedding Bundled Code Objects".

Johannes, did my update address the issue? Any further changes are needed? Thanks.

t-tye added inline comments.Aug 12 2020, 9:57 AM
llvm/docs/AMDGPUUsage.rst
211

This processor does support xnack so removing that from the Features column seems incorrect. That is true for the other processors below.

t-tye requested changes to this revision.Aug 12 2020, 9:57 AM
This revision now requires changes to proceed.Aug 12 2020, 9:57 AM
yaxunl updated this revision to Diff 285183.Aug 12 2020, 2:53 PM

Fix xnack for gfx8

t-tye accepted this revision.Aug 12 2020, 10:01 PM

LGTM

arsenm added inline comments.Aug 13 2020, 11:51 AM
llvm/docs/AMDGPUUsage.rst
386

Default is an extremely confusing name. This should reflect that it's the universally compatible mode. AnyMode? Either? Universal?

yaxunl marked an inline comment as done.Aug 13 2020, 3:19 PM
yaxunl added inline comments.
llvm/docs/AMDGPUUsage.rst
386

Another word I can think of is "General" or "Generic".

@t-tye What do you think? Thanks.

t-tye added inline comments.Aug 13 2020, 6:34 PM
llvm/docs/AMDGPUUsage.rst
386

What about "any". It is short like "on" and "off" and captures what Matt suggests (a truncation of "anymode" which would imply "onmode" and "offmode" which seem a mouthful and no more obvious).

Thoughts?

yaxunl marked an inline comment as done.Aug 14 2020, 9:18 AM
yaxunl added inline comments.
llvm/docs/AMDGPUUsage.rst
386

"Any" sounds perfect. I will update the patch. Thanks.

jdoerfert resigned from this revision.Aug 16 2020, 11:24 AM

I still believe the language is confusing at best. We should not describe generic functionality by linking AMDGPU documentation if this is not the same for all targets. We can say, for AMDGPU this is how is done, but as of now it does not state it this way. To me this reads as if the AMDGPU functionality is the only way these things are used, which is not the case as far as I can tell. Anyway, I get the feeling it might be simpler to go over this in a subsequent patch.

This revision is now accepted and ready to land.Aug 16 2020, 11:24 AM
t-tye added inline comments.Aug 17 2020, 7:51 PM
llvm/docs/AMDGPUUsage.rst
227

Change all occurences of sram-ecc to sramecc to avoid problems that hyphen is used as a bundled code object entry ID separator.

2285

Add "TargetID" attribute that is a string that is the target ID for the module.

7676–7677

<target> -> <target-id>

7680

target ID

7682–7683

and `-mcpu`.

7758–7759

This directive is being removed since the information is now present in the target ID that is provided by the .amdgcn_target directive.

t-tye requested changes to this revision.Aug 17 2020, 7:51 PM
This revision now requires changes to proceed.Aug 17 2020, 7:51 PM
t-tye added inline comments.Aug 17 2020, 8:28 PM
clang/docs/ClangOffloadBundlerFileFormat.rst
12

[ "-" <processor_or_target_id> ]

This is optional as some targets do not support this.

26

(for AMD GPU see :doc:amdgpu-target-ids)

88–90

defined by the target (for AMD GPU see :doc::amdgpu-embedding-bundled-objects).

llvm/docs/AMDGPUUsage.rst
271

We should move the code object V3 into a separate page so we continue to define that ABI since old code objects still exist. Then these changes can define the code object V4.

2285

Or is this:

<target_triple> "-" <target_id>

7676–7677

Or is this:

<target_triple> "-" <target_id>

7680

Or is this:

<target_triple> "-" <target_id>

yaxunl updated this revision to Diff 286397.Aug 18 2020, 2:16 PM
yaxunl marked 15 inline comments as done.

revised by Tony's and Matt's comments

clang/docs/ClangOffloadBundlerFileFormat.rst
12

If a target does not support target ID, <processor_or_target_id> will be processor, so this item is still there.

26

clang documentation is in different URL than llvm documentation.

llvm/docs/AMDGPUUsage.rst
227

done

271

can we do that with a separate patch? This patch already contains too many changes.

2285

This is V2. Are we sure we want to add target ID to V2?

7682–7683

-mcpu is already mentioned by the same sentence

7758–7759

done

t-tye added inline comments.Aug 18 2020, 2:39 PM
clang/docs/ClangOffloadBundlerFileFormat.rst
12

My understanding was that some devices do not need a processor to be specified at all. For example, for x86 isn't the target triple enough by itself? In OpenMP it would simply use a bundled code object entry ID that is "opemp-<target-triple>".

26

So LLVM has a mono-repo but the documentation does not have a single TOC tree? That is a shame and rather defeats the benefits of being in a mono repo.

llvm/docs/AMDGPUUsage.rst
271

My concern is that this patch will obliterate the V3 definition with this V4 one. But I guess you can always reach back to an earlier commit as that is what git is for:-)

2285

Sorry I put the comment in the wrong section. I meant to put it in the V3 (not becoming V4) section.

7682–7683

My suggested replacement was to remove the "and those which specify target features." since those options are being removed. So the "and" needs to go before the -mcpu to become:

Used by the assembler to validate command-line options such as ``-triple`` and ``-mcpu``.

How does one opt out of this scheme?

OpenMP already has a convention for finding code objects in the host binary, used by amdgpu and other targets, which I don't think matches the above.

Where's the corresponding implementation? I think it's the llc part I need to read to understand whether this proposal can be used outside of HIP and amd's OpenCL.

How does one opt out of this scheme?

OpenMP already has a convention for finding code objects in the host binary, used by amdgpu and other targets, which I don't think matches the above.

Where's the corresponding implementation? I think it's the llc part I need to read to understand whether this proposal can be used outside of HIP and amd's OpenCL.

If you do not care about turning on/off xnack and sram-ecc, you do not need to use it. The current processor name still works.

If you need to turn on/off xnack and sram-ecc, you need to use target ID. From user point of view, it is an extension to processor name. Target ID support is implemented in clang driver and backend (https://reviews.llvm.org/D60620). In clang driver, it is supported by clang tool by -mcpu option as an extension for processor name. As far as you pass target ID e.g. gfx906:xnack+ to clang tool by -mcpu, you will get target ID support in backend. My understanding is that since OpenMP only loads one GPU arch, only backend support is needed, therefore target ID support on OpenMP should be automatic.

yaxunl updated this revision to Diff 286552.Aug 19 2020, 6:55 AM
yaxunl marked 5 inline comments as done.

revised by Tony's comments.

clang/docs/ClangOffloadBundlerFileFormat.rst
12

fixed

llvm/docs/AMDGPUUsage.rst
2285

Moved to V3.

7682–7683

Sorry, I misunderstood. Fixed.

t-tye added inline comments.Aug 19 2020, 7:26 PM
clang/docs/ClangOffloadBundlerFileFormat.rst
26

for AMD GPU see

88–90

defined by the target (for AMD GPU see `AMDGPU Embedding Bundled Code Objects

<https://llvm.org/docs/AMDGPUUsage.html#embedding-bundled-objects>`_).
llvm/docs/AMDGPUUsage.rst
90

We should move the code object V3 into a separate page so we continue to define that ABI since old code objects still exist. Then these changes can define the code object V4, with a reference to the pages for previous vesions.

258
*TBA*

.. TODO::
  Add product
  names.
7677–7678

Make +'s match length of above title.

If you do not care about turning on/off xnack and sram-ecc, you do not need to use it. The current processor name still works.

Nice. That makes me much less nervous about this support. Plus the change seems to have gone in last year without breaking openmp. Thanks for the link.

t-tye requested changes to this revision.Aug 20 2020, 11:04 AM
t-tye added inline comments.
llvm/docs/AMDGPUUsage.rst
256–264

Delete as gfx1031 does not support XNACK.

935–938
``EF_AMDGPU_FEATURE_XNACK_NOT_SUPPORTED_V4``   0x0
``EF_AMDGPU_FEATURE_XNACK_ANY_V4 ``            0x1
``EF_AMDGPU_FEATURE_XNACK_OFF_V4 ``            0x2
``EF_AMDGPU_FEATURE_XNACK_ON_V4 ``             0x3
947–950
``EF_AMDGPU_FEATURE_SRAMECC_NOT_SUPPORTED_V4``   0x0
``EF_AMDGPU_FEATURE_SRAMECC_ANY_V4 ``            0x1
``EF_AMDGPU_FEATURE_SRAMECC_OFF_V4 ``            0x2
``EF_AMDGPU_FEATURE_SRAMECC_ON_V4 ``             0x3
This revision now requires changes to proceed.Aug 20 2020, 11:04 AM
yaxunl updated this revision to Diff 289400.Sep 2 2020, 5:16 AM
yaxunl marked 8 inline comments as done.
yaxunl edited the summary of this revision. (Show Details)

revised by Tony's comments.

kzhuravl requested changes to this revision.Sep 2 2020, 11:12 AM
kzhuravl added inline comments.
llvm/docs/AMDGPUUsage.rst
935–938

EF_AMDGPU_FEATURE_XNACK_UNSUPPORTED_V4

947–950

EF_AMDGPU_FEATURE_SRAMECC_UNSUPPORTED_V4

This revision now requires changes to proceed.Sep 2 2020, 11:12 AM
kzhuravl added inline comments.Sep 2 2020, 11:21 AM
llvm/docs/AMDGPUUsage.rst
830

EF_AMDGPU_FEATURE_XNACK_V3

841

EF_AMDGPU_FEATURE_SRAMECC_V3

867

EF_AMDGPU_FEATURE_XNACK_V4

867

EF_AMDGPU_FEATURE_XNACK_V3

871

EF_AMDGPU_FEATURE_SRAMECC_V4

kzhuravl added inline comments.Sep 2 2020, 11:31 AM
llvm/docs/AMDGPUUsage.rst
867

EF_AMDGPU_FEATURE_XNACK_V3 in the previous comment is wrong. It has to be EF_AMDGPU_FEATURE_XNACK_V4

t-tye added inline comments.Sep 3 2020, 8:57 AM
llvm/docs/AMDGPUUsage.rst
817–818

These tables need to be corrected to accurately represent the different layouts for e_flags in the code object versions.

The code object versions are not the same as the ABI versions. It should be made clearer what the relationship is.

Note that the code object V2 had the following definitions:

EF_AMDGPU_FEATURE_XNACK_V2             0x00000001
EF_AMDGPU_FEATURE_TRAP_HANDLER_V2      0x00000002
yaxunl updated this revision to Diff 291075.Sep 10 2020, 1:37 PM
yaxunl marked 9 inline comments as done.

revised by Tony's and Konstantin's comments