This is an archive of the discontinued LLVM Phabricator instance.

[docs]Updated the AMD GPU Attributes documentation
AbandonedPublic

Authored by pooja2299 on May 9 2021, 10:08 AM.

Download Raw Diff

Details

Reviewers

gandhi21299
arsenm
aaron.ballman
xgupta

Summary

Changed the documentation of amdgpu_flat_work_group_size under AMD GPU Attributes which suggested that attribute is an optimization hint. But as suggested in the bug https://bugs.llvm.org/show_bug.cgi?id=42989, it should be made mandatory.

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	2,120 ms	x64 debian > libarcher.races::lock-unrelated.c

Event Timeline

pooja2299 created this revision.May 9 2021, 10:08 AM

Herald added a reviewer: aaron.ballman. · View Herald TranscriptMay 9 2021, 10:08 AM

Herald added a subscriber: tpr. · View Herald Transcript

pooja2299 requested review of this revision.May 9 2021, 10:08 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2021, 10:08 AM

Herald added a subscriber: cfe-commits. · View Herald Transcript

pooja2299 edited reviewers, added: gandhi21299; removed: aaron.ballman.May 9 2021, 10:14 AM

Herald added a reviewer: aaron.ballman. · View Herald TranscriptMay 9 2021, 10:14 AM

pooja2299 removed a reviewer: aaron.ballman.May 9 2021, 10:15 AM

Herald added a reviewer: aaron.ballman. · View Herald TranscriptMay 9 2021, 10:15 AM

Amended the flat_work_group_size section.

pooja2299 edited reviewers, added: arsenm; removed: aaron.ballman.May 9 2021, 10:35 AM

Herald added a reviewer: aaron.ballman. · View Herald TranscriptMay 9 2021, 10:35 AM

Herald added a subscriber: wdng. · View Herald Transcript

Made some corrections

Harbormaster completed remote builds in B103393: Diff 343920.May 9 2021, 11:23 AM

Minor wordsmithing on the documentation changes, but more importantly: why is the correct fix to the documentation as opposed to changing the default max working group size?

clang/include/clang/Basic/AttrDocs.td
2244–2247

This revision now requires changes to proceed.May 10 2021, 5:39 AM

In D102134#2747649, @aaron.ballman wrote:

Minor wordsmithing on the documentation changes, but more importantly: why is the correct fix to the documentation as opposed to changing the default max working group size?

Hi @aaron.ballman. Thanks for your feedback! I am an outreachy applicant and totally new to this project. I am currently trying to understand the code base. So thought to update the documentation meanwhile. Later on we can change the default max working group size with your suggestion. What do you say, should we directly change the default max working group size and not the documentation?

In D102134#2751184, @pooja2299 wrote:

In D102134#2747649, @aaron.ballman wrote:

Minor wordsmithing on the documentation changes, but more importantly: why is the correct fix to the documentation as opposed to changing the default max working group size?

Hi @aaron.ballman. Thanks for your feedback! I am an outreachy applicant and totally new to this project. I am currently trying to understand the code base.

Welcome!

So thought to update the documentation meanwhile. Later on we can change the default max working group size with your suggestion. What do you say, should we directly change the default max working group size and not the documentation?

I'm not an AMD person and so I'm not certain I'm the *best* person to answer this, but my feeling is that this is a case where the implementation should be updated rather than the docs. Otherwise, we're effectively encouraging users to churn their code (add the attribute to places they didn't use it before) with the intention of undoing that in the future. However, I'm hoping someone more familiar with AMDGPU can pipe up with their opinions. @arsenm?

arsenm added inline comments.May 11 2021, 12:02 PM

clang/include/clang/Basic/AttrDocs.td
2244–2247	You're updating this with outdated information. In general functions should be conservatively correct by default with no attribute specified. This was broken at one point in the past. The default assumed workgroup size is now 1024, but for opencl clang will always default to a max of 256

I am really not an idol reviewer for this patch -:) don't know anything about AMDGPU.

xgupta added a subscriber: xgupta.May 12 2021, 9:52 AM

pooja2299 added inline comments.May 19 2021, 8:31 AM

clang/include/clang/Basic/AttrDocs.td
2244–2247	Ohh. Thanks for your feedback. Will update it

Herald added a subscriber: foad. · View Herald TranscriptMay 24 2021, 3:09 AM

xgupta removed a subscriber: xgupta.May 24 2021, 3:09 AM

Closing this issue because the default workgroup size is 1024 now, so no changes are required.

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

AttrDocs.td

5 lines

Diff 343920

clang/include/clang/Basic/AttrDocs.td

Show First 20 Lines • Show All 2,235 Lines • ▼ Show 20 Lines

def AMDGPUFlatWorkGroupSizeDocs : Documentation {

let Content = [{

The flat work-group size is the number of work-items in the work-group size

specified when the kernel is dispatched. It is the product of the sizes of the

x, y, and z dimension of the work-group.

Clang supports the

``__attribute__((amdgpu_flat_work_group_size(<min>, <max>)))`` attribute for the

AMDGPU target. This attribute may be attached to a kernel function definition

and is an optimization hint.

and is an optimization hint. It is mandatory to use this attribute in some

situations. Because when the attribute is absent, the compiler assumes the

default maximum workgroup size of 256 but nowadays the workgroup size can legally go

to 1024.

aaron.ballmanUnsubmitted

Not Done

AMDGPU target. This attribute may be attached to a kernel function definition

and is an optimization hint. It is mandatory to use this attribute in some

- situations. Because when the attribute is absent, the compiler assumes the

- default maximum workgroup size of 256 but nowadays the workgroup size can legally go

+ situations. When the attribute is absent, the compiler assumes the default

+ maximum workgroup size is 256, however the workgroup size can legally go

to 1024.

``<min>`` parameter specifies the minimum flat work-group size, and ``<max>``

aaron.ballman:

arsenmUnsubmitted

Not Done

You're updating this with outdated information. In general functions should be conservatively correct by default with no attribute specified. This was broken at one point in the past. The default assumed workgroup size is now 1024, but for opencl clang will always default to a max of 256

arsenm: You're updating this with outdated information. In general functions should be conservatively…

pooja2299AuthorUnsubmitted

Done

Ohh. Thanks for your feedback. Will update it

pooja2299: Ohh. Thanks for your feedback. Will update it

``<min>`` parameter specifies the minimum flat work-group size, and ``<max>``

parameter specifies the maximum flat work-group size (must be greater than

``<min>``) to which all dispatches of the kernel will conform. Passing ``0, 0``

as ``<min>, <max>`` implies the default behavior (``128, 256``).

If specified, the AMDGPU target backend might be able to produce better machine

code for barriers and perform scratch promotion by estimating available group

▲ Show 20 Lines • Show All 3,677 Lines • Show Last 20 Lines