This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU - Use MUBUF instructions for global address space access
ClosedPublic

Authored by pvellien on Dec 2 2020, 7:07 AM.

Details

Summary

Currently, the compiler crashes in instruction selection of global load/stores in gfx600 due to the lack of FLAT instructions. This patch fix the crash by selecting MUBUF instructions for global load/stores in gfx600.

Diff Detail

Event Timeline

pvellien created this revision.Dec 2 2020, 7:07 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 2 2020, 7:07 AM
pvellien requested review of this revision.Dec 2 2020, 7:07 AM
t-tye requested changes to this revision.Dec 2 2020, 11:36 AM
t-tye added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
207

Is this asking if the code contains use of the generic address space?

208

This seems a very strange way for carrying this knowledge to the later code. Why not use a separate variable with a clear name?

llvm/test/CodeGen/AMDGPU/si-global-buffer.ll
1–18 ↗(On Diff #308957)

I think it would be better to add a run line to existing memory-legalizer tests rather than have a separate test.

gfx600 now supports the memory model and so can now be tested by the memory-legalizer tests.

This revision now requires changes to proceed.Dec 2 2020, 11:36 AM
arsenm added inline comments.Dec 2 2020, 7:38 PM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
144–146

This isn't a generation check, and should be the same as the above addr64 check

146

This string search isn't actually a reliable test but this is already used

147

I'm not sure printing a warning here is particularly helpful. It's not going to be reported in any meaningful way. A proper Diagnostic would be better, but I don't think this would be the right place to emit it.

Comments inline

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
144–146

I'm not sure I follow your comment, this patch uses MUBUF instructions only for gfx60x. The addr64 check returns true for both southern_island and sea_island processors. I think if we conditionally check only for hasAddr64, that may not be enough to handle the case where (FlatForGlobal = false, for SOUTHERN_ISLANDS and FlatForGlobal = true, for SEA_ISLANDS).

147

I will remove this check and warning here.

207

No, this isn't a check for generic address space. The GPU string is set to a string "generic" by default when no target cpu is specified. It has nothing to do with address spaces access.
when OS is HSA it is set to "generic-hsa" and in all other cases it is set to "generic".

208

In the previous version for this piece of code, the Gen member of this class is initialized to sea_islands when OS is HSA and southern_islands for all other cases and then the requested 'gen' is picked by the tablegen generated function "ParseSubtargetFeatures" in initializeSubtargetDependencies.

The above mentioned behavior is not correct, because it initializes the Gen is sea_islands for "-mtriple=amdgcn-amd-amdhsa -mcpu=gfx600" where the correct "Gen" would be southern islands.

This piece of change fixes this bug, and also preserves the existing functionality. I don't think we need a separate variable since this is the only place we are handling. please correct me if I'm wrong.

fixed a typo in comments.

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
208

Sorry there is a typo, in the second passage it should be:

" The above mentioned behavior is not correct, because it initializes the Gen to
sea_islands for "-mtriple=amdgcn-amd-amdhsa -mcpu=gfx600" when the correct
'Gen' would be southern islands."

pvellien updated this revision to Diff 309889.EditedDec 7 2020, 5:46 AM

Removed error reporting based on string comparison. Updated the memory legalizer tests to include amdhsa for gfx60x

rampitec added inline comments.Dec 7 2020, 1:03 PM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
207

I would probably just error out if we have a generic CPU + amdhsa requested. It seems cleaner.

pvellien added inline comments.Dec 8 2020, 10:53 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
207

Thanks for your comment, this patch preserves the earlier functionality except the use of mubuf. Perhaps we can address your suggestion in a separate patch with some added context, and exploring the possibilities of reporting the error in other target-triple, cpu pairs?

rampitec added inline comments.Dec 8 2020, 12:16 PM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
207

It is OK to have it in a separate commit, but I do not follow logic here. You are setting SI for any NON-generic chip.

pvellien added inline comments.Dec 10 2020, 11:24 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
207

For a non-generic cpu, the correct Gen member is set by the ParseSubtargetFeatures function call (generated by tablegen) in initializeSubtargetDependencies function. It is similar to the previous version of this code, where Gen is set to SI if OS is not HSA and set to CI if OS is HSA and correct Gen is set by the call to ParseSubtargetFeatures function.

rampitec added inline comments.Dec 10 2020, 11:30 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
207

Then this line is a no-op? You say: "for a non-generic CPU gen is initialized elsewhere". An this change only forces SI for a non-generic CPU.

pvellien added inline comments.Dec 11 2020, 10:08 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
207

Actually there are bunch of lit tests which assumes Gen as CI if cpu isn't specified with AMDHSA OS. This change just means for HSA OS if the cpu isn't specified it initializes to CI, for non-HSA if cpu isn't specified it initializes to SI.
Do I want to change that if cpu isn't specified, then all mtriple assumes to SI?

rampitec added inline comments.Dec 11 2020, 11:42 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
207

I still do not understand how does that correspond to the condition. That is the old condition:

TT.getOS() == Triple::AMDHSA ? SEA_ISLANDS : SOUTHERN_ISLANDS

And that is literally what you have described: "This change just means for HSA OS if the cpu isn't specified it initializes to CI, for non-HSA if cpu isn't specified it initializes to SI".

At the same time this code:

!GPU.contains("generic") ? SOUTHERN_ISLANDS

Reads: use SI for any CPU besides generic regardless of the triple.

Comments inline

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
207

Sorry for the delay. I think I didn't explain it very well in my previous comments. So I take an attempt to describe here:
The problem is that gfx600 processors does not support flat instructions for global load/stores. But the compiler tries to use flat, which result in a crash in instruction selection. The root cause of this problem is due to the initialization in the code:

TT.getOS() == Triple::AMDHSA ? SEA_ISLANDS : SOUTHERN_ISLANDS

for HSA OS type and -mcpu=gfx600 the above code forces Gen member variable to be SEA_ISLANDS where it should have been SOUTHERN_ISLANDS. This bug only shows up when you try to use HSA OS for -mcpu=gfx600 processors.

What I'm trying to do is initialize the Gen member variable in following cases:
when OS is HSA

if mcpu is specified, set Gen to the correct generation that cpu belongs to, by initializing it with SOUTHERN_ISLANDS.
if mcpu is not specified (in this case GPU = "generic-hsa"), set Gen to the SEA_LANDS to be in parity with previous code

I didn't see how I can pick up the correct generation that the processor belong to without forcing this in constructor. I would like to know you thoughts on this?
Please let me know if there is any other way to solve this?

if mcpu is specified, set Gen to the correct generation that cpu belongs to, by initializing it with SOUTHERN_ISLANDS.
if mcpu is not specified (in this case GPU = "generic-hsa"), set Gen to the SEA_LANDS to be in parity with previous code> Comments inline

You are forcing SI for any SPECIFIED GPU instead:

!GPU.contains("generic") ? SOUTHERN_ISLANDS

Read this code line: if CPU is NOT generic, force it to SI.

if mcpu is specified, set Gen to the correct generation that cpu belongs to, by initializing it with SOUTHERN_ISLANDS.
if mcpu is not specified (in this case GPU = "generic-hsa"), set Gen to the SEA_LANDS to be in parity with previous code> Comments inline

You are forcing SI for any SPECIFIED GPU instead:

!GPU.contains("generic") ? SOUTHERN_ISLANDS

Read this code line: if CPU is NOT generic, force it to SI.

Ugh. I see what you mean now, it took debugger to understand. The real value of Gen is assigned below in the GCNSubtarget::initializeSubtargetDependencies().

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA.

if mcpu is specified, set Gen to the correct generation that cpu belongs to, by initializing it with SOUTHERN_ISLANDS.
if mcpu is not specified (in this case GPU = "generic-hsa"), set Gen to the SEA_LANDS to be in parity with previous code> Comments inline

You are forcing SI for any SPECIFIED GPU instead:

!GPU.contains("generic") ? SOUTHERN_ISLANDS

Read this code line: if CPU is NOT generic, force it to SI.

Ugh. I see what you mean now, it took debugger to understand. The real value of Gen is assigned below in the GCNSubtarget::initializeSubtargetDependencies().

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA.

if mcpu is specified, set Gen to the correct generation that cpu belongs to, by initializing it with SOUTHERN_ISLANDS.
if mcpu is not specified (in this case GPU = "generic-hsa"), set Gen to the SEA_LANDS to be in parity with previous code> Comments inline

You are forcing SI for any SPECIFIED GPU instead:

!GPU.contains("generic") ? SOUTHERN_ISLANDS

Read this code line: if CPU is NOT generic, force it to SI.

Ugh. I see what you mean now, it took debugger to understand. The real value of Gen is assigned below in the GCNSubtarget::initializeSubtargetDependencies().

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA

if mcpu is specified, set Gen to the correct generation that cpu belongs to, by initializing it with SOUTHERN_ISLANDS.
if mcpu is not specified (in this case GPU = "generic-hsa"), set Gen to the SEA_LANDS to be in parity with previous code> Comments inline

You are forcing SI for any SPECIFIED GPU instead:

!GPU.contains("generic") ? SOUTHERN_ISLANDS

Read this code line: if CPU is NOT generic, force it to SI.

Ugh. I see what you mean now, it took debugger to understand. The real value of Gen is assigned below in the GCNSubtarget::initializeSubtargetDependencies().

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA.

As far as I the support of HSA in SI processors. Tony suggested that gfx60x processors should support AMDHSA OS because it is capable of supporting OpenCL 1.2 which not need generic pointers. This topic was discussed in amdgcn weekly and using MUBUF instructions for global address space in gfx60x is decided. The same is also documented in AMDGPU User Guide earlier.

rampitec requested changes to this revision.Dec 18 2020, 12:16 PM

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA.

As far as I the support of HSA in SI processors. Tony suggested that gfx60x processors should support AMDHSA OS because it is capable of supporting OpenCL 1.2 which not need generic pointers. This topic was discussed in amdgcn weekly and using MUBUF instructions for global address space in gfx60x is decided. The same is also documented in AMDGPU User Guide earlier.

But it is not. In fact AMDGPUUsage.rst says that SI does NOT support amdhsa and it is only supported starting from gfx700:

**GCN GFX6 (Southern Islands (SI))** [AMD-GCN-GFX6]_
-----------------------------------------------------------------------------------------------------------------------
``gfx600``  - ``tahiti``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
                                                                   support
                                                                   generic
                                                                   address
                                                                   space
``gfx601``  - ``pitcairn``  ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``verde``                                            support
                                                                   generic
                                                                   address
                                                                   space
``gfx602``  - ``hainan``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``oland``                                            support
                                                                   generic
                                                                   address
                                                                   space

It is only amdpal listed here.

What was discussed is that we can use buffer instructions for global on SI even if amdhsa is not supported.

This revision now requires changes to proceed.Dec 18 2020, 12:16 PM

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA.

As far as I the support of HSA in SI processors. Tony suggested that gfx60x processors should support AMDHSA OS because it is capable of supporting OpenCL 1.2 which not need generic pointers. This topic was discussed in amdgcn weekly and using MUBUF instructions for global address space in gfx60x is decided. The same is also documented in AMDGPU User Guide earlier.

But it is not. In fact AMDGPUUsage.rst says that SI does NOT support amdhsa and it is only supported starting from gfx700:

**GCN GFX6 (Southern Islands (SI))** [AMD-GCN-GFX6]_
-----------------------------------------------------------------------------------------------------------------------
``gfx600``  - ``tahiti``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
                                                                   support
                                                                   generic
                                                                   address
                                                                   space
``gfx601``  - ``pitcairn``  ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``verde``                                            support
                                                                   generic
                                                                   address
                                                                   space
``gfx602``  - ``hainan``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``oland``                                            support
                                                                   generic
                                                                   address
                                                                   space

It is only amdpal listed here.

What was discussed is that we can use buffer instructions for global on SI even if amdhsa is not supported.

At one point, while you were on vacation, the consensus was "support amdhsa OS with gfx6, and fail to compile in the presence of e.g. generic pointers".

Has there been more discussion since then?

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA.

As far as I the support of HSA in SI processors. Tony suggested that gfx60x processors should support AMDHSA OS because it is capable of supporting OpenCL 1.2 which not need generic pointers. This topic was discussed in amdgcn weekly and using MUBUF instructions for global address space in gfx60x is decided. The same is also documented in AMDGPU User Guide earlier.

But it is not. In fact AMDGPUUsage.rst says that SI does NOT support amdhsa and it is only supported starting from gfx700:

**GCN GFX6 (Southern Islands (SI))** [AMD-GCN-GFX6]_
-----------------------------------------------------------------------------------------------------------------------
``gfx600``  - ``tahiti``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
                                                                   support
                                                                   generic
                                                                   address
                                                                   space
``gfx601``  - ``pitcairn``  ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``verde``                                            support
                                                                   generic
                                                                   address
                                                                   space
``gfx602``  - ``hainan``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``oland``                                            support
                                                                   generic
                                                                   address
                                                                   space

It is only amdpal listed here.

What was discussed is that we can use buffer instructions for global on SI even if amdhsa is not supported.

At one point, while you were on vacation, the consensus was "support amdhsa OS with gfx6, and fail to compile in the presence of e.g. generic pointers".

Has there been more discussion since then?

I see that is not what we document. @t-tye whatis your opinion?

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA.

As far as I the support of HSA in SI processors. Tony suggested that gfx60x processors should support AMDHSA OS because it is capable of supporting OpenCL 1.2 which not need generic pointers. This topic was discussed in amdgcn weekly and using MUBUF instructions for global address space in gfx60x is decided. The same is also documented in AMDGPU User Guide earlier.

But it is not. In fact AMDGPUUsage.rst says that SI does NOT support amdhsa and it is only supported starting from gfx700:

**GCN GFX6 (Southern Islands (SI))** [AMD-GCN-GFX6]_
-----------------------------------------------------------------------------------------------------------------------
``gfx600``  - ``tahiti``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
                                                                   support
                                                                   generic
                                                                   address
                                                                   space
``gfx601``  - ``pitcairn``  ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``verde``                                            support
                                                                   generic
                                                                   address
                                                                   space
``gfx602``  - ``hainan``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``oland``                                            support
                                                                   generic
                                                                   address
                                                                   space

It is only amdpal listed here.

What was discussed is that we can use buffer instructions for global on SI even if amdhsa is not supported.

At one point, while you were on vacation, the consensus was "support amdhsa OS with gfx6, and fail to compile in the presence of e.g. generic pointers".

Has there been more discussion since then?

Yeah, support amdhsa OS

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA.

As far as I the support of HSA in SI processors. Tony suggested that gfx60x processors should support AMDHSA OS because it is capable of supporting OpenCL 1.2 which not need generic pointers. This topic was discussed in amdgcn weekly and using MUBUF instructions for global address space in gfx60x is decided. The same is also documented in AMDGPU User Guide earlier.

But it is not. In fact AMDGPUUsage.rst says that SI does NOT support amdhsa and it is only supported starting from gfx700:

**GCN GFX6 (Southern Islands (SI))** [AMD-GCN-GFX6]_
-----------------------------------------------------------------------------------------------------------------------
``gfx600``  - ``tahiti``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
                                                                   support
                                                                   generic
                                                                   address
                                                                   space
``gfx601``  - ``pitcairn``  ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``verde``                                            support
                                                                   generic
                                                                   address
                                                                   space
``gfx602``  - ``hainan``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``oland``                                            support
                                                                   generic
                                                                   address
                                                                   space

It is only amdpal listed here.

What was discussed is that we can use buffer instructions for global on SI even if amdhsa is not supported.

At one point, while you were on vacation, the consensus was "support amdhsa OS with gfx6, and fail to compile in the presence of e.g. generic pointers".

Has there been more discussion since then?

I see that is not what we document. @t-tye whatis your opinion?

@scott.linder there is no discussion since then.
@rampitec the final consensus is to use buffer instructions for global address space accesses in SI in presence of amdhsa. In other OS types for SI, backend generates buffer instructions for global addr space access already.

t-tye added a comment.Dec 21 2020, 2:35 PM

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA.

As far as I the support of HSA in SI processors. Tony suggested that gfx60x processors should support AMDHSA OS because it is capable of supporting OpenCL 1.2 which not need generic pointers. This topic was discussed in amdgcn weekly and using MUBUF instructions for global address space in gfx60x is decided. The same is also documented in AMDGPU User Guide earlier.

But it is not. In fact AMDGPUUsage.rst says that SI does NOT support amdhsa and it is only supported starting from gfx700:

**GCN GFX6 (Southern Islands (SI))** [AMD-GCN-GFX6]_
-----------------------------------------------------------------------------------------------------------------------
``gfx600``  - ``tahiti``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
                                                                   support
                                                                   generic
                                                                   address
                                                                   space
``gfx601``  - ``pitcairn``  ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``verde``                                            support
                                                                   generic
                                                                   address
                                                                   space
``gfx602``  - ``hainan``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``oland``                                            support
                                                                   generic
                                                                   address
                                                                   space

It is only amdpal listed here.

What was discussed is that we can use buffer instructions for global on SI even if amdhsa is not supported.

At one point, while you were on vacation, the consensus was "support amdhsa OS with gfx6, and fail to compile in the presence of e.g. generic pointers".

Has there been more discussion since then?

Yeah, support amdhsa OS

But I still do not think this is a right thing to do to accept amdhsa on SI, even if you turn off flat instructions. SI does not support HSA.

As far as I the support of HSA in SI processors. Tony suggested that gfx60x processors should support AMDHSA OS because it is capable of supporting OpenCL 1.2 which not need generic pointers. This topic was discussed in amdgcn weekly and using MUBUF instructions for global address space in gfx60x is decided. The same is also documented in AMDGPU User Guide earlier.

But it is not. In fact AMDGPUUsage.rst says that SI does NOT support amdhsa and it is only supported starting from gfx700:

**GCN GFX6 (Southern Islands (SI))** [AMD-GCN-GFX6]_
-----------------------------------------------------------------------------------------------------------------------
``gfx600``  - ``tahiti``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
                                                                   support
                                                                   generic
                                                                   address
                                                                   space
``gfx601``  - ``pitcairn``  ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``verde``                                            support
                                                                   generic
                                                                   address
                                                                   space
``gfx602``  - ``hainan``    ``amdgcn``   dGPU                    - Does not      - *pal-amdpal*
            - ``oland``                                            support
                                                                   generic
                                                                   address
                                                                   space

It is only amdpal listed here.

What was discussed is that we can use buffer instructions for global on SI even if amdhsa is not supported.

At one point, while you were on vacation, the consensus was "support amdhsa OS with gfx6, and fail to compile in the presence of e.g. generic pointers".

Has there been more discussion since then?

I see that is not what we document. @t-tye whatis your opinion?

@scott.linder there is no discussion since then.
@rampitec the final consensus is to use buffer instructions for global address space accesses in SI in presence of amdhsa. In other OS types for SI, backend generates buffer instructions for global addr space access already.

I updated AMDGPUUsage to reflect the current state which is that gfx6* does not support amdhsa. But if this patch allows gfx6* to be supported without generic addresses then that table should also be updated as part of the patch.

We did discuss this further and decided that gfx6* should report it supports amdhsa with the restriction that it does not support generic addresses. AMDGPUUsage already notes the restriction in all relevant sections. The LIT tests would also need updating accordingly. The COMgr tests would also need updating.

The alternative would be that the compiler reports an error message for gfx6* if the triple specified amdhsa (rather than crash as it does currently). But it was felt that the amdhsa ABI can allow targets that do not support generic to be included provided it is documented. There is no need to artificially restrict them as a user may want to try using the target on an HSA complaint runtime to implement OpenCL 1.2.

I updated AMDGPUUsage to reflect the current state which is that gfx6* does not support amdhsa. But if this patch allows gfx6* to be supported without generic addresses then that table should also be updated as part of the patch.

It does not allow for generic pointers to be used on SI.

We did discuss this further and decided that gfx6* should report it supports amdhsa with the restriction that it does not support generic addresses. AMDGPUUsage already notes the restriction in all relevant sections. The LIT tests would also need updating accordingly. The COMgr tests would also need updating.

AMDGPUUsage does not list amdhsa for all 3 SI targets, yet it is going to be accepted with this patch. If that is so then amdhsa has to be listed in the table for these targets.

The alternative would be that the compiler reports an error message for gfx6* if the triple specified amdhsa (rather than crash as it does currently). But it was felt that the amdhsa ABI can allow targets that do not support generic to be included provided it is documented. There is no need to artificially restrict them as a user may want to try using the target on an HSA complaint runtime to implement OpenCL 1.2.

t-tye added a comment.Dec 21 2020, 2:46 PM

We did discuss this further and decided that gfx6* should report it supports amdhsa with the restriction that it does not support generic addresses. AMDGPUUsage already notes the restriction in all relevant sections. The LIT tests would also need updating accordingly. The COMgr tests would also need updating.

AMDGPUUsage does not list amdhsa for all 3 SI targets, yet it is going to be accepted with this patch. If that is so then amdhsa has to be listed in the table for these targets.

The table is indicating what runtimes support which ABIs. Since ROCm does not support gfx6* it would still be correct to not list it in the table. It may be useful to have another column to indicate which ABIs are supported by which processors.

We did discuss this further and decided that gfx6* should report it supports amdhsa with the restriction that it does not support generic addresses. AMDGPUUsage already notes the restriction in all relevant sections. The LIT tests would also need updating accordingly. The COMgr tests would also need updating.

AMDGPUUsage does not list amdhsa for all 3 SI targets, yet it is going to be accepted with this patch. If that is so then amdhsa has to be listed in the table for these targets.

The table is indicating what runtimes support which ABIs. Since ROCm does not support gfx6* it would still be correct to not list it in the table. It may be useful to have another column to indicate which ABIs are supported by which processors.

Pfff... I do not like it, but OK.

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
208

Please at least remove the parenthesis and reformat this statement.

t-tye added a comment.Dec 21 2020, 3:52 PM

We did discuss this further and decided that gfx6* should report it supports amdhsa with the restriction that it does not support generic addresses. AMDGPUUsage already notes the restriction in all relevant sections. The LIT tests would also need updating accordingly. The COMgr tests would also need updating.

AMDGPUUsage does not list amdhsa for all 3 SI targets, yet it is going to be accepted with this patch. If that is so then amdhsa has to be listed in the table for these targets.

The table is indicating what runtimes support which ABIs. Since ROCm does not support gfx6* it would still be correct to not list it in the table. It may be useful to have another column to indicate which ABIs are supported by which processors.

Pfff... I do not like it, but OK.

I am not completely happy with it either, but wanted to address questions of which runtimes support which processors. So I looked at what ROCr, VDI and PAL runtimes currently support and added that to the column that already existed but was woefully out of date and did not include PAL information. MESA information is also still missing.

scott.linder added inline comments.Dec 22 2020, 10:08 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
208

I think I finally understand what this is doing, but I still think it is worth trying this another way as it seems to have confused each of Stas, Tony, and I to some extent. I don't think this is Praveen's doing, the underlying mechanisms we use for this are just not very intuitive.

What I'd suggest is adding a new INVALID = 0 variant to enum AMDGPUSubtarget::Generation, and changing this initializer to Gen(INVALID).

Then in GCNSubtarget::initializeSubtargetDependencies, after the call to ParseSubtargetFeatures, check for (Gen == INVALID), and implement whatever "generic" CPU stuff we want.

I'd also accompany both places (the enum and the code after ParseSubtargetFeatures) with a summary of the situation in a comment. Maybe:

  class AMDGPUSubtarget {
  public:
    enum Generation {
~     // INVALID is only used as an initial value before `ParseSubtargetFeatures`,
~     // and as a way to detect the "generic" processor case.
~     INVALID,
~     R600,
~     R700,
~     EVERGREEN,
~     NORTHERN_ISLANDS,
~     SOUTHERN_ISLANDS,
~     SEA_ISLANDS,
+     VOLCANIC_ISLANDS,
+     GFX9,
+     GFX10
    };
     ParseSubtargetFeatures(GPU, /*TuneCPU*/ GPU, FullFS);

+   // Implement the "generic" processors, which act as the default when no
+   // generation features are enabled (e.g. for -mcpu=''). There are two variants
+   // of the generic processor, one for HSA and one for everything else.
+   if (Gen == INVALID) {
+     Gen = TT.getOS() == Triple::AMDHSA ? SEA_ISLANDS : SOUTHERN_ISLANDS;
+   }

Thoughts? If this doesn't seem appreciably better to everyone I'm also OK with the patch as-is.

As a separate issue, which we can discuss and address in a separate patch if we choose to do anything, is why we need/want the "generic" processor to actually be two different processors, with the decision being based on the OS. Tony suggested, and I think I agree, that we should probably just pick one generation that is the "lowest common denominator" that supports the major things we need. As the flat address space is pretty fundamental why not just have this be SEA_ISLANDS? Then we just have:

if (Gen == INVALID) {
  Gen = SEA_ISLANDS;
}

Doing this breaks about a dozen lit tests, which I assume can all be fixed by either adding -mcpu=gfx600 (i.e. they no longer test "generic") or updating some check lines.

I like what Scott describes as it seems intuitively obvious and cleaner. Are there any concerns for making these changes?

I like what Scott describes as it seems intuitively obvious and cleaner. Are there any concerns for making these changes?

I concur.

pvellien updated this revision to Diff 313414.Dec 22 2020, 12:42 PM

Updated the code according to scott's suggestion. And I tried to compile sample program with all possible combinations of processors and ABIs( amdhsa, mesa3d, amdpal), it seems like the backend supports all the ABIs for all processors. So Instead of adding a separate column, (since values of processors will be the same) I added a text pharse above the processor table which documents. Is it right thing to do so?

thanks

rampitec accepted this revision.Dec 22 2020, 12:45 PM

LGTM, but please check with Tony for the documentation.

t-tye added inline comments.Dec 22 2020, 1:05 PM
llvm/docs/AMDGPUUsage.rst
105

I suggest:

<p>Every processor supports every OS ABI (see :ref:`amdgpu-os`) with the following exceptions:

* `amdhsa` is not supported on the `r600` architecture (see :ref:`amdgpu-architecture-table`).
pvellien updated this revision to Diff 313552.Dec 23 2020, 7:07 AM
pvellien marked an inline comment as done.

Updated the AMDGPUUsage doc. Also I don't have commit access, please commit this patch on my behalf.
Thanks a lot your time.

t-tye added inline comments.Dec 23 2020, 8:09 AM
llvm/docs/AMDGPUUsage.rst
108

Need blank line before start of list for reStructure to format it correctly.

pvellien updated this revision to Diff 313568.Dec 23 2020, 9:42 AM

added a blank line in amdgpuusage.rst file.

t-tye added inline comments.Dec 23 2020, 10:04 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
94

Seems +flat-for-global should not be added if SI. Suggest pulling up later logic to here and combine.

116–117

Should we consider using AMDGPUSubtarget::SEA_ISLANDS as the "generic" processor for all OSs? That can be a separate patch.

124–143

Would seem better to combine these? Perhaps move this up to be adjacent or combined with code above that sets flat-for-global.

pvellien added inline comments.Dec 23 2020, 10:22 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
94

But the gen member variable is set to correct generation only after the call to parsesubtargetfeatures function, so it would be incorrect to use getGeneration function in this place. I think it would be better to conditionally select later.

t-tye added inline comments.Dec 23 2020, 10:58 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
94

But could that earlier code be using some other means to determine if the target supports flat operations? Is there no target property for that?

If not then leaving "as is" is fine with me. Just would be nice if the logic for each aspect was logically organized together rather than spread over different places making it hard to figure out what is happening.

If this is fine, place land. I don't have commit access yet.

llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
94

As far as I know there is no target property for that. I agree that it would be very much clear to have a common place for this mechanism

scott.linder added inline comments.Dec 23 2020, 11:37 AM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
94

There is FeatureFlatAddressSpace, but it will also only be set as part of ParseSubtargetFeatures.

I'm not sure why this needs to be here, unless some parts of ParseSubtargetFeatures depend on these features being set or not? Otherwise this could all be done below in one place.

scott.linder added inline comments.Dec 23 2020, 1:28 PM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
94

I suppose the issue is that putting these after the call to ParseSubtargetFeatures makes allowing it to be overridden by the user messier?

It seems like it might be nice to just update TableGen to also maintain a bitmap of explicitly set features. Then we can avoid any string manipulation here (all of these things can move below the call, and can just make the default conditional on ExplicitySet.test(FeatureFoo)).

That way everyone has a uniform "escape hatch" for implementing things like mutual exclusion, or more complicated "implies" relationships, without all the string manipulation and comparison.

pvellien added inline comments.Dec 23 2020, 10:45 PM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
94

It seems good to maintain a separate bitmap in tablegen but that should be separate patch since it is a refactor and applies to all features.

t-tye added inline comments.Dec 23 2020, 10:51 PM
llvm/lib/Target/AMDGPU/AMDGPUSubtarget.cpp
92–93

Suggest:

// Turn on features that HSA ABI requires. Also turn on FlatForGlobal by
// default.
if (isAmdHsaOS())
98–106

I do not understand how this is supposed to work. The code does not seem to match the comment. The parser seems to convert all these features into a single WavefrontSizeLog2 so last + one would win. So why is this even needed?

112–114

Suggest:

// Implement the "generic" processors, which acts as the default when no
// generation features are enabled (e.g for -mcpu=''). HSA OS defaults to the
// first amdgcn target that supports flat addressing. Other OSes default to
// the first amdgcn target.
123–143

Suggest:

// Targets must either support 64-bit offsets for MUBUF instructions, and/or
// support flat operations, otherwise they cannot access a 64-bit global
// address space.
assert(hasAddr64() || hasFlat());
// Unless +-flat-for-global is specified, turn on FlatForGlobal for targets
// that do not support ADDR64 variants of MUBUF instructions. Such targets
// cannot use a 64 bit offset with a MUBUF instruction to access the global
// address space.
if (!hasAddr64() && !FS.contains("flat-for-global") && !FlatForGlobal) {
  ToggleFeature(AMDGPU::FeatureFlatForGlobal);
  FlatForGlobal = true;
}
// Unless +-flat-for-global is specified, use MUBUF instructions for global
// address space access if flat operations are not available.
if (!hasFlat() && !FS.contains("flat-for-global") && FlatForGlobal) {
  ToggleFeature(AMDGPU::FeatureFlatForGlobal);
  FlatForGlobal = false;
}

Also add the following to llvm/lib/Target/AMDGPU/AMDGPUSubtarget.h since that is the way other similar predicates are defined:

bool hasFlat() const {
  return (getGeneration() > AMDGPUSubtarget::SOUTHERN_ISLANDS);
}
pvellien updated this revision to Diff 313661.Dec 24 2020, 12:07 AM

Included the suggestions given by tony.

t-tye accepted this revision.Dec 24 2020, 12:10 AM

LGTM Thanks.

This revision is now accepted and ready to land.Dec 24 2020, 12:10 AM
pvellien updated this revision to Diff 313665.Dec 24 2020, 12:18 AM

Fixed a small typo in comments. Nothing else is changed from prior diff.

t-tye accepted this revision.Dec 24 2020, 12:19 AM
This comment was removed by pvellien.
This revision was automatically updated to reflect the committed changes.