This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Update Subtarget isWave32 method to ignore the wave32 feature pre-gfx9
Needs RevisionPublic

Authored by skc7 on Apr 12 2023, 9:18 AM.

Details

Reviewers
arsenm
Summary

si-optimize-exec-masking pass is generating saveexec_b32 ops based on isWave32() without checking if target generation > gfx10 targets.
This patch updates isWave32 method to check for gfx10 and above targets and whether subtarget has FeatureWavefrontSize32.

Test case is added to check if si-optimize-exec-masking pass generates S_ANDN2_SAVEEXEC_B32 or S_ANDN2_SAVEEXEC_B64 correctly.

Diff Detail

Event Timeline

skc7 created this revision.Apr 12 2023, 9:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2023, 9:18 AM
skc7 requested review of this revision.Apr 12 2023, 9:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2023, 9:18 AM
arsenm requested changes to this revision.May 19 2023, 10:22 AM

The correct way to address this would be to make isWave32 ignore the wave32 feature pre-gfx9

This revision now requires changes to proceed.May 19 2023, 10:22 AM
skc7 updated this revision to Diff 526416.May 29 2023, 2:24 AM
arsenm requested changes to this revision.May 29 2023, 3:40 AM

If we’re going to do this, no changes should be needed in SIFrameLowering. The wavesize set in the subtarget constructor would be adjusted to consistently apply this everywhere. The prolog code is a very minor piece of this

This revision now requires changes to proceed.May 29 2023, 3:40 AM
skc7 added a comment.May 29 2023, 3:45 AM

If we’re going to do this, no changes should be needed in SIFrameLowering. The wavesize set in the subtarget constructor would be adjusted to consistently apply this everywhere. The prolog code is a very minor piece of this

@arsenm isWave32() method in GCNSubtarget class needs to be updated to ignore wave32 feature pre-gfx9 ?

skc7 updated this revision to Diff 526550.May 30 2023, 2:15 AM
skc7 retitled this revision from [WIP] Check and generate 32 bit saveexec for gfx10 and above targets to [AMDGPU] Update Subtarget isWave32 method to ignore the wave32 feature pre-gfx9.
skc7 edited the summary of this revision. (Show Details)
skc7 updated this revision to Diff 526897.May 30 2023, 9:49 PM
skc7 edited the summary of this revision. (Show Details)

Update test.

skc7 edited the summary of this revision. (Show Details)May 30 2023, 9:49 PM
arsenm requested changes to this revision.Jun 6 2023, 2:15 PM

You can achieve this in the subtarget constructor by setting the wavefrontsize member. This would also be a strategy change vs. the AMDGPURemoveIncompatibleFunctions and doesn't belong in a patch to fix the saveexec combine

llvm/test/CodeGen/AMDGPU/saveexec-xor-optimize.mir
2

The exec masking issue is not solved by this. The pass itself needs to check whether the 32-bit saveexec operations are legal (i.e. gfx10). It's not directly broken because of wave32

21

the saveexec isn't supposed to exist in the input

This revision now requires changes to proceed.Jun 6 2023, 2:15 PM