This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Update Subtarget isWave32 method to ignore the wave32 feature pre-gfx9
Needs RevisionPublic

Authored by skc7 on Apr 12 2023, 9:18 AM.

Details

Reviewers
arsenm
Summary

si-optimize-exec-masking pass is generating saveexec_b32 ops based on isWave32() without checking if target generation > gfx10 targets.
This patch updates isWave32 method to check for gfx10 and above targets and whether subtarget has FeatureWavefrontSize32.

Test case is added to check if si-optimize-exec-masking pass generates S_ANDN2_SAVEEXEC_B32 or S_ANDN2_SAVEEXEC_B64 correctly.

Diff Detail

Event Timeline

skc7 created this revision.Apr 12 2023, 9:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2023, 9:18 AM
skc7 requested review of this revision.Apr 12 2023, 9:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2023, 9:18 AM
arsenm requested changes to this revision.May 19 2023, 10:22 AM

The correct way to address this would be to make isWave32 ignore the wave32 feature pre-gfx9

This revision now requires changes to proceed.May 19 2023, 10:22 AM
skc7 updated this revision to Diff 526416.May 29 2023, 2:24 AM
arsenm requested changes to this revision.May 29 2023, 3:40 AM

If we’re going to do this, no changes should be needed in SIFrameLowering. The wavesize set in the subtarget constructor would be adjusted to consistently apply this everywhere. The prolog code is a very minor piece of this

This revision now requires changes to proceed.May 29 2023, 3:40 AM
skc7 added a comment.May 29 2023, 3:45 AM

If we’re going to do this, no changes should be needed in SIFrameLowering. The wavesize set in the subtarget constructor would be adjusted to consistently apply this everywhere. The prolog code is a very minor piece of this

@arsenm isWave32() method in GCNSubtarget class needs to be updated to ignore wave32 feature pre-gfx9 ?

skc7 updated this revision to Diff 526550.May 30 2023, 2:15 AM
skc7 retitled this revision from [WIP] Check and generate 32 bit saveexec for gfx10 and above targets to [AMDGPU] Update Subtarget isWave32 method to ignore the wave32 feature pre-gfx9.
skc7 edited the summary of this revision. (Show Details)
skc7 updated this revision to Diff 526897.May 30 2023, 9:49 PM
skc7 edited the summary of this revision. (Show Details)

Update test.

skc7 edited the summary of this revision. (Show Details)May 30 2023, 9:49 PM
arsenm requested changes to this revision.Jun 6 2023, 2:15 PM

You can achieve this in the subtarget constructor by setting the wavefrontsize member. This would also be a strategy change vs. the AMDGPURemoveIncompatibleFunctions and doesn't belong in a patch to fix the saveexec combine

llvm/test/CodeGen/AMDGPU/saveexec-xor-optimize.mir
2 ↗(On Diff #526897)

The exec masking issue is not solved by this. The pass itself needs to check whether the 32-bit saveexec operations are legal (i.e. gfx10). It's not directly broken because of wave32

21 ↗(On Diff #526897)

the saveexec isn't supposed to exist in the input

This revision now requires changes to proceed.Jun 6 2023, 2:15 PM