This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Don't select _SGPR forms of SMEM instructions on GFX9+
ClosedPublic

Authored by foad on Mar 31 2023, 9:05 AM.

Details

Summary

On GFX9+, SMEM instructions have an _SGPR_IMM form which is strictly
more powerful than the _SGPR form. It simplifies codegen if we always
select the _SGPR_IMM form with an immediate offset of 0 instead of the
_SGPR form.

Note that this patch just makes minimal changes to the selection
patterns to prove the concept. Further simplifications are possible to
reduced the number of selection patterns.

On GFX9 the _SGPR form of the Real instruction is still required for
assembly/disassembly but on GFX10+ it can be removed completely.

Diff Detail

Event Timeline

foad created this revision.Mar 31 2023, 9:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2023, 9:05 AM
foad requested review of this revision.Mar 31 2023, 9:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 31 2023, 9:05 AM
arsenm accepted this revision.Apr 3 2023, 7:00 AM

You're suggesting there still is a case pre-gfx9 where we need different instructions? I thought the immediate field was always just available

This revision is now accepted and ready to land.Apr 3 2023, 7:00 AM
foad added a comment.Apr 10 2023, 1:38 AM

You're suggesting there still is a case pre-gfx9 where we need different instructions? I thought the immediate field was always just available

Pre-GFX9 the "offset" field was either an immediate or an SGPR number. You couldn't have both.

This revision was landed with ongoing or failed builds.Apr 17 2023, 8:23 AM
This revision was automatically updated to reflect the committed changes.

Are there plans to select _SGPR_IMM instead of _IMM variants on GFX9+ as well?

foad added a comment.Apr 25 2023, 7:15 AM

Are there plans to select _SGPR_IMM instead of _IMM variants on GFX9+ as well?

I'm not planning to do that. I guess you mean by setting soffset to an immediate zero operand instead of a register when you don't need the register offset? That might work but I don't know what the tradeoffs are.