This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] gfx11 allow dlc for MUBUF atomics
ClosedPublic

Authored by Petar.Avramovic on Jul 4 2022, 5:50 AM.

Details

Summary

Add MC support for dlc in gfx11 MUBUF atomic instructions.

Diff Detail

Event Timeline

Petar.Avramovic created this revision.Jul 4 2022, 5:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 4 2022, 5:50 AM
Petar.Avramovic requested review of this revision.Jul 4 2022, 5:50 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 4 2022, 5:50 AM
foad added inline comments.Jul 4 2022, 6:09 AM
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
4486

Do I understand correctly: we don't need to check for isGFX9() here because the DLC bit did not exist on GFX9, so CPol & CPol::DLC would always be false?

Petar.Avramovic added inline comments.Jul 4 2022, 6:15 AM
llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
4486

yes...

5962–5965

... gfx9 (and older) check is during parsing.

dp added a comment.Jul 5 2022, 7:47 AM

Nice refactoring!

llvm/lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp
4489

I believe it would be clearer to handle dlc on GFX10 as unsupported rather than ignored because the patch actually disables this modifier.

llvm/lib/Target/AMDGPU/BUFInstructions.td
2074

Maybe remove extra spaces?
Maybe rename this class because a multiclass with the same name is defined below.

2138

Rtn may be better because `IsRtn' assumes a flag.

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
216

I believe the code with dlc=1 is legal for GFX10, and because dlc is ignored, I think we should silently drop it instead of emitting a warning.

It would be nice to have a test for this case.

Petar.Avramovic added inline comments.Jul 5 2022, 8:34 AM
llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
216

I don't have strong opinion here and only wanted to match disassembler error out when 1 is in dlc bit.
Assembler/codegen printing dlc in asm string but encoding 0 in dlc bit, seems misleading to me.

Also /llvm/include/llvm/IR/IntrinsicsAMDGPU.td only mentions // cachepolicy(imm; bit 1 = slc) but it is possible to force dlc or even glc on atomic with unused return (selected as no_ret version of the mir atomic).

here is a test:

define amdgpu_ps void @noret(<4 x i32> inreg %rsrc, i32 %data, i32 %vindex) {
main_body:
  %out = call i32 @llvm.amdgcn.raw.buffer.atomic.add.i32(i32 %data, <4 x i32> %rsrc, i32 %vindex, i32 0, i32 4)
  ret void
}

declare i32 @llvm.amdgcn.raw.buffer.atomic.add.i32(i32, <4 x i32>, i32, i32, i32)

but overall I think that this warning should not be reachable, and that could be avoided by checking value in cache policy operand from IR atomic. I don't know what would be the best place to implement such check. I think that Matt suggested IR verifier in some discussion.

dp added inline comments.Jul 5 2022, 10:08 AM
llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUInstPrinter.cpp
216

Assembler/codegen printing dlc in asm string but encoding 0 in dlc bit, seems misleading to me.

I do not understand how dlc may be printed. I proposed to correct AMDGPUInstPrinter::printCPol so that it prints neither dlc nor a warning.

As Matt suggested, you could also correct SIInstrInfo::verifyInstruction though I'm not sure this is really necessary.

It would be nice to have a test for this case.

I mean that adding a disassembler test with dlc=1 would be useful.

Petar.Avramovic edited the summary of this revision. (Show Details)

Remove gfx10 edits.

dp accepted this revision.Jul 25 2022, 1:30 PM

LGTM

This revision is now accepted and ready to land.Jul 25 2022, 1:30 PM
This revision was landed with ongoing or failed builds.Aug 1 2022, 3:18 AM
This revision was automatically updated to reflect the committed changes.