This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Annotate functions with inline asm using agprs
AbandonedPublic

Authored by rampitec on Jan 11 2022, 2:38 PM.

Details

Reviewers
arsenm
Summary

This is needed for future patch. It is possible to allocate all
VGPRs and use MFMA with VGPRs if we have less then 257 registers
and have no calls, but we need to make sure AGPRs are not used
by an inline asm. It is done as an annotation because the way to
use MFMA with VGPRs is to select a proper opcode, so it shall be
known by the time of selection.

Diff Detail

Event Timeline

rampitec created this revision.Jan 11 2022, 2:38 PM
rampitec requested review of this revision.Jan 11 2022, 2:38 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 11 2022, 2:38 PM
Herald added a subscriber: wdng. · View Herald Transcript
arsenm requested changes to this revision.Jan 11 2022, 2:45 PM

Why does this specifically need to check for inline asm? Is this only checked in the kernel or functions too?

I'm trying to delete this pass and don't want to add more stuff to it. If it's just for kernels can't you check for AGPR physreg uses later?

This revision now requires changes to proceed.Jan 11 2022, 2:45 PM

Why does this specifically need to check for inline asm? Is this only checked in the kernel or functions too?

I'm trying to delete this pass and don't want to add more stuff to it. If it's just for kernels can't you check for AGPR physreg uses later?

There two ways of getting AGPRs: use an MFMA and inline asm. For MFMA if we fit into register budget of 256 registers we can select _vgprcd versions and have no AGPRs, but that is not possible to do if there will be other AGPR uses.
This is too late to check for AGPR uses after selection, because it is needed for the selection.

I guess I can run this code in the AMDGPUDAGToDAGISel::runOnMachineFunction or even SIMachineFunctionInfo constructor and skip the attribution.

I guess I can run this code in the AMDGPUDAGToDAGISel::runOnMachineFunction or even SIMachineFunctionInfo constructor and skip the attribution.

The downside is that I have to scan all instructions for that, and AMDGPUAnnotateKernelFeatures already does that. Where are you planning to move its code?

rampitec abandoned this revision.Jan 11 2022, 4:37 PM

I guess I can run this code in the AMDGPUDAGToDAGISel::runOnMachineFunction or even SIMachineFunctionInfo constructor and skip the attribution.

The downside is that I have to scan all instructions for that, and AMDGPUAnnotateKernelFeatures already does that. Where are you planning to move its code?

AMDGPUAttributor. If this were to be an attribute, which is pretty ugly, it should be the inverse. Why can't you just select to AGPRs, and later we can adjust the register classes if necessary?

I guess I can run this code in the AMDGPUDAGToDAGISel::runOnMachineFunction or even SIMachineFunctionInfo constructor and skip the attribution.

The downside is that I have to scan all instructions for that, and AMDGPUAnnotateKernelFeatures already does that. Where are you planning to move its code?

AMDGPUAttributor. If this were to be an attribute, which is pretty ugly, it should be the inverse. Why can't you just select to AGPRs, and later we can adjust the register classes if necessary?

Because it means changing instructions, which is untrivial in some cases. I'd better select it right.

Anyhow, I have moved the code, it takes a yet another scan over the instructions which has to happen somewhere anyway. If we have a better scan place in the future it will be easy to do, attribute or not.

An attribute may have an additional benefit to allow allocation shift to happen in functions as well, but I am not sure how practically interesting is it to use a wave wide instruction in a function.

That said, I will still need a parent of this change to work properly because otherwise we still have misrepresented RC for inline asm.