This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Turn off force init 16 input SGPRS for pal
ClosedPublic

Authored by dstuttard on Jul 12 2022, 8:16 AM.

Details

Summary

Pal uses a different mechanism for user sgprs.

Diff Detail

Event Timeline

dstuttard created this revision.Jul 12 2022, 8:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2022, 8:16 AM
dstuttard requested review of this revision.Jul 12 2022, 8:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2022, 8:16 AM

I don't see why PAL would get special treatment here. I think the problem is we don't really try to account for the formal arguments in the user SGPR count. A better fix would be to track that accurately

I don't see why PAL would get special treatment here. I think the problem is we don't really try to account for the formal arguments in the user SGPR count. A better fix would be to track that accurately

Pal uses a completely different approach where the front end does all the setup for the user sgprs and passes the info in via pal metadata. The backend can override/modify this, but for user data sgprs is generally doesn't. Moving the pal shaders to use the same model would be a fairly big piece of work, and would mean adding quite a lot of extra code to the backend that's specific to pal.

I don't see why PAL would get special treatment here. I think the problem is we don't really try to account for the formal arguments in the user SGPR count. A better fix would be to track that accurately

Pal uses a completely different approach where the front end does all the setup for the user sgprs and passes the info in via pal metadata. The backend can override/modify this, but for user data sgprs is generally doesn't. Moving the pal shaders to use the same model would be a fairly big piece of work, and would mean adding quite a lot of extra code to the backend that's specific to pal.

I'm not suggesting moving it into the same model, but the backend should figure out the user SGPR count given the formal argument list in the IR

tpr added a comment.Jul 12 2022, 10:47 AM

I'm not suggesting moving it into the same model, but the backend should figure out the user SGPR count given the formal argument list in the IR

David is right -- we use a completely different model for PAL. The middle-end compiler LGC (part of LLPC) determines what user data sgprs there are, and what hardware-set sgprs (e.g. threadgroup_id0/1/2 in a CS) it wants to enable. It sets all the relevant bits and fields in the rsrc2 passed in IR metadata, and it adds the right number and type of inreg args to the function in the IR. For PAL, the backend would not modify those fields and bits in rsrc2 at all.

In the PAL model, the backend only sets fields that only it can know about, such as vgpr usage.

So this fix looks correct to me for PAL.

Additionally, not everything marked as inreg actually contributes to the NumUserSGPRs - so the knowledge about which arguments are and aren't would need to be added to the backend.

To be honest, it looks like the entire function should only be called for ROCm-style compute. It shouldn't be used for either PAL or Mesa. Except maybe the Info.hasPrivateSegmentWaveByteOffset part? I'm not sure about that last one. But none of the hasWorkGroupFoo stuff should apply as far as I can tell.

And by the way, why is that a function of the OS? Shouldn't it be a function of the calling convention?

So to sum up: I agree with Tim and David that this code shouldn't be applied in the PAL/Mesa case (they both need their own fix for this) -- though keeping the assertion alive somewhere would be good. But it's not clear to me that this patch is the right way to do it.

It feels somehwat dubious that this entire method runs for an amdgpu_cs calling convention, but perhaps we don't need to change that, given the method already has an IsShader check. How about: apply the init16 bug workaround only if !IsShader, but keep the assertion unmodified?

I wasn't sure whether Mesa would also have issues - I was going to follow that up, so thanks for the info.
I'll update as you suggest to !IsShader

As for the assert - pal shaders won't return anything useful for Info.getNumPreloadedSGPRs() so the assert would always fail. I assume (perhaps wrongly) that Mesa would be the same?

Thanks.

As for the assert - pal shaders won't return anything useful for Info.getNumPreloadedSGPRs() so the assert would always fail. I assume (perhaps wrongly) that Mesa would be the same?

Okay, no big deal. I thought it would return the number of SGPR arguments (since those are preloaded, it would kind of make sense).

dstuttard updated this revision to Diff 444301.Jul 13 2022, 9:10 AM

Updated with review suggestions

This revision is now accepted and ready to land.Jul 14 2022, 5:25 AM
dstuttard updated this revision to Diff 444951.Jul 15 2022, 5:37 AM

Added a test

I thought the best bet for this was something that showed the original patch
didn't alter user sgpr registers (and left RSRC2 as-is).

This revision was landed with ongoing or failed builds.Jul 25 2022, 2:55 AM
This revision was automatically updated to reflect the committed changes.