This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] gfx11 add bits to COMPUTE_PGM_RSRC3
ClosedPublic

Authored by Joe_Nash on Jun 7 2022, 11:39 AM.

Details

Reviewers
rampitec
arsenm
kzhuravl
Group Reviewers
Restricted Project
Commits
rGea3c9a87d344: [AMDGPU] gfx11 add bits to COMPUTE_PGM_RSRC3
Summary

Contributors:
Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>

Patch 21/N for upstreaming of AMDGPU gfx11 architecture

Depends on D127143

Diff Detail

Event Timeline

Joe_Nash created this revision.Jun 7 2022, 11:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2022, 11:39 AM
Joe_Nash requested review of this revision.Jun 7 2022, 11:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2022, 11:39 AM
Joe_Nash added reviewers: rampitec, arsenm, Restricted Project, kzhuravl.Jun 7 2022, 11:40 AM

This looks like it affects gfx10 as well, while probably should not?

This looks like it affects gfx10 as well, while probably should not?

@kzhuravl Do you know about this?

This should include updates to https://llvm.org/docs/AMDGPUUsage.html#kernel-descriptor if not planned for a different patch.

This looks like it affects gfx10 as well, while probably should not?

@kzhuravl Do you know about this?

GFX10 is only using COMPUTE_PGM_RSRC3_GFX10_PLUS.SHARED_VGPR_COUNT, the rest of the bits for GFX10 are reserved and must be 0.

GFX11 is using COMPUTE_PGM_RSRC3_GFX10_PLUS.INST_PREF_SIZE and others.

This is similar to how we do other compute program resource registers. E.g.:

...
  COMPUTE_PGM_RSRC1(BULKY, 24, 1),
  COMPUTE_PGM_RSRC1(CDBG_USER, 25, 1),
  COMPUTE_PGM_RSRC1(FP16_OVFL, 26, 1),    // GFX9+
  COMPUTE_PGM_RSRC1(RESERVED0, 27, 2),
  COMPUTE_PGM_RSRC1(WGP_MODE, 29, 1),     // GFX10+
  COMPUTE_PGM_RSRC1(MEM_ORDERED, 30, 1),  // GFX10+
  COMPUTE_PGM_RSRC1(FWD_PROGRESS, 31, 1), // GFX10+
...

The reason we have 2 different compute program resource register 3 is because GFX90A and GFX10 have completely different layout for those. So I think this header file should be fine.

Can you add documentation to AMDGPUUsage in this patch as well?

Thanks

This looks like it affects gfx10 as well, while probably should not?

@kzhuravl Do you know about this?

GFX10 is only using COMPUTE_PGM_RSRC3_GFX10_PLUS.SHARED_VGPR_COUNT, the rest of the bits for GFX10 are reserved and must be 0.

GFX11 is using COMPUTE_PGM_RSRC3_GFX10_PLUS.INST_PREF_SIZE and others.

This is similar to how we do other compute program resource registers. E.g.:

...
  COMPUTE_PGM_RSRC1(BULKY, 24, 1),
  COMPUTE_PGM_RSRC1(CDBG_USER, 25, 1),
  COMPUTE_PGM_RSRC1(FP16_OVFL, 26, 1),    // GFX9+
  COMPUTE_PGM_RSRC1(RESERVED0, 27, 2),
  COMPUTE_PGM_RSRC1(WGP_MODE, 29, 1),     // GFX10+
  COMPUTE_PGM_RSRC1(MEM_ORDERED, 30, 1),  // GFX10+
  COMPUTE_PGM_RSRC1(FWD_PROGRESS, 31, 1), // GFX10+
...

The reason we have 2 different compute program resource register 3 is because GFX90A and GFX10 have completely different layout for those. So I think this header file should be fine.

Can you add documentation to AMDGPUUsage in this patch as well?

Thanks

Once assembler support gets added for this, we will have to make sure directives for GFX11 only fields do not work on other generations.

foad added a comment.Jun 9 2022, 7:49 AM

Can you add documentation to AMDGPUUsage in this patch as well?

That's in D127402 which is in review right now.

Can you add documentation to AMDGPUUsage in this patch as well?

That's in D127402 which is in review right now.

Thanks, just saw it.

kzhuravl accepted this revision.Jun 9 2022, 7:54 AM

LGTM, can you wait and see if @rampitec has any additional comments?

This revision is now accepted and ready to land.Jun 9 2022, 7:54 AM
This revision was landed with ongoing or failed builds.Jun 10 2022, 10:35 AM
This revision was automatically updated to reflect the committed changes.