This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Limit memory scope for scratch, LDS and GDS
ClosedPublic

Authored by t-tye on Feb 13 2021, 1:03 AM.

Details

Summary

Changes for AMD GPU SIMemoryLegalizer:

  • Limit the memory scope to maximum supported by the scratch, LDS and GDS address spaces.
  • Improve assertion checking.
  • Correct toSIAtomicScope argument name.

Diff Detail

Event Timeline

t-tye created this revision.Feb 13 2021, 1:03 AM
t-tye requested review of this revision.Feb 13 2021, 1:03 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 13 2021, 1:03 AM
rampitec accepted this revision.Feb 13 2021, 1:31 AM

LGTM, thanks Tony!
It is up to you to use min() or if(). I will postpone my patch to rebase on top your change as I expect to have zero test changes with my patch after this one and want to make sure this expectation is correct.

llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
159

Scope = std::min(Scope, SIAtomicScope::SINGLETHREAD) ?
It just looks more formal.

163

Same here. Up to you though.

This revision is now accepted and ready to land.Feb 13 2021, 1:31 AM
madhur13490 added inline comments.
llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
160

Why not?

this->scope = std::min(Scope, SIAtomicScope::SINGLETHREAD)
t-tye updated this revision to Diff 323570.Feb 13 2021, 12:19 PM
  • Address review feedback.
  • Add additional assertion checking.
  • Improve scope checking and also check GDS address space.
t-tye marked 3 inline comments as done.Feb 13 2021, 12:23 PM
rampitec accepted this revision.Feb 13 2021, 12:38 PM
t-tye updated this revision to Diff 323571.Feb 13 2021, 12:59 PM

Improve commit message.

t-tye retitled this revision from [AMDGPU] Limit memory scope for scratch and LDS to [AMDGPU] Limit memory scope for scratch, LDS and GDS.Feb 13 2021, 1:02 PM
t-tye edited the summary of this revision. (Show Details)
t-tye added a comment.Feb 13 2021, 1:52 PM

LGTM, thanks Tony!
It is up to you to use min() or if(). I will postpone my patch to rebase on top your change as I expect to have zero test changes with my patch after this one and want to make sure this expectation is correct.

I made the changes, thanks. I am waiting to get executable test results but Gerrit is currently down.

Gerrit is down until 11pm CT according to IT.

Stas

  • Original message ---

From: Tony Tye via Phabricator reviews@reviews.llvm.org
Sent: February 13, 2021 13:52:43
To: Tye, Tony Tony.Tye@amd.com, Zhuravlyov, Konstantin Konstantin.Zhuravlyov@amd.com, Linder, Scott Scott.Linder@amd.com, Sumner, Brian Brian.Sumner@amd.com, Mekhanoshin, Stanislav Stanislav.Mekhanoshin@amd.com
CC: madhur13490@gmail.com, wei.ding2@amd.com, llvm-commits@lists.llvm.org, Arsenault, Matthew Matthew.Arsenault@amd.com, jv356@scarletmail.rutgers.edu, nhaehnle@gmail.com, Liu, Yaxun (Sam) Yaxun.Liu@amd.com, Stuttard, David David.Stuttard@amd.com, tpr.ll@botech.co.uk, reviews.llvm.org@jfbastien.com, Kerbow, Austin Austin.Kerbow@amd.com, Kumar N, Bhuvanendra Bhuvanendra.KumarN@amd.com, 88888yl@gmail.com, dougpuob@gmail.com, Neubauer, Sebastian Sebastian.Neubauer@amd.com, Song, Ruiling Ruiling.Song@amd.com
Subject: [PATCH] D96643: [AMDGPU] Limit memory scope for scratch, LDS and GDS

[CAUTION: External Email]

t-tye added a comment.

In D96643#2561617 https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freviews.llvm.org%2FD96643%232561617&data=04%7C01%7Cstanislav.mekhanoshin%40amd.com%7Ce267676049fe4e987bda08d8d069ae2b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637488499635053925%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=gtSTLQKaH1%2FU22bx0vE3BSOOZ50ZhryRfYwQXHvYv20%3D&reserved=0, @rampitec wrote:

LGTM, thanks Tony!
It is up to you to use min() or if(). I will postpone my patch to rebase on top your change as I expect to have zero test changes with my patch after this one and want to make sure this expectation is correct.

I made the changes, thanks. I am waiting to get executable test results but Gerrit is currently down.

Repository:

rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freviews.llvm.org%2FD96643%2Fnew%2F&data=04%7C01%7Cstanislav.mekhanoshin%40amd.com%7Ce267676049fe4e987bda08d8d069ae2b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637488499635053925%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=ywkjGAacWwABrPmLCAfpRqa6spSOUy6RG2gqAQpTG0A%3D&reserved=0

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Freviews.llvm.org%2FD96643&data=04%7C01%7Cstanislav.mekhanoshin%40amd.com%7Ce267676049fe4e987bda08d8d069ae2b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637488499635053925%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=2B1xLZkK6f6lfkQXZOBHIjm99SMU2iKiJGoNs3%2FSqXo%3D&reserved=0

This revision was landed with ongoing or failed builds.Feb 14 2021, 9:35 AM
This revision was automatically updated to reflect the committed changes.
rampitec added inline comments.Feb 14 2021, 2:17 PM
llvm/test/CodeGen/AMDGPU/memory-legalizer-local-agent.ll
726

Looks like it is sill insufficient.

t-tye added inline comments.Feb 14 2021, 9:48 PM
llvm/test/CodeGen/AMDGPU/memory-legalizer-local-agent.ll
726

In WGP mode it is still necessary to synchronize across CUs, hence the need to wait/invalidate for VMEM when at workgroup scope. Basically, the scope limited to workgroup has to do the equivalent of agent scope.

In CU more this is not necessary and so this waitcnt vscnt /gl0 invalidate is not present as seen in line 738-739.

t-tye added inline comments.Feb 14 2021, 9:54 PM
llvm/test/CodeGen/AMDGPU/memory-legalizer-local-agent.ll
726

But this does demonstrate that simply treating WGP mode as agent scope is conservatively correct, but not optimal. I will look to see if can be improved further.

rampitec added inline comments.Feb 15 2021, 11:00 AM
llvm/test/CodeGen/AMDGPU/memory-legalizer-local-agent.ll
726

I do not see it in the memory model for GFX10. There is no reference that vmem counter is needed in WGP mode.