This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] gfx940 memory model
ClosedPublic

Authored by rampitec on Mar 8 2022, 12:16 PM.

Diff Detail

Event Timeline

rampitec created this revision.Mar 8 2022, 12:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2022, 12:16 PM
rampitec requested review of this revision.Mar 8 2022, 12:16 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2022, 12:16 PM
Herald added a subscriber: wdng. · View Herald Transcript
foad added inline comments.Mar 9 2022, 4:49 AM
llvm/docs/AMDGPUUsage.rst
8673

I don't know how to review 2500 lines of text. How is it different from GFX90A?

llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
1486

Typo "hava"

1527

Typo "hava"

1568

Doesn't really matter but I find != a bit easier to understand than ^ on booleans.

1574

Space after (

1680

Typo "hava"

rampitec added inline comments.Mar 9 2022, 10:21 AM
llvm/docs/AMDGPUUsage.rst
8673

The main change is the replacement of glc/scc/slc with sc0/sc1/nt bits. While looks functionally similar it is conceptually different. Instead of locking individual caches we now requesting coherency at a specific memory scope.

rampitec updated this revision to Diff 414151.Mar 9 2022, 10:37 AM
rampitec marked 4 inline comments as done.

Fixed formatting and comments.

scott.linder added inline comments.
llvm/docs/AMDGPUUsage.rst
8673

One approach is to isolate the "Memory Model GFX90A" and "Memory Model GFX940" sections into their own files, and then use your preferred diff-tool with more granular highlighting (i.e. word-based rather than line-based) to compare the two models.

At that point the diff is actually very concise.

I noted some small nits, but otherwise LGTM. I won't accept, as there are others who should probably review.

llvm/docs/AMDGPUUsage.rst
8715

I think this can be dropped, as the sentence starts with the tgsplit condition.

8717

Is this the correct scope? This seems like the same case as the last bullet, which states sc0. The table below also seems to indicate that sc0 will invalidate L1, not sc1.

8893–8895

Duplicate clause, seems like it can be deleted

8912

It might be useful to note the "re-use" of the sc0 bit in atomircrmw instructions somewhere, maybe above in the long-form overview portion. To give context to the possible confusion, my original comment here was:

I'm trying to piece together an understanding of the memory model changes, so bear with me, but why is this not sc0=1 sc1=1? I guess my misunderstanding boils down to the difference in the sc0/sc1 bits between the _load/_store instructions and the _atomic instructions.

9642

Can collapse these two spaces

9697

3

10881

Should this be must generate? It occurs in a few places

llvm/lib/Target/AMDGPU/SIMemoryLegalizer.cpp
1543

I think including a paragraph along these lines in AMDGPUUsage.rst would cover the confusion I originally had

rampitec updated this revision to Diff 414224.Mar 9 2022, 3:19 PM

Fixed some comment typos.

t-tye added inline comments.Mar 9 2022, 3:21 PM
llvm/docs/AMDGPUUsage.rst
8717

Yes this should be sc0.

8912

Adding a bullet to the preamble text to explain this would be a good suggestion.

Scott, I appreciate your doc review. It is a lot of work. Do you mind to fix doc typos in a followup change? It is a real pain to maintain the downstream merge, it will be easier to fix when it is merged back.

Scott, I appreciate your doc review. It is a lot of work. Do you mind to fix doc typos in a followup change? It is a real pain to maintain the downstream merge, it will be easier to fix when it is merged back.

Absolutely, that's fine with me!

Scott, I appreciate your doc review. It is a lot of work. Do you mind to fix doc typos in a followup change? It is a real pain to maintain the downstream merge, it will be easier to fix when it is merged back.

Absolutely, that's fine with me!

Thanks! I have created followup D121397.

t-tye accepted this revision.Mar 14 2022, 2:31 PM

LGTM

This revision is now accepted and ready to land.Mar 14 2022, 2:31 PM
This revision was landed with ongoing or failed builds.Mar 14 2022, 3:02 PM
This revision was automatically updated to reflect the committed changes.