This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Cluster shader exports
ClosedPublic

Authored by critson on May 6 2020, 4:48 AM.

Details

Summary

Add DAG scheduling mutation to cluster export instructions.
This avoids unnecessary waitcnts being added when computation
ends up interspersed with exports.

Event Timeline

critson created this revision.May 6 2020, 4:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 6 2020, 4:48 AM
critson updated this revision to Diff 262345.May 6 2020, 4:57 AM

Apply linting.

foad accepted this revision.May 6 2020, 5:19 AM

Looks good.

As a follow-on, would this be a good place to try to ensure that position exports go before parameter exports?

llvm/lib/Target/AMDGPU/AMDGPUExportClustering.cpp
32–33

You don't need these parentheses.

40

This is trivia, but isn't it simpler to start counting at 1 than to finish at (End - 1) ?

46

I don't quite understand the comment. If nothing depends on an export then surely isExport() on the next line will never be true?

59–79

Do you understand the chain formation logic, or is it copied from the standard MemOpClusterMutation? (Or both?!)

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.exp.ll
552–558

Can you pre-commit this? Did it actually have a waitcnt before this fix?

This revision is now accepted and ready to land.May 6 2020, 5:19 AM
critson marked 6 inline comments as done.May 7 2020, 2:27 AM
critson added inline comments.
llvm/lib/Target/AMDGPU/AMDGPUExportClustering.cpp
59–79

A bit of both, code adapted from MemOpClusterMutation and I have reasonable confidence I understand what it is doing.

llvm/test/CodeGen/AMDGPU/llvm.amdgcn.exp.ll
552–558

Sorry for the effort, but can you educate me on the concept of pre-committing the test. It will fail without the rest of the code in this patch as a waitcnt is inserted at present.

critson updated this revision to Diff 262573.May 7 2020, 2:28 AM
critson marked an inline comment as done.

Address Jay's comments.

foad added inline comments.May 7 2020, 3:00 AM
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.exp.ll
552–558

I mean, commit @test_export_clustering but with check lines that show the current codegen. (This is easier if you auto-generate the check lines. I hadn't noticed that you wrote these ones by hand.) Then, when you rebase the current patch, it will show us exactly what effect it had on the codegen for this function.

It's not essential but it is helpful to reviewers.

critson updated this revision to Diff 262582.May 7 2020, 3:04 AM
critson marked an inline comment as done.

Precommit test.

This revision was automatically updated to reflect the committed changes.
foad added inline comments.May 7 2020, 3:24 AM
llvm/test/CodeGen/AMDGPU/llvm.amdgcn.exp.ll
552–558

Nice!