The AMD GPU SIMemoryLegalizer was using the ordering address space
rather than the instruction address space when determining the
s_waitcnt to generate to ensure that a read-modify-write atomic has
completed. This resulted in additional unnecessary counters being
waited on.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
clang-format suggested style edits found: