This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Avoid redundant mode register writes
ClosedPublic

Authored by timcorringham on Jun 19 2020, 10:36 AM.

Details

Summary

The SIModeRegister pass attempts to generate the minimal number of
writes to the mode register. However it was failing to correctly
deal with some loops, resulting in some redundant setreg instructions
being inserted.

This change amends the pass to avoid generating these redundant
instructions.

Diff Detail

Event Timeline

timcorringham created this revision.Jun 19 2020, 10:36 AM
foad added a comment.Jun 22 2020, 1:29 AM

Overall I think this is probably OK but (a) I'd like to see it with some of the irrelevant stuff removed and (b) there is ample scope for making this pass look much more like a "normal" dataflow solver, where Status should be a lattice with a proper "bottom" value, the phase 2 worklist should be seeded with just the entry block so we don't have to worry about visiting unreachable blocks, and hopefully processBlockPhase2 will get much simpler again.

llvm/lib/Target/AMDGPU/SIModeRegister.cpp
86

Looks like this is just reformatting. Please commit separately. Maybe even clang-format the whole file first?

115

Just reformatting?

136

Why is this needed? Was there a problem with the existing code using NumSetRegInserted?

355

If a block is its only predecessor then either it's the entry block (this wouldn't be allowed in IR but I guess it is in MIR?), or it's an unreachable single block loop, and it doesn't matter what we do in unreachable code. So the only useful thing this line and line 338 achieve is to set DefaultStatus at the start of the entry block. It seems like this could be simplified.

403–405

Just reformatting?

timcorringham marked 5 inline comments as done.Jun 22 2020, 3:18 AM
timcorringham added inline comments.
llvm/lib/Target/AMDGPU/SIModeRegister.cpp
86

clang-format did this - I will commit a clang-format version first and then update this review to eliminate formatting changes.

115

It now initialises ExitSet.

136

NumSetReginserted is a STATISTIC, and as such only works when LLVM is built with asserts or debug mode enabled - which is something I hadn't realised when I wrote the original code

355

This handles a case that should never happen but which would cause a loop if it did.

403–405

clang-format again.

Removed changes introduced only by clang-format format.

arsenm added inline comments.Jun 23 2020, 7:34 AM
llvm/lib/Target/AMDGPU/SIModeRegister.cpp
136

Probably should rename this, since this really should try to use S_ROUND_MODE/S_DENORM_MODE

202–206

This should use the encode function from utils for this, but that's a separate cleanup I guess (plus try to use the new gfx10 instruction)

llvm/test/CodeGen/AMDGPU/mode-register.mir
464–465

CHECK-NOT can be fragile. The case is so small I would generate the checks

foad accepted this revision.Jun 24 2020, 2:51 AM

LGTM modulo Matt's comments and please consider cleaning it all up as mentioned earlier.

llvm/lib/Target/AMDGPU/SIModeRegister.cpp
136

Changed is the usual name for this kind of flag.

This revision is now accepted and ready to land.Jun 24 2020, 2:51 AM
timcorringham marked 3 inline comments as done.Jun 24 2020, 5:24 AM
timcorringham added inline comments.
llvm/lib/Target/AMDGPU/SIModeRegister.cpp
202–206

Planned for a future change.

This revision was automatically updated to reflect the committed changes.