In the 2e29b0138ca243 we introduce a specific solving algorithm
that analyzes the VGPR to SGPR copies use chains and either lowers
the copy to v_readfirstlane_b32 or converts the whole chain to VALU forms.
Same time we still have the code that blindly converts to VALU REG_SEQUENCE and PHIs
in case they produce SGPR but have VGPRs input operands. In case the REG_SEQUENCE and PHIs
are in the VGPR to SGPR copy use chain, and this chain was considered long enough to convert
copy to v_readfistlane_b32, further lowering them to VALU leads to several kinds of issues.
At first, we have v_readfistlane_b32 which is completely useless because most parts of its use chain
were moved to VALU forms. Second, we may encounter subtle bugs related to the EXEC-dependent CF
because of the weird mixing of SALU and VALU instructions.
This change removes the code that moves REG_SEQUENCE and PHIs to VALU. Instead, we use the fact
that both REG_SEQUENCE and PHIs have copy semantics. That is, if they define SGPR but have VGPR inputs,
we insert VGPR to SGPR copies to make them pure SGPR. Then, the new copies are processed by the common
VGPR to SGPR lowering algorithm.
This is Part 2 in the series of commits aiming at the massive refactoring of the SIFixSGPRCopies pass.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
60,040 ms | x64 debian > MLIR.Examples/standalone::test.toy |
Event Timeline
Overall test code changes look either positive or neutral to me.
llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp | ||
---|---|---|
975 | Indention is off? |
Fixed corner case related with AllAGPRUses in PHIs when PHI has no uses at all.
This may happen in artificial MIR tests and causes compiler error.
This change also takes care of the VGPR to SGPR copies inserted to the PHI incoming blocks.
These copies shouldbe processed out of order since the incoming blockmay have already been processed
at the moment when the PHI parent block is being processesd.
llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp | ||
---|---|---|
1031 | Indention is off? |
llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp | ||
---|---|---|
1031 | Formatting is ok here. |
llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp | ||
---|---|---|
1031 | This is unreadable and misleading. Add the braces then. And maybe a blank line. |
llvm/lib/Target/AMDGPU/SIFixSGPRCopies.cpp | ||
---|---|---|
580 | It is better to clean these two at the end of the pass, to clean up memory. |
As a heads-up, this patch seems to be causing some Vulkan CTS failures in LLPC testing.
It is better to clean these two at the end of the pass, to clean up memory.