HomePhabricator

[AMDGPU] Make SGPR spills exec mask agnostic
Concern Raisedda33c96d4762

Authored by critson on Jun 2 2020, 8:34 PM.

Description

[AMDGPU] Make SGPR spills exec mask agnostic

Explicitly set the exec mask for SGPR spills and reloads.
This fixes a bug where SGPR spills to memory could be incorrect
if the exec mask was 0 (or differed between spill and reload).

Additionally pack scalar subregisters (upto 16/32 per VGPR),
so that the majority of scalar types can be spilt or reloaded
with a simple memory access. This should amortize some of the
additional overhead of manipulating the exec mask.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D80282

Details

Auditors
arsenm
Committed
critsonJun 2 2020, 8:34 PM
Reviewer
arsenm
Differential Revision
D80282: [AMDGPU] Make SGPR spills exec mask agnostic
Parents
rGa09bb6d77b39: Replace dyn_cast<>() with isa<>() when the result isn't used (NFC)
Branches
Unknown
Tags
Unknown

Event Timeline

arsenm added a subscriber: arsenm.Jun 11 2020, 10:36 AM
arsenm added inline comments.
/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
949

I think this is broken. S_MOV_B64 can't be used for an arbitrary 64-bit mask. This will implicitly zero extend to 64-bit and clobber the high half of exec

arsenm raised a concern with this commit.Jun 11 2020, 1:59 PM
This commit now has outstanding concerns.Jun 11 2020, 1:59 PM
critson marked an inline comment as done.Jun 11 2020, 7:47 PM
critson added inline comments.
/llvm/lib/Target/AMDGPU/SIRegisterInfo.cpp
949

Overwriting the high half of exec is intentional. That is why EXEC_HI is saved and restored in the surrounding code.