These are instructions introduced in VI+ Chips. We defined the instructions in this patch, and introduce intrinsics
llvm.amdgcn.ds.permute/llvm.amdgcn.ds.bpermute to expose them.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
230–231 ↗ | (On Diff #49084) | Checking just offset0 should be sufficient, and can be moved above |
lib/Target/AMDGPU/VIInstructions.td | ||
131 ↗ | (On Diff #49084) | Are we sure these don't real M0? |
test/CodeGen/AMDGPU/llvm.amdgcn.ds.permute.ll | ||
3–4 ↗ | (On Diff #49084) | I would prefer splitting the 2 separate intrinsics into separate patches. These are also missing the readnone (which shoulda also use attribute groups) |
lib/Target/AMDGPU/VIInstructions.td | ||
---|---|---|
131 ↗ | (On Diff #49084) | It reads M0, but it is supposed to ignore its value, so for our purposes we can treat it as if it doesn't read M0. |
Update the patch based on Matt's Review:
- Check only Offset0Imm and move the check one line ahead.
- It is safe to remove M0 from the Uses list for ds_permute.ds_bpermute.
- split the LIT test for ds_permute and ds_bpermute separately.
test/CodeGen/AMDGPU/llvm.amdgcn.ds.permute.ll | ||
---|---|---|
4–5 ↗ | (On Diff #49202) | I just split the test case. If you want to split the intrinsics and/or instruction definitions, I can do it in the integration. Thanks. |
lib/Target/AMDGPU/VIInstructions.td | ||
---|---|---|
136–139 ↗ | (On Diff #49202) | Why can't these be patterns on the instruction definition instead of standalone Pats? |
Move the pattern for ds_permute intrinsic code generation into
the instruction definition, based on Matt's comment.
LGTM
test/CodeGen/AMDGPU/llvm.amdgcn.ds.bpermute.ll | ||
---|---|---|
6 ↗ | (On Diff #49427) | Might want to check that there are 2 VGPR operands |