Details
- Reviewers
arsenm - Commits
- rG27a62f6317f3: [AMDGPU] global-isel support for RT
Diff Detail
Event Timeline
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3059–3075 | What kind of operations you'd like to see in the custom lowering? v_pack_b32_f16 should be fine, this is packed half type in this case. | |
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp | ||
4393 | That's a descriptor, I'd rather refuse to select. |
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3059–3075 | But v_pack_b32_f16 isn't semantically the same as the bit packing, so I would be surprised to insert this for the argument handling. | |
llvm/lib/Target/AMDGPU/AMDGPURegisterBankInfo.cpp | ||
4393 | You can never guarantee the input is uniform or in a VGPR. We can do the right thing now easily (and every other intrinsic with a descriptor does it) |
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3059–3075 | That's the best instruction for the job IMO. What we are doing is repacking vector of halfs. |
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3059–3075 | But it does change the input values. I believe this is a canonicalizing operation, so may flush denorms and quiet snans |
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3059–3075 | It should behave the same as bhv itself, the mode is common right? So if flushing on the value will be flushed anyway. Everything else results in a longer code. It can use v_lshl_or_b32, but it will also need an extra v_and_b32 to clear high half. |
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3059–3075 | Doing a custom lowering would need 4 different custom nodes and then selection. It will be much more overhead. |
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp | ||
---|---|---|
3059–3075 | You could use one wrapper instruction like the image intrinsics. We should expose the bit packing to the post-legalize combiner |
Braces here