Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
All lit tests are passing, did not get a chance to verify on hardware yet. half.ll was replaced by amdgcn-vop*f16*.ll tests. I am in the process of f16 test restructuring, and did not have time to include following tests:
- vector vt
- vop2: v_mac_f16, v_madak_f16
- vop3: v_mad_f16
- vopc: all
I will upload those tests a bit later today.
include/llvm/IR/IntrinsicsAMDGPU.td | ||
---|---|---|
172–182 ↗ | (On Diff #75834) | Can these changes be done in a different patch? They change the intrinsic signature, and will require updating any users of this intrinsic in external projects. |
lib/Target/AMDGPU/SIISelLowering.cpp | ||
85–86 ↗ | (On Diff #75834) | There is a 1-to-1 mapping between types and register classes, so you can drop the first addRegisterClass(MVT::f16, ... ) call |
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
82–83 ↗ | (On Diff #75834) | The TODO was taken care of already |
85–86 ↗ | (On Diff #75834) | I think this should only be added to SReg_32. We already have other random inconsistencies from having f32 in a VGPR class and i32 in SGPR which I've been trying to fix |
3602 ↗ | (On Diff #75834) | Does this need an f16 is legal check? If not this can probably just be an !isVector check, any of the other FP types are unusable |
lib/Target/AMDGPU/SIInstrInfo.cpp | ||
1415–1416 ↗ | (On Diff #75834) | I would prefer checking the more common f32 cases first |
1505 ↗ | (On Diff #75834) | Ditto |
lib/Target/AMDGPU/SIInstructions.td | ||
443–445 ↗ | (On Diff #75834) | The f32 patterns on the sources look wrong. Not sure how this compiles |
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
3602 ↗ | (On Diff #75834) | This is needed for v_cmp_class_f16. I have forgotten to add Subtarget->has16BitInsts(), which I have added in the revised patch. |
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
3703–3704 ↗ | (On Diff #77231) | Changed it by accident, I will put it back. |
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
2041–2044 ↗ | (On Diff #77247) | Braces |
3768 ↗ | (On Diff #77247) | Should this also handle f16? Could be separate optimization patch |
lib/Target/AMDGPU/SIISelLowering.h | ||
51 ↗ | (On Diff #77247) | should start with lower case |
lib/Target/AMDGPU/SIInstructions.td | ||
422–423 ↗ | (On Diff #77247) | Is this (and fpextend) correct? A correct lowering for these was just added I thought |
lib/Target/AMDGPU/SIRegisterInfo.td | ||
254 ↗ | (On Diff #77247) | Missing i16/f16 but it probably doesn't matter |
lib/Target/AMDGPU/SISchedule.td | ||
29 ↗ | (On Diff #77247) | I don't think we need this since they run at the same rat as 32-bit |
lib/Target/AMDGPU/VOP2Instructions.td | ||
326 ↗ | (On Diff #77247) | Why is this necessary? It can already mangle the FP type. This should work if you change it to VOP_F16_F16_I32, the int type doesn't matter |
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
3768 ↗ | (On Diff #77247) | Yes, I was initially planning to do it in this patch, but then decided that separate patch would be better. |
lib/Target/AMDGPU/SIInstructions.td | ||
422–423 ↗ | (On Diff #77247) | I think so. Without these changes a few existing tests (i.e. fptrunc.ll) were failing with the "cannot select error". |
lib/Target/AMDGPU/VOP2Instructions.td | ||
326 ↗ | (On Diff #77247) | Would this be acceptable to use VOP_F16_F16_I32? Spec says: D.f16 = S0.f16 * (2 ** S1.i16) So it should be i16. I have a follow up patch that changes ldexp intrinsic. |
lib/Target/AMDGPU/VOP2Instructions.td | ||
---|---|---|
326 ↗ | (On Diff #77247) | Changed to VOP_F16_F16_I32. |
LGTM except for some nits
lib/Target/AMDGPU/SIISelLowering.cpp | ||
---|---|---|
2033–2035 ↗ | (On Diff #77653) | Braces |
lib/Target/AMDGPU/SIShrinkInstructions.cpp | ||
93–94 ↗ | (On Diff #77653) | I would sort f16 after |
lib/Target/AMDGPU/VOP2Instructions.td | ||
160–162 ↗ | (On Diff #77653) | I think it should be easy to make VOP_MAC be the class with the type operand to avoid copy pasting the entire thing just to change the type |
test/CodeGen/AMDGPU/fadd.f16.ll | ||
1 ↗ | (On Diff #77653) | s/SI/CI |
test/CodeGen/AMDGPU/fcmp.f16.ll | ||
1 ↗ | (On Diff #77653) | Ditto |
test/CodeGen/AMDGPU/llvm.exp2.f16.ll | ||
41 ↗ | (On Diff #77653) | The naming convention should not name this vector and end in the actual vector type, v2f16 so this can expand to other vector widths that we might need to test later |