This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Promote f16/i16 conversions to f32/i32 + custom lower f16 = fp_round f64
ClosedPublic

Authored by kzhuravl on Nov 16 2016, 1:44 PM.

Details

Summary

I will split this change in 2 when submitting (it is easier to test everything in bulk):

  • Promote f16/i16 conversions to f32/i32
  • Custom lower f16 = fp_round f64

Testing done:

  • Conformance: half (passed), conversions (passed)
  • make check-all (passed)

Diff Detail

Event Timeline

kzhuravl updated this revision to Diff 78256.Nov 16 2016, 1:44 PM
kzhuravl retitled this revision from to [AMDGPU] Promote f16/i16 conversions to f32/i32 + custom lower f16 = fp_round f64.
kzhuravl updated this object.
kzhuravl added reviewers: tstellarAMD, arsenm.
kzhuravl added a subscriber: llvm-commits.
arsenm added inline comments.Nov 16 2016, 1:50 PM
lib/Target/AMDGPU/SIISelLowering.cpp
3664–3668

This should be very unnecessary

kzhuravl added inline comments.Nov 16 2016, 1:51 PM
lib/Target/AMDGPU/SIISelLowering.cpp
3664–3668

I am getting a "cannot select: i16 = bitcast i16". Do you have a suggestion on how to solve it?

Thanks

arsenm added inline comments.Nov 16 2016, 1:53 PM
lib/Target/AMDGPU/SIISelLowering.cpp
3664–3668

Where is that coming from? I thought getNode folded out trivial bitcasts like this already

kzhuravl added inline comments.Nov 16 2016, 1:54 PM
lib/Target/AMDGPU/SIISelLowering.cpp
3664–3668
arsenm added inline comments.Nov 16 2016, 1:59 PM
lib/Target/AMDGPU/SIISelLowering.cpp
3664–3668

I don't see how that would create a bitcast, or create one that somehow bypasses the no-op fold

kzhuravl added inline comments.Nov 16 2016, 2:47 PM
lib/Target/AMDGPU/SIISelLowering.cpp
3664–3668

It gets inserted in legalize phase, when legalizing fp_round:

  t16: f16 = fp_round t14, TargetConstant:i32<0>
t19: i16 = bitcast t16

Then fp_round gets legalized to (truncate to i16 (fp_to_fp16). Another approach is to legalize fp_round to (bitcast to f16 (truncate to i16 (fp_to_fp16)?

arsenm added inline comments.Nov 16 2016, 3:06 PM
lib/Target/AMDGPU/SIISelLowering.cpp
3664–3668

I'm confused. Is it i16 to i16 or i16 to f16?

kzhuravl added inline comments.Nov 16 2016, 3:09 PM
lib/Target/AMDGPU/SIISelLowering.cpp
3664–3668

in the end it is i16 to i16. after fp_round gets legalized:

    t21: i32 = fp_to_fp16 t14
  t22: i16 = truncate t21
t19: i16 = bitcast t22
arsenm edited edge metadata.Nov 16 2016, 3:16 PM

If fp_tound has integer type after legalization, the legalization for it is broken

kzhuravl updated this revision to Diff 78275.Nov 16 2016, 3:22 PM
kzhuravl edited edge metadata.

Address review feedback

If fp_tound has integer type after legalization, the legalization for it is broken

Agreed, I have bitcasted it to f16 and removed bitcast from combining. However I found that in some existing cases we legalize fp_to_fp16 to integer types (which got me confused), namely:
https://github.com/llvm-mirror/llvm/blob/master/lib/Target/AMDGPU/AMDGPUISelLowering.cpp#L2100

Also, why are our f16->f32 and f32->f16 are using i32?
https://github.com/llvm-mirror/llvm/blob/master/lib/Target/AMDGPU/VOP1Instructions.td#L132

If fp_tound has integer type after legalization, the legalization for it is broken

Agreed, I have bitcasted it to f16 and removed bitcast from combining. However I found that in some existing cases we legalize fp_to_fp16 to integer types (which got me confused), namely:
https://github.com/llvm-mirror/llvm/blob/master/lib/Target/AMDGPU/AMDGPUISelLowering.cpp#L2100

Also, why are our f16->f32 and f32->f16 are using i32?
https://github.com/llvm-mirror/llvm/blob/master/lib/Target/AMDGPU/VOP1Instructions.td#L132

fp_to_fp16 is not the same as fp_round. fp_to_fp16 is f16 stored in an integer type

arsenm accepted this revision.Nov 16 2016, 6:00 PM
arsenm edited edge metadata.

LGTM

lib/Target/AMDGPU/SIISelLowering.cpp
2063–2064

This can be return of the getNode directly

This revision is now accepted and ready to land.Nov 16 2016, 6:00 PM
kzhuravl updated this object.Nov 16 2016, 7:59 PM
kzhuravl edited edge metadata.