Improved InstCombine support for CVTPH2PS (F16C half 2 float conversion):
<4 x float> @llvm.x86.vcvtph2ps.128(<8 x i16>) - only uses the bottom 4 i16 elements for the conversion.
Added constant folding support.
Differential D12731
[InstCombine] CVTPH2PS Vector Demanded Elements + Constant Folding RKSimon on Sep 9 2015, 8:00 AM. Authored by
Details Improved InstCombine support for CVTPH2PS (F16C half 2 float conversion): <4 x float> @llvm.x86.vcvtph2ps.128(<8 x i16>) - only uses the bottom 4 i16 elements for the conversion. Added constant folding support.
Diff Detail
Event TimelineComment Actions Thanks Ahmed, I'll look into doing something similar for CVTPS2PH in a future patch, although I'm a little concerned about matching all the (compile-time) rounding modes.
|
I couldn't help but notice that this is very similar to the ppc case above. All else being equal, can you keep them together?
Plus, the ph2ps case is similar to ss2si below!