Improved InstCombine support for CVTPH2PS (F16C half 2 float conversion):
<4 x float> @llvm.x86.vcvtph2ps.128(<8 x i16>) - only uses the bottom 4 i16 elements for the conversion.
Added constant folding support.
Differential D12731
[InstCombine] CVTPH2PS Vector Demanded Elements + Constant Folding RKSimon on Sep 9 2015, 8:00 AM. Authored by
Details Improved InstCombine support for CVTPH2PS (F16C half 2 float conversion): <4 x float> @llvm.x86.vcvtph2ps.128(<8 x i16>) - only uses the bottom 4 i16 elements for the conversion. Added constant folding support.
Diff Detail
Event TimelineComment Actions LGTM
Comment Actions Thanks Ahmed, I'll look into doing something similar for CVTPS2PH in a future patch, although I'm a little concerned about matching all the (compile-time) rounding modes.
|