Page MenuHomePhabricator

[X86][FastISel] Teach how to select float-half conversion intrinsics.

Authored by andreadb on Feb 16 2015, 8:46 AM.



This patch teaches X86FastISel how to select intrinsic 'convert_from_fp16' and intrinsic 'convert_to_fp16'.

If the target has F16C (and no -soft-float), we can select instruction VCVTPS2PHrr for a float-to-half conversion, and VCVTPH2PSrr for a half-to-float conversion.

Added test fast-isel-float-half-convertion.ll to check that fast-isel doesn't fail to select float-half conversions if the target has F16C.

Please let me know if ok to submit.


Diff Detail

Event Timeline

andreadb updated this revision to Diff 20036.Feb 16 2015, 8:46 AM
andreadb retitled this revision from to [X86][FastISel] Teach how to select float-half conversion intrinsics..
andreadb updated this object.
andreadb edited the test plan for this revision. (Show Details)
andreadb added reviewers: mkuper, ributzka, qcolombet.
andreadb added a subscriber: Unknown Object (MLST).
ab added a subscriber: ab.Feb 16 2015, 11:44 AM
qcolombet added inline comments.Feb 17 2015, 11:03 AM

Shouldn’t we have some checks that the type is not double for any cases?


I think it would be cleaner to generate:
res = implicit_def
res2 = insert_subreg res, inputreg, 0

A copy with mismatching size sounds wrong to me.


EXTRACT_SUBREG here I believe.


Could you add tests with doubles?

I may be wrong but I thought the intrinsic allows any floating type.

Hi Quentin,


Right, I should check that neither the operand nor the return type is double.
I didn't take into account the fact that the intrinsic allows any floating point type.


Ok, I will change it.


I will fix it.


Right, the intrinsic allows any floating point type.
What if I add those tests into a separate test file maybe an XFAIL test)?

My concern is that if I add extra tests for doubles in this same file, then the test will start failing because of flag -fast-isel-abort. What do you think?

qcolombet added inline comments.Feb 17 2015, 1:02 PM

Good point.

Sounds good to me.

andreadb updated this revision to Diff 20399.Feb 20 2015, 8:06 AM

Hi Quentin,

Here is a new version of the patch. which hopefully addresses all your comments.

This patch checks that the operand type of intrinsic 'convert_to_fp16' is 'float', and that the return type of intrinsic 'convert_from_fp16' is 'float'. Those checks are required because both intrinsics may accept 'any' floating point type (even 'double' and 'long double').

As you suggested, I added another test (named 'fast-isel-float-double-convertion.ll') to check that fast-isel doesn't accidentally select a wrong instruction for double-to-half conversions. This new test is currently marked XFAIL since fast-isel only knows how to select float-to-half and half-to-float conversions.

In the previous patch you suggested to use an INSERT_SUBREG to perform an element insertion into a vector.
However, INSERT_SUBREG requires a valid sub-register index operand to identify which sub-register we want to address. Unfortunately, register class VR128 doesn't allow to use any sub-register index; therefore we cannot use insert_subreg to address the lower 32-bits of a VR128 register.

Instead, I implemented the element insertion (from GR32 to VR128) using tablegen'd function 'fastEmit_r' to emit the equivalent of a SCALAR_TO_VECTOR.
Conversions from FR32-to-VR128 are implicitly handled by method 'constrainOperandRegClass' (used by all the 'fastEmitInst_*' methods in FastISel).

We cannot use an 'extract_subreg' to extract a FR32 from VR128 for the same reason why we cannot use 'insert_subreg' on to promote an FR32 to VR128 (i.e. there is no sub_reg index that we can use). I found out that it is perfectly ok to 'copy' from register class VR128 to class FR32; the two classes are basically identical except for the accepted value types. This is also what ISel normally does when promoting FR32 to VR128 (and from VR128 to FR32). See for example the tablegen patterns in

For example:

(f32 (vector_extract (v4f32 VR128:$src), (iPTR 0))) -> 
    (COPY_TO_REGCLASS (v4f32 VR128:$src), FR32)

(v4f32 (scalar_to_vector FR32:%src)) ->
    (COPY_TO_REGCLASS FR32:$src, VR128)

Please let me know if ok to submit.


qcolombet accepted this revision.Feb 20 2015, 10:39 AM
qcolombet edited edge metadata.

Hi Andrea,


Thanks for checking.


This revision is now accepted and ready to land.Feb 20 2015, 10:39 AM
This revision was automatically updated to reflect the committed changes.

Thanks Quentin!
Committed revision 230043.