This is an archive of the discontinued LLVM Phabricator instance.

[X86][FastISel] Teach how to select float-half conversion intrinsics.
ClosedPublic

Authored by andreadb on Feb 16 2015, 8:46 AM.

Download Raw Diff

Details

Reviewers

qcolombet
mkuper
ributzka

Commits

rG7035178aebc9: [X86][FastIsel] Teach how to select float-half conversion intrinsics.
rL230043: [X86][FastIsel] Teach how to select float-half conversion intrinsics.

Summary

This patch teaches X86FastISel how to select intrinsic 'convert_from_fp16' and intrinsic 'convert_to_fp16'.

If the target has F16C (and no -soft-float), we can select instruction VCVTPS2PHrr for a float-to-half conversion, and VCVTPH2PSrr for a half-to-float conversion.

Added test fast-isel-float-half-convertion.ll to check that fast-isel doesn't fail to select float-half conversions if the target has F16C.

Please let me know if ok to submit.

Thanks,
Andrea

Diff Detail

Event Timeline

andreadb updated this revision to Diff 20036.Feb 16 2015, 8:46 AM

andreadb retitled this revision from to [X86][FastISel] Teach how to select float-half conversion intrinsics..

andreadb updated this object.

andreadb edited the test plan for this revision. (Show Details)

andreadb added reviewers: mkuper, ributzka, qcolombet.

andreadb added a subscriber: Unknown Object (MLST).

ab added a subscriber: ab.Feb 16 2015, 11:44 AM

qcolombet added inline comments.Feb 17 2015, 11:03 AM

lib/Target/X86/X86FastISel.cpp
2149	Shouldn’t we have some checks that the type is not double for any cases?
2162	I think it would be cleaner to generate: res = implicit_def res2 = insert_subreg res, inputreg, 0 A copy with mismatching size sounds wrong to me.
2182	EXTRACT_SUBREG here I believe.
test/CodeGen/X86/fast-isel-float-half-convertion.ll
7	Could you add tests with doubles? I may be wrong but I thought the intrinsic allows any floating type.

Hi Quentin,

lib/Target/X86/X86FastISel.cpp
2149	Right, I should check that neither the operand nor the return type is double. I didn't take into account the fact that the intrinsic allows any floating point type.
2162	Ok, I will change it.
2182	I will fix it.
test/CodeGen/X86/fast-isel-float-half-convertion.ll
7	Right, the intrinsic allows any floating point type. What if I add those tests into a separate test file maybe an XFAIL test)? My concern is that if I add extra tests for doubles in this same file, then the test will start failing because of flag -fast-isel-abort. What do you think?

qcolombet added inline comments.Feb 17 2015, 1:02 PM

test/CodeGen/X86/fast-isel-float-half-convertion.ll
7	Good point. Sounds good to me.

Hi Quentin,

Here is a new version of the patch. which hopefully addresses all your comments.

This patch checks that the operand type of intrinsic 'convert_to_fp16' is 'float', and that the return type of intrinsic 'convert_from_fp16' is 'float'. Those checks are required because both intrinsics may accept 'any' floating point type (even 'double' and 'long double').

As you suggested, I added another test (named 'fast-isel-float-double-convertion.ll') to check that fast-isel doesn't accidentally select a wrong instruction for double-to-half conversions. This new test is currently marked XFAIL since fast-isel only knows how to select float-to-half and half-to-float conversions.

In the previous patch you suggested to use an INSERT_SUBREG to perform an element insertion into a vector.
However, INSERT_SUBREG requires a valid sub-register index operand to identify which sub-register we want to address. Unfortunately, register class VR128 doesn't allow to use any sub-register index; therefore we cannot use insert_subreg to address the lower 32-bits of a VR128 register.

Instead, I implemented the element insertion (from GR32 to VR128) using tablegen'd function 'fastEmit_r' to emit the equivalent of a SCALAR_TO_VECTOR.
Conversions from FR32-to-VR128 are implicitly handled by method 'constrainOperandRegClass' (used by all the 'fastEmitInst_*' methods in FastISel).

We cannot use an 'extract_subreg' to extract a FR32 from VR128 for the same reason why we cannot use 'insert_subreg' on to promote an FR32 to VR128 (i.e. there is no sub_reg index that we can use). I found out that it is perfectly ok to 'copy' from register class VR128 to class FR32; the two classes are basically identical except for the accepted value types. This is also what ISel normally does when promoting FR32 to VR128 (and from VR128 to FR32). See for example the tablegen patterns in X86InstrSSE.td.

For example:

(f32 (vector_extract (v4f32 VR128:$src), (iPTR 0))) -> 
    (COPY_TO_REGCLASS (v4f32 VR128:$src), FR32)

(v4f32 (scalar_to_vector FR32:%src)) ->
    (COPY_TO_REGCLASS FR32:$src, VR128)

Please let me know if ok to submit.

Thanks,
Andrea

Hi Andrea,

LGTM.

Thanks for checking.

Quentin

This revision is now accepted and ready to land.Feb 20 2015, 10:39 AM

Closed by commit rL230043: [X86][FastIsel] Teach how to select float-half conversion intrinsics. (authored by adibiagio). · Explain WhyFeb 20 2015, 11:39 AM

This revision was automatically updated to reflect the committed changes.

Thanks Quentin!
Committed revision 230043.

Revision Contents

Path

Size

lib/

Target/

X86/

X86FastISel.cpp

54 lines

test/

CodeGen/

X86/

fast-isel-float-half-convertion.ll

21 lines

Diff 20036

lib/Target/X86/X86FastISel.cpp

Show First 20 Lines • Show All 2,130 Lines • ▼ Show 20 Lines	bool X86FastISel::TryEmitSmallMemcpy(X86AddressMode DestAM,

return true;		return true;
}		}

bool X86FastISel::fastLowerIntrinsicCall(const IntrinsicInst *II) {		bool X86FastISel::fastLowerIntrinsicCall(const IntrinsicInst *II) {
// FIXME: Handle more intrinsics.		// FIXME: Handle more intrinsics.
switch (II->getIntrinsicID()) {		switch (II->getIntrinsicID()) {
default: return false;		default: return false;
		case Intrinsic::convert_from_fp16:
		case Intrinsic::convert_to_fp16: {
		if (TM.Options.UseSoftFloat \|\| !Subtarget->hasF16C())
		return false;

		const Value *Op = II->getArgOperand(0);
		bool IsFloatToHalf = II->getIntrinsicID() == Intrinsic::convert_to_fp16;
		// F16C allows converting from float to half and from half to float.
		// In the case of float-to-half conversion, the type must be a float.
		if (IsFloatToHalf && !Op->getType()->isFloatTy())
		return false;
		qcolombetUnsubmitted Not Done Reply Inline Actions Shouldn’t we have some checks that the type is not double for any cases? qcolombet: Shouldn’t we have some checks that the type is not double for any cases?
		andreadbAuthorUnsubmitted Not Done Reply Inline Actions Right, I should check that neither the operand nor the return type is double. I didn't take into account the fact that the intrinsic allows any floating point type. andreadb: Right, I should check that neither the operand nor the return type is double. I didn't take…

		unsigned InputReg = getRegForValue(Op);
		if (!IsFloatToHalf) {
		assert(Op->getType()->isIntegerTy(16) && "Expected a 16-bit integer!");
		// Explicitly sign-extend the input value to 32-bit.
		InputReg = fastEmit_r(MVT::i16, MVT::i32, ISD::SIGN_EXTEND,
		InputReg, /Kill=/false);
		}

		// Copy to a vector (VR128) register class.
		const TargetRegisterClass *RC = TLI.getRegClassFor(MVT::v8i16);
		unsigned ResultReg = createResultReg(RC);
		BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc,
		qcolombetUnsubmitted Not Done Reply Inline Actions I think it would be cleaner to generate: res = implicit_def res2 = insert_subreg res, inputreg, 0 A copy with mismatching size sounds wrong to me. qcolombet: I think it would be cleaner to generate: res = implicit_def res2 = insert_subreg res, inputreg…
		andreadbAuthorUnsubmitted Not Done Reply Inline Actions Ok, I will change it. andreadb: Ok, I will change it.
		TII.get(TargetOpcode::COPY), ResultReg).addReg(InputReg);
		InputReg = ResultReg;

		// Now generate a VCVTPS2PHrr/VCVTPH2PSrr.
		ResultReg = createResultReg(RC);
		unsigned Opc = IsFloatToHalf ? X86::VCVTPS2PHrr : X86::VCVTPH2PSrr;
		MachineInstrBuilder MIB;
		MIB = BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, TII.get(Opc),
		ResultReg).addReg(InputReg, RegState::Kill);
		if (IsFloatToHalf)
		// Instruction VCVTPS2PHrr requires an extra immediate operand that
		// provides rounding control.
		MIB.addImm(0);
		InputReg = ResultReg;

		// Emit another copy to register class.
		RC = IsFloatToHalf ? &X86::GR32RegClass : &X86::FR32RegClass;
		ResultReg = createResultReg(RC);
		MIB = BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc,
		TII.get(TargetOpcode::COPY), ResultReg);
		qcolombetUnsubmitted Not Done Reply Inline Actions EXTRACT_SUBREG here I believe. qcolombet: EXTRACT_SUBREG here I believe.
		andreadbAuthorUnsubmitted Not Done Reply Inline Actions I will fix it. andreadb: I will fix it.
		MIB.addReg(InputReg, RegState::Kill);

		if (IsFloatToHalf)
		// In the case of float-to-half conversions, the half float is in
		// the lower 16-bits of ResultReg.
		ResultReg = fastEmitInst_extractsubreg(MVT::i16, ResultReg, /Kill=/true,
		X86::sub_16bit);
		updateValueMap(II, ResultReg);
		return true;
		}
case Intrinsic::frameaddress: {		case Intrinsic::frameaddress: {
MachineFunction *MF = FuncInfo.MF;		MachineFunction *MF = FuncInfo.MF;
if (MF->getTarget().getMCAsmInfo()->usesWindowsCFI())		if (MF->getTarget().getMCAsmInfo()->usesWindowsCFI())
return false;		return false;

Type *RetTy = II->getCalledFunction()->getReturnType();		Type *RetTy = II->getCalledFunction()->getReturnType();

MVT VT;		MVT VT;
▲ Show 20 Lines • Show All 1,223 Lines • Show Last 20 Lines

test/CodeGen/X86/fast-isel-float-half-convertion.ll

				; RUN: llc -O0 -fast-isel-abort -mtriple=x86_64-unknown-unknown -mattr=+f16c < %s \| FileCheck %s

				define i16 @test_fp32_to_fp16(float %a) {
				; CHECK-LABEL: test_fp32_to_fp16:
				; CHECK: vcvtps2ph
				entry:
				%0 = call i16 @llvm.convert.to.fp16.f32(float %a)
				qcolombetUnsubmitted Not Done Reply Inline Actions Could you add tests with doubles? I may be wrong but I thought the intrinsic allows any floating type. qcolombet: Could you add tests with doubles? I may be wrong but I thought the intrinsic allows any…
				andreadbAuthorUnsubmitted Not Done Reply Inline Actions Right, the intrinsic allows any floating point type. What if I add those tests into a separate test file maybe an XFAIL test)? My concern is that if I add extra tests for doubles in this same file, then the test will start failing because of flag -fast-isel-abort. What do you think? andreadb: Right, the intrinsic allows any floating point type. What if I add those tests into a separate…
				qcolombetUnsubmitted Not Done Reply Inline Actions Good point. Sounds good to me. qcolombet: Good point. Sounds good to me.
				ret i16 %0
				}

				define float @test_fp16_to_fp32(i16 signext %a) {
				; CHECK-LABEL: test_fp16_to_fp32:
				; CHECK: vcvtph2ps
				entry:
				%0 = call float @llvm.convert.from.fp16.f32(i16 %a)
				ret float %0
				}


				declare i16 @llvm.convert.to.fp16.f32(float)
				declare float @llvm.convert.from.fp16.f32(i16)