This is an archive of the discontinued LLVM Phabricator instance.

[X86][FastIsel] Teach how to select scalar integer to float/double conversions.
ClosedPublic

Authored by andreadb on Feb 17 2015, 5:05 AM.

Download Raw Diff

Details

Reviewers

qcolombet
mkuper
ributzka

Commits

rGe7b58ee55525: [X86][FastIsel] Teach how to select scalar integer to float/double conversions.
rL229589: [X86][FastIsel] Teach how to select scalar integer to float/double conversions.

Summary

This patch teaches fast-isel how to handle integer to float/double conversions.
In particular, this patch teaches fast-isel how to select a (V)CVTSI2SSrr for an integer to float conversion, and how to select (V)CVTSI2SDrr for an integer to double conversion.

Added test 'fast-isel-int-float-conversion.ll'.

Please let me know if ok to submit.

Thanks,
Andrea

Diff Detail

Repository: rL LLVM

Event Timeline

andreadb updated this revision to Diff 20079.Feb 17 2015, 5:05 AM

andreadb retitled this revision from to [X86][FastIsel] Teach how to select scalar integer to float/double conversions..

andreadb updated this object.

andreadb edited the test plan for this revision. (Show Details)

andreadb added reviewers: mkuper, ributzka, qcolombet.

andreadb added a subscriber: Unknown Object (MLST).

Hi Andrea,

The code looks mostly good to me, but I am confused by the tests.
Shouldn’t we match some cvt instructions for the SSE2 pattern?

Thanks,
-Quentin

lib/Target/X86/X86FastISel.cpp
2010 ↗	(On Diff #20079)	Use early exist here.
test/CodeGen/X86/fast-isel-int-float-conversion.ll
7 ↗	(On Diff #20079)	Shouldn’t we have some check for the actual pattern for SSE2?

Hi Quentin,

lib/Target/X86/X86FastISel.cpp
2010 ↗	(On Diff #20079)	I will use an early-exit.
test/CodeGen/X86/fast-isel-int-float-conversion.ll
7 ↗	(On Diff #20079)	Yes, for SSE2 we could match 'cvtsi2sdl %edi, %xmm0'. I just wanted to make sure that with SSE2 we didn't have the vex prefix. I agree that it is a bit confusing so I will change it.

Hi Quentin,

Here is an updated patch.
As you suggested, I added an early exit in X86SelectSIToFP. I also fixed the check lines for SSE2.

Please let me know what you think.

Thanks in advance,
Andrea

ributzka added inline comments.Feb 17 2015, 10:40 AM

lib/Target/X86/X86FastISel.cpp
2046 ↗	(On Diff #20097)	You may want to constrain the register class with "constrainOperandRegClass".

Thanks Andrea!

LGTM with Juergen’s suggestion. I believe this is not needed for correctness, but it is definitely cleaner!

Cheers,
-Quentin

test/CodeGen/X86/fast-isel-int-float-conversion.ll
7 ↗	(On Diff #20097)	You could still keep the SSE2-NOT with vcvt stuff if you want to be sure :).

This revision is now accepted and ready to land.Feb 17 2015, 10:48 AM

Hi Quentin, Juergen

thanks a lot for the reviews!
So, I have added a call to 'constrainOperandRegClass' to ensure that 'OpReg' is in the correct register class.
Since I am not very familiar with that method, before committing I just wanted to double-check with you if this new version of the patch is ok to commit :-).

Thanks again for your time!
Andrea

qcolombet added inline comments.Feb 17 2015, 1:57 PM

lib/Target/X86/X86FastISel.cpp
2042 ↗	(On Diff #20100)	Should be just 1 instead of (HasAVX …). Indeed, OpReg is the equivalent of the first operand for the description of the CVT. The fact that you will insert an implicit def is orthogonal so you do not have to account for that when doing the query.

Hi Quentin,

lib/Target/X86/X86FastISel.cpp
2042 ↗	(On Diff #20100)	I tried to just pass 1 instead of checking if the target HasAVX. However, the code generated is wrong.. For example, with that change, I get the following assembly: int_to_double_rr: vmovd %edi, %xmm1 vcvtsi2sd %xmm1, %xmm0, %xmm0 int_to_float_rr: vmovd %edi, %xmm1 vcvtsi2ssl %xmm1, %xmm0, %xmm0 retq The reason why I added a check for AVX is because OpReg is expected to be the last input operand. According to X86InstrSSE.td, OpReg should be the second operand for the AVX variant. If I pass index 2 on AVX, then I get the correct results. This seems to suggest that we have to account for the IMPLICIT_DEF when doing the query.

Please commit!

lib/Target/X86/X86FastISel.cpp
2042 ↗	(On Diff #20100)	Indeed! I forgot that the AVX variant copied the first operand. Sorry for the noise!

Closed by commit rL229589: [X86][FastIsel] Teach how to select scalar integer to float/double conversions. (authored by adibiagio). · Explain WhyFeb 17 2015, 3:42 PM

This revision was automatically updated to reflect the committed changes.

Hi Andrea,

On a second thought, why do we need to use the AVX variant at all if we never use the first source operand?
The bonus point is that would kill all the code related to the implicit def.

Thanks,
-Quentin

In D7698#125598, @qcolombet wrote:

Hi Andrea,

On a second thought, why do we need to use the AVX variant at all if we never use the first source operand?
The bonus point is that would kill all the code related to the implicit def.

Hi Quentin,

The main reason (at least from my point of view) is that we want to avoid mixing legacy SSE instructions and AVX instructions as much as possible. Otherwise, we end up increasing AVX-SSE transition penalties. That's why (basically everywhere in X86FastISel) we always prefer vex encoded instructions over legacy SSE instructions.

I hope this makes sense.
-Andrea

Yeah, that makes sense.
I was just wondering if that specific instruction had such a penalty.

Thanks for double checking!

Quentin

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

X86/

X86FastISel.cpp

48 lines

test/

CodeGen/

X86/

fast-isel-int-float-conversion.ll

45 lines

Diff 20118

llvm/trunk/lib/Target/X86/X86FastISel.cpp

Show First 20 Lines • Show All 123 Lines • ▼ Show 20 Lines	private:

bool X86SelectTrunc(const Instruction *I);		bool X86SelectTrunc(const Instruction *I);

bool X86SelectFPExtOrFPTrunc(const Instruction *I, unsigned Opc,		bool X86SelectFPExtOrFPTrunc(const Instruction *I, unsigned Opc,
const TargetRegisterClass *RC);		const TargetRegisterClass *RC);

bool X86SelectFPExt(const Instruction *I);		bool X86SelectFPExt(const Instruction *I);
bool X86SelectFPTrunc(const Instruction *I);		bool X86SelectFPTrunc(const Instruction *I);
		bool X86SelectSIToFP(const Instruction *I);

const X86InstrInfo *getInstrInfo() const {		const X86InstrInfo *getInstrInfo() const {
return Subtarget->getInstrInfo();		return Subtarget->getInstrInfo();
}		}
const X86TargetMachine *getTargetMachine() const {		const X86TargetMachine *getTargetMachine() const {
return static_cast<const X86TargetMachine *>(&TM);		return static_cast<const X86TargetMachine *>(&TM);
}		}

▲ Show 20 Lines • Show All 1,860 Lines • ▼ Show 20 Lines	bool X86FastISel::X86SelectSelect(const Instruction *I) {
// Fall-back to pseudo conditional move instructions, which will be later		// Fall-back to pseudo conditional move instructions, which will be later
// converted to control-flow.		// converted to control-flow.
if (X86FastEmitPseudoSelect(RetVT, I))		if (X86FastEmitPseudoSelect(RetVT, I))
return true;		return true;

return false;		return false;
}		}

		bool X86FastISel::X86SelectSIToFP(const Instruction *I) {
		if (!I->getOperand(0)->getType()->isIntegerTy(32))
		return false;

		// Select integer to float/double conversion.
		unsigned OpReg = getRegForValue(I->getOperand(0));
		if (OpReg == 0)
		return false;

		bool HasAVX = Subtarget->hasAVX();
		const TargetRegisterClass *RC = nullptr;
		unsigned Opcode;

		if (I->getType()->isDoubleTy() && X86ScalarSSEf64) {
		// sitofp int -> double
		Opcode = HasAVX ? X86::VCVTSI2SDrr : X86::CVTSI2SDrr;
		RC = &X86::FR64RegClass;
		} else if (I->getType()->isFloatTy() && X86ScalarSSEf32) {
		// sitofp int -> float
		Opcode = HasAVX ? X86::VCVTSI2SSrr : X86::CVTSI2SSrr;
		RC = &X86::FR32RegClass;
		} else
		return false;


		unsigned ImplicitDefReg = 0;
		if (HasAVX) {
		ImplicitDefReg = createResultReg(RC);
		BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc,
		TII.get(TargetOpcode::IMPLICIT_DEF), ImplicitDefReg);
		}

		const MCInstrDesc &II = TII.get(Opcode);
		OpReg = constrainOperandRegClass(II, OpReg, (HasAVX ? 2 : 1));

		unsigned ResultReg = createResultReg(RC);
		MachineInstrBuilder MIB;
		MIB = BuildMI(*FuncInfo.MBB, FuncInfo.InsertPt, DbgLoc, II, ResultReg);
		if (ImplicitDefReg)
		MIB.addReg(ImplicitDefReg, RegState::Kill);
		MIB.addReg(OpReg);
		updateValueMap(I, ResultReg);
		return true;
		}

// Helper method used by X86SelectFPExt and X86SelectFPTrunc.		// Helper method used by X86SelectFPExt and X86SelectFPTrunc.
bool X86FastISel::X86SelectFPExtOrFPTrunc(const Instruction *I,		bool X86FastISel::X86SelectFPExtOrFPTrunc(const Instruction *I,
unsigned TargetOpc,		unsigned TargetOpc,
const TargetRegisterClass *RC) {		const TargetRegisterClass *RC) {
assert((I->getOpcode() == Instruction::FPExt \|\|		assert((I->getOpcode() == Instruction::FPExt \|\|
I->getOpcode() == Instruction::FPTrunc) &&		I->getOpcode() == Instruction::FPTrunc) &&
"Instruction must be an FPExt or FPTrunc!");		"Instruction must be an FPExt or FPTrunc!");

▲ Show 20 Lines • Show All 1,034 Lines • ▼ Show 20 Lines	X86FastISel::fastSelectInstruction(const Instruction *I) {
case Instruction::Select:		case Instruction::Select:
return X86SelectSelect(I);		return X86SelectSelect(I);
case Instruction::Trunc:		case Instruction::Trunc:
return X86SelectTrunc(I);		return X86SelectTrunc(I);
case Instruction::FPExt:		case Instruction::FPExt:
return X86SelectFPExt(I);		return X86SelectFPExt(I);
case Instruction::FPTrunc:		case Instruction::FPTrunc:
return X86SelectFPTrunc(I);		return X86SelectFPTrunc(I);
		case Instruction::SIToFP:
		return X86SelectSIToFP(I);
case Instruction::IntToPtr: // Deliberate fall-through.		case Instruction::IntToPtr: // Deliberate fall-through.
case Instruction::PtrToInt: {		case Instruction::PtrToInt: {
EVT SrcVT = TLI.getValueType(I->getOperand(0)->getType());		EVT SrcVT = TLI.getValueType(I->getOperand(0)->getType());
EVT DstVT = TLI.getValueType(I->getType());		EVT DstVT = TLI.getValueType(I->getType());
if (DstVT.bitsGT(SrcVT))		if (DstVT.bitsGT(SrcVT))
return X86SelectZExt(I);		return X86SelectZExt(I);
if (DstVT.bitsLT(SrcVT))		if (DstVT.bitsLT(SrcVT))
return X86SelectTrunc(I);		return X86SelectTrunc(I);
▲ Show 20 Lines • Show All 304 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/X86/fast-isel-int-float-conversion.ll

				; RUN: llc -mtriple=x86_64-unknown-unknown -mcpu=generic -mattr=+sse2 -O0 --fast-isel-abort < %s \| FileCheck %s --check-prefix=ALL --check-prefix=SSE2
				; RUN: llc -mtriple=x86_64-unknown-unknown -mcpu=generic -mattr=+avx -O0 --fast-isel-abort < %s \| FileCheck %s --check-prefix=ALL --check-prefix=AVX


				define double @int_to_double_rr(i32 %a) {
				; ALL-LABEL: int_to_double_rr:
				; SSE2: cvtsi2sdl %edi, %xmm0
				; AVX: vcvtsi2sdl %edi, %xmm0, %xmm0
				; ALL-NEXT: ret
				entry:
				%0 = sitofp i32 %a to double
				ret double %0
				}

				define double @int_to_double_rm(i32* %a) {
				; ALL-LABEL: int_to_double_rm:
				; SSE2: cvtsi2sdl (%rdi), %xmm0
				; AVX: vcvtsi2sdl (%rdi), %xmm0, %xmm0
				; ALL-NEXT: ret
				entry:
				%0 = load i32* %a
				%1 = sitofp i32 %0 to double
				ret double %1
				}

				define float @int_to_float_rr(i32 %a) {
				; ALL-LABEL: int_to_float_rr:
				; SSE2: cvtsi2ssl %edi, %xmm0
				; AVX: vcvtsi2ssl %edi, %xmm0, %xmm0
				; ALL-NEXT: ret
				entry:
				%0 = sitofp i32 %a to float
				ret float %0
				}

				define float @int_to_float_rm(i32* %a) {
				; ALL-LABEL: int_to_float_rm:
				; SSE2: cvtsi2ssl (%rdi), %xmm0
				; AVX: vcvtsi2ssl (%rdi), %xmm0, %xmm0
				; ALL-NEXT: ret
				entry:
				%0 = load i32* %a
				%1 = sitofp i32 %0 to float
				ret float %1
				}

This is an archive of the discontinued LLVM Phabricator instance.

[X86][FastIsel] Teach how to select scalar integer to float/double conversions.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 20118

llvm/trunk/lib/Target/X86/X86FastISel.cpp

llvm/trunk/test/CodeGen/X86/fast-isel-int-float-conversion.ll

[X86][FastIsel] Teach how to select scalar integer to float/double conversions.
ClosedPublic