Download Raw Diff

Details

Reviewers

javed.absar
SjoerdMeijer

Commits

rG16092ab3c5f5: [AArch64] added FP16 vcvth intrinsic support
rL333410: [AArch64] added FP16 vcvth intrinsic support

Summary

Change-Id: I0df845749c7689dfc99150ba7c19c7d0dadbd705

Diff Detail

Repository: rL LLVM

Event Timeline

LukeGeeson created this revision.May 1 2018, 6:35 AM

Herald added a reviewer: javed.absar. · View Herald TranscriptMay 1 2018, 6:35 AM

Herald added subscribers: llvm-commits, kristof.beyls, rengolin. · View Herald Transcript

LukeGeeson added a reviewer: SjoerdMeijer.May 1 2018, 6:36 AM

lebedev.ri added a subscriber: lebedev.ri.May 1 2018, 6:40 AM

lebedev.ri added inline comments.

test/CodeGen/AArch64/fp16_intrinsic_scalar_2op.ll
131–200 ↗	(On Diff #144703)	Is it intentional that there are no `; CHECK` lines?

Hi Luke, thanks for fixing this, some first comments inlined.

lib/Target/AArch64/AArch64InstrFormats.td
5983 ↗	(On Diff #144703)	Remove comments
7793 ↗	(On Diff #144703)	nit: trailing whitespace?
7799 ↗	(On Diff #144703)	nit: don't think you break up the lines like this.
lib/Target/AArch64/AArch64InstrInfo.td
4880 ↗	(On Diff #144703)	Better not to introduce new line breaks here and also below.
test/CodeGen/AArch64/fp16_intrinsic_scalar_2op.ll
131 ↗	(On Diff #144703)	You should add CHECK lines for all these test cases (see examples above).

[AArch64] rm'd linebreaks, added CHECKS to fp16 instrinsics

SjoerdMeijer added inline comments.May 1 2018, 7:43 AM

test/CodeGen/AArch64/fp16_intrinsic_scalar_2op.ll
1 ↗	(On Diff #144710)	Please remove this line.
14 ↗	(On Diff #144710)	Please don't modify the existing test cases. There is no need to check for the %bb.0 stuff as it doesn't add any value.
146 ↗	(On Diff #144710)	Please remove this line here and similar lines in the test cases below.
147 ↗	(On Diff #144710)	You should create a regexp for w8, and use it in the line below. Currently the test is very fragile, because if another register gets allocated this test starts failing.

rm'd additional %bb lines, added regex for w8

give me one moment, syntax error missed

lib/Target/AArch64/AArch64InstrFormats.td
7799 ↗	(On Diff #144703)	Would you remove line 7803 too? looks better for separation
test/CodeGen/AArch64/fp16_intrinsic_scalar_2op.ll
131–200 ↗	(On Diff #144703)	missed this, will add thanks

SjoerdMeijer added inline comments.May 1 2018, 8:40 AM

test/CodeGen/AArch64/fp16_intrinsic_scalar_2op.ll
156 ↗	(On Diff #144725)	What is the supported range of constant 'n' for this intrinsic? If it is e.g. [1,16], I think it is best to test the minimum value 1, which is what we do here, but also the maximum value 16. Same comment for the other intrinsics here.

[AArch64] fixed syntax errors for FP16 intrinsics

Harbormaster completed remote builds in B17576: Diff 144730.May 1 2018, 8:43 AM

[AArch64] added tests for FP16 intrinsic ranges

SjoerdMeijer added inline comments.May 2 2018, 3:35 AM

lib/Target/AArch64/AArch64InstrFormats.td
7793 ↗	(On Diff #144844)	Do we use OpNode?
7805 ↗	(On Diff #144844)	Nit1: spaces are off: FPR16 should be aligned under U. Nit2: space between ">{". Same for other rules below.
lib/Target/AArch64/AArch64InstrInfo.td
4869 ↗	(On Diff #144844)	Do we need to pass the intrinsics opnodes?

[AArch64] removed OpNode+intrinsic template params

LukeGeeson marked an inline comment as done.May 2 2018, 5:23 AM

LukeGeeson added inline comments.

lib/Target/AArch64/AArch64InstrFormats.td
7793 ↗	(On Diff #144844)	builds and tests fine without it - removed

[AArch64] added spaces between [] and {

LukeGeeson marked an inline comment as done.May 2 2018, 6:15 AM

Thanks, looks good to me now. One more nit inlined, but no need for another review.

lib/Target/AArch64/AArch64InstrFormats.td
5982 ↗	(On Diff #144870)	No changes were made here, so can you keep the old formatting?

This revision is now accepted and ready to land.May 2 2018, 6:28 AM

Sorry, I had one more look, see question inlined.

lib/Target/AArch64/AArch64InstrFormats.td
7804 ↗	(On Diff #144870)	Does this need to be vecshiftR32 and thus accept values [1,32]? If that's the case, we also need to update the tests.
7822 ↗	(On Diff #144870)	We don't need to change this?

This revision now requires changes to proceed.May 2 2018, 8:12 AM

[AArch64] modified fp16 instrinsics patterns

Herald added a reviewer: andreadb. · View Herald TranscriptMay 4 2018, 3:42 AM

Herald added a reviewer: alexander-shaposhnikov. · View Herald Transcript

Herald added subscribers: george.burgess.iv, gbedwell, delcypher and 13 others. · View Herald Transcript

Local setup broken, please ignore

[Aarch64 reverting diff]

LukeGeeson removed reviewers: andreadb, alexander-shaposhnikov.May 4 2018, 3:54 AM

LukeGeeson edited subscribers, added: SjoerdMeijer; removed: jfb, sanjoy, arsenm and 17 others.

LukeGeeson added a subscriber: llvm-commits.May 4 2018, 4:05 AM

[AArch64] fixed FP16 intrinsic h pattern

[AArch64] added 32 case for FP16 instrinsic case, fixed remarks

Harbormaster completed remote builds in B17696: Diff 145183.May 4 2018, 6:26 AM

[AArch64] fixed 32 case for FP16 s61 f16 test, 2op tests pass

SjoerdMeijer added inline comments.May 4 2018, 7:11 AM

lib/Target/AArch64/AArch64InstrFormats.td
5982 ↗	(On Diff #145186)	Nit: trailing whitespace
7793 ↗	(On Diff #145186)	Nit: unnecessary new line.
7816 ↗	(On Diff #145186)	Do the HDr and DHr patterns need to be guarded by predicates [HasNEON, HasFullFP16]? Can you check if they are all predicates are correctly set here? Also looks like we can simply things here a bit: merge all patterns with the same neon and fullfp16 predicates in one block.
7834 ↗	(On Diff #145186)	To keep the changes minimal, can you please move the "def h" back up where it was?

[AArch64] moved FP16 h pattern, refactored FP16 intrinsics into block
[AArch64] moved FP16 h pattern, refactored FP16 intrinsics into block

Harbormaster completed remote builds in B17702: Diff 145197.May 4 2018, 8:09 AM

[AArch64] removed whitespace for consistency

Thanks, I think this looks OK.

This revision is now accepted and ready to land.May 14 2018, 1:19 AM

Closed by commit rL333410: [AArch64] added FP16 vcvth intrinsic support (authored by LukeGeeson). · Explain WhyMay 29 2018, 4:44 AM

This revision was automatically updated to reflect the committed changes.

Diff 148884

llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 7,922 Lines • ▼ Show 20 Lines	class BaseSIMDScalarShiftTied<bit U, bits<5> opc, bits<7> fixed_imm,
let Inst{10} = 1;		let Inst{10} = 1;
let Inst{9-5} = Rn;		let Inst{9-5} = Rn;
let Inst{4-0} = Rd;		let Inst{4-0} = Rd;
}		}


multiclass SIMDFPScalarRShift<bit U, bits<5> opc, string asm> {		multiclass SIMDFPScalarRShift<bit U, bits<5> opc, string asm> {
let Predicates = [HasNEON, HasFullFP16] in {		let Predicates = [HasNEON, HasFullFP16] in {
		def HSr : BaseSIMDScalarShift<U, opc, {0,0,1,?,?,?,?},
		FPR16, FPR32, vecshiftR16, asm, []> {
		let Inst{19-16} = imm{3-0};
		let Inst{23-22} = 0b11;
		}
		def SHr : BaseSIMDScalarShift<U, opc, {?,?,?,?,?,?,?},
		FPR32, FPR16, vecshiftR32, asm, []> {
		let Inst{19-16} = imm{3-0};
		}
		def HDr : BaseSIMDScalarShift<U, opc, {?,?,?,?,?,?,?},
		FPR16, FPR64, vecshiftR32, asm, []> {
		let Inst{21-16} = imm{5-0};
		let Inst{23-22} = 0b11;
		}
		def DHr : BaseSIMDScalarShift<U, opc, {?,?,?,?,?,?,?},
		FPR64, FPR16, vecshiftR64, asm, []> {
		let Inst{21-16} = imm{5-0};
		let Inst{23-22} = 0b11;
		let Inst{31} = 1;
		}
def h : BaseSIMDScalarShift<U, opc, {0,0,1,?,?,?,?},		def h : BaseSIMDScalarShift<U, opc, {0,0,1,?,?,?,?},
FPR16, FPR16, vecshiftR16, asm, []> {		FPR16, FPR16, vecshiftR16, asm, []> {
let Inst{19-16} = imm{3-0};		let Inst{19-16} = imm{3-0};
}		}
} // Predicates = [HasNEON, HasFullFP16]		} // Predicates = [HasNEON, HasFullFP16]
def s : BaseSIMDScalarShift<U, opc, {0,1,?,?,?,?,?},		def s : BaseSIMDScalarShift<U, opc, {0,1,?,?,?,?,?},
FPR32, FPR32, vecshiftR32, asm, []> {		FPR32, FPR32, vecshiftR32, asm, []> {
let Inst{20-16} = imm{4-0};		let Inst{20-16} = imm{4-0};
}		}

def d : BaseSIMDScalarShift<U, opc, {1,?,?,?,?,?,?},		def d : BaseSIMDScalarShift<U, opc, {1,?,?,?,?,?,?},
FPR64, FPR64, vecshiftR64, asm, []> {		FPR64, FPR64, vecshiftR64, asm, []> {
let Inst{21-16} = imm{5-0};		let Inst{21-16} = imm{5-0};
}		}
}		}

multiclass SIMDScalarRShiftD<bit U, bits<5> opc, string asm,		multiclass SIMDScalarRShiftD<bit U, bits<5> opc, string asm,
SDPatternOperator OpNode> {		SDPatternOperator OpNode> {
▲ Show 20 Lines • Show All 2,270 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 4,949 Lines • ▼ Show 20 Lines
	def : Pat<(i64 (int_aarch64_neon_vcvtfp2fxu (f64 FPR64:$Rn), vecshiftR64:$imm)),			def : Pat<(i64 (int_aarch64_neon_vcvtfp2fxu (f64 FPR64:$Rn), vecshiftR64:$imm)),
	(FCVTZUd FPR64:$Rn, vecshiftR64:$imm)>;			(FCVTZUd FPR64:$Rn, vecshiftR64:$imm)>;
	def : Pat<(v1i64 (int_aarch64_neon_vcvtfp2fxs (v1f64 FPR64:$Rn),			def : Pat<(v1i64 (int_aarch64_neon_vcvtfp2fxs (v1f64 FPR64:$Rn),
	vecshiftR64:$imm)),			vecshiftR64:$imm)),
	(FCVTZSd FPR64:$Rn, vecshiftR64:$imm)>;			(FCVTZSd FPR64:$Rn, vecshiftR64:$imm)>;
	def : Pat<(v1i64 (int_aarch64_neon_vcvtfp2fxu (v1f64 FPR64:$Rn),			def : Pat<(v1i64 (int_aarch64_neon_vcvtfp2fxu (v1f64 FPR64:$Rn),
	vecshiftR64:$imm)),			vecshiftR64:$imm)),
	(FCVTZUd FPR64:$Rn, vecshiftR64:$imm)>;			(FCVTZUd FPR64:$Rn, vecshiftR64:$imm)>;
	def : Pat<(int_aarch64_neon_vcvtfxs2fp FPR32:$Rn, vecshiftR32:$imm),			def : Pat<(f16 (int_aarch64_neon_vcvtfxs2fp (i64 FPR64:$Rn), vecshiftR16:$imm)),
	(SCVTFs FPR32:$Rn, vecshiftR32:$imm)>;			(FCVTZSHDr (i64 FPR64:$Rn), vecshiftR32:$imm)>;
				def : Pat<(i32 (int_aarch64_neon_vcvtfp2fxu FPR16:$Rn, vecshiftR32:$imm)),
				(FCVTZUSHr FPR16:$Rn, vecshiftR32:$imm)>;
				def : Pat<(i32 (int_aarch64_neon_vcvtfp2fxs FPR16:$Rn, vecshiftR32:$imm)),
				(FCVTZSSHr FPR16:$Rn, vecshiftR32:$imm)>;
				def : Pat<(i64 (int_aarch64_neon_vcvtfp2fxs (f16 FPR16:$Rn), vecshiftR64:$imm)),
				(FCVTZSDHr (f16 FPR16:$Rn), vecshiftR64:$imm)>;
				def : Pat<(f16 (int_aarch64_neon_vcvtfxu2fp FPR32:$Rn, vecshiftR16:$imm)),
				(UCVTFHSr FPR32:$Rn, vecshiftR16:$imm)>;
	def : Pat<(int_aarch64_neon_vcvtfxu2fp FPR32:$Rn, vecshiftR32:$imm),			def : Pat<(int_aarch64_neon_vcvtfxu2fp FPR32:$Rn, vecshiftR32:$imm),
	(UCVTFs FPR32:$Rn, vecshiftR32:$imm)>;			(UCVTFs FPR32:$Rn, vecshiftR32:$imm)>;
	def : Pat<(f64 (int_aarch64_neon_vcvtfxs2fp (i64 FPR64:$Rn), vecshiftR64:$imm)),
	(SCVTFd FPR64:$Rn, vecshiftR64:$imm)>;
	def : Pat<(f64 (int_aarch64_neon_vcvtfxu2fp (i64 FPR64:$Rn), vecshiftR64:$imm)),			def : Pat<(f64 (int_aarch64_neon_vcvtfxu2fp (i64 FPR64:$Rn), vecshiftR64:$imm)),
	(UCVTFd FPR64:$Rn, vecshiftR64:$imm)>;			(UCVTFd FPR64:$Rn, vecshiftR64:$imm)>;
	def : Pat<(v1f64 (int_aarch64_neon_vcvtfxs2fp (v1i64 FPR64:$Rn),			def : Pat<(v1f64 (int_aarch64_neon_vcvtfxs2fp (v1i64 FPR64:$Rn),
	vecshiftR64:$imm)),			vecshiftR64:$imm)),
	(SCVTFd FPR64:$Rn, vecshiftR64:$imm)>;			(SCVTFd FPR64:$Rn, vecshiftR64:$imm)>;
				def : Pat<(f16 (int_aarch64_neon_vcvtfxs2fp (i32 FPR32:$Rn), vecshiftR16:$imm)),
				(SCVTFHSr FPR32:$Rn, vecshiftR16:$imm)>;
				def : Pat<(f16 (int_aarch64_neon_vcvtfxs2fp FPR32:$Rn, vecshiftR16:$imm)),
				(SCVTFHSr FPR32:$Rn, vecshiftR16:$imm)>;
				def : Pat<(f64 (int_aarch64_neon_vcvtfxs2fp (i64 FPR64:$Rn), vecshiftR64:$imm)),
				(SCVTFd FPR64:$Rn, vecshiftR64:$imm)>;
	def : Pat<(v1f64 (int_aarch64_neon_vcvtfxu2fp (v1i64 FPR64:$Rn),			def : Pat<(v1f64 (int_aarch64_neon_vcvtfxu2fp (v1i64 FPR64:$Rn),
	vecshiftR64:$imm)),			vecshiftR64:$imm)),
	(UCVTFd FPR64:$Rn, vecshiftR64:$imm)>;			(UCVTFd FPR64:$Rn, vecshiftR64:$imm)>;

	defm SHL : SIMDScalarLShiftD< 0, 0b01010, "shl", AArch64vshl>;			defm SHL : SIMDScalarLShiftD< 0, 0b01010, "shl", AArch64vshl>;
	defm SLI : SIMDScalarLShiftDTied<1, 0b01010, "sli">;			defm SLI : SIMDScalarLShiftDTied<1, 0b01010, "sli">;
	defm SQRSHRN : SIMDScalarRShiftBHS< 0, 0b10011, "sqrshrn",			defm SQRSHRN : SIMDScalarRShiftBHS< 0, 0b10011, "sqrshrn",
	int_aarch64_neon_sqrshrn>;			int_aarch64_neon_sqrshrn>;
	▲ Show 20 Lines • Show All 1,403 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/AArch64/fp16_intrinsic_scalar_2op.ll

	Show First 20 Lines • Show All 120 Lines • ▼ Show 20 Lines
	define dso_local half @t_vrsqrtsh_f16(half %a, half %b) {			define dso_local half @t_vrsqrtsh_f16(half %a, half %b) {
	; CHECK-LABEL: t_vrsqrtsh_f16:			; CHECK-LABEL: t_vrsqrtsh_f16:
	; CHECK: frsqrts h0, h0, h1			; CHECK: frsqrts h0, h0, h1
	; CHECK-NEXT: ret			; CHECK-NEXT: ret
	entry:			entry:
	%vrsqrtsh_f16 = tail call half @llvm.aarch64.neon.frsqrts.f16(half %a, half %b)			%vrsqrtsh_f16 = tail call half @llvm.aarch64.neon.frsqrts.f16(half %a, half %b)
	ret half %vrsqrtsh_f16			ret half %vrsqrtsh_f16
	}			}

				declare half @llvm.aarch64.neon.vcvtfxs2fp.f16.i32(i32, i32) #1
				declare half @llvm.aarch64.neon.vcvtfxs2fp.f16.i64(i64, i32) #1
				declare i32 @llvm.aarch64.neon.vcvtfp2fxs.i32.f16(half, i32) #1
				declare i64 @llvm.aarch64.neon.vcvtfp2fxs.i64.f16(half, i32) #1
				declare half @llvm.aarch64.neon.vcvtfxu2fp.f16.i32(i32, i32) #1
				declare i32 @llvm.aarch64.neon.vcvtfp2fxu.i32.f16(half, i32) #1

				define dso_local half @test_vcvth_n_f16_s16_1(i16 %a) {
				; CHECK-LABEL: test_vcvth_n_f16_s16_1:
				; CHECK: sxth w[[wReg:[0-9]+]], w0
				; CHECK-NEXT: fmov s0, w[[wReg:[0-9]+]]
				; CHECK-NEXT: scvtf h0, s0, #1
				; CHECK-NEXT: ret
				entry:
				%sext = sext i16 %a to i32
				%fcvth_n = tail call half @llvm.aarch64.neon.vcvtfxs2fp.f16.i32(i32 %sext, i32 1)
				ret half %fcvth_n
				}

				define dso_local half @test_vcvth_n_f16_s16_16(i16 %a) {
				; CHECK-LABEL: test_vcvth_n_f16_s16_16:
				; CHECK: sxth w[[wReg:[0-9]+]], w0
				; CHECK-NEXT: fmov s0, w[[wReg:[0-9]+]]
				; CHECK-NEXT: scvtf h0, s0, #16
				; CHECK-NEXT: ret
				entry:
				%sext = sext i16 %a to i32
				%fcvth_n = tail call half @llvm.aarch64.neon.vcvtfxs2fp.f16.i32(i32 %sext, i32 16)
				ret half %fcvth_n
				}

				define dso_local half @test_vcvth_n_f16_s32_1(i32 %a) {
				; CHECK-LABEL: test_vcvth_n_f16_s32_1:
				; CHECK: fmov s0, w0
				; CHECK-NEXT: scvtf h0, s0, #1
				; CHECK-NEXT: ret
				entry:
				%vcvth_n_f16_s32 = tail call half @llvm.aarch64.neon.vcvtfxs2fp.f16.i32(i32 %a, i32 1)
				ret half %vcvth_n_f16_s32
				}

				define dso_local half @test_vcvth_n_f16_s32_16(i32 %a) {
				; CHECK-LABEL: test_vcvth_n_f16_s32_16:
				; CHECK: fmov s0, w0
				; CHECK-NEXT: scvtf h0, s0, #16
				; CHECK-NEXT: ret
				entry:
				%vcvth_n_f16_s32 = tail call half @llvm.aarch64.neon.vcvtfxs2fp.f16.i32(i32 %a, i32 16)
				ret half %vcvth_n_f16_s32
				}

				define dso_local half @test_vcvth_n_f16_s64_1(i64 %a) {
				; CHECK-LABEL: test_vcvth_n_f16_s64_1:
				; CHECK: fmov d0, x0
				; CHECK-NEXT: fcvtzs h0, d0, #1
				; CHECK-NEXT: ret
				entry:
				%vcvth_n_f16_s64 = tail call half @llvm.aarch64.neon.vcvtfxs2fp.f16.i64(i64 %a, i32 1)
				ret half %vcvth_n_f16_s64
				}

				define dso_local half @test_vcvth_n_f16_s64_16(i64 %a) {
				; CHECK-LABEL: test_vcvth_n_f16_s64_16:
				; CHECK: fmov d0, x0
				; CHECK-NEXT: fcvtzs h0, d0, #16
				; CHECK-NEXT: ret
				entry:
				%vcvth_n_f16_s64 = tail call half @llvm.aarch64.neon.vcvtfxs2fp.f16.i64(i64 %a, i32 16)
				ret half %vcvth_n_f16_s64
				}

				define dso_local i16 @test_vcvth_n_s16_f16_1(half %a) {
				; CHECK-LABEL: test_vcvth_n_s16_f16_1:
				; CHECK: fcvtzs s0, h0, #1
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				entry:
				%fcvth_n = tail call i32 @llvm.aarch64.neon.vcvtfp2fxs.i32.f16(half %a, i32 1)
				%0 = trunc i32 %fcvth_n to i16
				ret i16 %0
				}

				define dso_local i16 @test_vcvth_n_s16_f16_16(half %a) {
				; CHECK-LABEL: test_vcvth_n_s16_f16_16:
				; CHECK: fcvtzs s0, h0, #16
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				entry:
				%fcvth_n = tail call i32 @llvm.aarch64.neon.vcvtfp2fxs.i32.f16(half %a, i32 16)
				%0 = trunc i32 %fcvth_n to i16
				ret i16 %0
				}

				define dso_local i32 @test_vcvth_n_s32_f16_1(half %a) {
				; CHECK-LABEL: test_vcvth_n_s32_f16_1:
				; CHECK: fcvtzs s0, h0, #1
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				entry:
				%vcvth_n_s32_f16 = tail call i32 @llvm.aarch64.neon.vcvtfp2fxs.i32.f16(half %a, i32 1)
				ret i32 %vcvth_n_s32_f16
				}

				define dso_local i32 @test_vcvth_n_s32_f16_16(half %a) {
				; CHECK-LABEL: test_vcvth_n_s32_f16_16:
				; CHECK: fcvtzs s0, h0, #16
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				entry:
				%vcvth_n_s32_f16 = tail call i32 @llvm.aarch64.neon.vcvtfp2fxs.i32.f16(half %a, i32 16)
				ret i32 %vcvth_n_s32_f16
				}

				define dso_local i64 @test_vcvth_n_s64_f16_1(half %a) {
				; CHECK-LABEL: test_vcvth_n_s64_f16_1:
				; CHECK: fcvtzs d0, h0, #1
				; CHECK-NEXT: fmov x0, d0
				; CHECK-NEXT: ret
				entry:
				%vcvth_n_s64_f16 = tail call i64 @llvm.aarch64.neon.vcvtfp2fxs.i64.f16(half %a, i32 1)
				ret i64 %vcvth_n_s64_f16
				}

				define dso_local i64 @test_vcvth_n_s64_f16_32(half %a) {
				; CHECK-LABEL: test_vcvth_n_s64_f16_32:
				; CHECK: fcvtzs d0, h0, #32
				; CHECK-NEXT: fmov x0, d0
				; CHECK-NEXT: ret
				entry:
				%vcvth_n_s64_f16 = tail call i64 @llvm.aarch64.neon.vcvtfp2fxs.i64.f16(half %a, i32 32)
				ret i64 %vcvth_n_s64_f16
				}

				define dso_local half @test_vcvth_n_f16_u16_1(i16 %a) {
				; CHECK-LABEL: test_vcvth_n_f16_u16_1:
				; CHECK: and w[[wReg:[0-9]+]], w0, #0xffff
				; CHECK-NEXT: fmov s0, w[[wReg:[0-9]+]]
				; CHECK-NEXT: ucvtf h0, s0, #1
				; CHECK-NEXT: ret
				entry:
				%0 = zext i16 %a to i32
				%fcvth_n = tail call half @llvm.aarch64.neon.vcvtfxu2fp.f16.i32(i32 %0, i32 1)
				ret half %fcvth_n
				}

				define dso_local half @test_vcvth_n_f16_u16_16(i16 %a) {
				; CHECK-LABEL: test_vcvth_n_f16_u16_16:
				; CHECK: and w[[wReg:[0-9]+]], w0, #0xffff
				; CHECK-NEXT: fmov s0, w[[wReg:[0-9]+]]
				; CHECK-NEXT: ucvtf h0, s0, #16
				; CHECK-NEXT: ret
				entry:
				%0 = zext i16 %a to i32
				%fcvth_n = tail call half @llvm.aarch64.neon.vcvtfxu2fp.f16.i32(i32 %0, i32 16)
				ret half %fcvth_n
				}

				define dso_local half @test_vcvth_n_f16_u32_1(i32 %a) {
				; CHECK-LABEL: test_vcvth_n_f16_u32_1:
				; CHECK: fmov s0, w0
				; CHECK-NEXT: ucvtf h0, s0, #1
				; CHECK-NEXT: ret
				entry:
				%vcvth_n_f16_u32 = tail call half @llvm.aarch64.neon.vcvtfxu2fp.f16.i32(i32 %a, i32 1)
				ret half %vcvth_n_f16_u32
				}

				define dso_local half @test_vcvth_n_f16_u32_16(i32 %a) {
				; CHECK-LABEL: test_vcvth_n_f16_u32_16:
				; CHECK: fmov s0, w0
				; CHECK-NEXT: ucvtf h0, s0, #16
				; CHECK-NEXT: ret
				entry:
				%vcvth_n_f16_u32 = tail call half @llvm.aarch64.neon.vcvtfxu2fp.f16.i32(i32 %a, i32 16)
				ret half %vcvth_n_f16_u32
				}

				define dso_local i16 @test_vcvth_n_u16_f16_1(half %a) {
				; CHECK-LABEL: test_vcvth_n_u16_f16_1:
				; CHECK: fcvtzu s0, h0, #1
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				entry:
				%fcvth_n = tail call i32 @llvm.aarch64.neon.vcvtfp2fxu.i32.f16(half %a, i32 1)
				%0 = trunc i32 %fcvth_n to i16
				ret i16 %0
				}

				define dso_local i16 @test_vcvth_n_u16_f16_16(half %a) {
				; CHECK-LABEL: test_vcvth_n_u16_f16_16:
				; CHECK: fcvtzu s0, h0, #16
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				entry:
				%fcvth_n = tail call i32 @llvm.aarch64.neon.vcvtfp2fxu.i32.f16(half %a, i32 16)
				%0 = trunc i32 %fcvth_n to i16
				ret i16 %0
				}

				define dso_local i32 @test_vcvth_n_u32_f16_1(half %a) {
				; CHECK-LABEL: test_vcvth_n_u32_f16_1:
				; CHECK: fcvtzu s0, h0, #1
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				entry:
				%vcvth_n_u32_f16 = tail call i32 @llvm.aarch64.neon.vcvtfp2fxu.i32.f16(half %a, i32 1)
				ret i32 %vcvth_n_u32_f16
				}

				define dso_local i32 @test_vcvth_n_u32_f16_16(half %a) {
				; CHECK-LABEL: test_vcvth_n_u32_f16_16:
				; CHECK: fcvtzu s0, h0, #16
				; CHECK-NEXT: fmov w0, s0
				; CHECK-NEXT: ret
				entry:
				%vcvth_n_u32_f16 = tail call i32 @llvm.aarch64.neon.vcvtfp2fxu.i32.f16(half %a, i32 16)
				ret i32 %vcvth_n_u32_f16
				}

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] added FP16 vcvth intrinsic support
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 148884

llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td

llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td

llvm/trunk/test/CodeGen/AArch64/fp16_intrinsic_scalar_2op.ll

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] added FP16 vcvth intrinsic supportClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 148884

llvm/trunk/lib/Target/AArch64/AArch64InstrFormats.td

llvm/trunk/lib/Target/AArch64/AArch64InstrInfo.td

llvm/trunk/test/CodeGen/AArch64/fp16_intrinsic_scalar_2op.ll

[AArch64] added FP16 vcvth intrinsic support
ClosedPublic