Download Raw Diff

Details

Reviewers

paulwalker-arm
peterwaller-arm
bsmith
david-arm
DavidTruby
efriedma
georges

Summary

Replacing instrinsics with normal binops results in
more succinct AArch64 SVE output, e.g.:

4:   65428041   fmul    z1.h, p0/m, z1.h, z2.h
8:   65408020   fadd    z0.h, p0/m, z0.h, z1.h
->
4:   65620020   fmla    z0.h, p0/m, z1.h, z2.h

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

MattDevereau created this revision.Sep 2 2021, 4:24 AM

Herald added a reviewer: efriedma. · View Herald TranscriptSep 2 2021, 4:24 AM

Herald added subscribers: ctetreau, psnobl, hiraditya and 2 others. · View Herald Transcript

MattDevereau requested review of this revision.Sep 2 2021, 4:24 AM

Herald added a project: Restricted Project. · View Herald TranscriptSep 2 2021, 4:24 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B122277: Diff 370226.Sep 2 2021, 4:25 AM

MattDevereau added a reviewer: georges.Sep 2 2021, 4:26 AM

paulwalker-arm added inline comments.Sep 2 2021, 5:27 AM

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
772	This is not enough as you're not considering the predicate the intrinsic takes. We can only do this transform when the predicate is the equivalent of `ptrue all`.
900–901	It seems odd for `aarch64_sve_fadd` to call `instCombineSVEVectorMul`. That said the functionality you're after is pretty generic so perhaps it's worth creating `instCombineSVEVectorBinOp` as I can see us extending this for other cases. This can be called directly for `case Intrinsic::aarch64_sve_fadd` and called at the bottom of `instCombineSVEVectorMul` whilst we figure out how much of that function is still useful.

Matt added a subscriber: Matt.Sep 2 2021, 7:05 AM

Created new method instCombineSVEVectorBinOp and narrowed the intrinsic conditional replacement to svp_true_xx/Intrinsic::aarch64_sve_convert_from_svbool

Harbormaster completed remote builds in B122478: Diff 370529.Sep 3 2021, 4:21 AM

Added ptrue check

Harbormaster completed remote builds in B122843: Diff 371029.Sep 7 2021, 4:20 AM

bsmith added inline comments.Sep 7 2021, 4:44 AM

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
713–718	This is not taking into consideration the type of the ptrue, this could be an i64 type ptrue passed to an i32 type mul/add, hence does not cover all lanes of the operation. There needs to be a check in here that checks that the ptrue type is the same or smaller than the predicated operation. For example, for a `<vscale x 4 x i32>` vector op, only ptrue types of `<vscale x 4 x i1>`, `<vscale x 8 x i1>`, `<vscale x 16 x i1>` are allowed, `<vscale x 2 x i1>` is not.

peterwaller-arm added inline comments.Sep 7 2021, 5:46 AM

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
714	I would prefer to see `if (match(II, m_Intrinsic(Intrinsic::aarch64_sve_fmul))) && ...` which is more idiomatic rather than introducing an IsIntrinsic lambda.

Added test, removed Intrinsic::aarch64_sve_from_svbool case

Harbormaster completed remote builds in B123043: Diff 371326.Sep 8 2021, 7:05 AM

Make instCombineSVEVectorBinOp more succinct

Harbormaster completed remote builds in B123428: Diff 371899.Sep 10 2021, 6:15 AM

updated for clang-format and clang-tidy

Harbormaster completed remote builds in B123642: Diff 372208.Sep 13 2021, 4:18 AM

david-arm added inline comments.Sep 15 2021, 7:18 AM

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
711	You could restructure this as: auto BinOpCode = intrinsicIDToBinOpCode(II.getIntrinsicID()); if (BinOpCode == Instruction::BinaryOpsEnd \|\| !match(...)) return None; IRBuilder<> Builder(II.getContext()); ... return IC.replaceInstUsesWith If you agree that looks better?
901	Since we're doing this for fadd shall we also do this for fsub too? It's just literally adding another case statement I think.

Added fsub case and tests. Rearranged instCombineSVEVectorBinOp logic

MattDevereau retitled this revision from [AArch64][SVE] Replace fmul and fadd LLVM IR instrinsics with fmul and fadd to [AArch64][SVE] Replace fmul, fadd and fsub LLVM IR instrinsics with LLVM IR binary ops.Sep 16 2021, 7:56 AM

MattDevereau edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B124182: Diff 372931.Sep 16 2021, 8:10 AM

Hi @MattDevereau, thanks for making the changes - it looks really good now! Is it worth adding at least one negative test for the case where the predicate isn't ptrue all?

added test no_replace_on_non_ptrue to assert only ptrue_all is replaced

renamed no_replace_on_non_ptrue to no_replace_on_non_ptrue_all

LGTM! Thanks for making the changes.

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-fma-binops.ll
10	nit: Could you change the test names before committing to have something different to the intrinsic name? i.e. you could replace the '.' with '_' so that you have declare <vscale x 8 x half> @llvm_aarch64_sve_fmul_nxv8f16(

This revision is now accepted and ready to land.Oct 1 2021, 12:13 AM

Closed in commit f085a9db8b8d408d08adcba8e283e637a0116622

@david-arm changing the function declarations to underscores broke the tests, so i've left them for now

Diff 376276

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

Show First 20 Lines • Show All 689 Lines • ▼ Show 20 Lines	if (Op1 && Op2 &&

PTest->takeName(&II);		PTest->takeName(&II);
return IC.replaceInstUsesWith(II, PTest);		return IC.replaceInstUsesWith(II, PTest);
}		}

return None;		return None;
}		}

		static Instruction::BinaryOps intrinsicIDToBinOpCode(unsigned Intrinsic) {
		switch (Intrinsic) {
		case Intrinsic::aarch64_sve_fmul:
		return Instruction::BinaryOps::FMul;
		case Intrinsic::aarch64_sve_fadd:
		return Instruction::BinaryOps::FAdd;
		case Intrinsic::aarch64_sve_fsub:
		return Instruction::BinaryOps::FSub;
		default:
		return Instruction::BinaryOpsEnd;
		}
		}

		static Optional<Instruction *> instCombineSVEVectorBinOp(InstCombiner &IC,
		david-armUnsubmitted Not Done Reply Inline Actions You could restructure this as: auto BinOpCode = intrinsicIDToBinOpCode(II.getIntrinsicID()); if (BinOpCode == Instruction::BinaryOpsEnd \|\| !match(...)) return None; IRBuilder<> Builder(II.getContext()); ... return IC.replaceInstUsesWith If you agree that looks better? david-arm: You could restructure this as: auto BinOpCode = intrinsicIDToBinOpCode(II.getIntrinsicID())…
		IntrinsicInst &II) {
		auto BinOpCode = intrinsicIDToBinOpCode(II.getIntrinsicID());
		if (BinOpCode == Instruction::BinaryOpsEnd \|\|
		peterwaller-armUnsubmitted Not Done Reply Inline Actions I would prefer to see `if (match(II, m_Intrinsic(Intrinsic::aarch64_sve_fmul))) && ...` which is more idiomatic rather than introducing an IsIntrinsic lambda. peterwaller-arm: I would prefer to see `if (match(II, m_Intrinsic(Intrinsic::aarch64_sve_fmul))) && ...` which…
		!match(II.getOperand(0),
		m_Intrinsic<Intrinsic::aarch64_sve_ptrue>(
		m_ConstantInt<AArch64SVEPredPattern::all>())))
		return None;
		bsmithUnsubmitted Not Done Reply Inline Actions This is not taking into consideration the type of the ptrue, this could be an i64 type ptrue passed to an i32 type mul/add, hence does not cover all lanes of the operation. There needs to be a check in here that checks that the ptrue type is the same or smaller than the predicated operation. For example, for a `<vscale x 4 x i32>` vector op, only ptrue types of `<vscale x 4 x i1>`, `<vscale x 8 x i1>`, `<vscale x 16 x i1>` are allowed, `<vscale x 2 x i1>` is not. bsmith: This is not taking into consideration the type of the ptrue, this could be an i64 type ptrue…
		IRBuilder<> Builder(II.getContext());
		Builder.SetInsertPoint(&II);
		return IC.replaceInstUsesWith(
		II, Builder.CreateBinOp(BinOpCode, II.getOperand(1), II.getOperand(2)));
		}

static Optional<Instruction *> instCombineSVEVectorMul(InstCombiner &IC,		static Optional<Instruction *> instCombineSVEVectorMul(InstCombiner &IC,
IntrinsicInst &II) {		IntrinsicInst &II) {
auto *OpPredicate = II.getOperand(0);		auto *OpPredicate = II.getOperand(0);
auto *OpMultiplicand = II.getOperand(1);		auto *OpMultiplicand = II.getOperand(1);
auto *OpMultiplier = II.getOperand(2);		auto *OpMultiplier = II.getOperand(2);

IRBuilder<> Builder(II.getContext());		IRBuilder<> Builder(II.getContext());
Builder.SetInsertPoint(&II);		Builder.SetInsertPoint(&II);
Show All 31 Lines	if (IsUnitSplat(OpMultiplier)) {
auto *DupInst = cast<IntrinsicInst>(OpMultiplier);		auto *DupInst = cast<IntrinsicInst>(OpMultiplier);
auto *DupPg = DupInst->getOperand(1);		auto *DupPg = DupInst->getOperand(1);
// TODO: this is naive. The optimization is still valid if DupPg		// TODO: this is naive. The optimization is still valid if DupPg
// 'encompasses' OpPredicate, not only if they're the same predicate.		// 'encompasses' OpPredicate, not only if they're the same predicate.
if (OpPredicate == DupPg) {		if (OpPredicate == DupPg) {
OpMultiplicand->takeName(&II);		OpMultiplicand->takeName(&II);
return IC.replaceInstUsesWith(II, OpMultiplicand);		return IC.replaceInstUsesWith(II, OpMultiplicand);
}		}
}		}
		paulwalker-armUnsubmitted Not Done Reply Inline Actions This is not enough as you're not considering the predicate the intrinsic takes. We can only do this transform when the predicate is the equivalent of `ptrue all`. paulwalker-arm: This is not enough as you're not considering the predicate the intrinsic takes. We can only do…

return None;		return instCombineSVEVectorBinOp(IC, II);
}		}

static Optional<Instruction *> instCombineSVEUnpack(InstCombiner &IC,		static Optional<Instruction *> instCombineSVEUnpack(InstCombiner &IC,
IntrinsicInst &II) {		IntrinsicInst &II) {
IRBuilder<> Builder(II.getContext());		IRBuilder<> Builder(II.getContext());
Builder.SetInsertPoint(&II);		Builder.SetInsertPoint(&II);
Value *UnpackArg = II.getArgOperand(0);		Value *UnpackArg = II.getArgOperand(0);
auto *RetTy = cast<ScalableVectorType>(II.getType());		auto *RetTy = cast<ScalableVectorType>(II.getType());
▲ Show 20 Lines • Show All 109 Lines • ▼ Show 20 Lines	AArch64TTIImpl::instCombineIntrinsic(InstCombiner &IC,
case Intrinsic::aarch64_sve_cntb:		case Intrinsic::aarch64_sve_cntb:
return instCombineSVECntElts(IC, II, 16);		return instCombineSVECntElts(IC, II, 16);
case Intrinsic::aarch64_sve_ptest_any:		case Intrinsic::aarch64_sve_ptest_any:
case Intrinsic::aarch64_sve_ptest_first:		case Intrinsic::aarch64_sve_ptest_first:
case Intrinsic::aarch64_sve_ptest_last:		case Intrinsic::aarch64_sve_ptest_last:
return instCombineSVEPTest(IC, II);		return instCombineSVEPTest(IC, II);
case Intrinsic::aarch64_sve_mul:		case Intrinsic::aarch64_sve_mul:
case Intrinsic::aarch64_sve_fmul:		case Intrinsic::aarch64_sve_fmul:
return instCombineSVEVectorMul(IC, II);		return instCombineSVEVectorMul(IC, II);
		case Intrinsic::aarch64_sve_fadd:
		paulwalker-armUnsubmitted Not Done Reply Inline Actions It seems odd for `aarch64_sve_fadd` to call `instCombineSVEVectorMul`. That said the functionality you're after is pretty generic so perhaps it's worth creating `instCombineSVEVectorBinOp` as I can see us extending this for other cases. This can be called directly for `case Intrinsic::aarch64_sve_fadd` and called at the bottom of `instCombineSVEVectorMul` whilst we figure out how much of that function is still useful. paulwalker-arm: It seems odd for `aarch64_sve_fadd` to call `instCombineSVEVectorMul`. That said the…
		david-armUnsubmitted Not Done Reply Inline Actions Since we're doing this for fadd shall we also do this for fsub too? It's just literally adding another case statement I think. david-arm: Since we're doing this for fadd shall we also do this for fsub too? It's just literally adding…
		case Intrinsic::aarch64_sve_fsub:
		return instCombineSVEVectorBinOp(IC, II);
case Intrinsic::aarch64_sve_tbl:		case Intrinsic::aarch64_sve_tbl:
return instCombineSVETBL(IC, II);		return instCombineSVETBL(IC, II);
case Intrinsic::aarch64_sve_uunpkhi:		case Intrinsic::aarch64_sve_uunpkhi:
case Intrinsic::aarch64_sve_uunpklo:		case Intrinsic::aarch64_sve_uunpklo:
case Intrinsic::aarch64_sve_sunpkhi:		case Intrinsic::aarch64_sve_sunpkhi:
case Intrinsic::aarch64_sve_sunpklo:		case Intrinsic::aarch64_sve_sunpklo:
return instCombineSVEUnpack(IC, II);		return instCombineSVEUnpack(IC, II);
case Intrinsic::aarch64_sve_tuple_get:		case Intrinsic::aarch64_sve_tuple_get:
▲ Show 20 Lines • Show All 1,383 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-fma-binops.ll

This file was added.

				; RUN: opt -S -instcombine < %s \| FileCheck %s

				target triple = "aarch64-unknown-linux-gnu"

				declare <vscale x 8 x i1> @llvm.aarch64.sve.ptrue.nxv8i1(i32)
				declare <vscale x 4 x i1> @llvm.aarch64.sve.ptrue.nxv4i1(i32)
				declare <vscale x 2 x i1> @llvm.aarch64.sve.ptrue.nxv2i1(i32)

				; SVE intrinsics fmul and fadd should be replaced with regular fmul and fadd
				declare <vscale x 8 x half> @llvm.aarch64.sve.fmul.nxv8f16(<vscale x 8 x i1>, <vscale x 8 x half>, <vscale x 8 x half>)
				david-armUnsubmitted Not Done Reply Inline Actions nit: Could you change the test names before committing to have something different to the intrinsic name? i.e. you could replace the '.' with '_' so that you have declare <vscale x 8 x half> @llvm_aarch64_sve_fmul_nxv8f16( david-arm: nit: Could you change the test names before committing to have something different to the…
				define <vscale x 8 x half> @replace_fmul_intrinsic_half(<vscale x 8 x half> %a, <vscale x 8 x half> %b) #0 {
				; CHECK-LABEL: @replace_fmul_intrinsic_half
				; CHECK-NEXT: %1 = fmul <vscale x 8 x half> %a, %b
				; CHECK-NEXT: ret <vscale x 8 x half> %1
				%1 = tail call <vscale x 8 x i1> @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
				%2 = tail call fast <vscale x 8 x half> @llvm.aarch64.sve.fmul.nxv8f16(<vscale x 8 x i1> %1, <vscale x 8 x half> %a, <vscale x 8 x half> %b)
				ret <vscale x 8 x half> %2
				}

				declare <vscale x 4 x float> @llvm.aarch64.sve.fmul.nxv4f32(<vscale x 4 x i1>, <vscale x 4 x float>, <vscale x 4 x float>)
				define <vscale x 4 x float> @replace_fmul_intrinsic_float(<vscale x 4 x float> %a, <vscale x 4 x float> %b) #0 {
				; CHECK-LABEL: @replace_fmul_intrinsic_float
				; CHECK-NEXT: %1 = fmul <vscale x 4 x float> %a, %b
				; CHECK-NEXT: ret <vscale x 4 x float> %1
				%1 = tail call <vscale x 4 x i1> @llvm.aarch64.sve.ptrue.nxv4i1(i32 31)
				%2 = tail call fast <vscale x 4 x float> @llvm.aarch64.sve.fmul.nxv4f32(<vscale x 4 x i1> %1, <vscale x 4 x float> %a, <vscale x 4 x float> %b)
				ret <vscale x 4 x float> %2
				}

				declare <vscale x 2 x double> @llvm.aarch64.sve.fmul.nxv2f64(<vscale x 2 x i1>, <vscale x 2 x double>, <vscale x 2 x double>)
				define <vscale x 2 x double> @replace_fmul_intrinsic_double(<vscale x 2 x double> %a, <vscale x 2 x double> %b) #0 {
				; CHECK-LABEL: @replace_fmul_intrinsic_double
				; CHECK-NEXT: %1 = fmul <vscale x 2 x double> %a, %b
				; CHECK-NEXT: ret <vscale x 2 x double> %1
				%1 = tail call <vscale x 2 x i1> @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
				%2 = tail call fast <vscale x 2 x double> @llvm.aarch64.sve.fmul.nxv2f64(<vscale x 2 x i1> %1, <vscale x 2 x double> %a, <vscale x 2 x double> %b)
				ret <vscale x 2 x double> %2
				}

				declare <vscale x 8 x half> @llvm.aarch64.sve.fadd.nxv8f16(<vscale x 8 x i1>, <vscale x 8 x half>, <vscale x 8 x half>)
				define <vscale x 8 x half> @replace_fadd_intrinsic_half(<vscale x 8 x half> %a, <vscale x 8 x half> %b) #0 {
				; CHECK-LABEL: @replace_fadd_intrinsic_half
				; CHECK-NEXT: %1 = fadd <vscale x 8 x half> %a, %b
				; CHECK-NEXT: ret <vscale x 8 x half> %1
				%1 = tail call <vscale x 8 x i1> @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
				%2 = tail call fast <vscale x 8 x half> @llvm.aarch64.sve.fadd.nxv8f16(<vscale x 8 x i1> %1, <vscale x 8 x half> %a, <vscale x 8 x half> %b)
				ret <vscale x 8 x half> %2
				}

				declare <vscale x 4 x float> @llvm.aarch64.sve.fadd.nxv4f32(<vscale x 4 x i1>, <vscale x 4 x float>, <vscale x 4 x float>)
				define <vscale x 4 x float> @replace_fadd_intrinsic_float(<vscale x 4 x float> %a, <vscale x 4 x float> %b) #0 {
				; CHECK-LABEL: @replace_fadd_intrinsic_float
				; CHECK-NEXT: %1 = fadd <vscale x 4 x float> %a, %b
				; CHECK-NEXT: ret <vscale x 4 x float> %1
				%1 = tail call <vscale x 4 x i1> @llvm.aarch64.sve.ptrue.nxv4i1(i32 31)
				%2 = tail call fast <vscale x 4 x float> @llvm.aarch64.sve.fadd.nxv4f32(<vscale x 4 x i1> %1, <vscale x 4 x float> %a, <vscale x 4 x float> %b)
				ret <vscale x 4 x float> %2
				}

				declare <vscale x 2 x double> @llvm.aarch64.sve.fadd.nxv2f64(<vscale x 2 x i1>, <vscale x 2 x double>, <vscale x 2 x double>)
				define <vscale x 2 x double> @replace_fadd_intrinsic_double(<vscale x 2 x double> %a, <vscale x 2 x double> %b) #0 {
				; CHECK-LABEL: @replace_fadd_intrinsic_double
				; CHECK-NEXT: %1 = fadd <vscale x 2 x double> %a, %b
				; CHECK-NEXT: ret <vscale x 2 x double> %1
				%1 = tail call <vscale x 2 x i1> @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
				%2 = tail call fast <vscale x 2 x double> @llvm.aarch64.sve.fadd.nxv2f64(<vscale x 2 x i1> %1, <vscale x 2 x double> %a, <vscale x 2 x double> %b)
				ret <vscale x 2 x double> %2
				}

				declare <vscale x 8 x half> @llvm.aarch64.sve.fsub.nxv8f16(<vscale x 8 x i1>, <vscale x 8 x half>, <vscale x 8 x half>)
				define <vscale x 8 x half> @replace_fsub_intrinsic_half(<vscale x 8 x half> %a, <vscale x 8 x half> %b) #0 {
				; CHECK-LABEL: @replace_fsub_intrinsic_half
				; CHECK-NEXT: %1 = fsub <vscale x 8 x half> %a, %b
				; CHECK-NEXT: ret <vscale x 8 x half> %1
				%1 = tail call <vscale x 8 x i1> @llvm.aarch64.sve.ptrue.nxv8i1(i32 31)
				%2 = tail call fast <vscale x 8 x half> @llvm.aarch64.sve.fsub.nxv8f16(<vscale x 8 x i1> %1, <vscale x 8 x half> %a, <vscale x 8 x half> %b)
				ret <vscale x 8 x half> %2
				}

				declare <vscale x 4 x float> @llvm.aarch64.sve.fsub.nxv4f32(<vscale x 4 x i1>, <vscale x 4 x float>, <vscale x 4 x float>)
				define <vscale x 4 x float> @replace_fsub_intrinsic_float(<vscale x 4 x float> %a, <vscale x 4 x float> %b) #0 {
				; CHECK-LABEL: @replace_fsub_intrinsic_float
				; CHECK-NEXT: %1 = fsub <vscale x 4 x float> %a, %b
				; CHECK-NEXT: ret <vscale x 4 x float> %1
				%1 = tail call <vscale x 4 x i1> @llvm.aarch64.sve.ptrue.nxv4i1(i32 31)
				%2 = tail call fast <vscale x 4 x float> @llvm.aarch64.sve.fsub.nxv4f32(<vscale x 4 x i1> %1, <vscale x 4 x float> %a, <vscale x 4 x float> %b)
				ret <vscale x 4 x float> %2
				}


				declare <vscale x 2 x double> @llvm.aarch64.sve.fsub.nxv2f64(<vscale x 2 x i1>, <vscale x 2 x double>, <vscale x 2 x double>)
				define <vscale x 2 x double> @replace_fsub_intrinsic_double(<vscale x 2 x double> %a, <vscale x 2 x double> %b) #0 {
				; CHECK-LABEL: @replace_fsub_intrinsic_double
				; CHECK-NEXT: %1 = fsub <vscale x 2 x double> %a, %b
				; CHECK-NEXT: ret <vscale x 2 x double> %1
				%1 = tail call <vscale x 2 x i1> @llvm.aarch64.sve.ptrue.nxv2i1(i32 31)
				%2 = tail call fast <vscale x 2 x double> @llvm.aarch64.sve.fsub.nxv2f64(<vscale x 2 x i1> %1, <vscale x 2 x double> %a, <vscale x 2 x double> %b)
				ret <vscale x 2 x double> %2
				}

				define <vscale x 2 x double> @no_replace_on_non_ptrue_all(<vscale x 2 x double> %a, <vscale x 2 x double> %b) #0 {
				; CHECK-LABEL: @no_replace_on_non_ptrue_all
				; CHECK-NEXT: %1 = tail call <vscale x 2 x i1> @llvm.aarch64.sve.ptrue.nxv2i1(i32 5)
				; CHECK-NEXT: %2 = tail call fast <vscale x 2 x double> @llvm.aarch64.sve.fsub.nxv2f64(<vscale x 2 x i1> %1, <vscale x 2 x double> %a, <vscale x 2 x double> %b)
				; CHECK-NEXT: ret <vscale x 2 x double> %2
				%1 = tail call <vscale x 2 x i1> @llvm.aarch64.sve.ptrue.nxv2i1(i32 5)
				%2 = tail call fast <vscale x 2 x double> @llvm.aarch64.sve.fsub.nxv2f64(<vscale x 2 x i1> %1, <vscale x 2 x double> %a, <vscale x 2 x double> %b)
				ret <vscale x 2 x double> %2
				}

				attributes #0 = { "target-features"="+sve" }

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Replace fmul, fadd and fsub LLVM IR instrinsics with LLVM IR binary ops
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 376276

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-fma-binops.ll

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Replace fmul, fadd and fsub LLVM IR instrinsics with LLVM IR binary opsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 376276

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

llvm/test/Transforms/InstCombine/AArch64/sve-intrinsic-fma-binops.ll

[AArch64][SVE] Replace fmul, fadd and fsub LLVM IR instrinsics with LLVM IR binary ops
ClosedPublic