This is an archive of the discontinued LLVM Phabricator instance.

[BasicTTI] Allow generic handling of scalable vector fshr/fshl
ClosedPublic

Authored by reames on Jun 13 2022, 12:06 PM.

Download Raw Diff

Details

Reviewers

CarolineConcatto
craig.topper
david-arm
dmgreen

Commits

rGdb85345f2d9f: [BasicTTI] Allow generic handling of scalable vector fshr/fshl

Summary

This change removes an explicit scalable vector bailout for fshl and fshr. This bailout was added in 60e4698b9aba8, when sinking a unconditional bailout for all intrinsics into selected cases. Its not clear if the bailout was originally unneeded, or if our cost model infrastructure has simply matured in the meantime. Either way, the generic code appears to handle scalable vectors without issue.

Note that the RISC-V cost model changes here aren't particularly interesting. They do probably better match the current lowering, but the main point is to have coverage of the BasicTTI path and simply show lack of crashing.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

reames created this revision.Jun 13 2022, 12:06 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2022, 12:06 PM

Herald added subscribers: luke957, StephenFan, frasercrmck and 23 others. · View Herald Transcript

reames requested review of this revision.Jun 13 2022, 12:06 PM

Herald added a project: Restricted Project. · View Herald TranscriptJun 13 2022, 12:06 PM

Herald added subscribers: alextsao1999, • pcwang-thead, MaskRay. · View Herald Transcript

Harbormaster completed remote builds in B169522: Diff 436502.Jun 13 2022, 2:12 PM

I suspect we were just being overly conservative at the time when fixing up some FixedVectorType casts, but as you say it seems to work now. Would you be able to add a similar Analysis/CostModel/AArch64/sve-intrinsics.ll test as well for the <vscale x 4 x i32> case?

RISC-V change LGTM

In D127680#3581448, @david-arm wrote:

I suspect we were just being overly conservative at the time when fixing up some FixedVectorType casts, but as you say it seems to work now. Would you be able to add a similar Analysis/CostModel/AArch64/sve-intrinsics.ll test as well for the <vscale x 4 x i32> case?

@david-arm It's not really clear which test you want me to add. Since this isn't fixing a crash, could you just land what you have in mind and I'll rebase this over?

In D127680#3583308, @reames wrote:

In D127680#3581448, @david-arm wrote:

I suspect we were just being overly conservative at the time when fixing up some FixedVectorType casts, but as you say it seems to work now. Would you be able to add a similar Analysis/CostModel/AArch64/sve-intrinsics.ll test as well for the <vscale x 4 x i32> case?

@david-arm It's not really clear which test you want me to add. Since this isn't fixing a crash, could you just land what you have in mind and I'll rebase this over?

I think the request is to add these to sve-intrinsics.ll

define void @fshr(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b, <vscale x 4 x i32> %c) {
  call <vscale x 4 x i32> @llvm.fshr.nxv4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b, <vscale x 4 x i32> %c)
  ret void
}

define void @fshl(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b, <vscale x 4 x i32> %c) {
  call <vscale x 4 x i32> @llvm.fshl.nxv4i32(<vscale x 4 x i32> %a, <vscale x 4 x i32> %b, <vscale x 4 x i32> %c)
  ret void
}

llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
39	This should be llvm.fshr.nxv1i32. The instrinsic parser is lax about checking the type part of the string.

Thanks @craig.topper for explaining better than I - yes those are exactly the tests I meant! If it's easier I can add some tests myself though today and then this patch can be rebased? I just haven't had chance to do it yet.

I've now added some fshl/fshr cost model tests for SVE (https://reviews.llvm.org/rGcd53e6b48b67) so I believe you just have to rebase now!

Rebase and add AArch64 target code to avoid change in cost modelling.

Herald added a subscriber: hiraditya. · View Herald TranscriptJun 16 2022, 10:52 AM

reames added inline comments.Jun 16 2022, 10:53 AM

llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll
39	Oops, fixed.

Harbormaster completed remote builds in B170305: Diff 437603.Jun 16 2022, 11:47 AM

Matt added a subscriber: Matt.Jun 16 2022, 4:44 PM

craig.topper added inline comments.Jun 18 2022, 11:30 AM

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
425	Seems like this should be a FIXME?

dmgreen added a subscriber: dmgreen.Jun 20 2022, 12:09 AM

dmgreen added inline comments.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
425	I don't think this needs to be added. Just updating the costs in the test files to whatever is now produced will likely be more accurate, and more in-line with the scalar and fixed length vector costs. We can then update them more accurately in the future.

reames added inline comments.Jun 20 2022, 9:02 AM

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
425	I don't work on SVE, don't have any sense for what "reasonable" costs are for the target, and don't particularly want to be on the blame list for a potential performance swing investigation. If you think the costs are likely reasonable, I'd ask that you remove the bailout in a separate following change.

Whichever way you go with the test, I think we can all agree that the change looks fine. LGTM

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
425	The numbers looked fine when I looked at them. 13 IIRC? It would be unreasonable for someone to blame this commit for performance regression, considering how little costs we have for funnel shifts. It's more likely to make things better, and they are free to blame me if it does make things worse. It would seem simpler to remove this code and just update the test. I believe that was what @david-arm was asking for. But if you insist then sure, I can update afterwards. That should be simple enough.

This revision is now accepted and ready to land.Jun 20 2022, 10:19 AM

This revision was landed with ongoing or failed builds.Jun 20 2022, 10:38 AM

Closed by commit rGdb85345f2d9f: [BasicTTI] Allow generic handling of scalable vector fshr/fshl (authored by reames). · Explain Why

This revision was automatically updated to reflect the committed changes.

reames added a commit: rGdb85345f2d9f: [BasicTTI] Allow generic handling of scalable vector fshr/fshl.

reames added inline comments.Jun 20 2022, 10:41 AM

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
425	In this particular case, I'm probably being paranoid, but in generally, yes I think it's well worth splitting changes to minimize the impact on other targets and configurations. You say no one would reasonable blame this patch for a performance regression. My experience says I've wasted a lot of time whether someone was reasonable on patches just this simple. :)

dmgreen mentioned this in rGfb4d3d238fd9: [AArch64] Remove unnecessary funnel shift sve costs..Jun 21 2022, 4:21 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

BasicTTIImpl.h

2 lines

lib/

Target/

AArch64/

AArch64TargetTransformInfo.cpp

6 lines

test/

Analysis/

CostModel/

RISCV/

rvv-intrinsics.ll

4 lines

Diff 438438

llvm/include/llvm/CodeGen/BasicTTIImpl.h

Show First 20 Lines • Show All 1,494 Lines • ▼ Show 20 Lines	InstructionCost getIntrinsicInstrCost(const IntrinsicCostAttributes &ICA,
case Intrinsic::vector_reduce_fadd:		case Intrinsic::vector_reduce_fadd:
case Intrinsic::vector_reduce_fmul: {		case Intrinsic::vector_reduce_fmul: {
IntrinsicCostAttributes Attrs(		IntrinsicCostAttributes Attrs(
IID, RetTy, {Args[0]->getType(), Args[1]->getType()}, FMF, I, 1);		IID, RetTy, {Args[0]->getType(), Args[1]->getType()}, FMF, I, 1);
return getTypeBasedIntrinsicInstrCost(Attrs, CostKind);		return getTypeBasedIntrinsicInstrCost(Attrs, CostKind);
}		}
case Intrinsic::fshl:		case Intrinsic::fshl:
case Intrinsic::fshr: {		case Intrinsic::fshr: {
if (isa<ScalableVectorType>(RetTy))
return BaseT::getIntrinsicInstrCost(ICA, CostKind);
const Value *X = Args[0];		const Value *X = Args[0];
const Value *Y = Args[1];		const Value *Y = Args[1];
const Value *Z = Args[2];		const Value *Z = Args[2];
TTI::OperandValueProperties OpPropsX, OpPropsY, OpPropsZ, OpPropsBW;		TTI::OperandValueProperties OpPropsX, OpPropsY, OpPropsZ, OpPropsBW;
TTI::OperandValueKind OpKindX = TTI::getOperandInfo(X, OpPropsX);		TTI::OperandValueKind OpKindX = TTI::getOperandInfo(X, OpPropsX);
TTI::OperandValueKind OpKindY = TTI::getOperandInfo(Y, OpPropsY);		TTI::OperandValueKind OpKindY = TTI::getOperandInfo(Y, OpPropsY);
TTI::OperandValueKind OpKindZ = TTI::getOperandInfo(Z, OpPropsZ);		TTI::OperandValueKind OpKindZ = TTI::getOperandInfo(Z, OpPropsZ);
TTI::OperandValueKind OpKindBW = TTI::OK_UniformConstantValue;		TTI::OperandValueKind OpKindBW = TTI::OK_UniformConstantValue;
▲ Show 20 Lines • Show All 825 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

Show First 20 Lines • Show All 414 Lines • ▼ Show 20 Lines	if ((LT.second.getScalarType() == MVT::f32 \|\|
Cost += getIntrinsicInstrCost(Attrs1, CostKind);		Cost += getIntrinsicInstrCost(Attrs1, CostKind);
IntrinsicCostAttributes Attrs2(IsSigned ? Intrinsic::smax : Intrinsic::umax,		IntrinsicCostAttributes Attrs2(IsSigned ? Intrinsic::smax : Intrinsic::umax,
LegalTy, {LegalTy, LegalTy});		LegalTy, {LegalTy, LegalTy});
Cost += getIntrinsicInstrCost(Attrs2, CostKind);		Cost += getIntrinsicInstrCost(Attrs2, CostKind);
return LT.first * Cost;		return LT.first * Cost;
}		}
break;		break;
}		}
		case Intrinsic::fshl:
		case Intrinsic::fshr:
		// FIXME: Match legacy behavior; this is probably not the right costing.
		craig.topperUnsubmitted Not Done Reply Inline Actions Seems like this should be a FIXME? craig.topper: Seems like this should be a FIXME?
		dmgreenUnsubmitted Not Done Reply Inline Actions I don't think this needs to be added. Just updating the costs in the test files to whatever is now produced will likely be more accurate, and more in-line with the scalar and fixed length vector costs. We can then update them more accurately in the future. dmgreen: I don't think this needs to be added. Just updating the costs in the test files to whatever is…
		reamesAuthorUnsubmitted Done Reply Inline Actions I don't work on SVE, don't have any sense for what "reasonable" costs are for the target, and don't particularly want to be on the blame list for a potential performance swing investigation. If you think the costs are likely reasonable, I'd ask that you remove the bailout in a separate following change. reames: I don't work on SVE, don't have any sense for what "reasonable" costs are for the target, and…
		dmgreenUnsubmitted Not Done Reply Inline Actions The numbers looked fine when I looked at them. 13 IIRC? It would be unreasonable for someone to blame this commit for performance regression, considering how little costs we have for funnel shifts. It's more likely to make things better, and they are free to blame me if it does make things worse. It would seem simpler to remove this code and just update the test. I believe that was what @david-arm was asking for. But if you insist then sure, I can update afterwards. That should be simple enough. dmgreen: The numbers looked fine when I looked at them. 13 IIRC? It would be unreasonable for someone to…
		reamesAuthorUnsubmitted Done Reply Inline Actions In this particular case, I'm probably being paranoid, but in generally, yes I think it's well worth splitting changes to minimize the impact on other targets and configurations. You say no one would reasonable blame this patch for a performance regression. My experience says I've wasted a lot of time whether someone was reasonable on patches just this simple. :) reames: In this particular case, I'm probably being paranoid, but in generally, yes I think it's well…
		if (isa<ScalableVectorType>(RetTy))
		return 1;
		break;
default:		default:
break;		break;
}		}
return BaseT::getIntrinsicInstrCost(ICA, CostKind);		return BaseT::getIntrinsicInstrCost(ICA, CostKind);
}		}

/// The function will remove redundant reinterprets casting in the presence		/// The function will remove redundant reinterprets casting in the presence
/// of the control flow		/// of the control flow
▲ Show 20 Lines • Show All 2,481 Lines • Show Last 20 Lines

llvm/test/Analysis/CostModel/RISCV/rvv-intrinsics.ll

Show All 27 Lines	;
%log10 = call <vscale x 4 x float> @llvm.log10.nxv4f32(<vscale x 4 x float> %vec)		%log10 = call <vscale x 4 x float> @llvm.log10.nxv4f32(<vscale x 4 x float> %vec)
%rint = call <vscale x 4 x float> @llvm.rint.nxv4f32(<vscale x 4 x float> %vec)		%rint = call <vscale x 4 x float> @llvm.rint.nxv4f32(<vscale x 4 x float> %vec)
%nearbyint = call <vscale x 4 x float> @llvm.nearbyint.nxv4f32(<vscale x 4 x float> %vec)		%nearbyint = call <vscale x 4 x float> @llvm.nearbyint.nxv4f32(<vscale x 4 x float> %vec)
ret void		ret void
}		}

define void @fshr(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c) {		define void @fshr(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c) {
; CHECK-LABEL: 'fshr'		; CHECK-LABEL: 'fshr'
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = call <vscale x 1 x i32> @llvm.fshr.nxv1i32(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c)		; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %1 = call <vscale x 1 x i32> @llvm.fshr.nxv1i32(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c)
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void		; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
;		;
call <vscale x 1 x i32> @llvm.fshr.nxv4i32(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c)		call <vscale x 1 x i32> @llvm.fshr.nxv4i32(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c)
		craig.topperUnsubmitted Not Done Reply Inline Actions This should be llvm.fshr.nxv1i32. The instrinsic parser is lax about checking the type part of the string. craig.topper: This should be llvm.fshr.nxv1i32. The instrinsic parser is lax about checking the type part of…
		reamesAuthorUnsubmitted Done Reply Inline Actions Oops, fixed. reames: Oops, fixed.
ret void		ret void
}		}

define void @fshl(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c) {		define void @fshl(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c) {
; CHECK-LABEL: 'fshl'		; CHECK-LABEL: 'fshl'
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: %1 = call <vscale x 1 x i32> @llvm.fshl.nxv1i32(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c)		; CHECK-NEXT: Cost Model: Found an estimated cost of 7 for instruction: %1 = call <vscale x 1 x i32> @llvm.fshl.nxv1i32(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c)
; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void		; CHECK-NEXT: Cost Model: Found an estimated cost of 1 for instruction: ret void
;		;
call <vscale x 1 x i32> @llvm.fshl.nxv4i32(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c)		call <vscale x 1 x i32> @llvm.fshl.nxv4i32(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c)
ret void		ret void
}		}

declare <vscale x 1 x i32> @llvm.fshr.nxv4i32(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c)		declare <vscale x 1 x i32> @llvm.fshr.nxv4i32(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c)
declare <vscale x 1 x i32> @llvm.fshl.nxv4i32(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c)		declare <vscale x 1 x i32> @llvm.fshl.nxv4i32(<vscale x 1 x i32> %a, <vscale x 1 x i32> %b, <vscale x 1 x i32> %c)
Show All 13 Lines