Similar to D124357, this adds some cost modelling for fptoi_sat for Arm targets. Where VFP2 is available (and FP64/FP16 for the relevant types), the operations are legal as the Arm instructions naturally saturate. Otherwise they will need an extra smin/smax clamp, similar to AArch64.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Comment Actions
LGTM
| llvm/lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
|---|---|---|
| 1776 | Nit: this if and the one below could share some code. I think this a small improvement, but I don't mind either way: if (((ST->hasVFP2Base() && LT.second == MVT::f32) ||
(ST->hasFP64() && LT.second == MVT::f64) ||
(ST->hasFullFP16() && LT.second == MVT::f16)) {
if (MTy == MVT::i32)
return LT.first;
if (LT.second.getScalarSizeInBits() < MTy.getScalarSizeInBits())
break;
Type *LegalTy = Type::getIntNTy(ICA.getReturnType()->getContext(),
LT.second.getScalarSizeInBits());
InstructionCost Cost = 1;
..
} | |
Nit: this if and the one below could share some code. I think this a small improvement, but I don't mind either way:
if (((ST->hasVFP2Base() && LT.second == MVT::f32) || (ST->hasFP64() && LT.second == MVT::f64) || (ST->hasFullFP16() && LT.second == MVT::f16)) { if (MTy == MVT::i32) return LT.first; if (LT.second.getScalarSizeInBits() < MTy.getScalarSizeInBits()) break; Type *LegalTy = Type::getIntNTy(ICA.getReturnType()->getContext(), LT.second.getScalarSizeInBits()); InstructionCost Cost = 1; .. }