Added a case for CTPOP to AArch64TTIImpl::getIntrinsicInstrCost so that
the cost estimate matches the codegen in
test/CodeGen/AArch64/arm64-vpopcnt.ll
Details
Diff Detail
Event Timeline
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
---|---|---|
281 | From a quick look, lookup tables here are more rare, while the it seems a lot more prevalent in the x86 implementations. But I quite like this little table here, it's accurate and easy to read, so have no problem with it. | |
289 | Nit: you don't need the curly brackets for an if with 1 statement. | |
289 | Do we have a test case where the table look up fails? | |
llvm/test/Analysis/CostModel/AArch64/ctpop.ll | ||
168 | Does this really have a cost of only 2? I guess it's 2 because LT.first * Entry->Cost = 2 * 1 = 2 So I am wondering if it is easier to expand the table with vector types, and if the lookup fails let it fall back to the generic function that will hopefully then assign a high cost for it. |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
---|---|---|
289 | Good point, I'll expand the table then the large vector cases won't be found in the table. | |
llvm/test/Analysis/CostModel/AArch64/ctpop.ll | ||
168 | I did try doing a check of the codegen for the large vector cases and they did seem to be double the cost, but I understand it is probably safer to expand the table so I'm happy to do that |
llvm/test/Analysis/CostModel/AArch64/ctpop.ll | ||
---|---|---|
168 | The costs for the large vectors are all now 4 as they fall through to a CustomCost of LT.first*2 in BasicTTIImpl.h. Is this what we want? |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
---|---|---|
280 | It looks like this file doesn't contain all of the cases. As far as I understand, this is how it works:
Those are all good. For scalar, as opposed to vector, there is no good instruction though. The generation also looks pretty terrible at the moment. Everything else would be legalized to one of those types. So you can probably use "auto LT = TLI->getTypeLegalizationCost(DL, RetTy);" and Use LT.first for the type in the table lookup, and return Entry->Cost * LT.second, to include the cost of legalization (how many different vectors there will be). | |
282 | The table can also be easier to read if it's not formatted. |
llvm/test/Analysis/CostModel/AArch64/ctpop.ll | ||
---|---|---|
64 | nit: I don't know why ';' was added here in all functions, but I don't think you should use this patch to remove them. |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
---|---|---|
280 | Thanks for the help with this, I've had another go at it. Only thing I'm not sure about now is that e.g. v2i16 is promoted to v2i32 so with LT.first * Entry->Cost it gets a cost of 3, but checking codegen it seems there are 5 instructions. Something like (2 * Entry->Cost - 1) would give the right numbers, but is this the correct thing to do? |
- Corrected the scalar costs
- Adjusted the table
- Return LT.first * Entry->Cost or LT.first*Entry->Cost + 1 depending on whether vector integer type is promoted
Looks sensible to me, thanks
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
---|---|---|
293 | I may be simpler to write it as this, without the if: // Extra cost of +1 when illegal vector types are legalized by promoting // the integer type. int ExtraCost = MTy.getScalarSizeInBits() != RetTy->getScalarSizeInBits() ? 1 : 0; return LT.first * Entry->Cost + ExtraCost; | |
llvm/test/Analysis/CostModel/AArch64/ctpop.ll | ||
64 | If this is what the update script produces, I'm OK with the extra lines being removed. Either here or in a separate NFC patch. |
It looks like this file doesn't contain all of the cases. As far as I understand, this is how it works:
Those are all good. For scalar, as opposed to vector, there is no good instruction though. The generation also looks pretty terrible at the moment.
So a i8 becomes "and 0xff; expensive-mov; v8i8 cnt; addlv; expensive-mov". The others looks equally expensive too. I would guess a cost of 5 would make sense?
Everything else would be legalized to one of those types. So you can probably use "auto LT = TLI->getTypeLegalizationCost(DL, RetTy);" and Use LT.first for the type in the table lookup, and return Entry->Cost * LT.second, to include the cost of legalization (how many different vectors there will be).