This patch adds cost model for vector insert/extract element instructions. In RVV, we could use vector scalar move instruction to insert or extract the first element, and use vslide to move it. But for mask vector or i64 vector in i32 target, we need special instructions to make it.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
Generally quite reasonable, a couple comments to address before a likely LGTM.
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp | ||
---|---|---|
738 | This routine is used for both insertelement and extractelement. Your comments apply only to extract. I think you're either a) missing a opcode early return, or b) need to give corresponding insert comments. | |
745 | This line is interesting as it implies we should probably have a test check configuration for cases where the vector type is scalarized. Can you add a run line (precommit it please) which *doesn't* have float, but does have vector? Just to make sure we have this case exercised. Hm, actually, thinking about this. There may be an interesting case here. What happens if we're costing a scalable vector which can't be legalized via scalarizing? Digging through the code, it looks like we return LT.first as invalid in that case. I think we need to propagate that here to be safe. | |
749 | Ok, the following is an *optional* suggestion. Feel free to include it or ignore. This might be more readable if structured as: |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp | ||
---|---|---|
738 | Thanks, I missed some special cases of insertelement, already added. | |
745 | Thanks for pointing this. If a scalable vector type is not supported in the target, getTypeLegalizationCost would return {InvalidCost, itself}. I missed this case. Now I add an early exit to make sure returning InvalidCost. |
Adding the insertelement handling caused a lot more test diffs. For ease of review, would you mind restricting this patch to extractelement and then doing insertelement in a follow up patch?
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp | ||
---|---|---|
744 | I think you're missing a return here. | |
788 | For a follow up patch... I think we can do better here using logical ops on the mask vector. Haven't fully worked through the sequence, but generating a mask with one bit set is fairly cheap, and then we're just doing bit toggles. | |
825 | I think we're missing something here. Specifically, what is the appropriate cost for an illegally typed vector which has been split? Don't we need to account for the cost of figuring out which sub-vector to extract from/insert into? For constants, this should fold away, but for a non-constant index doesn't this require some logic? |
The reason why other test cases affected is that some caculation of other instructions' cost depends on the cost of insertelement and extractelement. It my mistake to forget upload them before. So splitting this revision would be not help for reviewing.
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp | ||
---|---|---|
788 | I think you mean we could improve the codegen of insertelemet of mask vector. |
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp | ||
---|---|---|
825 | I missed considerring the case of splitted vector. In this case, we would use stack store/load to make it. I added it now. |
LGTM with one required change made. If you disagree with that change, or it doesn't work for some reason, further discussion is needed.
llvm/lib/Target/RISCV/RISCVTargetTransformInfo.cpp | ||
---|---|---|
757 | I don't follow the added code here. For purposes of driving this review forward, can you split this bit into it's own patch? For this patch, do the following: if (!isTypeLegal(Val->getType()) return BaseT::getVectorInstrCost(Opcode, Val, Index); |
This routine is used for both insertelement and extractelement. Your comments apply only to extract. I think you're either a) missing a opcode early return, or b) need to give corresponding insert comments.