We somehow missed doing this when we were working on Power9 exploitation. This just adds the missing legalization and cost for producing the vector intrinsics.
Details
- Reviewers
hfinkel steven.zhang - Group Reviewers
Restricted Project - Commits
- rGa5da8d90daa3: [PowerPC] Add missing legalization for vector BSWAP
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/PowerPC/PPCInstrVSX.td | ||
---|---|---|
2950–2951 | It is surprising to me that, vsrc didn't have v16i8 and v1i128. However, it has nothing to do with this patch. It would be great if someone could educate me on this :) | |
2962 | The PPCxxreverse has completely the same semantics of bswap. I will commit a follow up patch to clean up it, as it might stop us to leverage the combine rules for bswap that implemented in DAGCombiner. | |
llvm/test/CodeGen/PowerPC/vec-bswap.ll | ||
113 | It is never used in the IR. Please remove it together with the end pair. | |
124 | The attribute is also not needed. | |
llvm/test/Transforms/LoopVectorize/PowerPC/vectorize-bswap.ll | ||
5 | The dso_local here is not need i think. |
llvm/lib/Target/PowerPC/PPCInstrVSX.td | ||
---|---|---|
2950–2951 | The initial VSX specification did not provide any operations for sub-word elements so allowing such vectors in vsrc is not useful. Even now, aside from a few specific operations (such as bswap here), there isn't a meaningful set of VSX sub-word operations. | |
2962 | Thanks. | |
llvm/test/CodeGen/PowerPC/vec-bswap.ll | ||
113 | Ah, good catch. I'll get rid of these. | |
124 | I generally keep it to avoid the CFI directives in the ASM output, but in this case I am not using the script to produce checks and can (and will) remove this. | |
llvm/test/Transforms/LoopVectorize/PowerPC/vectorize-bswap.ll | ||
5 | I agree that it is not needed, do you have a compelling reason to remove it though? |
llvm/lib/Target/PowerPC/PPCInstrVSX.td | ||
---|---|---|
2950–2951 | I think we need to fix it, as I see lots of changes if add them. Especially when some instructions want to put the result into VSX register while we are still putting them into VSR and move it to VSX. It will have some performance benefit from what I see. I will commit a patch later. Thank you for the explanation. | |
llvm/test/Transforms/LoopVectorize/PowerPC/vectorize-bswap.ll | ||
5 | No... Just to make the case as simple as possible. Anyway, it depends on you. |
It is surprising to me that, vsrc didn't have v16i8 and v1i128. However, it has nothing to do with this patch. It would be great if someone could educate me on this :)