This patch reverts commit 7614ba0a5db8 to optimize VPERM when one of its
vector operands is XXSWAPD, similar to XXPERM. It also reorganizes the
little-endian swap code on LE, swapping the vector operand after
adjusting the mask operand. This ensures that the vector operand is
swapped at the correct point in the code, resulting in a valid
constant pool for the mask operand.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Overall I think that this makes sense to me. I do have a couple of comments but mostly nits.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
10238 | Now that the swap has been moved down should this also be moved down? Or at least part of it? | |
10310 | nit: if (isPPC64 && (V1HasXXSWAPD || V2HasXXSWAPD || Opcode == PPCISD::XXPERM)) That way if we are not PPC64 we won't need to check any of the rest. | |
llvm/test/CodeGen/PowerPC/xxperm-swap.ll | ||
20 | This is the kind of extra copy that I mean. |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
10238 | In the following code section (line 10286), it uses the NeedSwap to adjust the mask, so this single-use operand swap needs to happen before that. |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
10238 | I see what you are saying. However, maybe something like this will work: if ((!isLittleEndian && !V2->hasOneUse() && V1->hasOneUse()) || (isLittleEndian && !V1->hasOneUse() && V2->hasOneUse())) { I tried the above on one of the tests with the extra copy and it seems to get rid of it. I have not tested this extensively so please make sure that I have not broken anything. |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
10238 | I tried this by adding your code here and removing // Only need to place items backwards in LE, // the mask was properly calculated. if (isLittleEndian) std::swap(V1, V2); from line 10334. It regressed bunch of test cases, looking at them I realized we can not check the endianness here because this block of code is already guarded by if (Subtarget.hasVSX() && Subtarget.hasP9Vector() && (V1->hasOneUse() || V2->hasOneUse())) { while this check if (isLittleEndian) is for all subtargets. |
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
10238 | I think I didn't explain this properly. I think that you should leave the code for: // Only need to place items backwards in LE, // the mask was properly calculated. if (isLittleEndian) std::swap(V1, V2); at the end. That accounts for Big vs. Little Endian and as you said should be for all subtargets. What I was thinking is just changing this if statement here. This swap is only here to try to prevent copying registers. What I'm proposing is that you take into consideration whether or not there will be a final swap at the end when you check if you want to also do this swap here. |
Take into account the little-endian swap when deciding on single-use operand to be second one
I think the patch mostly looks good. I have one more minor comment.
llvm/lib/Target/PowerPC/PPCISelLowering.cpp | ||
---|---|---|
10320 | Since you have removed Opcode == PPCISD::XXPERM from other places in the code is it safe to remove it from here as well? |
Now that the swap has been moved down should this also be moved down? Or at least part of it?
I ask because I've noticed that there are a number of tests where a new vmr has been introduced and I'm wondering if that's why.