Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
7638–7639 | I think this is better done within getNode() so as to eliminate the possibility as early as possible. | |
7641 | getMinSVEVectorSizeInBits doesn't look relevant here. The restriction is linked to our expected isel to either INSR or EXT instructions. So I think we can be explicit here and use 2048/Ty.getVectorElementType().getSizeInBits() along with a suitable comment about the range of EXT's index operand. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
7641 | Is that really correct? If we just allow all vector sizes then surely you would end up with incorrect ext instructions? For example, if you had a VL of 128, a splice of a <vscale x 16 x i8> with an index of 31 would give an ext with an immediate of 31, which is only correct if your VL is >= 256, or are you suggesting that such a DAG node is malformed and hence undef? |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
7641 | The latter. We can assume the index in range and just need to ensure that when out of range we don't do anything bad (like access random memory). In our case we'll always emit an EXT instruction that will either do the correct thing when the index is in range (i.e. the defined case) or return the first vector when the index is out of range (i.e. the undefined case). |
- Move idx==0 optimization to getNode
- Allow lowering for all VL lengths, not just the current one.
llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp | ||
---|---|---|
6162 | Up to you but N3 must to be a constant so you can do if (cast<ConstantSDNode>(N3)->isNullValue()). | |
llvm/test/CodeGen/AArch64/named-vector-shuffles-sve.ll | ||
18–22 | I think the intent of all the first_idx tests is to validate the lower bound of the EXT isel patterns and so think they should all pass 1 to the vector.splice and then perhaps just have a single zero_idx test for the NOP case. | |
46–48 | Is this safe? You've not touched this code so I've got a feeling the common expand code is incorrectly assuming the largest legal index+1 represents a safe index to use when expanding the code. I think it is safer to force an out-of-range index to 0. | |
221–222 | Perhaps the _1_ in the function name should now be _31_? |
llvm/test/CodeGen/AArch64/named-vector-shuffles-sve.ll | ||
---|---|---|
46–48 | Please ignore this. I didn't spot the ll test change and so got my wires crossed. |
- Change getNode to assume constant value for 3rd operand of VECTOR_SPLICE
- Change #0 index tests to #1 to check minimum ext range
- Add single #0 index test
llvm/test/CodeGen/AArch64/named-vector-shuffles-sve.ll | ||
---|---|---|
216 | I expected to find a #2 here (1 x f16's worth of bytes), but there is a #8, which I can't quickly explain? | |
227 | Continuing the logic of the earlier comment I make the immediate maximum to be 255 (bytes) - 2 (1 x f16) = 253, but we can't access byte 253 because we need a whole number of 2-byte halves, so I'm expecting to see 252/2 = 126 as input and #252 as the immediate. |
llvm/test/CodeGen/AArch64/named-vector-shuffles-sve.ll | ||
---|---|---|
216 | A colleague reminded me that this is an unpacked type, so the container fits in 8 bytes. Now it all makes sense. |
Up to you but N3 must to be a constant so you can do if (cast<ConstantSDNode>(N3)->isNullValue()).