Use predicated SEL for vector.insert instead of going through memory
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
10999–11010 | Perhaps worth combining these blocks? Like if (Idx == 0 && isPackedVectorType(VT, DAG)) { // This will be matched by custom code during ISelDAGToDAG. if (Vec0.isUndef()) return Op; unsigned int PredPattern = ...... ...... } |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
10999–11010 | The two blocks differ slightly: I had to pass VT instead of InVT to get isPackedVector to return false on the <4 x bfloat> test. All tests pass when I combine the statements and use VT. Do you think there will be any side effects later down the line from doing this? |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
10999–11010 | Oh that's interesting. I think the original code is wrong to use InVT and is likely the result of some refactoring for when this code path needed to handle scalable vectors. isPackedVectorType will always return true for fixed length vectors and so given by this point we know InVT is fixed length the call is redundant. I believe the original code should be using VT because that is the only case that is handled within ISelDAGToDAG (i.e. inserting a fixed length vector into an undef packed scalable vector at index 0). |
llvm/test/CodeGen/AArch64/split-vector-insert.ll | ||
---|---|---|
20 | You've run update_llc_test_checks on this file, which did not previously use it but the intent of the test was not to check the output-- it was checking for 'legally typed node' in the debug output. I think you can keep the newer CHECK lines but this diff hunk needs discarding. | |
108 | Same for this one. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
11009 | This is why the tests are wrong. ISDOpcodes says Select(COND, TRUEVAL, FALSEVAL) and so Vec0 and ScalableVec1 need switching. Might be worth writing a runable test just to verify this to yourself. | |
llvm/test/CodeGen/AArch64/insert-subvector-res-legalization.ll | ||
82 | Is this not backwards? SEL says "Read active elements from the first source vector and inactive elements from the second source vector" so the subvector should be the first source? |
llvm/test/CodeGen/AArch64/insert-subvector-res-legalization.ll | ||
---|---|---|
82 | Yes, it was backwards. Swapping the operands has made it generate a MOV alias of SEL instead which seems more appropriate. |
This is why the tests are wrong. ISDOpcodes says Select(COND, TRUEVAL, FALSEVAL) and so Vec0 and ScalableVec1 need switching. Might be worth writing a runable test just to verify this to yourself.