Make sure scalable property is preserved by using getVectorElementCount().
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Take test function @sext_inreg
run: llc -mtriple=aarch64-linux-gnu -mattr=+sve < test/CodeGen/AArch64/DAGCombine_vscale.ll
Current upstream assert with "Assertion `(!EVT.isVector() || EVT.getVectorElementCount() == VT.getVectorElementCount()) && "Vector element counts must match in SIGN_EXTEND_INREG"' failed."
Hi, this patch looks like it has some good fixes, thanks! I just left one comment about adding a test if possible?
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
8419 | Is it possible to add a test for this too? |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
8364–8365 | Just a thought but in general it seems like a good way to mitigate such issues is to remove any explicit vector length handling and thus I'm wondering if it's worth changing this and the other instances to use changeVectorElementType (i.e. VT.changeVectorElementType(ExtVT) in this instance). It might even reduce line wrapping and thus read better. |
Thanks David and Paul for the reviews!
I have included test coverage for the second change of getVectorElementCount.
But when trying to use changeVectorElementType(), I ran into a case where VT is v3i16, and the computed ExtVT is v3i1 not a valid simple integer vector type. We might end up needing extra special handling for i1 and other types.
Please refer to detailed test case in the comments.
Let me know if it's ok to keep the usage of EVT::getVectorVT(..., ElementCount) ?
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
8364–8365 | It's definitely nicer to use changeVectorElementType(). VT is v3i16, a valid simple value type, but ExtTy will be v3i1, not a valid simple integer vector type. Please refer to test.ll , run with this patch and changeVectorElementType() change define <3 x i16> @sext_in_reg_v3i1_to_v3i16(<3 x i16> %a, <3 x i16> %b) { %c = add <3 x i16> %a, %b ; %shl = shl <3 x i16> %a, <i16 15, i16 15, i16 15> %ashr = ashr <3 x i16> %shl, <i16 15, i16 15, i16 15> ret <3 x i16> %ashr } ( I found this test from test/CodeGen/AMDGPU/sext-in-reg.ll ) Also, we will need to restrict ExtTy/TruncTy to be simple type. In most cases VT is simple type, but with ashr (shl ...) TruncTy could be extended type (e.g., i7), then we will have null pointer access to LLVMTy when calling changeExtendedVectorElementType(). | |
8419 | Test @ashr_shl and @ashr_shl_illegal_trunc_vec_ty will hit this line. |
Just a thought but in general it seems like a good way to mitigate such issues is to remove any explicit vector length handling and thus I'm wondering if it's worth changing this and the other instances to use changeVectorElementType (i.e. VT.changeVectorElementType(ExtVT) in this instance). It might even reduce line wrapping and thus read better.