This commit is moving the rejection of scalable vectors in the loop vectorization up to the target specific cost model. Before this change for interleaved memory accesses there was an early rejection from inside the loop vectorizer cost model. The BasicTTIImplBase returns invalid cost for scalable vectors what should be enough for other backends to use. AArch64 will still return invalid cost for scalable vectors for interleaved accesses as more work is required to enable it.
Details
Diff Detail
Event Timeline
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14033–14034 | The usage of getMinSVEVectorSizeInBits here is only relevant for fixed length vector types. When working with scalable types we already know the size of the legal types (i.e. vscale * 128) and so the : 128 part should be fine because Scalable_EC/(vscale * 128) ==> EC.getKnownMinValue() / 128). Fixing this should mean you no longer require vscale_range attributes for your scalable vector tests. | |
14048–14049 | Is this correct? Do you really want to say all scalable vector types are legal? | |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
2489–2490 | This is worth a comment because the naive optimisation is to simply move the check higher up whereas you deliberately want to exercise as much of the costing code as possible, even though we'll currently ignore the result. | |
2496–2498 | Perhaps simplify to just "All other uses of scalable vectors are not legal."? With that said, looking at the base implementation of getInterleavedMemoryOpCost suggests it does the right thing so you could remove this code and just fall into the base version as before. | |
llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll | ||
1–6 ↗ | (On Diff #468552) | Does the code affect the output of these tests? Perhaps in this instance it is better to commit the tests first in isolation and then it's clear to see what effect this patch has (or rather, to show this patch does not change the existing behaviour). |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14033–14034 | ok, I will simplify it | |
14048–14049 | my understanding is that at the moment we do not know which scalable vector types are legal or not for this type of accesses, it depends how this is going to be implemented, right? getInterleavedMemoryOpCost is just calling isLegalInterleavedAccessType, but later there is a check and returns invalid cost for all scalable VectorTypes. I thought that this might be enough for now, we will need to add some extra tests later to check if the right types are legal or not. At the moment the isLegalInterleavedAccessType is only used for the fixed width SVE so I think I should add extra assert here as well. I might also try to check the scalable types here assuming legal element sizes etc, but at the moment we aren't able to test it. | |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
2489–2490 | ok will add a comment, fair point | |
2496–2498 | yeah you are right, it has an early check at the beginning, I will change it | |
llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll | ||
1–6 ↗ | (On Diff #468552) | My patch does not affect the output for this test. This test is also a copy of llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll I wanted to demonstrate that despite removing the early exit from LoopVectorizationCostModel::getInterleaveGroupCost for scalable VF we still do not attempt to vectorize interleaved groups for AArch64. In that instance yes It might be worth to pre commit this test. |
llvm/lib/Target/AArch64/AArch64ISelLowering.cpp | ||
---|---|---|
14048–14049 | This is target specific code so we can rely on implicit knowledge. For NEON you'll see the function only accepts the typical element types and then ends with Ensure the total vector size is 64 or a multiple of 128. So it seems reasonable to expect similar for scalable vectors. | |
llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll | ||
1–6 ↗ | (On Diff #468552) | Yep, by pre committing the test is becomes trivial to prove the patch works as expected by not changing the current behaviour. |
The usage of getMinSVEVectorSizeInBits here is only relevant for fixed length vector types. When working with scalable types we already know the size of the legal types (i.e. vscale * 128) and so the : 128 part should be fine because Scalable_EC/(vscale * 128) ==> EC.getKnownMinValue() / 128).
Fixing this should mean you no longer require vscale_range attributes for your scalable vector tests.