Reduce the cost of VLS loads/stores to make the vectorizor emit them more frequently.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
---|---|---|
1519 | What's the rationale behind this cost for fixed types? | |
llvm/test/Analysis/CostModel/AArch64/masked_ldst.ll | ||
2 ↗ | (On Diff #364808) | This test is for scalable types, hence the changes in here are testing the cost of scalable masked loads/stores rather than fixed ones. You'll need to add functions (probably in another test file) that use the vscale_range attribute to specify various different fixed vector lengths) |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
---|---|---|
1505 | I don't think we should be removing this because it's needed when we don't know the SVE vector length, in which case we will have to scalarise the masked ops. I think you can add an extra check here to see if we're using SVE for fixed width vectors. | |
1519 | Is this due to the extra predicate we have to create? If so, that's also true for SVE since we'll need a ptrue and it's highly likely to get reused anyway or hoisted out of a loop. | |
llvm/test/Analysis/CostModel/AArch64/masked_ldst.ll | ||
6 ↗ | (On Diff #364808) | Again, I think these changes look wrong - they should be high because we are going to scalarise the operations, since we haven't specified the SVE vector length so we can't use SVE or NEON. |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
---|---|---|
1519 | A cost of 3*vscale was suggested as a suitable cost for VLS compared to the current VLA cost of 2*vscale. Rather than edit TLI->getTypeLegalizationCost(DL, Src); directly, this seemed like the simplest way of achieving it. | |
llvm/test/Analysis/CostModel/AArch64/masked_ldst.ll | ||
2 ↗ | (On Diff #364808) | What is the difference between this test called "fixed" and the test called "scalable" further down then? |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp | ||
---|---|---|
1505 | Perhaps just add something like && !ST->useSVEForFixedLengthVectors() | |
llvm/test/Analysis/CostModel/AArch64/masked_ldst.ll | ||
2 ↗ | (On Diff #364808) | @bsmith Can this not be done using multiple RUN lines instead? I'm hoping that a single set of CHECK lines can be used by utilising FileChecks math functions. |
llvm/test/Analysis/CostModel/AArch64/masked_ldst.ll | ||
---|---|---|
2 ↗ | (On Diff #364808) | That's probably a better approach yes, to avoid duplicating things. |
2 ↗ | (On Diff #364808) | The 'fixed' test is testing the fixed LLVM IR types when using scalable codegen (i.e. fixed types are treated as Neon types, not SVE), the 'scalable' test is testing scalable types using scalable codegen. You need a test that checks fixed types using fixed codegen (of various sizes). That is to say, as per Paul's approach, you need some additional run lines that specify -aarch64-sve-vector-bits-min=<value>, which result in different costs for the the masked load/stores depending on the value of <value> |
Changed the cost model by keeping the scalarised NEON costs for 128bit width vectors, but use the SVE costs for larger VLS sizes. Added a new regression test to assert the cost-model estimates depending on VLS width
llvm/test/Analysis/CostModel/AArch64/masked_ldst_vls.ll | ||
---|---|---|
2 | This should be removed as I imagine it's not true and you just inherited it from a file you copied. | |
3 | Rather than having -mtriple=aarch64-linux-gnu -mattr=+sve on ever RUN line can you use LLVM IR directly. For example: target triple = "aarch64-unknown-linux-gnu" define void @fixed-sve-vls() #0 { .... } attributes #0 = { "target-features"="+sve" } It just means you need to know less of the details when running the test manually. | |
30 | Storing vectors of i1 is a thorny issue so I think it best to not bother testing them just yet. |
Removed i1 vector from regression test, added some LLVM IR syntax cleanup to regression test
llvm/test/Analysis/CostModel/AArch64/masked_ldst_vls.ll | ||
---|---|---|
20 | Minor nit: functions with dashes in the name are quite rare. I'm a little surprised it's allowed! Out of ~240k function definitions in the LLVM test suite, only 28 of them have dashes, and ~187k have underscores, so I'd go with the majority here. |
renamed fixed-sve-vl to fixed_sve_vl, use useNeonVector() function instead of verbose if statement
I don't think we should be removing this because it's needed when we don't know the SVE vector length, in which case we will have to scalarise the masked ops. I think you can add an extra check here to see if we're using SVE for fixed width vectors.