User Details
- User Since
- Nov 24 2016, 5:21 AM (292 w, 1 d)
Today
One potential nit but otherwise this and the other i1 work is looking really good.
Discussed offline but for completeness.
Yesterday
Reading through the comments, has anything changed since my first comment? The main response seems to be "but hay it works". I'm sure that's true, but we've had several such moments when adding scalable support and each time we've been told to implement things properly. Hence the new vector type, element count, type size and instruction cost. I'm not hearing good reasons why demanded elements is any different.
Sorry for the incremental review. This was not intentional, just how my code review time worked out for this patch.
A few comment related issues but otherwise looks good to me.
A potential code placement improvement but otherwise looks good.
Wed, Jun 29
Tue, Jun 28
Out of interest are the test output produced manually or via an update script. I ask because I cannot see any of the normal headers and am wondering why.
Is this optimisation valid? The merging SVE intrinsics have strict rules about what happens to inactive lanes. For the llvm.aarch64.sve.and the inactive lanes are set to the matching lanes of the first operand. This means that the inactive lanes of the second operand play no role in the operation and thus the example in and_i64_zero_comm is not a zeroing and.
When checking the output for SVE2 I see no difference, which means we're missing out on the BSL optimisation we get for scalable vectors. I think this is because you're handling the fixed->scalable lowering too late. I think you really need to edit LowerFCOPYSIGN to first convert the fixed length ISD::FCOPYSIGN to a scalable one, then let the existing scalable vector code decide how best to lower it.
Mon, Jun 27
Sun, Jun 26
Fri, Jun 24
Thu, Jun 23
Wed, Jun 22
Tue, Jun 21
Hi @Allen, typically the SVE intrinsics only support legal types. This is a conscience choice because otherwise it will be very difficult to offer universal type support for all these target specific operations (i.e. where would we draw the line). Is there a reason not to use llvm.vector.reduce.and instead?
Mon, Jun 20
Extended to allow an arbitrary start index.
We may well have other sra related patterns so what about naming the test just sve-sra.ll?
I don't like the path this patch is taking. In it current form it is changing what the current interface means without documenting such, nor paying much heed to what all the code paths might do. This is why during initial scalable vector bring up, we decided to go the safest route and essentially bail for scalable vectors. Sure this means we implicitly changed the meaning of the "all known" value but there was little else we could do.
Fri, Jun 17
Thu, Jun 16
Tue, Jun 14
Mon, Jun 13
Rebased.
Changed isConstantSequence to return and optional.
Added tests.
I see :)
I don't believe we want this functionality. We originally added it but then it got removed by D88994 as unsound because there's no instruction available to load/store predicates that are smaller than <vscale x 16 x i1>. At the C/C++ level we don't expose these "smaller" predicate types and so there shouldn't really be a route needed to load/store them. @Allen Do you have a real world use case where loading/storing them is required?
@paulwalker-arm Would that work or do you think the test refactor is imminent?
Sat, Jun 11
Fri, Jun 10
Thanks for the information @efriedma. I'll likely need to do a fuller <vscale x 1 x investigation to close out the remaining legalisation issues as I think the current strategy may well not be a good fit.
Thu, Jun 9
@ab What's your plans for submitting this fix. I only ask because others have hit the same issue, as shown by https://github.com/llvm/llvm-project/issues/55866.
To be honest these tests are just not a good fit for auto-generation, which is why we didn't use update_llc_test_checks.py originally. The best I could come up with is to disable the NO_SVE run line when running the script and then enable it afterwards. I do have the workings of a patch to add a --ignored-check-prefixes option to update_llc_test_checks.py but even then the decision as to which CHECK lines are emitted doesn't really match what we wanted to achieve. If autogeneration is preferable and you're happy to wait a week or two I think it'll be better if I just restructure all the SVE fixed length tests (essentially split them based on register size) and enable auto-generation at the same time. I think this is worth doing anyway to increase our test coverage and test parallelism as the existing layout means these tests are the most time consuming ones with the AArch64 directory.
Wed, Jun 8
For full disclosure, I'm not 100% sure but believe the nxv1#64 output is likely wrong. This is true for the existing and new tests. I don't think this is the fault of this or previous bitcast work but rather an issue with how we're lowering ISD::INSERT_SUBVECTOR for these types. It looks suspiciously like we're treating ISD::INSERT_SUBVECTOR nxv2i64 undef, nxv1i64 data, i32 0 as a NOP.
Tue, Jun 7
Have you had chance to see if the extra NO_SVE lines can be removed. I really don't want these tests to require updates when SVE is not being used, whilst at the same time validating no SVE is used, which is what the current single NO_SVE-NOT lines verifies.
Mon, Jun 6
For most of these tests we don't care about the exact "no sve" output and thus have a single CHECK for the whole file. Is there a parameter that can be passed to update_llc_test_checks.py so those CHECK lines are not generated?
Sun, Jun 5
Add assert to getSVESafeBitCast to detect original failure case.
Sat, Jun 4
@efriedma I'd still rather have the improved code generation as a separate patch, mainly because this patch forces the need to remove an invalid type size conversion from SelectionDAGLegalize::EmitStackConvert and I'd like there to be a window (however small) that validates the code now works for fixed and scalable vector types.
Fri, Jun 3
For this patch I'm just playing it safe and using common expansion via the stack rather than anything more exotic.