- User Since
- Nov 24 2016, 5:21 AM (238 w, 4 d)
Fri, Jun 18
Thu, Jun 17
Tue, Jun 15
Mon, Jun 14
Thu, Jun 10
Wed, Jun 9
I'll take a more detailed look later but structurally it looks fine to me. I've added a comment that hopefully means we don't need the frame index limitation.
My feeling is that the vectoriser should never try to vectorise a loop that is impossible to vectorise, with the act of costing a vectorisation candidate effectively saying the loop is safe to vectorise. So a way to guard this is to ensure that when costing a vectorisation candidate we force the need for a real cost. It's just unfortunate that when it comes to scalable vectorisation there are more scenarios that lead to loops that are not vectorisable (or rather, loops where scalable VFs must be pulled from the list of candidate VFs).
Tue, Jun 8
Personally, for the testing I'd pick a different element type than i128 given that is already problematic.
Thu, Jun 3
Tue, Jun 1
May 21 2021
May 20 2021
May 19 2021
I've added a recommendation that I believe will also simplify D102777.
May 18 2021
May 17 2021
May 16 2021
A few stylistic things to consider that I'll not hold you to, but the change to validate is the main thing stopping me from accepting the patch.
May 13 2021
May 12 2021
May 11 2021
Looks good to me. I've a few comments that you're free to ignore.
May 10 2021
May 2 2021
Apr 30 2021
Please can you also remove INDEX_VECTOR from the AArch64ISD enum.
I'm not against adding the extra tests, which could also include the legal i8, i16 & i32 vector types, but for my money there's little new code here so I'm happy to assume the common expand/legalisation code is already well tested.
Are we concerned about the reduced code quality shown by the splice tests? Is there a way to know if the basic block is, or within, a loop body? so that we can restrict the patterns to instances where we're confident the extra instructions are likely to be hoisted. I guess for the PTRUE case we already know there is a need for some kind of machine pass to remove larger element PTRUEs when a smaller element one is safe to use.
Apr 29 2021
Assuming you agree I think you can just delete a bit of code and then the patch is good to go.
Apr 28 2021
Apr 27 2021
Looks fine to me given this is more a "compiler shouldn't crash" test that will be updated once we have support to custom lower these nodes.
Apr 26 2021
Apr 23 2021
Apr 22 2021
The non-power-of-two RUN lines should be extended to allow maximum testing (see comments) but otherwise looks good.
A couple of minor requests plus it's worth adding a "step-vector has more than one use" test but otherwise looks good to me. Thanks for your efforts @junparser, also thanks @frasercrmck and @craig.topper for the assists.
Apr 21 2021
Apr 20 2021
I've nothing against the change but it is more involved than updating the comment and assert. There are places where STEP_VECTOR's operand is treated as unsigned, based on the original requirement, that will need to be updated. For example DAGTypeLegalizer::PromoteIntRes_STEP_VECTOR. Ideally we'd want to add/update tests to show the signedness is properly protected during type legalisation.
Apr 19 2021
One comment about preexisting possibly redundant code so the patch looks good to me.
Sounds sensible to me but wonder if there's a nicer way to create the patterns as it seems we're likely to have a significant number of patterns targeting the same instructions. Do you think it's possible to have something like load_8 match all things where LD1_B (reg+reg) can be used?
Looks fine to me with a couple of issues that can be resolved before committing.
Apr 16 2021
Apr 15 2021
Apr 14 2021
Apr 6 2021
Apr 1 2021
Mar 31 2021
Mar 30 2021
Mar 29 2021
Just wanted to add that the patch summary no longer matches the intent of the patch.
I agree, InstructionCost::Invalid is solving a specific problem which is different to "Did not bother computing a real cost".