- User Since
- Jan 6 2015, 6:21 AM (385 w, 4 d)
Apr 7 2022
Apr 6 2022
Apr 5 2022
Ah, great. Thanks for working on this.
Mar 15 2022
Just spoke with Paul about this issue. We decided that defining the undef elements may be too aggressive. Updated patch to come...
Mar 1 2022
Addressed Paul's review...
Feb 28 2022
Updated Diff to implement commuting with PatFrags.
Feb 25 2022
LGTM. Thanks, Paul.
Feb 23 2022
Good point. Replacing the lowered truncates with ptrue sounds like a win in the general case. Abandoning this Diff.
Feb 22 2022
Updated patch based on @david-arm's review.
Feb 18 2022
Fix formatting for the Lint bots.
Feb 9 2022
Ah, sorry for the noise. Abandoning this Diff...
Feb 8 2022
Jan 25 2022
I believe this is a duplicate of D117674, which had not been reviewed yet so I've just pushed it along.
Jan 24 2022
Further reduced test case, but still not great.
Is it possible to break the 4 subtasks into separate reviews?
Nov 1 2021
Perhaps worth adding matching half/fp16 tests to sve-fixed-length-fp-fma.ll but otherwise looks good.
Oct 30 2021
Updated Diff for @paulwalker-arm's reviews...
Oct 26 2021
Fix clang-format warning and add the missing '+' to "+sve".
Jan 29 2021
[NOT READY FOR REVIEW]
Jan 25 2021
Ok, I see where you are coming from now. LoopVectorize is keeping the shuffle result full by widening the the load+shuffle to double wide. LV's double wide choice seems like a weird one, but I suppose if that sequence is codegen'd correctly, then it will work out.
Jan 22 2021
Jan 19 2021
Having said that, I wonder if we should revisit the idea of allowing shuffle vectors to accept step vector masks?
Jan 15 2021
In D94444, @paulwalker-arm proposed a more generic extract vector intrinsic that accepts an index and stride. Now I'm wondering if we should just have a generic scalable shuffle vector intrinsic to handle all these operations under one intrinsic.
Jan 14 2021
Jan 13 2021
Add known minimum number of elements restrictions...
Jan 12 2021
Updated to @david-arm's suggested naming scheme...
I'm assuming scheduling the new addvls closer to their uses is a register pressure win?
Address some of @sdesmalen's comments, but deferring name changes...
Jan 11 2021
Jan 7 2021
Jan 6 2021
Jan 4 2021
Dec 26 2020
Dec 17 2020
Add FIXME comment.
Dec 15 2020
Dec 14 2020
Dec 11 2020
Dec 10 2020
LGTM with one nit below...
Dec 4 2020
I think @ctetreau's "first class citizen" argument on the RFC has merit though. But this patch is a good first step if we're not ready to extend ShuffleVector yet. I personally would like to see ShuffleVector extended eventually, since it would be easier to optimize.
Dec 1 2020
Do we need to protect against mismatched element types? Or does legalization handle those exts/truncs?
Nov 12 2020
Nov 10 2020
Nov 4 2020
Nov 3 2020
Reformat to appease pre-merge checks...
Nov 2 2020
Oct 30 2020
Update patch based on @nikic's comments...
Oct 28 2020
Updated patch with, I think, all the needed legalizations.
Oct 27 2020
Comment from ARM/ARMISelLowering.cpp:
Ah, I see it in ARM/. That will work...
Update 'neutral' element to -0.0.
Oct 23 2020
 I just wanted to highlight my previous VBITS_EQ_256-COUNT-33: fadd comment as this gives us a bit more test coverage and is something that will obviously fail (in a good way) when the splitting work is available.
@paulwalker-arm, back to the splitting discussion...
Updating patch, but not ready for a serious review yet as I haven't started the splitting work. I'm still not convinced we can handle splitting appropriately with the current setup, but will comment on that seperately.
Oct 22 2020
Try again with 80 column fix...
Fix 80 column issue. No other changes intended...