User Details
- User Since
- Jan 6 2015, 6:21 AM (429 w, 2 d)
Feb 22 2023
Updated patch for Matt's reviews...
Update the diff to include context.
Created D144571 with a patch to fix the ICE.
Feb 21 2023
Also note that a fix will be needed for the release/16.x branch.
Hi Matt,
Jan 9 2023
LGTM
Aug 24 2022
Apologies, @efriedma. I've updated the patch again to move the check into collectSRATypes(). Would you mind doing one more review?
Aug 23 2022
Not sure why you're sticking the test in llvm/test/CodeGen/AArch64 instead of llvm/test/Transforms/GlobalOpt/.
Aug 22 2022
Apr 7 2022
Apr 6 2022
Apr 5 2022
Ah, great. Thanks for working on this.
Mar 15 2022
Just spoke with Paul about this issue. We decided that defining the undef elements may be too aggressive. Updated patch to come...
Mar 1 2022
Addressed Paul's review...
Feb 28 2022
Updated Diff to implement commuting with PatFrags.
Feb 25 2022
LGTM. Thanks, Paul.
Feb 23 2022
Good point. Replacing the lowered truncates with ptrue sounds like a win in the general case. Abandoning this Diff.
Feb 22 2022
Updated patch based on @david-arm's review.
Feb 18 2022
Fix formatting for the Lint bots.
Updated Diff.
Feb 9 2022
Ah, sorry for the noise. Abandoning this Diff...
Feb 8 2022
Jan 25 2022
I believe this is a duplicate of D117674, which had not been reviewed yet so I've just pushed it along.
Jan 24 2022
Further reduced test case, but still not great.
Is it possible to break the 4 subtasks into separate reviews?
Nov 1 2021
Perhaps worth adding matching half/fp16 tests to sve-fixed-length-fp-fma.ll but otherwise looks good.
Oct 30 2021
Updated Diff for @paulwalker-arm's reviews...
Oct 26 2021
Fix clang-format warning and add the missing '+' to "+sve".
Jan 29 2021
[NOT READY FOR REVIEW]
Jan 25 2021
In D94444#2497697, @paulwalker-arm wrote:
<A x Elt> llvm.experimental.vector.extract.elements(<B x Elt> %invec, i32 index, i32 stride)
Ok, I see where you are coming from now. LoopVectorize is keeping the shuffle result full by widening the the load+shuffle to double wide. LV's double wide choice seems like a weird one, but I suppose if that sequence is codegen'd correctly, then it will work out.
Jan 22 2021
In D94444#2497697, @paulwalker-arm wrote:
<A x Elt> llvm.experimental.vector.extract.elements(<B x Elt> %invec, i32 index, i32 stride)
Jan 19 2021
Having said that, I wonder if we should revisit the idea of allowing shuffle vectors to accept step vector masks?
Jan 15 2021
In D94444, @paulwalker-arm proposed a more generic extract vector intrinsic that accepts an index and stride. Now I'm wondering if we should just have a generic scalable shuffle vector intrinsic to handle all these operations under one intrinsic.
Jan 14 2021
Jan 13 2021
Add known minimum number of elements restrictions...
Jan 12 2021
Updated to @david-arm's suggested naming scheme...
I'm assuming scheduling the new addvls closer to their uses is a register pressure win?
Address some of @sdesmalen's comments, but deferring name changes...
Jan 11 2021
Jan 7 2021
Jan 6 2021
Jan 4 2021
LGTM
Ping.
Dec 26 2020
Dec 17 2020
Add FIXME comment.
Dec 15 2020
Dec 14 2020
Dec 11 2020
LGTM
Dec 10 2020
LGTM with one nit below...
Dec 4 2020
I think @ctetreau's "first class citizen" argument on the RFC has merit though. But this patch is a good first step if we're not ready to extend ShuffleVector yet. I personally would like to see ShuffleVector extended eventually, since it would be easier to optimize.
Dec 1 2020
Do we need to protect against mismatched element types? Or does legalization handle those exts/truncs?
Nov 12 2020
Nov 10 2020
Nov 4 2020
Nov 3 2020
Reformat to appease pre-merge checks...
Nov 2 2020
Oct 30 2020
Update patch based on @nikic's comments...
Oct 28 2020
Updated patch with, I think, all the needed legalizations.
Oct 27 2020
Comment from ARM/ARMISelLowering.cpp:
Ah, I see it in ARM/. That will work...
Update 'neutral' element to -0.0.