- User Since
- Aug 6 2021, 8:13 AM (111 w, 5 d)
Mon, Sep 25
Thu, Sep 21
Wed, Sep 20
Wed, Sep 13
Thu, Sep 7
Wed, Sep 6
Wed, Aug 30
Aug 23 2023
Aug 22 2023
The change looks good to me but the test needs tightening up.
Aug 11 2023
Aug 10 2023
@sdesmalen I had to make an additional changes to clang/lib/Basic/Targets/AArch64.cpp and clang/test/CodeGen/aarch64-sve-inline-asm-datatypes.c To get the constraint parsing from the clang level. It's pretty minor but thought I'd update the review before pushing in case you've anything to add.
Aug 9 2023
Updated test name & updated wording in LangRef.rst
Aug 8 2023
Aug 7 2023
Aug 2 2023
Emit an AArch64ISD node instead of the machine node directly.
Include D->S addressing mode of RSHRNB
Aug 1 2023
Jul 26 2023
Replaced explicit checks for splat values with dyn_cast_or_null
Jul 24 2023
Jul 19 2023
@kmclaughlin Thank you. I've added two tests @neg_trunc_lsr_add_op1_not_splat and @neg_trunc_lsr_op1_not_splat to bail out of emitting rshrnb when the RHS operands are not splat values.
Jul 18 2023
Jul 14 2023
Jul 12 2023
Jul 11 2023
Jul 10 2023
Jun 20 2023
Jun 5 2023
@nikic I'm still interested in landing this patch as this is a problem for tail vectorization on AArch64. Though D148010 would likely fix this, that patch is a large change that seems to be more of a longer term goal. Is it possible to re-evaluate landing this patch?
May 10 2023
Move OneUseCheck to getFNEGPatterns and use hasOneNonDBGUse instead
@manojgupta Thank you for reverting the patch and the reproducer. I've added a check to bail on the combine if there is more than one use of an FMADD which it what was causing your reproducer to fail and i've added a test to assert this behaviour now.
May 9 2023
May 5 2023
When I pushed the previous revision it produced bad machine code which triggered failures.
Previously I created a new variable MAD for capturing old FMADD instructions and used this variable
to merge the FMADD flags with the FNEG flags, however I did not mark it for deletion like the MUL did, i.e.
@foad Very well, I've reverted it
@foad apologies, I pushed another commit a9919db65a1afa71ac62631d51711383c17d43fc straight afterwards which only enables this test for aarch64. It's possible that you pulled this commit but not the one immediately afterwards. If it still fails with both commits can you let me know? Thanks.
May 2 2023
Apr 28 2023
@craig.topper That was correct, it was only checking for contract. I've added a check for both contract and nsz being present and added some more tests.
Apr 26 2023
Apr 18 2023
@nikic I've updated this patch just for the sake of it, I expect D148010 should fix the problem and make this patch unnecessary. Unfortunately I only have a small snippet of IR generated by clang but do not have my hands on the source code that generated this. Until I have the source I can't yet conclude if D148010 has fixed the original problem.
Apr 14 2023
Yes that's correct. If its impossible for the vectorization factor op0 to be lower than op1 in get.active.lane.mask it will always return an all false mask. This optimization is to fix an edge case kicked out by loop vectorizer which is unable to perform this optimization.
Apr 12 2023
Mar 14 2023
Mar 13 2023
Mar 10 2023
Mar 9 2023
Mar 7 2023
Feb 20 2023
Feb 15 2023
@nikic I've precommited the tests, and added your alive examples to the description. I've changed the flag condition to if (!NUW || (ICmpInst::isSigned(Pred) && !NSW)to assert NUW for unsigned and NUW and NSW for signed. I've added two tests neg_icmp_lshr_known_non_zero_slt_no_nuw and neg_icmp_lshr_known_non_zero_ult_no_nuw which assert the correct flag conditions.
Feb 14 2023
@sdesmalen Thanks, I've put in your comments, the patch looks a lot more compact now.
Feb 9 2023
I think not implementing several of my suggestions because of a future patch is a mistake. I don't think they're nits and have some obvious benefits for readability and control flow, and I'm of the opinion that leaving code that anticipates future work that may or may not even land or be reverted is not ideal, so I'll leave this for someone else to approve.
@goldstein.w.n Please update the commit message/description to properly reflect the patch after it was split (e.g. references to one use removed, an example of the Y >= Z transform etc)