User Details
- User Since
- Oct 21 2016, 1:19 AM (335 w, 2 d)
Fri, Mar 24
Thanks @david-arm, LGTM.
Hi I'm getting build failures with this patch due to a linker error where it says:
ld.lld: error: undefined symbol: llvm::createLoopDataPrefetchPass() >>> referenced by LoongArchTargetMachine.cpp >>> lib/Target/LoongArch/CMakeFiles/LLVMLoongArchCodeGen.dir/LoongArchTargetMachine.cpp.o:((anonymous namespace)::LoongArchPassConfig::addIRPasses()) collect2: error: ld returned 1 exit status`
Thu, Mar 23
Wed, Mar 22
Tue, Mar 21
Mon, Mar 20
Fri, Mar 17
Sorry for the quick turn-around here, but I just realise that we probably want to guard these aliases with HasSVE2p1_or_HasSME2, because without those features the concept of a predicate-as-counter does not exist. I see that we haven't done this for pfalse either.
Thu, Mar 16
Fair enough, I was mostly wondering if there was a fundamental reason for it or if my understanding of poison was lacking here :) Thanks for clarifying!
In this case all comparisons just fold to true (but could also fold to any other value).
What is the reason that the IR doesn't explicitly return poison in that case? (i.e. ret i1 poison)
Abandoning since the alternative fix has now landed (D146056)
Wed, Mar 15
It gives me great joy to see this code removed :)
Tue, Mar 14
I'm happy to look at one of the other patches. Can you land this patch in the meantime?
Fri, Mar 10
Nice improvement.
Thu, Mar 9
Thanks for the changes @hassnaa-arm, I'm satisfied with the patch now so removing my 'requesting changes'.
Unless @dmgreen has more comments on the tests, I'm happy for this patch to land.
Tue, Mar 7
Added missing CHECK lines
Thanks for all the changes @hassnaa-arm, I've just left some final minor comments.
Mon, Mar 6
- Added assert to check that NumVectors=1 for svcount_t.
Thu, Mar 2
Seems fine to me, but please give @aprantl a chance to have another look at it as well before you land it.
Did you also try to see if there is additional value doing this transform at the LLVM IR layer? (I don't know if that could unlock other optimisations since ConstantFolding kicks in earlier)
Wed, Mar 1
This was addressed by D144624
Thanks for reviewing! I've addressed final comments before committing
Thanks for the reviews! I've addressed the nits before committing the patch.
Tue, Feb 28
- Use PromoteToType for svcount_t -> nxv16i1
- PNR -> PPR in one of the patterns
- Added tests for select
Feb 23 2023
- Removed unused interfaces.
- Rebased after moving out change in foldSelectInstWithICmp.
Seems like a sensible refactoring to me and also removes the need for D143642.
Rebased to use getMaxVScale function introduced in rG9449deda12c4
I agree this is a sensible change to make, LGTM!
Thanks for making the changes. I've just left a few more comments on the test.
Feb 22 2023
- IVUpdateCannotOverflow -> IVUpdateMayOverflow
- Added 'IVUpdateMayOverflow' as operand to getPreferredTailFoldingStyle.
- Runtime check is no longer emitted when tail-folding when we know the check evaluates to false.
@jcranmer-intel thanks for your review, I think I've addressed both your comments now.
Feb 21 2023
Thanks for the review @paulwalker-arm, I think I've addressed most of your comments, with the exception of giving predicate-as-counter its own register class, as that one probably makes more sense to pull out into a separate patch.
- Added EVT::isScalableVT() and refactored code to use it
- Use BITCAST instead of REINTERPRET_CAST
- Custom lower load/store of svcount to use nxv16i1
- Simplified lowering code in LowerSELECT
- Replaced the blanket bailout in visitLOAD/visitSTORE in DAGCombiner with more specific bailouts
Refactored to use Type::isScalableTy()
- Added description of the svcount type to AArch64SME.rst
- Use isIntOrIntVectorTy() in foldSelectInstWithICmp
- Added Type::isScalableTy() convenience function.
Feb 20 2023
Thanks for simplifying this patch a bit!
Feb 16 2023
Hi @goldstein.w.n, I'm still going through the patch but already found that some code paths are currently untested. I am requesting changes to avoid you from landing it.
In general my preference would be to limit the cases your code handles, rather than adding more tests for more possible combinations of shifts/muls. That makes the code-changes easier to review and helps identify the test-coverage for the different code-paths you've added.
Feb 15 2023
Split out SROA changes into rG462227f1150f as suggested, and rebased this patch.