- User Since
- Jan 23 2017, 12:28 AM (347 w, 6 d)
May 12 2023
May 10 2023
Can we abandon this now that D141891 has landed?
May 9 2023
I suppose this makes sense for ARM MVE/SVE as well?
Generally, we should simplify this in IR for other targets and/or tripcounts with known multiples: otherwise the VP expansion pass will create %evl-to-%mask expansion code not knowing that the %evl is ineffective (eg %evl == vector_length).
You don't need to rewrite the templated code to have the option for virtual dispatch in the future:
If the need arises, we could implement one SuperMatchContext that defers to EmptyContext and VPMatchContext internally (via subclasses or otherwise) and only instantiate the templates for SuperMatchContext once.
(If you go down this route, there is a bunch of cool things you can do, eg combining different matchers or running matcher code with multiple contexts in parallel).
May 8 2023
There is a concern that using template for many functions in
DAGCombiner.cpp may make the binary too large.
Could you provide more details on this concern, eg where was it raised? To put this discussion in perspective, looking at vanilla LLVM 14, a debug build of DAGCombiner.cpp.o has 7MB vs 1322MB for libLLVM-14.so (with X86,RISCV). I would add that this approach adds virtual dispatch on the standard DAGCombiner path, which has a performance cost. IMHO, the question is what a good tradeoff would be.
Apr 24 2023
Feb 13 2023
LGTM. I am no longer the maintainer of the VE bot, though.
Jan 13 2023
Jan 12 2023
Sep 26 2022
The extension to irreducible control happens in D130746
Sep 19 2022
Yes and we should do this for all VP intrinsics. This also means we need to change the wording in the Semantics sections, eg https://llvm.org/docs/LangRef.html#id894 .
Aug 30 2022
Aug 19 2022
Aug 4 2022
Jul 20 2022
Please rebase for commit
Jul 19 2022
I could not apply this patch. Please rebase again so i can commit.
Jul 18 2022
Jul 17 2022
Jul 14 2022
Rebase onto committed add_vp.ll test to better show improvement
Jun 27 2022
Well, What isn't working on your system? All templates should be instantiated now. Please share your build configuration because static/dylib/shared lib builds of LLVM work fine on my system.
The cmake variable (in the reference patch), let's you run compile time tests with different trait configurations: cmake -DLLVM_OPTIMIZER_TRAITS_TO_INSTANTIATE=all. Whether we actually want a cmake variable is a different story. However, there is precedent in the TARGETS_TO_BUILD cmake variable: Distributions can configure the enabled backends. Yet, there are target-specific intrinsics in every LLVM distribution. I wouldn't be surprised if some distributors removed those manually to make their builds smaller.
Jun 24 2022
- Fixed m_ExtractValue pattern (all tests passing now).
- Add cmake variable to control which traits InstSimplify will be instantiated for (LLVM_OPTIMIZER_TRAITS_TO_INSTANTIATE .. eg VPTrait;CFPTrait).
- Explicitly instantiate InstSimplify templates.
Jun 22 2022
- inline -> static inline
Jun 10 2022
This is really on two separate things: First on generalized pattern matching, the second on vector predication (which is only one of the intrinsics sets benefiting from this).
Some comments/thoughts inline.
In the meantime, upstream got new fp patterns that consider the fp environment. This should work just fine with this approach (see my comment on the consider function - we may even DCE fp-environment-aware-patterns for Traits that support fp but do not care about the fp env).
Jun 9 2022
Here is the rebased patch for now. Currently two failing tests. This could be instsimplify opportunities enabled by this patch on constrained fp (have to look into it more).
D126889 has landed, which includes clang-formatting.
We got one accept and no opposition. This passes all testing locally. If pre-merge checks are ok (except for unrelated breakage), I will go forward and commit this.
- lower case more static functions.
Jun 8 2022
Jun 7 2022
Apart from applying the patch, not really. This is clang-format + lower-case all function names in InstCombine.cpp/h + compile, wait for an error and lower-case the highlighted function name in the error message, repeat.
Cleanup, repair function names, comments, variables
Jun 2 2022
Also use the new function names in the unit tests..
Btw, no matter where you stand on D126889 - the patch lower-cases all InstructionSimplify function names - the diff shows just how many different passes actually rely on instruction simplification. Those passes could potentially benefit from generalized pattern matching (even if its limited to instsimplify/combine).
we should rework those messy switches in isel legalization someday.. but this isn't a blocker for this patch
Adding the top five people blamed for changed lines in 2022 as reviewers.. Are you okay with auto-formatting the whole file with one commit?
Jun 1 2022
Thanks for chiming in!
Making this a Diff mostly because it changes (although non-functionally) such a huge file.
May 30 2022
All good :) We will find a way to work together on the patches. I will send you an e-mail to discuss this.
Ping. Can we land this patch?
Are you happy with me merging this back into D126363? Do you want to keep working on the VPlan-VP integration or what is your plan here?
May 27 2022
I suppose? These are the differences between VPPredicatedWidenRecipe and VPWidenRecipe, i see:
- Mask & EVL operands. The Mask operand could be optional (which implies a constant all-one mask), EVL is mandatory.
- EVL has significant performance implications on RVV and VE. If you count costing as a VPlan based analysis, then that would be one analysis in favor for having two recipe kinds.
- VPPredicatedWiden* recipes always widen to VP intrinsics. VPWiden* don't.
May 25 2022
Ok. If that works then having one global EVL per State defined this way should be fine for us for now.
I was looking at the comments we got earlier for the reference implementation. In particular @hahn 's comment on the EVL being loop-invariant when it's not used for tail predication.
The thing is, when EVL is used for tail predication you need to re-compute it in every vector loop iteration. I don't see how EVL could be handled like VectorTripCount in this case. Could you elaborate?
To clarify: This Diff is the second sub patch and ready for review. I changed my mind on the whole adding-another-sub-patch-and-abandoning-this-patch thing.
VP intrinsics show UB if the %evl parameter is out of bounds - they must not carry the speculatable attribute. The out-of-bounds UB disappears when the %evl parameter is expanded into the mask or expansion replaces the entire VP intrinsic with non-VP code.
This review replaces D104608 , which I am effectively commandeering as a co-author for the four original sub-patches by Vineet Kumar.
Please review the first of two sub patches: https://reviews.llvm.org/D126362
May 24 2022
Note from syncup call: Split up into a fixup for vp reduction expansion and a second patch for the speculatable flag.
May 20 2022
- More documentation on isSafeToSpeculativelyExecuteWithOpcode
- VP Intrinsics cleanup in Intrinsics.td
May 10 2022
Re-enable crossed out RUN: lines
Mar 23 2022
Mar 22 2022
Mar 21 2022
Update for vp_strided_load|store.ll tests to be fused into this patch (no attribution necessary):
Document that vp.merge, vp.select do not have a VP mask and why.
Landing this will break VP strided load/store RVV and VE tests. I'll attach a diff for the VE ones that you can merge into this patch.
Mar 18 2022
- alphabetic file order in the CMakeLists.txt
- unchanged, just to reiterate: contract only checked on the fmul, not on the fadd.