- User Since
- Jan 23 2017, 12:28 AM (284 w, 3 d)
Mon, Jun 27
Well, What isn't working on your system? All templates should be instantiated now. Please share your build configuration because static/dylib/shared lib builds of LLVM work fine on my system.
The cmake variable (in the reference patch), let's you run compile time tests with different trait configurations: cmake -DLLVM_OPTIMIZER_TRAITS_TO_INSTANTIATE=all. Whether we actually want a cmake variable is a different story. However, there is precedent in the TARGETS_TO_BUILD cmake variable: Distributions can configure the enabled backends. Yet, there are target-specific intrinsics in every LLVM distribution. I wouldn't be surprised if some distributors removed those manually to make their builds smaller.
Fri, Jun 24
- Fixed m_ExtractValue pattern (all tests passing now).
- Add cmake variable to control which traits InstSimplify will be instantiated for (LLVM_OPTIMIZER_TRAITS_TO_INSTANTIATE .. eg VPTrait;CFPTrait).
- Explicitly instantiate InstSimplify templates.
Wed, Jun 22
- inline -> static inline
Fri, Jun 10
This is really on two separate things: First on generalized pattern matching, the second on vector predication (which is only one of the intrinsics sets benefiting from this).
Some comments/thoughts inline.
In the meantime, upstream got new fp patterns that consider the fp environment. This should work just fine with this approach (see my comment on the consider function - we may even DCE fp-environment-aware-patterns for Traits that support fp but do not care about the fp env).
Thu, Jun 9
Here is the rebased patch for now. Currently two failing tests. This could be instsimplify opportunities enabled by this patch on constrained fp (have to look into it more).
D126889 has landed, which includes clang-formatting.
We got one accept and no opposition. This passes all testing locally. If pre-merge checks are ok (except for unrelated breakage), I will go forward and commit this.
- lower case more static functions.
Wed, Jun 8
Jun 7 2022
Apart from applying the patch, not really. This is clang-format + lower-case all function names in InstCombine.cpp/h + compile, wait for an error and lower-case the highlighted function name in the error message, repeat.
Cleanup, repair function names, comments, variables
Jun 2 2022
Also use the new function names in the unit tests..
Btw, no matter where you stand on D126889 - the patch lower-cases all InstructionSimplify function names - the diff shows just how many different passes actually rely on instruction simplification. Those passes could potentially benefit from generalized pattern matching (even if its limited to instsimplify/combine).
we should rework those messy switches in isel legalization someday.. but this isn't a blocker for this patch
Adding the top five people blamed for changed lines in 2022 as reviewers.. Are you okay with auto-formatting the whole file with one commit?
Jun 1 2022
Thanks for chiming in!
Making this a Diff mostly because it changes (although non-functionally) such a huge file.
May 30 2022
All good :) We will find a way to work together on the patches. I will send you an e-mail to discuss this.
Ping. Can we land this patch?
Are you happy with me merging this back into D126363? Do you want to keep working on the VPlan-VP integration or what is your plan here?
May 27 2022
I suppose? These are the differences between VPPredicatedWidenRecipe and VPWidenRecipe, i see:
- Mask & EVL operands. The Mask operand could be optional (which implies a constant all-one mask), EVL is mandatory.
- EVL has significant performance implications on RVV and VE. If you count costing as a VPlan based analysis, then that would be one analysis in favor for having two recipe kinds.
- VPPredicatedWiden* recipes always widen to VP intrinsics. VPWiden* don't.
May 25 2022
Ok. If that works then having one global EVL per State defined this way should be fine for us for now.
I was looking at the comments we got earlier for the reference implementation. In particular @hahn 's comment on the EVL being loop-invariant when it's not used for tail predication.
The thing is, when EVL is used for tail predication you need to re-compute it in every vector loop iteration. I don't see how EVL could be handled like VectorTripCount in this case. Could you elaborate?
To clarify: This Diff is the second sub patch and ready for review. I changed my mind on the whole adding-another-sub-patch-and-abandoning-this-patch thing.
VP intrinsics show UB if the %evl parameter is out of bounds - they must not carry the speculatable attribute. The out-of-bounds UB disappears when the %evl parameter is expanded into the mask or expansion replaces the entire VP intrinsic with non-VP code.
This review replaces D104608 , which I am effectively commandeering as a co-author for the four original sub-patches by Vineet Kumar.
Please review the first of two sub patches: https://reviews.llvm.org/D126362
May 24 2022
Note from syncup call: Split up into a fixup for vp reduction expansion and a second patch for the speculatable flag.
May 20 2022
- More documentation on isSafeToSpeculativelyExecuteWithOpcode
- VP Intrinsics cleanup in Intrinsics.td
May 10 2022
Re-enable crossed out RUN: lines
Mar 23 2022
Mar 22 2022
Mar 21 2022
Update for vp_strided_load|store.ll tests to be fused into this patch (no attribution necessary):
Document that vp.merge, vp.select do not have a VP mask and why.
Landing this will break VP strided load/store RVV and VE tests. I'll attach a diff for the VE ones that you can merge into this patch.
Mar 18 2022
- alphabetic file order in the CMakeLists.txt
- unchanged, just to reiterate: contract only checked on the fmul, not on the fadd.
Mar 17 2022
Fixing an SelectionDAG bug where Flags wouldn't be set in some cases of SelectionDAG::getNode. This bug fix may have (positive) trickle down effects on other targets.
Mar 16 2022
Do we really know that the ASM target is always the same as the C|CXX target?
+1 for emitting a warning at least. FATAL_ERROR if we know for sure that these are always the same.
I always wondered why we are doing the <..
Pass NodeFlags in getNode.
Mar 15 2022
Check IR generated for constexpr of bool vectors. This actually revealed a bug in the bool vector sema (bool vector comparisons would result in byte sized elements). Thx!
@jacquesguan Thanks for working on VP optimizations! We just discussed on the VP syncup call where we want to go with this.
The thing is that this patch is essentially replicating the existing MUL DAGCombiner patterns for VP_MUL. Ideally, we would lift the existing DAGCombiner logic to work on both VP and non-VP SDNodes - automatically.
Make the sema check for bool vector arithmetic specific to ext_vector_type bool. This assures that we do not interfere with scalar bool + int vector arithmetic sema (vector.cpp test was failing before).
Blocking of arithmetic on bool vectors extends to the case where only one operand is a bool vector (and the other is, eg, a vector of int). This was still caught before during sema because the integer vector operand would have been implicitly truncated down to bool, which always has been illegal.
This change results in a different error message for the mixed bool-and-non-bool arithmetic case and a more explicit check in the code ("|| instead of &&").
VP intrinsics aren't speculatable because they have undefined behavior if %evl > static vector length
Mar 14 2022
Add missing tests
Simplify following comments. Thx :)