Page MenuHomePhabricator

simoll (Simon Moll)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 23 2017, 12:28 AM (284 w, 3 d)

Recent Activity

Mon, Jun 27

simoll added a comment to D92086: Generalized PatternMatch & InstSimplify.

We can have a cmake variable that controls instantation and if your distribution does not care about constrained fp or vp, say, you just don't instantiate for it and won't see compile time or size increases. I was hinting in that direction with the EnabledTraits.def file.

I don't think this makes a lot sense. There's no way we can disable this from the distribution side if constrained FP and VP are core parts of LLVM (which at least the former is at this point, given that it is exposed by Clang). If you want to do this you'll also have to export variables for llvm-lit, so tests can be disabled based on which traits are enabled. I don't think we want to go there.

I'd really like to have a working patch (with enabled traits) so I can give this a basic compile-time evaluation at least.

Well, What isn't working on your system? All templates should be instantiated now. Please share your build configuration because static/dylib/shared lib builds of LLVM work fine on my system.
The cmake variable (in the reference patch), let's you run compile time tests with different trait configurations: cmake -DLLVM_OPTIMIZER_TRAITS_TO_INSTANTIATE=all. Whether we actually want a cmake variable is a different story. However, there is precedent in the TARGETS_TO_BUILD cmake variable: Distributions can configure the enabled backends. Yet, there are target-specific intrinsics in every LLVM distribution. I wouldn't be surprised if some distributors removed those manually to make their builds smaller.

Mon, Jun 27, 1:47 AM · Restricted Project, Restricted Project

Fri, Jun 24

simoll updated the diff for D92086: Generalized PatternMatch & InstSimplify.
  • Fixed m_ExtractValue pattern (all tests passing now).
  • Add cmake variable to control which traits InstSimplify will be instantiated for (LLVM_OPTIMIZER_TRAITS_TO_INSTANTIATE .. eg VPTrait;CFPTrait).
  • Explicitly instantiate InstSimplify templates.
Fri, Jun 24, 11:09 AM · Restricted Project, Restricted Project

Wed, Jun 22

simoll updated the diff for D92086: Generalized PatternMatch & InstSimplify.
  • inline -> static inline
  • rebased
Wed, Jun 22, 7:54 AM · Restricted Project, Restricted Project

Fri, Jun 10

simoll added a comment to D92086: Generalized PatternMatch & InstSimplify.

This is really on two separate things: First on generalized pattern matching, the second on vector predication (which is only one of the intrinsics sets benefiting from this).

Fri, Jun 10, 4:25 AM · Restricted Project, Restricted Project
simoll added inline comments to D92086: Generalized PatternMatch & InstSimplify.
Fri, Jun 10, 2:08 AM · Restricted Project, Restricted Project
simoll added a comment to D92086: Generalized PatternMatch & InstSimplify.

Some comments/thoughts inline.
In the meantime, upstream got new fp patterns that consider the fp environment. This should work just fine with this approach (see my comment on the consider function - we may even DCE fp-environment-aware-patterns for Traits that support fp but do not care about the fp env).

Fri, Jun 10, 2:05 AM · Restricted Project, Restricted Project

Thu, Jun 9

simoll updated the diff for D92086: Generalized PatternMatch & InstSimplify.

Here is the rebased patch for now. Currently two failing tests. This could be instsimplify opportunities enabled by this patch on constrained fp (have to look into it more).

Thu, Jun 9, 9:16 AM · Restricted Project, Restricted Project
simoll committed rG746908a0380c: [NFC] Clang-format PatternMatch.h (authored by simoll).
[NFC] Clang-format PatternMatch.h
Thu, Jun 9, 7:55 AM · Restricted Project, Restricted Project
simoll abandoned D126783: [NFC] clang-format InstructionSimplify.cpp.

D126889 has landed, which includes clang-formatting.

Thu, Jun 9, 7:50 AM · Restricted Project, Restricted Project
simoll committed rGb8c2781ff601: [NFC] format InstructionSimplify & lowerCaseFunctionNames (authored by simoll).
[NFC] format InstructionSimplify & lowerCaseFunctionNames
Thu, Jun 9, 7:10 AM · Restricted Project, Restricted Project
simoll closed D126889: [NFC] format InstructionSimplify & lowerCaseFunctionNames.
Thu, Jun 9, 7:10 AM · Restricted Project, Restricted Project
simoll added a comment to D126889: [NFC] format InstructionSimplify & lowerCaseFunctionNames.

We got one accept and no opposition. This passes all testing locally. If pre-merge checks are ok (except for unrelated breakage), I will go forward and commit this.

Thu, Jun 9, 6:29 AM · Restricted Project, Restricted Project
simoll updated the diff for D126889: [NFC] format InstructionSimplify & lowerCaseFunctionNames.
  • lower case more static functions.
  • rebased.
Thu, Jun 9, 6:24 AM · Restricted Project, Restricted Project

Wed, Jun 8

simoll added a comment to D92086: Generalized PatternMatch & InstSimplify.

It looks like unfortunately the patch doesn't apply on current main. Does it have any dependencies or does it just need a rebase?

Wed, Jun 8, 12:05 AM · Restricted Project, Restricted Project

Jun 7 2022

simoll added a comment to D126889: [NFC] format InstructionSimplify & lowerCaseFunctionNames.

No objection from me, and I do maintain downstream forks. Can you provide a simple one-line command that I can run to make the equivalent changes in my downstream?

Apart from applying the patch, not really. This is clang-format + lower-case all function names in InstCombine.cpp/h + compile, wait for an error and lower-case the highlighted function name in the error message, repeat.

Jun 7 2022, 3:22 AM · Restricted Project, Restricted Project
simoll updated the diff for D126889: [NFC] format InstructionSimplify & lowerCaseFunctionNames.

Cleanup, repair function names, comments, variables

Jun 7 2022, 3:17 AM · Restricted Project, Restricted Project

Jun 2 2022

simoll updated the diff for D126889: [NFC] format InstructionSimplify & lowerCaseFunctionNames.

Also use the new function names in the unit tests..

Jun 2 2022, 9:02 AM · Restricted Project, Restricted Project
simoll added a reviewer for D126889: [NFC] format InstructionSimplify & lowerCaseFunctionNames: rengolin.
Jun 2 2022, 8:57 AM · Restricted Project, Restricted Project
simoll added a comment to D92086: Generalized PatternMatch & InstSimplify.

Btw, no matter where you stand on D126889 - the patch lower-cases all InstructionSimplify function names - the diff shows just how many different passes actually rely on instruction simplification. Those passes could potentially benefit from generalized pattern matching (even if its limited to instsimplify/combine).

Jun 2 2022, 8:54 AM · Restricted Project, Restricted Project
simoll added a comment to D126783: [NFC] clang-format InstructionSimplify.cpp.

LGTM - I'd go further and update the old "FunctionName" formatting too, so we are consistent with the current "functionName" style.

Many of the changed lines would already overlap, and if it's done in this patch, then that's still just this one potential commit that anyone would be looking past with 'blame'.

Jun 2 2022, 8:44 AM · Restricted Project, Restricted Project
simoll requested review of D126889: [NFC] format InstructionSimplify & lowerCaseFunctionNames.
Jun 2 2022, 8:43 AM · Restricted Project, Restricted Project
simoll accepted D109584: [VP] Implementing expansion pass for VP load and store..

LGTM

Jun 2 2022, 2:24 AM · Restricted Project, Restricted Project, Restricted Project
simoll accepted D120664: [VP] IR expansion pass for VP gather and scatter.

LGTM

Jun 2 2022, 2:20 AM · Restricted Project, Restricted Project
simoll accepted D126847: [LegalizeTypes][VP] Add widen and split support for VP FP integer casting op..

LGTM
we should rework those messy switches in isel legalization someday.. but this isn't a blocker for this patch

Jun 2 2022, 1:38 AM · Restricted Project, Restricted Project
simoll added reviewers for D126783: [NFC] clang-format InstructionSimplify.cpp: reames, nikic, fhahn, spatel, hkmatsumoto.

Adding the top five people blamed for changed lines in 2022 as reviewers.. Are you okay with auto-formatting the whole file with one commit?

Jun 2 2022, 1:13 AM · Restricted Project, Restricted Project

Jun 1 2022

simoll added a comment to D92086: Generalized PatternMatch & InstSimplify.

First, I think this is a good idea and can eventually mitigate the general problem of intrinsics vs. instructions in other LLVM passes.

Thanks for chiming in!

Jun 1 2022, 9:04 AM · Restricted Project, Restricted Project
simoll added a comment to D126783: [NFC] clang-format InstructionSimplify.cpp.

Making this a Diff mostly because it changes (although non-functionally) such a huge file.

Jun 1 2022, 8:05 AM · Restricted Project, Restricted Project
simoll requested review of D126783: [NFC] clang-format InstructionSimplify.cpp.
Jun 1 2022, 8:03 AM · Restricted Project, Restricted Project

May 30 2022

simoll added a reverting change for rG2e2a8a2d9082: Revert "[VP] vp intrinsics are not speculatable": rG18c1ee04de44: Re-land "[VP] vp intrinsics are not speculatable" with test fix.
May 30 2022, 5:42 AM · Restricted Project, Restricted Project
simoll committed rG18c1ee04de44: Re-land "[VP] vp intrinsics are not speculatable" with test fix (authored by simoll).
Re-land "[VP] vp intrinsics are not speculatable" with test fix
May 30 2022, 5:42 AM · Restricted Project, Restricted Project, Restricted Project
simoll committed rG78a18d2b54e7: [VP] vp intrinsics are not speculatable (authored by simoll).
[VP] vp intrinsics are not speculatable
May 30 2022, 3:20 AM · Restricted Project, Restricted Project
simoll closed D125296: [VP] vp intrinsics are not speculatable.
May 30 2022, 3:20 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D126625: [VPlan, VP] 1/4 Introduce new recipes to support predicated vectorization.

Are you happy with me merging this back into D126363? Do you want to keep working on the VPlan-VP integration or what is your plan here?

yeah! I am very happy to continue working on vplan VP integration, and improving LLVM's vectorization to enable automatic vectorization for RVV.

! In D126625#3545290, @ym1813382441 wrote:

! In D126625#3545254, @simoll wrote:

Are you happy with me merging this back into D126363? Do you want to keep working on the VPlan-VP integration or what is your plan here?

And my English may not be very good. Please don't laught at me if you find something indecent of me :)

All good :) We will find a way to work together on the patches. I will send you an e-mail to discuss this.

May 30 2022, 3:18 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D125296: [VP] vp intrinsics are not speculatable.

Ping. Can we land this patch?

May 30 2022, 2:38 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D126625: [VPlan, VP] 1/4 Introduce new recipes to support predicated vectorization.

Are you happy with me merging this back into D126363? Do you want to keep working on the VPlan-VP integration or what is your plan here?

May 30 2022, 2:10 AM · Restricted Project, Restricted Project, Restricted Project

May 27 2022

simoll added a comment to D126363: [VPlan, VP] 1/4 Introduce new recipes to support predicated vectorization.

We store the the %evl as the related value for VPValue *EVL using State.set(EVL, %evl, Part) and then get required value using State.get(EVL, Part).
In this case we can treat EVL similarly to canonical iv, which is not an invariant.

Ok. If that works then having one global EVL per State defined this way should be fine for us for now.

The way VectorTripCount is handled at the moment is a workaround and probably shouldn't act as inspiration. AFAICT we already removed all uses during code-gen that where using State to access the vector trip count. If it is needed for code-gen of a recipe, it should be expressed as operand.

Better to treat EVL as CanonicalIV. Yes, I agree that the recipe is better choice here (similar to CaonicalIV) but it requires some extra work because of VPWidenIntOrFpInductionRecipe should depend on EVL. Probably, need to split VPWidenIntOrFpInductionRecipe to a PHI recipe and something like CanonicalIVIncrement, otherwise this dependency prevents it from being vectorized effectively.

May 27 2022, 7:51 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D126363: [VPlan, VP] 1/4 Introduce new recipes to support predicated vectorization.

We need to generate code to compute the EVL, its operand should be application vector length ,and EVL maybe depends on the canonical induction recipes.
For example,

for (i = 0; i < n; ++i)
  c[i] = a[i] + b[i];

VPlan should be modeled as:

n = trip count
 vector loop: {
  EMIT i = canonical induction
  EMIT avl = n - i
  EMIT evl = set.evl(avl, ...)  (generate explicit vector length)
  EMIT mask = (all true mask)
  WIDEN t0 = vp.load(a, mask, evl)
  WIDEN t1 = vp.load(b, mask, evl)
  WIDEN t2 = vp.add(t0, t1, mask, evl)
  WIDEN vp.store(t2, c, mask, evl)
  EMIT  i.next = i + evl
  EMIT  i < n    (branch-on-count)
}

EVL may be different in each vector iteration, and set.evl may also require more parameters, such as the upper limit of the vector length or the width of the data type.
Various situations show that we need to model it as recipe.
And "llvm.set.evl" intrinsic need to be added.
We may refer to RVV architecture to set some restrictions for "llvm.set.evl", like...

evl = 0 if avl = 0
evl > 0 if avl > 0
evl ≤ VF
evl ≤ avl

Thanks for the example! We seem to converge on the VPWidenEVL recipe among all active participants in the discussion. To make a concrete proposal:

evl = ExplicitVectorLength(TripCount, CanonicalInduction)

That's just pulled and rephrased from your example. This should be sufficient for RVV, right?

Shall we rename it to something like VPCanonicalEVL... instead? It is scalar.

May 27 2022, 5:43 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D126363: [VPlan, VP] 1/4 Introduce new recipes to support predicated vectorization.

We need to generate code to compute the EVL, its operand should be application vector length ,and EVL maybe depends on the canonical induction recipes.
For example,

for (i = 0; i < n; ++i)
  c[i] = a[i] + b[i];

VPlan should be modeled as:

n = trip count
 vector loop: {
  EMIT i = canonical induction
  EMIT avl = n - i
  EMIT evl = set.evl(avl, ...)  (generate explicit vector length)
  EMIT mask = (all true mask)
  WIDEN t0 = vp.load(a, mask, evl)
  WIDEN t1 = vp.load(b, mask, evl)
  WIDEN t2 = vp.add(t0, t1, mask, evl)
  WIDEN vp.store(t2, c, mask, evl)
  EMIT  i.next = i + evl
  EMIT  i < n    (branch-on-count)
}

EVL may be different in each vector iteration, and set.evl may also require more parameters, such as the upper limit of the vector length or the width of the data type.
Various situations show that we need to model it as recipe.
And "llvm.set.evl" intrinsic need to be added.
We may refer to RVV architecture to set some restrictions for "llvm.set.evl", like...

evl = 0 if avl = 0
evl > 0 if avl > 0
evl ≤ VF
evl ≤ avl
May 27 2022, 4:53 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D126363: [VPlan, VP] 1/4 Introduce new recipes to support predicated vectorization.

We store the the %evl as the related value for VPValue *EVL using State.set(EVL, %evl, Part) and then get required value using State.get(EVL, Part).
In this case we can treat EVL similarly to canonical iv, which is not an invariant.

Ok. If that works then having one global EVL per State defined this way should be fine for us for now.

The way VectorTripCount is handled at the moment is a workaround and probably shouldn't act as inspiration. AFAICT we already removed all uses during code-gen that where using State to access the vector trip count. If it is needed for code-gen of a recipe, it should be expressed as operand.

Better to treat EVL as CanonicalIV. Yes, I agree that the recipe is better choice here (similar to CaonicalIV) but it requires some extra work because of VPWidenIntOrFpInductionRecipe should depend on EVL. Probably, need to split VPWidenIntOrFpInductionRecipe to a PHI recipe and something like CanonicalIVIncrement, otherwise this dependency prevents it from being vectorized effectively.

If we need to generate code to compute the EVL, it should be modeled as recipe. If the EVL depends on any other recipes (like the canonical induction), it needs to be a recipe. If all that is needed is an opcode and operands, then it should probably just be a additional opcode in VPInstruction, instead of a new recipe.

Here is my suggestion:

  1. We get an explicit-vector-length recipe to compute EVL inside the vector loop. And this will be the only recipe we add because..
  2. We extend the existing recipes with an (optional) EVL operand. Presence of EVL implies that VP intrinsics are used for widening.

I'm afraid that it will require HUGE(!!!) amount of changes in the Vplan. I assume, there still will be the same recipe/vpvalue for EVL across of all recipes/vpinstructions.

How? I think there is some kind of misunderstanding here.
There is an existing prototype implementation that uses these three new recipes and a global EVL per vector loop.
The code for the additional recipes is small and - frankly - trivial when you know how to do it. If you use separate recipes, the existing recipes are completely unaffected by this.

What part of my suggestion makes you think that there would be huge changes? Is it the adding EVL to existing recipes?

I think when deciding whether to add new recipes here, a key question is what the differences are to other existing recipes. IIUC the only difference between VPPredicatedWidenRecipe and VPWidenRecipe modulo the extra mask & EVL operands is during code-gen, right? But fundamentally both still widen an (arithmetic) operation. Whether some elements may be masked out shouldn't really matter for VPlan based analysis at the moment.

I suppose? These are the differences between VPPredicatedWidenRecipe and VPWidenRecipe, i see:

  • Mask & EVL operands. The Mask operand could be optional (which implies a constant all-one mask), EVL is mandatory.
  • EVL has significant performance implications on RVV and VE. If you count costing as a VPlan based analysis, then that would be one analysis in favor for having two recipe kinds.
  • VPPredicatedWiden* recipes always widen to VP intrinsics. VPWiden* don't.
May 27 2022, 4:47 AM · Restricted Project, Restricted Project, Restricted Project
simoll accepted D126457: [NFC][VP] Fix llvm.vp.merge intrinsic Expansion in LangRef.
May 27 2022, 12:40 AM · Restricted Project, Restricted Project

May 25 2022

simoll added a comment to D126363: [VPlan, VP] 1/4 Introduce new recipes to support predicated vectorization.

Hi Simon, did you think about making EVL a member of VPlan just like TripCount? In this case we might be not needed lots of these new classes.

Hi Alexey! The EVL behaves like a mask and less like the TripCount. When used for tail predication, the value of EVL still depends on the current vector iteration and needs to be computed in the vector loop.

But you can also treat it as an effective vector factor and use it similarly to VectorTripCount. Introducing new nodes just to add an extra operand EVL does not look necessary

I was looking at the comments we got earlier for the reference implementation. In particular @hahn 's comment on the EVL being loop-invariant when it's not used for tail predication.
The thing is, when EVL is used for tail predication you need to re-compute it in every vector loop iteration. I don't see how EVL could be handled like VectorTripCount in this case. Could you elaborate?

I have this scheme in mind:

vector.body:
  %canon.iv = phi int
  %evl = evl(%canon.iv, %vector.trip.count)
  ...
  br ...

We store the the %evl as the related value for VPValue *EVL using State.set(EVL, %evl, Part) and then get required value using State.get(EVL, Part).
In this case we can treat EVL similarly to canonical iv, which is not an invariant.

Ok. If that works then having one global EVL per State defined this way should be fine for us for now.

May 25 2022, 8:43 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D126363: [VPlan, VP] 1/4 Introduce new recipes to support predicated vectorization.

Hi Simon, did you think about making EVL a member of VPlan just like TripCount? In this case we might be not needed lots of these new classes.

Hi Alexey! The EVL behaves like a mask and less like the TripCount. When used for tail predication, the value of EVL still depends on the current vector iteration and needs to be computed in the vector loop.

But you can also treat it as an effective vector factor and use it similarly to VectorTripCount. Introducing new nodes just to add an extra operand EVL does not look necessary

I was looking at the comments we got earlier for the reference implementation. In particular @hahn 's comment on the EVL being loop-invariant when it's not used for tail predication.
The thing is, when EVL is used for tail predication you need to re-compute it in every vector loop iteration. I don't see how EVL could be handled like VectorTripCount in this case. Could you elaborate?

May 25 2022, 8:06 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D126363: [VPlan, VP] 1/4 Introduce new recipes to support predicated vectorization.

Hi Simon, did you think about making EVL a member of VPlan just like TripCount? In this case we might be not needed lots of these new classes.

May 25 2022, 7:22 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D125296: [VP] vp intrinsics are not speculatable.

To clarify: This Diff is the second sub patch and ready for review. I changed my mind on the whole adding-another-sub-patch-and-abandoning-this-patch thing.

May 25 2022, 7:12 AM · Restricted Project, Restricted Project, Restricted Project
simoll updated the diff for D125296: [VP] vp intrinsics are not speculatable.

VP intrinsics show UB if the %evl parameter is out of bounds - they must not carry the speculatable attribute. The out-of-bounds UB disappears when the %evl parameter is expanded into the mask or expansion replaces the entire VP intrinsic with non-VP code.

May 25 2022, 7:10 AM · Restricted Project, Restricted Project, Restricted Project
simoll committed rG6e12711081d7: [VP][fix] Don't discard masks in reductions (authored by simoll).
[VP][fix] Don't discard masks in reductions
May 25 2022, 6:55 AM · Restricted Project, Restricted Project
simoll closed D126362: [VP][fix] Don't discard masks in reductions.
May 25 2022, 6:55 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a reviewer for D126363: [VPlan, VP] 1/4 Introduce new recipes to support predicated vectorization: bmahjour.

This review replaces D104608 , which I am effectively commandeering as a co-author for the four original sub-patches by Vineet Kumar.

May 25 2022, 3:15 AM · Restricted Project, Restricted Project, Restricted Project
simoll requested review of D126363: [VPlan, VP] 1/4 Introduce new recipes to support predicated vectorization.
May 25 2022, 3:10 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D125296: [VP] vp intrinsics are not speculatable.

Please review the first of two sub patches: https://reviews.llvm.org/D126362

May 25 2022, 2:44 AM · Restricted Project, Restricted Project, Restricted Project
simoll requested review of D126362: [VP][fix] Don't discard masks in reductions.
May 25 2022, 2:42 AM · Restricted Project, Restricted Project, Restricted Project

May 24 2022

simoll added a comment to D125296: [VP] vp intrinsics are not speculatable.

Note from syncup call: Split up into a fixup for vp reduction expansion and a second patch for the speculatable flag.

May 24 2022, 8:15 AM · Restricted Project, Restricted Project, Restricted Project
simoll added inline comments to D126273: [DAGCombiner][VP] Add DAGCombine for merging VP_FADD and VP_FMUL to VP_FMA..
May 24 2022, 6:47 AM · Restricted Project, Restricted Project
simoll added inline comments to D126273: [DAGCombiner][VP] Add DAGCombine for merging VP_FADD and VP_FMUL to VP_FMA..
May 24 2022, 6:36 AM · Restricted Project, Restricted Project

May 20 2022

Herald added a project to D93755: [VE][isel] Map EVT vectors to vector registers.: Restricted Project.

@kaz7 @efocht Does either of you want to commandeer this patch and become the patch author?

May 20 2022, 5:41 AM · Restricted Project, Restricted Project, Restricted Project
simoll abandoned D71340: [VE,#3] Runtime libraries.
May 20 2022, 5:38 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project
simoll added inline comments to D125296: [VP] vp intrinsics are not speculatable.
May 20 2022, 2:34 AM · Restricted Project, Restricted Project, Restricted Project
simoll updated the diff for D125296: [VP] vp intrinsics are not speculatable.
  • More documentation on isSafeToSpeculativelyExecuteWithOpcode
  • VP Intrinsics cleanup in Intrinsics.td
  • Rebased
  • Formatting
May 20 2022, 2:34 AM · Restricted Project, Restricted Project, Restricted Project

May 10 2022

simoll updated the diff for D125296: [VP] vp intrinsics are not speculatable.

Re-enable crossed out RUN: lines

May 10 2022, 2:46 AM · Restricted Project, Restricted Project, Restricted Project
simoll published D125296: [VP] vp intrinsics are not speculatable for review.
May 10 2022, 2:42 AM · Restricted Project, Restricted Project, Restricted Project

Mar 23 2022

simoll added inline comments to D121292: [VP] Add vp.fcmp comparison intrinsic and docs.
Mar 23 2022, 7:01 AM · Restricted Project, Restricted Project

Mar 22 2022

simoll committed rG7de383c89213: [VP] Fix VPintrinsic::getStaticVectorLength for vp.merge|select (authored by simoll).
[VP] Fix VPintrinsic::getStaticVectorLength for vp.merge|select
Mar 22 2022, 3:43 AM · Restricted Project
simoll closed D121913: [VP] Fix VPintrinsic::getStaticVectorLength for vp.merge|select.
Mar 22 2022, 3:42 AM · Restricted Project, Restricted Project, Restricted Project

Mar 21 2022

simoll edited reviewers for D120094: [CallingConv] Generate isCCArgumentReg() predicate via tablegen, added: kaz7; removed: simoll.
Mar 21 2022, 5:35 AM · Restricted Project, Restricted Project
simoll added a comment to D122042: [VP] Preserve address space of pointer for strided load/store intrinsics..

Update for vp_strided_load|store.ll tests to be fused into this patch (no attribution necessary):

Mar 21 2022, 3:23 AM · Restricted Project, Restricted Project
simoll accepted D122032: [VP] Make VectorBuilder take IRBuilderBase instead of IRBuilder<>.
Mar 21 2022, 2:17 AM · Restricted Project, Restricted Project
simoll updated the diff for D121913: [VP] Fix VPintrinsic::getStaticVectorLength for vp.merge|select.

Document that vp.merge, vp.select do not have a VP mask and why.

Mar 21 2022, 1:15 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D122042: [VP] Preserve address space of pointer for strided load/store intrinsics..

Landing this will break VP strided load/store RVV and VE tests. I'll attach a diff for the VE ones that you can merge into this patch.

Mar 21 2022, 1:11 AM · Restricted Project, Restricted Project

Mar 18 2022

simoll added inline comments to D121714: [VP] fm flag transfer to SDNodes [VE] VVP_FMA fusion.
Mar 18 2022, 5:28 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
simoll updated the diff for D121714: [VP] fm flag transfer to SDNodes [VE] VVP_FMA fusion.
  • alphabetic file order in the CMakeLists.txt
  • unchanged, just to reiterate: contract only checked on the fmul, not on the fadd.
  • formatting
Mar 18 2022, 3:22 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project

Mar 17 2022

simoll requested review of D121913: [VP] Fix VPintrinsic::getStaticVectorLength for vp.merge|select.
Mar 17 2022, 7:48 AM · Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D121816: [Clang][VE] Add vector mask intrinsics to clang.

Having timeout errors on lldb-aarch64-ubuntu builder... I'm not sure what modification on this patch causes a such error.

Mar 17 2022, 3:29 AM · Restricted Project, Restricted Project
simoll updated the diff for D121714: [VP] fm flag transfer to SDNodes [VE] VVP_FMA fusion.

Fixing an SelectionDAG bug where Flags wouldn't be set in some cases of SelectionDAG::getNode. This bug fix may have (positive) trickle down effects on other targets.

Mar 17 2022, 1:32 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project

Mar 16 2022

simoll accepted D121816: [Clang][VE] Add vector mask intrinsics to clang.
Mar 16 2022, 9:36 AM · Restricted Project, Restricted Project
simoll added a comment to D112155: [runtimes] Remove FOO_TARGET_TRIPLE, FOO_SYSROOT and FOO_GCC_TOOLCHAIN.

Remark: Targets need to specify CMAKE_ASM_COMPILER_TARGET or UnwindRegistersRestore|Save.S will be assembled for the default target, not the CMAKE_C|CXX_COMPILER_TARGET.
This caused the libunwind miscompile for the clang-ve-ninja bot: https://lab.llvm.org/buildbot/#/builders/91/builds/4728

Damn, that is tricky. Thanks for the heads up. Do you think we should set CMAKE_ASM_COMPILER_TARGET to CMAKE_CXX_COMPILER_TARGET? [..]
Or perhaps we could do something like:

if (NOT CMAKE_ASM_COMPILER_TARGET STREQUAL CMAKE_CXX_COMPILER_TARGET)
  message(FATAL_ERROR "I'm sure this is an oversight!")
endif()

if (NOT CMAKE_C_COMPILER_TARGET STREQUAL CMAKE_CXX_COMPILER_TARGET)
  message(FATAL_ERROR "I'm sure this is an oversight!")
endif()

Anything to either avoid the miscompile would be better than the status quo, which is kind of a trap.

Do we really know that the ASM target is always the same as the C|CXX target?
+1 for emitting a warning at least. FATAL_ERROR if we know for sure that these are always the same.

Mar 16 2022, 9:34 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
simoll accepted D121799: [VE] Remove output to /dev/stdout.

I always wondered why we are doing the <..

Mar 16 2022, 6:59 AM · Restricted Project, Restricted Project
simoll added a comment to D121292: [VP] Add vp.fcmp comparison intrinsic and docs.

VP intrinsics aren't speculatable because they have undefined behavior if %evl > static vector length

Hmm, this sort of goes against what was said in 9b99927618d338a2be3f13053ccf869a39eb1637.

Mar 16 2022, 4:00 AM · Restricted Project, Restricted Project
simoll updated the diff for D121714: [VP] fm flag transfer to SDNodes [VE] VVP_FMA fusion.

Pass NodeFlags in getNode.
Rebased.

Mar 16 2022, 3:53 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D121586: [Clang][VE] Add the rest of intrinsics to clang.

@kaz7 With rG0aab34410403 in tree you should be able to enable the masked intrinsics now.

Mar 16 2022, 3:35 AM · Restricted Project, Restricted Project
simoll committed rG91fad1167aba: [VE] v512|256 f32|64 fneg isel and tests (authored by simoll).
[VE] v512|256 f32|64 fneg isel and tests
Mar 16 2022, 3:33 AM · Restricted Project
simoll closed D121620: [VE] v512|256 f32|64 fneg isel and tests.
Mar 16 2022, 3:33 AM · Restricted Project, Restricted Project, Restricted Project
simoll committed rG0aab34410403: [Clang] Allow "ext_vector_type" applied to Booleans (authored by simoll).
[Clang] Allow "ext_vector_type" applied to Booleans
Mar 16 2022, 3:12 AM · Restricted Project
simoll closed D88905: [Clang] Allow "ext_vector_type" applied to Booleans.
Mar 16 2022, 3:12 AM · Restricted Project, Restricted Project

Mar 15 2022

simoll updated the diff for D121714: [VP] fm flag transfer to SDNodes [VE] VVP_FMA fusion.

explicit Optional<unsigned>

Mar 15 2022, 11:12 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
simoll added inline comments to D88905: [Clang] Allow "ext_vector_type" applied to Booleans.
Mar 15 2022, 11:09 AM · Restricted Project, Restricted Project
simoll updated the diff for D88905: [Clang] Allow "ext_vector_type" applied to Booleans.

Check IR generated for constexpr of bool vectors. This actually revealed a bug in the bool vector sema (bool vector comparisons would result in byte sized elements). Thx!

Mar 15 2022, 11:08 AM · Restricted Project, Restricted Project
simoll requested review of D121714: [VP] fm flag transfer to SDNodes [VE] VVP_FMA fusion.
Mar 15 2022, 9:33 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project
simoll added a comment to D88905: [Clang] Allow "ext_vector_type" applied to Booleans.

Thanks! Sema bits now LGTM

Mar 15 2022, 9:27 AM · Restricted Project, Restricted Project
simoll added a comment to D121187: [DAGCombiner][VP] Add DAGCombine for VP_MUL..

@jacquesguan Thanks for working on VP optimizations! We just discussed on the VP syncup call where we want to go with this.
The thing is that this patch is essentially replicating the existing MUL DAGCombiner patterns for VP_MUL. Ideally, we would lift the existing DAGCombiner logic to work on both VP and non-VP SDNodes - automatically.

Mar 15 2022, 9:19 AM · Restricted Project, Restricted Project
simoll added inline comments to D121620: [VE] v512|256 f32|64 fneg isel and tests.
Mar 15 2022, 7:40 AM · Restricted Project, Restricted Project, Restricted Project
simoll committed rG6ac3d8ef9c43: [VE] strided v256.23 isel and tests (authored by simoll).
[VE] strided v256.23 isel and tests
Mar 15 2022, 7:30 AM · Restricted Project
simoll closed D121616: [VE] strided v256.23 isel and tests.
Mar 15 2022, 7:30 AM · Restricted Project, Restricted Project, Restricted Project
simoll updated the diff for D121620: [VE] v512|256 f32|64 fneg isel and tests.

Formatting

Mar 15 2022, 6:04 AM · Restricted Project, Restricted Project, Restricted Project
simoll updated the diff for D88905: [Clang] Allow "ext_vector_type" applied to Booleans.

Make the sema check for bool vector arithmetic specific to ext_vector_type bool. This assures that we do not interfere with scalar bool + int vector arithmetic sema (vector.cpp test was failing before).

Mar 15 2022, 6:01 AM · Restricted Project, Restricted Project
simoll updated the diff for D88905: [Clang] Allow "ext_vector_type" applied to Booleans.

Blocking of arithmetic on bool vectors extends to the case where only one operand is a bool vector (and the other is, eg, a vector of int). This was still caught before during sema because the integer vector operand would have been implicitly truncated down to bool, which always has been illegal.
This change results in a different error message for the mixed bool-and-non-bool arithmetic case and a more explicit check in the code ("|| instead of &&").

Mar 15 2022, 5:39 AM · Restricted Project, Restricted Project
simoll added a comment to D121292: [VP] Add vp.fcmp comparison intrinsic and docs.

VP intrinsics aren't speculatable because they have undefined behavior if %evl > static vector length

Mar 15 2022, 3:43 AM · Restricted Project, Restricted Project

Mar 14 2022

simoll updated the diff for D121620: [VE] v512|256 f32|64 fneg isel and tests.

Add missing tests

Mar 14 2022, 11:29 AM · Restricted Project, Restricted Project, Restricted Project
simoll added inline comments to D88905: [Clang] Allow "ext_vector_type" applied to Booleans.
Mar 14 2022, 10:34 AM · Restricted Project, Restricted Project
simoll updated the diff for D88905: [Clang] Allow "ext_vector_type" applied to Booleans.

Simplify following comments. Thx :)

Mar 14 2022, 10:34 AM · Restricted Project, Restricted Project
simoll requested review of D121620: [VE] v512|256 f32|64 fneg isel and tests.
Mar 14 2022, 10:09 AM · Restricted Project, Restricted Project, Restricted Project