Fix the indention
Active Repositories
- rZORG LLVM Github Zorg
- Thu, May 26, 2:33 PM
- Git
Recent Activity
Today
[CompilerInstance] Tweak the condition on createCodeCompletionConsumer
@mkazantsev wrote:
Will it still work if c1 and c2 are checked in the loop?
- Rebase on top of D126215
- Accept the flag with a warning, rather than erroring out.
@alexbatashev @ftynse thank you for the quick review! If you have no further objections, could someone please approve the revision?
In D126363#3541663, @ym1813382441 wrote:We need to generate code to compute the EVL, its operand should be application vector length ,and EVL maybe depends on the canonical induction recipes.
For example,for (i = 0; i < n; ++i) c[i] = a[i] + b[i];VPlan should be modeled as:
n = trip count vector loop: { EMIT i = canonical induction EMIT avl = n - i EMIT evl = set.evl(avl, ...) (generate explicit vector length) EMIT mask = (all true mask) WIDEN t0 = vp.load(a, mask, evl) WIDEN t1 = vp.load(b, mask, evl) WIDEN t2 = vp.add(t0, t1, mask, evl) WIDEN vp.store(t2, c, mask, evl) EMIT i.next = i + evl EMIT i < n (branch-on-count) }EVL may be different in each vector iteration, and set.evl may also require more parameters, such as the upper limit of the vector length or the width of the data type.
Various situations show that we need to model it as recipe.
And "llvm.set.evl" intrinsic need to be added.
We may refer to RVV architecture to set some restrictions for "llvm.set.evl", like...evl = 0 if avl = 0 evl > 0 if avl > 0 evl ≤ VF evl ≤ avl
Address review comments
Update sort order to apply conditions that constraint ranges using max/min (predicates != EQ & != NE).
In D126363#3538646, @ABataev wrote:In D126363#3538612, @fhahn wrote:In D126363#3537438, @simoll wrote:In D126363#3537416, @ABataev wrote:We store the the %evl as the related value for VPValue *EVL using State.set(EVL, %evl, Part) and then get required value using State.get(EVL, Part).
In this case we can treat EVL similarly to canonical iv, which is not an invariant.Ok. If that works then having one global EVL per State defined this way should be fine for us for now.
The way VectorTripCount is handled at the moment is a workaround and probably shouldn't act as inspiration. AFAICT we already removed all uses during code-gen that where using State to access the vector trip count. If it is needed for code-gen of a recipe, it should be expressed as operand.
Better to treat EVL as CanonicalIV. Yes, I agree that the recipe is better choice here (similar to CaonicalIV) but it requires some extra work because of VPWidenIntOrFpInductionRecipe should depend on EVL. Probably, need to split VPWidenIntOrFpInductionRecipe to a PHI recipe and something like CanonicalIVIncrement, otherwise this dependency prevents it from being vectorized effectively.
If we need to generate code to compute the EVL, it should be modeled as recipe. If the EVL depends on any other recipes (like the canonical induction), it needs to be a recipe. If all that is needed is an opcode and operands, then it should probably just be a additional opcode in VPInstruction, instead of a new recipe.
Here is my suggestion:
- We get an explicit-vector-length recipe to compute EVL inside the vector loop. And this will be the only recipe we add because..
- We extend the existing recipes with an (optional) EVL operand. Presence of EVL implies that VP intrinsics are used for widening.
I'm afraid that it will require HUGE(!!!) amount of changes in the Vplan. I assume, there still will be the same recipe/vpvalue for EVL across of all recipes/vpinstructions.
How? I think there is some kind of misunderstanding here.
There is an existing prototype implementation that uses these three new recipes and a global EVL per vector loop.
The code for the additional recipes is small and - frankly - trivial when you know how to do it. If you use separate recipes, the existing recipes are completely unaffected by this.What part of my suggestion makes you think that there would be huge changes? Is it the adding EVL to existing recipes?
I think when deciding whether to add new recipes here, a key question is what the differences are to other existing recipes. IIUC the only difference between VPPredicatedWidenRecipe and VPWidenRecipe modulo the extra mask & EVL operands is during code-gen, right? But fundamentally both still widen an (arithmetic) operation. Whether some elements may be masked out shouldn't really matter for VPlan based analysis at the moment.
I suppose? These are the differences between VPPredicatedWidenRecipe and VPWidenRecipe, i see:
- Mask & EVL operands. The Mask operand could be optional (which implies a constant all-one mask), EVL is mandatory.
- EVL has significant performance implications on RVV and VE. If you count costing as a VPlan based analysis, then that would be one analysis in favor for having two recipe kinds.
- VPPredicatedWiden* recipes always widen to VP intrinsics. VPWiden* don't.
rebased.
LGTM
In D126364#3540673, @efriedma wrote:For FENV_ROUND, I think we should try to stick to the standard as closely as possible in Sema; since the standard models FENV_ROUND as a separate state, Sema should also model it as a separate state.
LGTM, thanks!
In D126484#3542066, @thakis wrote:Looks like this breaks tests: http://45.33.8.238/linux/77072/step_11.txt
Please take a look and revert for now if it takes a while to fix.
Rebased and added '-fno-delayed-template-parsing' option for the test file.
Of course, I forgot to create a new diff. There is some debug output and logic errors for some of the tests, but the simple ones already show the issue.
The function in question is getPackTemplateParameter, which provides similar functionality to isExpandedParameter before
Looks reasonable for extra tests.
In D126451#3541590, @zixuan-wu wrote:I don't think it is largely similar to RISCV implementation except for the code structure and test reuse. And the test result is big different.
ping
In D126465#3540710, @reames wrote:In D126465#3540602, @craig.topper wrote:In D126465#3540599, @reames wrote:I'm a bit confused by this patch. Where does the VLEN > 32 bit come from? All I kind find in the spec is that VLEN >= ELEN, and must be a power of 2. Given ELEN only has to be 8, shouldn't the smallest VLEN also be 8? This seems like an utterly useless configuration, but the spec seems to allow it?
VLEN is determined by the Zvl32b, Zvl64b, Zvl128b, etc. extensions. V implies Zvl128b. Zve64* implies Zvl64b. Zve32* implies Zvl32b. VLEN can never be less than 32 with the currently defined extensions.
Thanks for the pointer. The wording here is as follows:
"Note: Explicit use of the Zvl32b extension string is not required for any standard vector extension as they all effectively mandate at least this minimum, but the string can be useful when stating hardware capabilities."Reviewing 18.2 and 18.3 confirms that none of the proposed vector variants allow VLEN < 32.
Looks like this breaks tests: http://45.33.8.238/linux/77072/step_11.txt