Ayal (Ayal Zaks)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 12 2015, 1:48 PM (167 w, 2 d)

Recent Activity

Mon, Sep 24

Ayal accepted D50665: [LV][LAA] Vectorize loop invariant values stored into loop invariant address.

Thanks for taking care of everything, this LGTM now, added only a few minor optional comments.

Mon, Sep 24, 2:55 PM

Wed, Sep 12

Ayal added a comment to D50665: [LV][LAA] Vectorize loop invariant values stored into loop invariant address.

Best allow only a single store to an invariant address for now; until we're sure the last one to store is always identified correctly.

Wed, Sep 12, 5:27 AM

Mon, Sep 10

Ayal added inline comments to D51313: [LV] Fix code gen for conditionally executed uniform loads.
Mon, Sep 10, 4:54 PM

Fri, Sep 7

Ayal added a comment to D51313: [LV] Fix code gen for conditionally executed uniform loads.

(post commit review)

Fri, Sep 7, 9:35 AM

Tue, Aug 28

Ayal added inline comments to D51313: [LV] Fix code gen for conditionally executed uniform loads.
Tue, Aug 28, 3:07 PM
Ayal added a comment to D50823: [VPlan] Introduce VPCmpInst sub-class in the instruction-level representation.

Jumping from D50480:

This patch aims to model a rather special early-exit condition that restricts the execution of the entire loop body to certain iterations, rather than model general compare instructions. If preferred, an "EarlyExit" extended opcode can be introduced instead of the controversial ICmpULE. This should be easy to revisit in the future if needed.

This patch is fine as is, or rather much better with ICmpULE than EarlyExit.

This patch focuses on modeling an early-exit compare and then generating it, w/o making strategic design decisions supporting future vplan-to-vplan transformations, the interfaces they may need, potential templatization, or other long-term high-level VPlan concerns. These should be explained and discussed separately along with pros and cons of alternative solutions for supporting the desired interfaces and for holding their storage, including subclassing VPInstructions, using detached Instructions, or other possibilities.

Sure. I agree.

[Full disclosure] I have a big mental barrier in accepting your "early-exit" terminology here since I relate that term to "break out of the loop", but that's just the terminology difference. Nothing to do with the substance of this patch. [End of full disclosure]

Regarding "using detached Instructions". I fully go against that because that'll forever prohibit moving the VPlan/VPInstructions into Analysis. IR Verifier will trigger if there is a detached IR Instruction at the end of an Analysis pass. I already had a hallway chat with @lattner about a possibility of using IR Instructions and IR CFG in the detached mode (and that also requires many utilities to be usable in detached mode) and he was totally pessimistic about it. That was two years ago at 2016 Developer Conference, but nothing really has changed since then in that regard. That was the end of my hope for using detached IR Instructions, instead of introducing VPInstructions. Detached Instructions under the hood of VPInstructions is not very useful if we can't keep them between vectorization Analysis pass and vectorization Transformation pass.

Tue, Aug 28, 7:46 AM
Ayal added a comment to D51313: [LV] Fix code gen for conditionally executed uniform loads.

The above holds also for conditional loads from non-uniform addresses, that can turn into gathers, but possibly also get incorrectly scalarized w/o branches.

Tue, Aug 28, 6:42 AM

Mon, Aug 27

Ayal added inline comments to D51313: [LV] Fix code gen for conditionally executed uniform loads.
Mon, Aug 27, 4:41 PM

Aug 26 2018

Ayal added a comment to D50665: [LV][LAA] Vectorize loop invariant values stored into loop invariant address.

This is what the langref states for scatter intrinsic (https://llvm.org/docs/LangRef.html#id1792):

. The data stored in memory is a vector of any integer, floating-point or pointer data type. Each vector element is stored in an arbitrary memory address. Scatter with overlapping addresses is guaranteed to be ordered from least-significant to most-significant element.
Aug 26 2018, 9:29 AM
Ayal added a comment to D50480: [LV] Vectorizing loops of arbitrary trip count without remainder under opt for size.

Reverted to use the original ICmpULE extended opcode instead of detached ICmpInst. This can be revised quite easily once VPInstructions acquire any other form of modeling compares.

Since the VPCmpInst code is ready (D50823) and this is a clear use case where we need to model a new compare (including its predicate) that is not in the input IR, I'd appreciate if we could discuss a bit more about using the VPCmpInst approach. At least, I'd like to understand what are the concerns about the VPCmpInst approach and what other people think.

I do have concerns regarding modeling ICmpULE as an opcode only for compare instructions newly created during a VPlan-to-VPlan transformation. For example:

...

Aug 26 2018, 7:21 AM

Aug 23 2018

Ayal added inline comments to D50665: [LV][LAA] Vectorize loop invariant values stored into loop invariant address.
Aug 23 2018, 12:08 PM

Aug 22 2018

Ayal added a comment to D50665: [LV][LAA] Vectorize loop invariant values stored into loop invariant address.

...

Yes, the stores are scalarized. Identical replicas left as-is. Either passes such as load elimination can remove it, or we can clean it up in LV itself.

  • - by revisiting LoopVectorizationCostModel::collectLoopUniforms()? ;-)

Right now, I just run instcombine after loop vectorization to clean up those unnecessary stores (and test cases make sure there's only one store left). Looks like there are other places in LV which relies on InstCombine as the clean up pass, so it may not be that bad after all? Thoughts?

Aug 22 2018, 2:24 PM
Ayal added inline comments to D50480: [LV] Vectorizing loops of arbitrary trip count without remainder under opt for size.
Aug 22 2018, 9:17 AM
Ayal updated the diff for D50480: [LV] Vectorizing loops of arbitrary trip count without remainder under opt for size.

Addressing review comments, rebased, added a couple of asserts.

Aug 22 2018, 8:39 AM
Ayal added inline comments to D50480: [LV] Vectorizing loops of arbitrary trip count without remainder under opt for size.
Aug 22 2018, 5:38 AM

Aug 20 2018

Ayal added a comment to D50665: [LV][LAA] Vectorize loop invariant values stored into loop invariant address.

...

Yes, the stores are scalarized. Identical replicas left as-is. Either passes such as load elimination can remove it, or we can clean it up in LV itself.

Aug 20 2018, 4:07 PM
Ayal added inline comments to D50480: [LV] Vectorizing loops of arbitrary trip count without remainder under opt for size.
Aug 20 2018, 3:08 PM
Ayal updated the diff for D50480: [LV] Vectorizing loops of arbitrary trip count without remainder under opt for size.

Addressed review comments.

Aug 20 2018, 3:01 PM
Ayal accepted D50778: [LV] Vectorize loops where non-phi instructions used outside loop.
Aug 20 2018, 1:16 PM

Aug 19 2018

Ayal added inline comments to D50778: [LV] Vectorize loops where non-phi instructions used outside loop.
Aug 19 2018, 1:02 AM

Aug 15 2018

Ayal added a comment to D50778: [LV] Vectorize loops where non-phi instructions used outside loop.

Suggest to update InnerLoopVectorizer::fixLCSSAPHIs() as follows, now that arbitrary values are allowed to be live-out:

unsigned LastLane = Cost->isUniformAfterVectorization(IncomingValue, VF) ? 0 : VF - 1;
Value *lastIncomingValue =
    getOrCreateScalarValue(IncomingValue, {UF - 1, LastLane});
Aug 15 2018, 3:15 PM
Ayal added a comment to D50480: [LV] Vectorizing loops of arbitrary trip count without remainder under opt for size.

I have a general question about direction, not specific to this patch.

It seems like we're adding a specific form of predication to the vectorizer in this patch and I know we already have support for various predicated load and store idioms. What are our plans in terms of supporting more general predication? For instance, I don't believe we handle loops like the following at the moment:

for (int i = 0; i < N; i++) {
 if (unlikely(i > M)) 
    break;
 sum += a[i];
}

Can the infrastructure in this patch be generalized to handle such cases? And if so, are their any specific plans to do so?

Aug 15 2018, 12:54 PM

Aug 14 2018

Ayal added a comment to D50665: [LV][LAA] Vectorize loop invariant values stored into loop invariant address.

The decision how to vectorize invariant stores also deserves attention: LoopVectorizationCostModel::setCostBasedWideningDecision() considers loads from uniform addresses, but not invariant stores - these may end up being scalarized or becoming a scatter; the former is preferred in this case, as the identical scalarized replicas can later be removed. In any case associated cost estimates should be provided to support overall vectorization costs. Note that vectorizing conditional invariant stores deserves special attention. Unconditional invariant stores are candidates to be sunk out of the loop, preferably before trying to vectorize it. One approach to vectorize a conditional invariant store is to check if its mask is all false, and if not to perform a single invariant scalar store, for lack of a masked-scalar-store instruction. May be worth distinguishing between uniform and divergent conditions; this check is easier to carry out in the former case.

Aug 14 2018, 1:53 PM
Ayal accepted D50579: [LV] Teach about non header phis that have uses outside the loop.
Aug 14 2018, 7:52 AM
Ayal added a comment to D50480: [LV] Vectorizing loops of arbitrary trip count without remainder under opt for size.

Do you see any potential issue that could make modeling this in the VPlan native path complicated once we have predication?

Aug 14 2018, 12:25 AM

Aug 13 2018

Ayal added a comment to D50579: [LV] Teach about non header phis that have uses outside the loop.

Overall looks good to me, though it could be cleaned up a bit more?

Aug 13 2018, 2:12 PM

Aug 12 2018

Ayal added inline comments to D50579: [LV] Teach about non header phis that have uses outside the loop.
Aug 12 2018, 3:03 PM

Aug 11 2018

Ayal added inline comments to D50480: [LV] Vectorizing loops of arbitrary trip count without remainder under opt for size.
Aug 11 2018, 2:06 PM
Ayal added inline comments to D50474: [LV] Vectorize header phis that feed from if-convertable latch phis.
Aug 11 2018, 2:11 AM

Aug 9 2018

Ayal added inline comments to D50474: [LV] Vectorize header phis that feed from if-convertable latch phis.
Aug 9 2018, 3:27 PM

Aug 8 2018

Ayal created D50480: [LV] Vectorizing loops of arbitrary trip count without remainder under opt for size.
Aug 8 2018, 3:11 PM

Jun 14 2018

Ayal added inline comments to D48048: [LV] Prevent LV to run cost model twice for VF=2.
Jun 14 2018, 3:16 PM

Jun 12 2018

Ayal added inline comments to D48048: [LV] Prevent LV to run cost model twice for VF=2.
Jun 12 2018, 1:02 PM

May 1 2018

Ayal added inline comments to D46126: [SLP] Vectorize transposable binary operand bundles.
May 1 2018, 12:31 PM
Ayal added inline comments to D46126: [SLP] Vectorize transposable binary operand bundles.
May 1 2018, 8:46 AM

Apr 30 2018

Ayal added a comment to D46126: [SLP] Vectorize transposable binary operand bundles.

This is reminiscent of LV's interleave group optimization, in the sense that a couple of correlated inefficient vector "gathers" are replaced by a couple of efficiently formed vectors followed by transposing shuffles. The correlated gathers may come from the two operands of a binary operation, as in this patch, or more generally from arbitrary leaves of the SLP tree.

Apr 30 2018, 8:48 AM

Apr 1 2018

Ayal accepted D43776: [SLP] Fix PR36481: vectorize reassociated instructions..

Looks good to me, thanks for addressing the issues, have only a few last minor suggestions.

Apr 1 2018, 12:31 AM

Mar 23 2018

Ayal added a comment to D43776: [SLP] Fix PR36481: vectorize reassociated instructions..

Have test(s) for extractvalue's, for completeness.
Make sure tests cover best-order selection: cases where original order is just as frequent as other orders (tie-break), less frequent, more frequent.

Mar 23 2018, 4:58 PM

Mar 18 2018

Ayal added a comment to D44523: Change calculation of MaxVectorSize.

See MaximizeBandwidth, as in
llvm-dev's Enable vectorizer-maximize-bandwidth by default?
patch: Enable vectorizer-maximize-bandwidth by default.
which is still reverted afaik: r306936 - Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default."

Mar 18 2018, 1:52 PM

Mar 9 2018

Ayal added inline comments to D43776: [SLP] Fix PR36481: vectorize reassociated instructions..
Mar 9 2018, 2:44 PM

Mar 7 2018

Ayal added a comment to D43776: [SLP] Fix PR36481: vectorize reassociated instructions..

This patch addresses the following TODO, plus handles extracts:

Mar 7 2018, 11:39 PM

Feb 27 2018

Ayal resigned from D43812: [LV] Let recordVectorLoopValueForInductionCast to check if IV was created from the cast..
Feb 27 2018, 12:01 PM

Feb 21 2018

Ayal resigned from D43536: [LV] Fix for PR36311, vectorizer's isUniform() abuse triggers assert in SCEV.
Feb 21 2018, 3:26 AM

Feb 10 2018

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..

Hi Ayal, Sanjoy,

The last update's review was pending for long. Off late, SLP has lots of changes so I will have to rebase but before rebasing please see if any more changes required in its current form.

Thanks in advance.

Feb 10 2018, 11:48 PM

Feb 1 2018

Ayal added inline comments to D42123: Derive GEP index type from Data Layout.
Feb 1 2018, 2:49 AM

Jan 15 2018

Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Jan 15 2018, 11:49 PM

Jan 11 2018

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..
In D36130#971181, @Ayal wrote:

This should fix the case observed by @sanjoy in http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171218/511721.html; please also include a testcase.

Test case, test/Transforms/SLPVectorizer/X86/external_user_jumbled_load.ll, already included.

Jan 11 2018, 1:15 PM

Jan 9 2018

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..

This should fix the case observed by @sanjoy in http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171218/511721.html; please also include a testcase.

Jan 9 2018, 8:40 AM

Dec 29 2017

Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Dec 29 2017, 7:31 AM

Dec 21 2017

Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Dec 21 2017, 3:25 AM

Dec 19 2017

Ayal added a comment to D41324: [SLPVectorizer] Add shuffle instruction cost for jumbled load.

Presumably this fixes the reported regressions?

Dec 19 2017, 2:44 AM

Dec 14 2017

Ayal accepted D38948: [LV] Support efficient vectorization of an induction with redundant casts.

Just to formally close this review, as it wasn't closed by the commit.

Dec 14 2017, 11:33 AM

Dec 11 2017

Ayal accepted D36130: [SLP] Vectorize jumbled memory loads..

This looks good to me, with a couple of last minor fixes.

Dec 11 2017, 2:51 PM

Dec 10 2017

Ayal added a comment to D38948: [LV] Support efficient vectorization of an induction with redundant casts.

This looks good to me, but please wait for Silviu to approve as well before committing.

Dec 10 2017, 10:25 AM

Dec 9 2017

Ayal accepted D40883: [LV] Ignore the cost of values that will not appear in the vectorized loop.

LGTM, thanks for taking this out of D38948 and into a separate commit.

Dec 9 2017, 1:57 PM
Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..
In D36130#945728, @Ayal wrote:

Good catch. Add a LIT test?

It was asserting in few of LNT Multisource bench mark. How to extract it for LIT test?

Dec 9 2017, 1:30 PM

Dec 7 2017

Ayal abandoned D28975: [LV] Introducing VPlan to model the vectorized code and drive its transformation.

Hi Ayal,

This functionality has been submitted already, right? If so, please close this review.

Thanks,
--renato

Dec 7 2017, 8:21 AM

Dec 5 2017

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..

Good catch. Add a LIT test?

Dec 5 2017, 2:41 PM

Nov 28 2017

Ayal added a comment to D39346: [LV] [ScalarEvolution] Fix PR34965 - Cache pointer stride information before LV code gen.

Nice catch! Continuing to use SCEV and expect consistent answers in the midst of restructuring the IR is indeed wrong; its cache is not meant to cover for cases that can no longer be analyzed as before.

Nov 28 2017, 3:27 AM

Nov 19 2017

Ayal added inline comments to D38948: [LV] Support efficient vectorization of an induction with redundant casts.
Nov 19 2017, 10:31 AM

Nov 16 2017

Ayal added a comment to D38948: [LV] Support efficient vectorization of an induction with redundant casts.

A few additional minor comments..

Nov 16 2017, 2:39 AM

Nov 2 2017

Ayal added a comment to D38785: [LV/LAA] Avoid specializing a loop for stride=1 when this predicate implies a single-iteration loop.

In short, I think we are all in agreement that:

  1. This patch is a (small) improvement, * regardless of the users and their specific cost considerations *.
  2. The cost-model aspect of deciding when to specialize for a certain stride needs to be improved. The users (LV especially) are currently not making informed decisions.

    …Right?
Nov 2 2017, 2:47 PM
Ayal added inline comments to D38948: [LV] Support efficient vectorization of an induction with redundant casts.
Nov 2 2017, 9:33 AM

Nov 1 2017

Ayal added a comment to D38785: [LV/LAA] Avoid specializing a loop for stride=1 when this predicate implies a single-iteration loop.

Indeed, a loop with an iteration count smaller than VF is definitely not worth vectorizing. An interesting profitability issue is to decide how many iterations past VF suffice to amortize vectorization overheads. In any case, this single/no iteration case looks like a no-brainer and realistic case - traversing a column of an NxN matrix.

Nov 1 2017, 5:47 PM

Oct 2 2017

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..

The regression test result is as follows. There are 2 failures coming from Clang however these failures are also observed without this patch.

Oct 2 2017, 2:55 PM

Sep 29 2017

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..
Sep 29 2017, 12:47 PM

Sep 27 2017

Ayal created D38339: [LV] Fix PR34711 - handle widening of instruction ranges in the presence of sinking casts.
Sep 27 2017, 4:53 PM
Ayal created D38338: [LV] Fix PR34743 - handle casts that sink after interleaved loads.
Sep 27 2017, 4:24 PM

Sep 26 2017

Ayal added a comment to rL311849: [LV] Fix PR34248 - recommit D32871 after revert r311304.

Can you provide a reproducer? Best is to open a PR and continue this discussion there. See e.g. https://bugs.llvm.org/show_bug.cgi?id=34711, which also contains a suggested fix to be upstreamed, which might apply to your case as well.

Sep 26 2017, 3:42 PM

Sep 19 2017

Ayal accepted D36130: [SLP] Vectorize jumbled memory loads..

Agreed, revisiting the ReverseConsecutive/NumLoadsWantToChangeOrder/shouldReorder() logic in view of general shuffled loads deserves a separate patch.

Sep 19 2017, 10:19 AM

Sep 16 2017

Ayal added a comment to D35498: [LoopVectorizer] Use two step casting for float to pointer types..

This was closed due to committing r312331, right? Code LGTM, for the record. Tests for interleaved loads of float/pointer should still be added, as this patch presumably handles them too.

Sep 16 2017, 6:17 AM

Sep 13 2017

Ayal added inline comments to D32729: LV: Don't vectorize with unknown loop counts on divergent targets.
Sep 13 2017, 1:46 PM

Sep 12 2017

Ayal accepted D37507: Fix maximum legal VF calculation.
Sep 12 2017, 9:05 AM
Ayal accepted D37702: [LV] Clamp the VF to the trip count.

@Ayal, any other comments or does this look good to go? Thanks.

Sep 12 2017, 7:51 AM
Ayal added inline comments to D37702: [LV] Clamp the VF to the trip count.
Sep 12 2017, 6:40 AM

Sep 11 2017

Ayal added inline comments to D37702: [LV] Clamp the VF to the trip count.
Sep 11 2017, 2:05 PM
Ayal added a reviewer for D37507: Fix maximum legal VF calculation: mkuper.
Sep 11 2017, 12:29 AM
Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..
In D36130#863237, @Ayal wrote:

Yes, thanks, the example is clear. I agree that having OpdNums allows to represent such cases where two shuffles from the same load are to feed two distinct operands of the same user. But the SLP vectorizer with this patch alone will not optimize such patterns, right? Take for example jumbled-load.ll above; to match the case depicted in the pdf, replace the zeroes used as the 2nd operands of the cmp's and/or of the select's by a shuffle (same as that of the 1st operand or different).

Yes, that's right. However right now I am not clear what needs to be done to optimize those pattern. I also tried another simple case, where 2 different Shuffle from same LOAD is fed into
a MUL. Here, for VF=4, SLP reports "Scalar used twice in bundle" and removes redundant scalar operations instead of vectorization. Consequently, the STOREs of the results of the MULs does not get vectorized. May be we need to do a trade-off between scalar vs vector code here.

Sep 11 2017, 12:28 AM
Ayal added a comment to D37507: Fix maximum legal VF calculation.

Only additional comment is whether the test needs to be X86/skx specific, or whether it can be placed in, say, memdep.ll

Sep 11 2017, 12:17 AM

Sep 10 2017

Ayal added inline comments to D37425: LoopVectorize: MaxVF should not be larger than the loop trip count.
Sep 10 2017, 1:24 AM
Ayal added inline comments to D37507: Fix maximum legal VF calculation.
Sep 10 2017, 12:40 AM

Sep 8 2017

Ayal created D37619: [LV] Fix PR34523 - avoid generating redundant selects.
Sep 8 2017, 3:30 AM

Sep 7 2017

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..

Sep 7 2017, 4:45 AM

Sep 4 2017

Ayal added inline comments to D37425: LoopVectorize: MaxVF should not be larger than the loop trip count.
Sep 4 2017, 2:45 AM
Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Sep 4 2017, 12:39 AM
Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Sep 4 2017, 12:17 AM

Sep 1 2017

Ayal added inline comments to D35498: [LoopVectorizer] Use two step casting for float to pointer types..
Sep 1 2017, 2:58 AM
Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Sep 1 2017, 1:10 AM

Aug 31 2017

Ayal added a comment to D35498: [LoopVectorizer] Use two step casting for float to pointer types..

Sorry, slight overlap. Please follow Ayal's comments before committing.

Aug 31 2017, 4:06 AM
Ayal added a comment to D35498: [LoopVectorizer] Use two step casting for float to pointer types..
  1. Moved tests to Codegen/ARM. The crash disappears without arm tuple so can't remove it
Aug 31 2017, 2:06 AM

Aug 30 2017

Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Aug 30 2017, 2:49 AM

Aug 28 2017

Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Aug 28 2017, 4:11 AM
Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Aug 28 2017, 12:26 AM

Aug 23 2017

Ayal updated the diff for D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Fix PR34248: pack a predicated scalar into a vector only when vectorizing; avoid doing so when only unrolling. Add a test derived from the reproducer of PR34248.

Aug 23 2017, 3:41 PM

Aug 22 2017

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..
In D36130#847002, @Ayal wrote:

sortMemAccesses() is analogous to the formation of InterleaveGroups in the LoopVectorizer, which also scans a collection of Loads (or Stores) to determine if they are adjacent in some order and can be combined into one Vector Load of a given width; and if so, in what order. This requires a single scan to compute the distances relative to the first access, as done here. But knowing that we're looking for a permutation of a given width, we can more easily sort the accesses as they are entered into a map, holding the minimum and maximum indices. See insertMember() there.

Thanks Ayal for your comments.
As this is a long pending work and my current focus is to add this feature in SLP. I welcome your suggestions and would like to consider in a separate patch after this.

Aug 22 2017, 4:42 AM

Aug 20 2017

Ayal added a comment to D17080: [LAA] Allow more run-time alias checks by coercing pointer expressions to AddRecExprs.

@sbaranga, can you clarify my comments and help me understand this better? Hopefully this could move forward.

Aug 20 2017, 4:10 PM
Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..

sortMemAccesses() is analogous to the formation of InterleaveGroups in the LoopVectorizer, which also scans a collection of Loads (or Stores) to determine if they are adjacent in some order and can be combined into one Vector Load of a given width; and if so, in what order. This requires a single scan to compute the distances relative to the first access, as done here. But knowing that we're looking for a permutation of a given width, we can more easily sort the accesses as they are entered into a map, holding the minimum and maximum indices. See insertMember() there.

Aug 20 2017, 3:52 PM

Aug 19 2017

Ayal updated the diff for D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Previous upload missed newly added VPlan.h and VPlan.cpp, including them here. This is the version that was committed.

Aug 19 2017, 11:44 AM

Aug 16 2017

Ayal updated the diff for D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Uploading the version updated to top of trunk before committing, including merging with SinkAfter patch D33058 by reordering ingredients before constructing recipes for them.

Aug 16 2017, 3:33 PM

Aug 13 2017

Ayal updated the diff for D36408: [LV] Minor fixes to Sink casts to unravel first order recurrence.

Is it worth adding a test case for this? I'm not sure...

Aug 13 2017, 9:28 AM