Ayal (Ayal Zaks)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 12 2015, 1:48 PM (157 w, 3 d)

Recent Activity

Jun 14 2018

Ayal added inline comments to D48048: [LV] Prevent LV to run cost model twice for VF=2.
Jun 14 2018, 3:16 PM

Jun 12 2018

Ayal added inline comments to D48048: [LV] Prevent LV to run cost model twice for VF=2.
Jun 12 2018, 1:02 PM

May 1 2018

Ayal added inline comments to D46126: [SLP] Vectorize transposable binary operand bundles.
May 1 2018, 12:31 PM
Ayal added inline comments to D46126: [SLP] Vectorize transposable binary operand bundles.
May 1 2018, 8:46 AM

Apr 30 2018

Ayal added a comment to D46126: [SLP] Vectorize transposable binary operand bundles.

This is reminiscent of LV's interleave group optimization, in the sense that a couple of correlated inefficient vector "gathers" are replaced by a couple of efficiently formed vectors followed by transposing shuffles. The correlated gathers may come from the two operands of a binary operation, as in this patch, or more generally from arbitrary leaves of the SLP tree.

Apr 30 2018, 8:48 AM

Apr 1 2018

Ayal accepted D43776: [SLP] Fix PR36481: vectorize reassociated instructions..

Looks good to me, thanks for addressing the issues, have only a few last minor suggestions.

Apr 1 2018, 12:31 AM

Mar 23 2018

Ayal added a comment to D43776: [SLP] Fix PR36481: vectorize reassociated instructions..

Have test(s) for extractvalue's, for completeness.
Make sure tests cover best-order selection: cases where original order is just as frequent as other orders (tie-break), less frequent, more frequent.

Mar 23 2018, 4:58 PM

Mar 18 2018

Ayal added a comment to D44523: Change calculation of MaxVectorSize.

See MaximizeBandwidth, as in
llvm-dev's Enable vectorizer-maximize-bandwidth by default?
patch: Enable vectorizer-maximize-bandwidth by default.
which is still reverted afaik: r306936 - Revert "r306473 - re-commit r306336: Enable vectorizer-maximize-bandwidth by default."

Mar 18 2018, 1:52 PM

Mar 9 2018

Ayal added inline comments to D43776: [SLP] Fix PR36481: vectorize reassociated instructions..
Mar 9 2018, 2:44 PM

Mar 7 2018

Ayal added a comment to D43776: [SLP] Fix PR36481: vectorize reassociated instructions..

This patch addresses the following TODO, plus handles extracts:

Mar 7 2018, 11:39 PM

Feb 27 2018

Ayal resigned from D43812: [LV] Let recordVectorLoopValueForInductionCast to check if IV was created from the cast..
Feb 27 2018, 12:01 PM

Feb 21 2018

Ayal resigned from D43536: [LV] Fix for PR36311, vectorizer's isUniform() abuse triggers assert in SCEV.
Feb 21 2018, 3:26 AM

Feb 10 2018

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..

Hi Ayal, Sanjoy,

The last update's review was pending for long. Off late, SLP has lots of changes so I will have to rebase but before rebasing please see if any more changes required in its current form.

Thanks in advance.

Feb 10 2018, 11:48 PM

Feb 1 2018

Ayal added inline comments to D42123: Derive GEP index type from Data Layout.
Feb 1 2018, 2:49 AM

Jan 15 2018

Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Jan 15 2018, 11:49 PM

Jan 11 2018

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..
In D36130#971181, @Ayal wrote:

This should fix the case observed by @sanjoy in http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171218/511721.html; please also include a testcase.

Test case, test/Transforms/SLPVectorizer/X86/external_user_jumbled_load.ll, already included.

Jan 11 2018, 1:15 PM

Jan 9 2018

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..

This should fix the case observed by @sanjoy in http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171218/511721.html; please also include a testcase.

Jan 9 2018, 8:40 AM

Dec 29 2017

Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Dec 29 2017, 7:31 AM

Dec 21 2017

Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Dec 21 2017, 3:25 AM

Dec 19 2017

Ayal added a comment to D41324: [SLPVectorizer] Add shuffle instruction cost for jumbled load.

Presumably this fixes the reported regressions?

Dec 19 2017, 2:44 AM

Dec 14 2017

Ayal accepted D38948: [LV] Support efficient vectorization of an induction with redundant casts.

Just to formally close this review, as it wasn't closed by the commit.

Dec 14 2017, 11:33 AM

Dec 11 2017

Ayal accepted D36130: [SLP] Vectorize jumbled memory loads..

This looks good to me, with a couple of last minor fixes.

Dec 11 2017, 2:51 PM

Dec 10 2017

Ayal added a comment to D38948: [LV] Support efficient vectorization of an induction with redundant casts.

This looks good to me, but please wait for Silviu to approve as well before committing.

Dec 10 2017, 10:25 AM

Dec 9 2017

Ayal accepted D40883: [LV] Ignore the cost of values that will not appear in the vectorized loop.

LGTM, thanks for taking this out of D38948 and into a separate commit.

Dec 9 2017, 1:57 PM
Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..
In D36130#945728, @Ayal wrote:

Good catch. Add a LIT test?

It was asserting in few of LNT Multisource bench mark. How to extract it for LIT test?

Dec 9 2017, 1:30 PM

Dec 7 2017

Ayal abandoned D28975: [LV] Introducing VPlan to model the vectorized code and drive its transformation.

Hi Ayal,

This functionality has been submitted already, right? If so, please close this review.

Thanks,
--renato

Dec 7 2017, 8:21 AM

Dec 5 2017

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..

Good catch. Add a LIT test?

Dec 5 2017, 2:41 PM

Nov 28 2017

Ayal added a comment to D39346: [LV] [ScalarEvolution] Fix PR34965 - Cache pointer stride information before LV code gen.

Nice catch! Continuing to use SCEV and expect consistent answers in the midst of restructuring the IR is indeed wrong; its cache is not meant to cover for cases that can no longer be analyzed as before.

Nov 28 2017, 3:27 AM

Nov 19 2017

Ayal added inline comments to D38948: [LV] Support efficient vectorization of an induction with redundant casts.
Nov 19 2017, 10:31 AM

Nov 16 2017

Ayal added a comment to D38948: [LV] Support efficient vectorization of an induction with redundant casts.

A few additional minor comments..

Nov 16 2017, 2:39 AM

Nov 2 2017

Ayal added a comment to D38785: [LV/LAA] Avoid specializing a loop for stride=1 when this predicate implies a single-iteration loop.

In short, I think we are all in agreement that:

  1. This patch is a (small) improvement, * regardless of the users and their specific cost considerations *.
  2. The cost-model aspect of deciding when to specialize for a certain stride needs to be improved. The users (LV especially) are currently not making informed decisions.

    …Right?
Nov 2 2017, 2:47 PM
Ayal added inline comments to D38948: [LV] Support efficient vectorization of an induction with redundant casts.
Nov 2 2017, 9:33 AM

Nov 1 2017

Ayal added a comment to D38785: [LV/LAA] Avoid specializing a loop for stride=1 when this predicate implies a single-iteration loop.

Indeed, a loop with an iteration count smaller than VF is definitely not worth vectorizing. An interesting profitability issue is to decide how many iterations past VF suffice to amortize vectorization overheads. In any case, this single/no iteration case looks like a no-brainer and realistic case - traversing a column of an NxN matrix.

Nov 1 2017, 5:47 PM

Oct 2 2017

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..

The regression test result is as follows. There are 2 failures coming from Clang however these failures are also observed without this patch.

Oct 2 2017, 2:55 PM

Sep 29 2017

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..
Sep 29 2017, 12:47 PM

Sep 27 2017

Ayal created D38339: [LV] Fix PR34711 - handle widening of instruction ranges in the presence of sinking casts.
Sep 27 2017, 4:53 PM
Ayal created D38338: [LV] Fix PR34743 - handle casts that sink after interleaved loads.
Sep 27 2017, 4:24 PM

Sep 26 2017

Ayal added a comment to rL311849: [LV] Fix PR34248 - recommit D32871 after revert r311304.

Can you provide a reproducer? Best is to open a PR and continue this discussion there. See e.g. https://bugs.llvm.org/show_bug.cgi?id=34711, which also contains a suggested fix to be upstreamed, which might apply to your case as well.

Sep 26 2017, 3:42 PM

Sep 19 2017

Ayal accepted D36130: [SLP] Vectorize jumbled memory loads..

Agreed, revisiting the ReverseConsecutive/NumLoadsWantToChangeOrder/shouldReorder() logic in view of general shuffled loads deserves a separate patch.

Sep 19 2017, 10:19 AM

Sep 16 2017

Ayal added a comment to D35498: [LoopVectorizer] Use two step casting for float to pointer types..

This was closed due to committing r312331, right? Code LGTM, for the record. Tests for interleaved loads of float/pointer should still be added, as this patch presumably handles them too.

Sep 16 2017, 6:17 AM

Sep 13 2017

Ayal added inline comments to D32729: LV: Don't vectorize with unknown loop counts on divergent targets.
Sep 13 2017, 1:46 PM

Sep 12 2017

Ayal accepted D37507: Fix maximum legal VF calculation.
Sep 12 2017, 9:05 AM
Ayal accepted D37702: [LV] Clamp the VF to the trip count.

@Ayal, any other comments or does this look good to go? Thanks.

Sep 12 2017, 7:51 AM
Ayal added inline comments to D37702: [LV] Clamp the VF to the trip count.
Sep 12 2017, 6:40 AM

Sep 11 2017

Ayal added inline comments to D37702: [LV] Clamp the VF to the trip count.
Sep 11 2017, 2:05 PM
Ayal added a reviewer for D37507: Fix maximum legal VF calculation: mkuper.
Sep 11 2017, 12:29 AM
Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..
In D36130#863237, @Ayal wrote:

Yes, thanks, the example is clear. I agree that having OpdNums allows to represent such cases where two shuffles from the same load are to feed two distinct operands of the same user. But the SLP vectorizer with this patch alone will not optimize such patterns, right? Take for example jumbled-load.ll above; to match the case depicted in the pdf, replace the zeroes used as the 2nd operands of the cmp's and/or of the select's by a shuffle (same as that of the 1st operand or different).

Yes, that's right. However right now I am not clear what needs to be done to optimize those pattern. I also tried another simple case, where 2 different Shuffle from same LOAD is fed into
a MUL. Here, for VF=4, SLP reports "Scalar used twice in bundle" and removes redundant scalar operations instead of vectorization. Consequently, the STOREs of the results of the MULs does not get vectorized. May be we need to do a trade-off between scalar vs vector code here.

Sep 11 2017, 12:28 AM
Ayal added a comment to D37507: Fix maximum legal VF calculation.

Only additional comment is whether the test needs to be X86/skx specific, or whether it can be placed in, say, memdep.ll

Sep 11 2017, 12:17 AM

Sep 10 2017

Ayal added inline comments to D37425: LoopVectorize: MaxVF should not be larger than the loop trip count.
Sep 10 2017, 1:24 AM
Ayal added inline comments to D37507: Fix maximum legal VF calculation.
Sep 10 2017, 12:40 AM

Sep 8 2017

Ayal created D37619: [LV] Fix PR34523 - avoid generating redundant selects.
Sep 8 2017, 3:30 AM

Sep 7 2017

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..

Sep 7 2017, 4:45 AM

Sep 4 2017

Ayal added inline comments to D37425: LoopVectorize: MaxVF should not be larger than the loop trip count.
Sep 4 2017, 2:45 AM
Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Sep 4 2017, 12:39 AM
Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Sep 4 2017, 12:17 AM

Sep 1 2017

Ayal added inline comments to D35498: [LoopVectorizer] Use two step casting for float to pointer types..
Sep 1 2017, 2:58 AM
Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Sep 1 2017, 1:10 AM

Aug 31 2017

Ayal added a comment to D35498: [LoopVectorizer] Use two step casting for float to pointer types..

Sorry, slight overlap. Please follow Ayal's comments before committing.

Aug 31 2017, 4:06 AM
Ayal added a comment to D35498: [LoopVectorizer] Use two step casting for float to pointer types..
  1. Moved tests to Codegen/ARM. The crash disappears without arm tuple so can't remove it
Aug 31 2017, 2:06 AM

Aug 30 2017

Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Aug 30 2017, 2:49 AM

Aug 28 2017

Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Aug 28 2017, 4:11 AM
Ayal added inline comments to D36130: [SLP] Vectorize jumbled memory loads..
Aug 28 2017, 12:26 AM

Aug 23 2017

Ayal updated the diff for D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Fix PR34248: pack a predicated scalar into a vector only when vectorizing; avoid doing so when only unrolling. Add a test derived from the reproducer of PR34248.

Aug 23 2017, 3:41 PM

Aug 22 2017

Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..
In D36130#847002, @Ayal wrote:

sortMemAccesses() is analogous to the formation of InterleaveGroups in the LoopVectorizer, which also scans a collection of Loads (or Stores) to determine if they are adjacent in some order and can be combined into one Vector Load of a given width; and if so, in what order. This requires a single scan to compute the distances relative to the first access, as done here. But knowing that we're looking for a permutation of a given width, we can more easily sort the accesses as they are entered into a map, holding the minimum and maximum indices. See insertMember() there.

Thanks Ayal for your comments.
As this is a long pending work and my current focus is to add this feature in SLP. I welcome your suggestions and would like to consider in a separate patch after this.

Aug 22 2017, 4:42 AM

Aug 20 2017

Ayal added a comment to D17080: [LAA] Allow more run-time alias checks by coercing pointer expressions to AddRecExprs.

@sbaranga, can you clarify my comments and help me understand this better? Hopefully this could move forward.

Aug 20 2017, 4:10 PM
Ayal added a comment to D36130: [SLP] Vectorize jumbled memory loads..

sortMemAccesses() is analogous to the formation of InterleaveGroups in the LoopVectorizer, which also scans a collection of Loads (or Stores) to determine if they are adjacent in some order and can be combined into one Vector Load of a given width; and if so, in what order. This requires a single scan to compute the distances relative to the first access, as done here. But knowing that we're looking for a permutation of a given width, we can more easily sort the accesses as they are entered into a map, holding the minimum and maximum indices. See insertMember() there.

Aug 20 2017, 3:52 PM

Aug 19 2017

Ayal updated the diff for D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Previous upload missed newly added VPlan.h and VPlan.cpp, including them here. This is the version that was committed.

Aug 19 2017, 11:44 AM

Aug 16 2017

Ayal updated the diff for D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Uploading the version updated to top of trunk before committing, including merging with SinkAfter patch D33058 by reordering ingredients before constructing recipes for them.

Aug 16 2017, 3:33 PM

Aug 13 2017

Ayal updated the diff for D36408: [LV] Minor fixes to Sink casts to unravel first order recurrence.

Is it worth adding a test case for this? I'm not sure...

Aug 13 2017, 9:28 AM

Aug 8 2017

Ayal added a comment to D36244: [LoopVectorize] Fix assertion failure in Fcmp vectorization.

Looks good to me, please wait for @mssimpso to approve.

Aug 8 2017, 8:37 AM

Aug 7 2017

Ayal created D36408: [LV] Minor fixes to Sink casts to unravel first order recurrence.
Aug 7 2017, 9:27 AM

Aug 4 2017

Ayal added inline comments to D36244: [LoopVectorize] Fix assertion failure in Fcmp vectorization.
Aug 4 2017, 2:27 PM

Aug 2 2017

Ayal added inline comments to D36244: [LoopVectorize] Fix assertion failure in Fcmp vectorization.
Aug 2 2017, 11:37 PM

Jul 27 2017

Ayal added a comment to D35227: [LV] Don't allow outside uses of IVs if the SCEV is predicated on loop conditions.

Instead of refraining to vectorize a loop which has an externally used phi (or rather the bump thereof) and any predicate, can a predicate be added (or an existing one be extended) to also cover the last iteration? Pity to bail out on such corner cases.

Jul 27 2017, 1:00 AM

Jul 21 2017

Ayal added a comment to D35725: [LV] Avoid redundant operations manipulating masks.

This looks fine to me, but I don't quite follow why the patch is needed. Are we just trying to match what the VPlan recipe will eventually do?

Jul 21 2017, 2:26 PM
Ayal created D35725: [LV] Avoid redundant operations manipulating masks.
Jul 21 2017, 7:38 AM

Jul 19 2017

Ayal added a comment to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Ping

Jul 19 2017, 6:31 AM

Jul 18 2017

Ayal added inline comments to D35498: [LoopVectorizer] Use two step casting for float to pointer types..
Jul 18 2017, 11:37 PM
Ayal added a comment to D35498: [LoopVectorizer] Use two step casting for float to pointer types..

... Or maybe we just load them as "data" (i32/i64?) and then bitcast safely?

Makes sense?

Jul 18 2017, 1:38 PM

Jul 12 2017

Ayal updated the diff for D34150: [LV] Test once if vector trip count is zero, instead of twice.

Added the following comment following review, and updated a testcase that was recently modified (if-conversion-nest.ll)

Jul 12 2017, 3:15 PM

Jul 11 2017

Ayal updated subscribers of D34150: [LV] Test once if vector trip count is zero, instead of twice.

Re overflow - the point is that getOrCreateTripCount() returns, basically, PSE.getBackedgeTakenCount() + 1, and that may overflow, so the "trip count" may end up being 0 if the backedge taken count is 0. I don't think this is outdated, and this is behavior we want to preserve. But this patch should preserve this behavior IIUC.

Jul 11 2017, 4:20 PM

Jul 3 2017

Ayal updated the diff for D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Patch updated to llvm trunk, adapted to the new ValueMap interface of D34473. ValueMap is extracted to a standalone struct VectorizerValueMap.

Jul 3 2017, 7:33 AM

Jun 28 2017

Ayal added a comment to D34150: [LV] Test once if vector trip count is zero, instead of twice.

ping

Jun 28 2017, 12:44 PM
Ayal updated the diff for D34373: [LV] Optimize for size when vectorizing loops with tiny trip count.

Updated version includes the comment requested by @hfinkel.

Jun 28 2017, 10:34 AM
Ayal created D34760: [LV] Fix PR33613 - retain order of insertelements per part.
Jun 28 2017, 9:04 AM

Jun 24 2017

Ayal updated the diff for D34473: [LV] Changing the interface of ValueMap.

Updated version addresses review comments.

Jun 24 2017, 4:23 PM
Ayal added inline comments to D34473: [LV] Changing the interface of ValueMap.
Jun 24 2017, 3:39 PM

Jun 21 2017

Ayal added a comment to D34373: [LV] Optimize for size when vectorizing loops with tiny trip count.

Yes, we saw a couple of ~7% improvements running eembc benchmarks on x86.

Jun 21 2017, 1:45 PM
Ayal added a comment to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

An alternative interface which may be simpler and clearer ... it may be better done in a separate patch.

Jun 21 2017, 1:04 PM
Ayal created D34473: [LV] Changing the interface of ValueMap.
Jun 21 2017, 1:00 PM

Jun 19 2017

Ayal created D34373: [LV] Optimize for size when vectorizing loops with tiny trip count.
Jun 19 2017, 4:49 PM

Jun 18 2017

Ayal added inline comments to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.
Jun 18 2017, 8:46 AM

Jun 14 2017

Ayal added a comment to D32451: Improve profile-guided heuristics to use estimated trip count..

This has been conceptually approved, but is pending:

Jun 14 2017, 10:16 AM
Ayal added a reviewer for D34150: [LV] Test once if vector trip count is zero, instead of twice: jmolloy.
Jun 14 2017, 9:38 AM

Jun 13 2017

Ayal created D34150: [LV] Test once if vector trip count is zero, instead of twice.
Jun 13 2017, 9:00 AM
Ayal added inline comments to D33058: [LV] Sink casts to unravel first order recurrence.
Jun 13 2017, 4:52 AM
Ayal updated the diff for D33058: [LV] Sink casts to unravel first order recurrence.

In D33058#751569, @mssimpso wrote:
I'm wondering if it's worth extending this to handle other kinds of instructions/expressions.

Jun 13 2017, 4:48 AM

Jun 12 2017

Ayal updated the diff for D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

Updated following review comments.

Jun 12 2017, 5:12 AM
Ayal added inline comments to D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.
Jun 12 2017, 2:35 AM

Jun 7 2017

Ayal updated the diff for D32871: [LV] Using VPlan to model the vectorized code and drive its transformation.

(Addressing review comments; To be completed)

Done. Will upload updated version shortly.

Jun 7 2017, 8:58 AM