Page MenuHomePhabricator

anton-afanasyev (Anton Afanasyev)
User

Projects

User does not belong to any projects.

User Details

User Since
Sep 30 2018, 2:53 PM (132 w, 4 h)

Recent Activity

Fri, Apr 9

anton-afanasyev added inline comments to D98714: [SLP] Add insertelement instructions to vectorizable tree.
Fri, Apr 9, 10:19 AM · Restricted Project
anton-afanasyev added inline comments to D98714: [SLP] Add insertelement instructions to vectorizable tree.
Fri, Apr 9, 10:07 AM · Restricted Project
anton-afanasyev added inline comments to D98714: [SLP] Add insertelement instructions to vectorizable tree.
Fri, Apr 9, 9:49 AM · Restricted Project
anton-afanasyev added inline comments to D98714: [SLP] Add insertelement instructions to vectorizable tree.
Fri, Apr 9, 9:09 AM · Restricted Project

Wed, Apr 7

anton-afanasyev updated the summary of D98714: [SLP] Add insertelement instructions to vectorizable tree.
Wed, Apr 7, 3:49 AM · Restricted Project
anton-afanasyev updated the diff for D98714: [SLP] Add insertelement instructions to vectorizable tree.

Update summary

Wed, Apr 7, 3:48 AM · Restricted Project
anton-afanasyev added a comment to D98714: [SLP] Add insertelement instructions to vectorizable tree.

Refactored code, removed new TreeEntry::State. Make InsertElementInst passing through ordinary scheduling, allowing inner deps for bundle of insertelements.

Wed, Apr 7, 3:47 AM · Restricted Project

Wed, Mar 17

anton-afanasyev closed D98423: Fix the trunc instruction insertion problem in SLP pass.

Closed by commit: https://reviews.llvm.org/rG9abe50047330
(modified summary a bit to comply with changes)

Wed, Mar 17, 4:26 AM · Restricted Project
anton-afanasyev committed rG9abe50047330: [SLP] Fix the trunc instruction insertion problem (authored by bule).
[SLP] Fix the trunc instruction insertion problem
Wed, Mar 17, 3:52 AM
anton-afanasyev added a comment to D97691: [SLP] Honor min/max regsize and min/max VF in vectorizeStores.

LGTM except for the comment. I think MaxVecRegSize % EltSize != 0 check could be removed (together with comment).

Wed, Mar 17, 3:49 AM · Restricted Project
anton-afanasyev added inline comments to D98423: Fix the trunc instruction insertion problem in SLP pass.
Wed, Mar 17, 3:09 AM · Restricted Project
anton-afanasyev closed D98596: precommit for D98423.

Closed by commit https://reviews.llvm.org/rGdd90c36d601e

Wed, Mar 17, 3:08 AM · Restricted Project
anton-afanasyev committed rGdd90c36d601e: [SLP][Test] Precommit test for D98423 (authored by bule).
[SLP][Test] Precommit test for D98423
Wed, Mar 17, 2:12 AM
anton-afanasyev reopened D98596: precommit for D98423.

Sorry, prematurely closed this revision by mistake.

Wed, Mar 17, 1:02 AM · Restricted Project
anton-afanasyev closed D98596: precommit for D98423.

Fix the duplicate element error that cause the failure in CI

Ok, is it working now? I could commit it on your behalf.

Thanks a lot, I am waiting for the build status. Does it still need anton @anton-afanasyev to review it first? since I have changed the code a lot.

Wed, Mar 17, 1:00 AM · Restricted Project

Tue, Mar 16

anton-afanasyev added inline comments to D98714: [SLP] Add insertelement instructions to vectorizable tree.
Tue, Mar 16, 10:00 AM · Restricted Project
anton-afanasyev added a comment to D98596: precommit for D98423.

Fix the duplicate element error that cause the failure in CI

Tue, Mar 16, 8:48 AM · Restricted Project
anton-afanasyev abandoned D72689: [SLP] Revectorize partially vectorized instructions.

The same work is resumed here: D98714, so abandoning this.

Tue, Mar 16, 8:33 AM · Restricted Project
anton-afanasyev abandoned D96791: [SLP] Double UserCost compensation for vector store of aggregate.

Here is another fix approach: https://reviews.llvm.org/D98714

Tue, Mar 16, 8:23 AM · Restricted Project
anton-afanasyev requested review of D98714: [SLP] Add insertelement instructions to vectorizable tree.
Tue, Mar 16, 8:21 AM · Restricted Project

Mon, Mar 15

anton-afanasyev committed rG3cec93b405f2: [SLP][Test] Precommit test for PR40522 (authored by anton-afanasyev).
[SLP][Test] Precommit test for PR40522
Mon, Mar 15, 5:54 AM

Sun, Mar 14

anton-afanasyev accepted D98423: Fix the trunc instruction insertion problem in SLP pass.

LGTM

Sun, Mar 14, 10:41 AM · Restricted Project

Sat, Mar 13

anton-afanasyev added inline comments to D98423: Fix the trunc instruction insertion problem in SLP pass.
Sat, Mar 13, 9:15 PM · Restricted Project
anton-afanasyev accepted D98596: precommit for D98423.
Sat, Mar 13, 9:07 PM · Restricted Project
anton-afanasyev added a comment to D96791: [SLP] Double UserCost compensation for vector store of aggregate.

I'm trying another approach here, so abandoned this for a while.

Sat, Mar 13, 4:44 AM · Restricted Project

Mar 11 2021

anton-afanasyev added a comment to D98423: Fix the trunc instruction insertion problem in SLP pass.

Also could you please precommit test or just make diff against it (to see test changes before/after patch)?

Mar 11 2021, 9:15 AM · Restricted Project

Mar 3 2021

anton-afanasyev added inline comments to D97691: [SLP] Honor min/max regsize and min/max VF in vectorizeStores.
Mar 3 2021, 12:35 PM · Restricted Project

Feb 26 2021

anton-afanasyev accepted D94992: [SLP]Merge reorder and reuse shuffles..

LGTM

Feb 26 2021, 2:29 PM · Restricted Project

Feb 25 2021

anton-afanasyev added a comment to D94992: [SLP]Merge reorder and reuse shuffles..

LG after addressing all comments.

Feb 25 2021, 3:21 AM · Restricted Project

Feb 22 2021

anton-afanasyev committed rG5207151cf652: [SLP][Test] Add test for PR49081.ll (authored by anton-afanasyev).
[SLP][Test] Add test for PR49081.ll
Feb 22 2021, 10:38 PM
anton-afanasyev added a comment to D57059: [SLP] Initial support for the vectorization of the non-power-of-2 vectors..

Actually, it is not reducing. This is how test-suite python script works. So, here lhs - number of instructions after this patch, rhs - before. And the less relative number, the more vector instructions we actually generate.

Feb 22 2021, 6:34 AM · Restricted Project
anton-afanasyev added a comment to D57059: [SLP] Initial support for the vectorization of the non-power-of-2 vectors..

This is an integral patch, going to split it into several smaller patches.

Feb 22 2021, 5:09 AM · Restricted Project
anton-afanasyev added a comment to D57059: [SLP] Initial support for the vectorization of the non-power-of-2 vectors..

Btw, how could it be explained NumVectorInstructions stat reducing after this patch?

Feb 22 2021, 5:05 AM · Restricted Project

Feb 18 2021

anton-afanasyev updated the diff for D96791: [SLP] Double UserCost compensation for vector store of aggregate.

Added fp2int scalar and vector test cases

Feb 18 2021, 5:10 AM · Restricted Project
anton-afanasyev added a comment to D96791: [SLP] Double UserCost compensation for vector store of aggregate.

I can add fp2int sample, but it is not touched by this patch. Actually, its tree built is not vectorized by marking as "tiny" (R.isTreeTinyAndNotFullyVectorizable() function), its size is 2 (since fp2int is unary operation, whereas add is binary one). Though it could be fixed by checking that stores are using its result and therefore tree size could be increased. But this looks too hacky, isn't it?

Feb 18 2021, 2:39 AM · Restricted Project
anton-afanasyev updated the diff for D96791: [SLP] Double UserCost compensation for vector store of aggregate.

Cleaned pr40522.ll

Feb 18 2021, 2:29 AM · Restricted Project
anton-afanasyev added a comment to D96791: [SLP] Double UserCost compensation for vector store of aggregate.

I think the better approach would be to pass the list of InsertUses as a buildTree function UserIgnoreLst argument. You just need to correctly generate ExtractElement instructions for these InsertElements because currently the compiler just crashes trying to remove instructions used as operands for the original InsertElements. Thoughts?

Feb 18 2021, 1:18 AM · Restricted Project

Feb 17 2021

anton-afanasyev accepted D96818: [SLP]No need to mark scatter load pointer as scalar as it gets vectorized..

LGTM

Feb 17 2021, 7:40 AM · Restricted Project

Feb 16 2021

anton-afanasyev requested review of D96791: [SLP] Double UserCost compensation for vector store of aggregate.
Feb 16 2021, 8:32 AM · Restricted Project

Feb 5 2021

anton-afanasyev abandoned D94974: [SLP] Try doubled MaxElts for stores vectorization.
Feb 5 2021, 5:43 AM · Restricted Project

Feb 2 2021

anton-afanasyev added a comment to D94974: [SLP] Try doubled MaxElts for stores vectorization.

Due to what said above, I'm to abandon this change. It looks like over-optimization, breaking llvm IR middle-end abstraction.

Feb 2 2021, 7:20 AM · Restricted Project
anton-afanasyev added inline comments to D94974: [SLP] Try doubled MaxElts for stores vectorization.
Feb 2 2021, 7:20 AM · Restricted Project

Jan 27 2021

anton-afanasyev added a comment to D57779: [SLP] Add support for throttling..

At Dinar's request, I've measured compile time regression: http://llvm-compile-time-tracker.com/compare.php?from=f3449ed6073cac58efd9b62d0eb285affa650238&to=39362e11add238c45a7a7d55c1e002005f396fb7&stat=instructions. The regression is visible, but it is acceptable for such change imho. The largest regression comes from CMakeFiles/clamscan.dir/libclamav_uuencode.c.o (+11.28%), so one can investigate this particular file.

Jan 27 2021, 11:10 AM · Restricted Project
anton-afanasyev added inline comments to D94974: [SLP] Try doubled MaxElts for stores vectorization.
Jan 27 2021, 10:34 AM · Restricted Project
anton-afanasyev updated the diff for D94974: [SLP] Try doubled MaxElts for stores vectorization.

Clean up tests

Jan 27 2021, 10:33 AM · Restricted Project
anton-afanasyev added inline comments to D94974: [SLP] Try doubled MaxElts for stores vectorization.
Jan 27 2021, 9:22 AM · Restricted Project
anton-afanasyev updated the diff for D94974: [SLP] Try doubled MaxElts for stores vectorization.

Small test fix

Jan 27 2021, 9:19 AM · Restricted Project

Jan 19 2021

anton-afanasyev updated the diff for D94974: [SLP] Try doubled MaxElts for stores vectorization.

Fixed comment containing this D94974 revision number

Jan 19 2021, 8:21 AM · Restricted Project
anton-afanasyev requested review of D94974: [SLP] Try doubled MaxElts for stores vectorization.
Jan 19 2021, 8:19 AM · Restricted Project

Jan 18 2021

anton-afanasyev added a comment to D94713: Do not traverse ConstantData use-list in SLPVectorizer.

You are planning to revert this patch after ConstantData use-list removing, am I right?

Jan 18 2021, 2:24 AM · Restricted Project

Jan 13 2021

anton-afanasyev accepted D94446: [SLP] Don't vectorize stores of non-packed types (like i1, i2).

Yes, thanks.

Jan 13 2021, 3:52 PM · Restricted Project

Jan 12 2021

anton-afanasyev added a comment to D94446: [SLP] Don't vectorize stores of non-packed types (like i1, i2).

Looks good, but could you please precommit test (and rebase) to see actual output difference?

Jan 12 2021, 7:19 AM · Restricted Project

Jan 4 2021

anton-afanasyev accepted D93967: [SLP]Need shrink the load vector after reordering..

LGTM

Jan 4 2021, 6:59 AM · Restricted Project

Jan 1 2021

anton-afanasyev added inline comments to D93967: [SLP]Need shrink the load vector after reordering..
Jan 1 2021, 9:26 AM · Restricted Project

Dec 18 2020

anton-afanasyev added a comment to D93192: [SLP] Fix vector element size for the store chains.

In-short: I'm still do sure that current patch is bugfixing. But this bug had induced some of boundary vectorization cases before fixing. Below is a typical case.

Dec 18 2020, 8:06 AM · Restricted Project

Dec 14 2020

anton-afanasyev added a comment to D93192: [SLP] Fix vector element size for the store chains.
Dec 14 2020, 8:47 AM · Restricted Project
anton-afanasyev committed rGfac7c7ec3ccd: [SLP] Fix vector element size for the store chains (authored by anton-afanasyev).
[SLP] Fix vector element size for the store chains
Dec 14 2020, 4:54 AM
anton-afanasyev closed D93192: [SLP] Fix vector element size for the store chains.
Dec 14 2020, 4:54 AM · Restricted Project

Dec 13 2020

anton-afanasyev committed rGb8c847ee731b: [SLP][Test] Precommit test for D93192 (authored by anton-afanasyev).
[SLP][Test] Precommit test for D93192
Dec 13 2020, 10:27 PM
anton-afanasyev requested review of D93192: [SLP] Fix vector element size for the store chains.
Dec 13 2020, 10:22 PM · Restricted Project

Dec 9 2020

anton-afanasyev committed rGe5bf2e898946: [SLP] Use the width of value truncated just before storing (authored by anton-afanasyev).
[SLP] Use the width of value truncated just before storing
Dec 9 2020, 5:39 AM
anton-afanasyev closed D92824: [SLP] Use the width of value truncated just before storing.
Dec 9 2020, 5:39 AM · Restricted Project
anton-afanasyev added a comment to D57059: [SLP] Initial support for the vectorization of the non-power-of-2 vectors..

AFAICT the only outstanding question is whether the compile time increase is acceptable?

Dec 9 2020, 4:58 AM · Restricted Project

Dec 8 2020

anton-afanasyev added inline comments to D92824: [SLP] Use the width of value truncated just before storing.
Dec 8 2020, 7:06 AM · Restricted Project
anton-afanasyev added a comment to D92824: [SLP] Use the width of value truncated just before storing.

@anton-afanasyev Please can you rebase? I added extra test coverage in rG41d0666391131ddee451085c72ba6513872e7f6c

Dec 8 2020, 6:15 AM · Restricted Project
anton-afanasyev updated the diff for D92824: [SLP] Use the width of value truncated just before storing.

Rebase

Dec 8 2020, 6:11 AM · Restricted Project
anton-afanasyev planned changes to D91919: [SLP] Make SLPVectorizer to use `llvm.masked.scatter` intrinsic.

Actually NFC now, since stores could be only seed entry, but we collect only consecutive stores there for now. It will be changed in future commits.

Are you looking at supporting non-consecutive stores? This patch probably shouldn't be reviewed until we actually have something that uses and tests it.

Dec 8 2020, 4:57 AM · Restricted Project
anton-afanasyev added inline comments to D92824: [SLP] Use the width of value truncated just before storing.
Dec 8 2020, 2:12 AM · Restricted Project
anton-afanasyev requested review of D92824: [SLP] Use the width of value truncated just before storing.
Dec 8 2020, 1:59 AM · Restricted Project
anton-afanasyev committed rG6c3f56efa6e6: [SLP][Test] Differentiate SSE/AVX512 test coverage (NFC) (authored by anton-afanasyev).
[SLP][Test] Differentiate SSE/AVX512 test coverage (NFC)
Dec 8 2020, 1:02 AM

Dec 7 2020

anton-afanasyev committed rG50bff64158e9: [SLP][Test] Add test for PR46983 (authored by anton-afanasyev).
[SLP][Test] Add test for PR46983
Dec 7 2020, 10:08 AM
anton-afanasyev added inline comments to D57779: [SLP] Add support for throttling..
Dec 7 2020, 8:43 AM · Restricted Project
anton-afanasyev accepted D92668: [SLP]Merge reorder and reuse shuffles..

Looks good to me, just one style remark: we have newlines between function members for all classes throughout this module, so I'd prefer to see the same for ShuffleInstructionBuilder class.

Dec 7 2020, 7:00 AM · Restricted Project
anton-afanasyev added inline comments to D92668: [SLP]Merge reorder and reuse shuffles..
Dec 7 2020, 6:50 AM · Restricted Project
anton-afanasyev added inline comments to D92668: [SLP]Merge reorder and reuse shuffles..
Dec 7 2020, 6:37 AM · Restricted Project

Dec 5 2020

anton-afanasyev added a comment to D92701: [SLPVectorize] Call isLegalMaskedGather before creating a gather TreeEntry.

I'm agree with @fhahn -- tuning cost model is more right way. Also it could be the case of arch where gathers are missing but it's beneficial to use them for vectorization tree building. They are lowered to scalarized instrs further.

Dec 5 2020, 3:01 AM · Restricted Project

Dec 4 2020

anton-afanasyev added a comment to D92668: [SLP]Merge reorder and reuse shuffles..

The same work will be done by instcombine, so we can just zero redundant cost here to gain the same effect globally. But it makes sense to make shuffle merging at the vectorization stage as well, of course.

Dec 4 2020, 2:09 PM · Restricted Project

Dec 1 2020

anton-afanasyev updated subscribers of D57059: [SLP] Initial support for the vectorization of the non-power-of-2 vectors..

Btw, I've observed significant compile-time regression with this patch: http://llvm-compile-time-tracker.com/compare.php?from=99d82412f822190a6caa3e3a5b9f87b71f56de47&to=81b636bae72c967f526bcd18de45a6f4a76daa41&stat=instructions (thanks to @nikic for awesome service). This could be justified in case of comparable performance improvements but have you done any benchmarking?

Dec 1 2020, 3:29 PM · Restricted Project

Nov 25 2020

anton-afanasyev added a comment to D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.

It sounds like throttling patch should resolve this issue as cutting out ScatterVectorize entry with high cost will effectively return to previous behavior.

Nov 25 2020, 10:02 AM · Restricted Project
anton-afanasyev added a comment to D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.

Talked with @dtemirbulatov privately and reached a consensus that his patch reviews.llvm.org/D57779 fixes the issue defined above by @vdmitrie in general.

Nov 25 2020, 7:54 AM · Restricted Project
anton-afanasyev added a comment to D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.

Current SLP has significant drawback with regard to its cost modeling. And this patch highlights it.
Consider we have four scalar loads of i8 type. With prior approach (vectorization overhead) we had cost for such entry 4 (x86 target).
With this new approach we have two entries instead of one: ScatterVectorize loads + NeedToGather GEPs. And costs for these entries are 6 and 10 respectively, thus cost increased from 4 to 16.
And the problem here is once we put this pattern into the tree it pulls cost up for the entire tree. If we have multiple such patterns over the tree their effect is magnified. These entries finally outweigh possible profit of vectorization for remaining portion of the tree and we end up not vectorizing it at all (even if downstream optimizations could probably change it into optimal code). If SLP could make choice vectorization overhead vs gather intrinsic based in their costs while building vectorizable tree the outcome could be different.

Nov 25 2020, 3:20 AM · Restricted Project

Nov 23 2020

anton-afanasyev added inline comments to D57059: [SLP] Initial support for the vectorization of the non-power-of-2 vectors..
Nov 23 2020, 1:01 PM · Restricted Project
anton-afanasyev added a comment to D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.

I believe this issue is related to the default cost for getGatherScatterOpCost(). For the arch not having gather/scatter instrs we use TargetTransformInfoImplBase::getGatherScatterOpCost() which returns 1 unconditionally: https://github.com/llvm/llvm-project/blob/release/11.x/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h#L480
I'd fix it by setting something like 1024 for the default cost.

Nov 23 2020, 8:52 AM · Restricted Project
anton-afanasyev updated the diff for D91919: [SLP] Make SLPVectorizer to use `llvm.masked.scatter` intrinsic.

Align fix. Style fix.

Nov 23 2020, 2:08 AM · Restricted Project

Nov 21 2020

anton-afanasyev added a comment to D91919: [SLP] Make SLPVectorizer to use `llvm.masked.scatter` intrinsic.

Analogue of https://reviews.llvm.org/D90445 for stores.

Nov 21 2020, 11:05 AM · Restricted Project
anton-afanasyev requested review of D91919: [SLP] Make SLPVectorizer to use `llvm.masked.scatter` intrinsic.
Nov 21 2020, 11:03 AM · Restricted Project

Nov 20 2020

anton-afanasyev committed rG6f1c07b23a1c: [SLP][Test] Update pr47269.ll test. NFC (authored by anton-afanasyev).
[SLP][Test] Update pr47269.ll test. NFC
Nov 20 2020, 7:35 AM

Nov 18 2020

anton-afanasyev added a comment to D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.

Saw some random miscompiles after this. Fixed by 4dbe12e86649ba6b5f03a9ba97e84d718727f7a7, can you check if I got it right?

Nov 18 2020, 4:19 AM · Restricted Project

Nov 17 2020

anton-afanasyev added a comment to D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.

Fixed

Nov 17 2020, 7:49 AM · Restricted Project
anton-afanasyev committed rG0a1d315f9f16: [SLPVectorizer] Fix assert (authored by anton-afanasyev).
[SLPVectorizer] Fix assert
Nov 17 2020, 7:47 AM
anton-afanasyev added a comment to D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.

It looks like this change may cause a crash when building LNT, e.g. http://lab.llvm.org:8011/#/builders/105/builds/1899/steps/7/logs/stdio

Nov 17 2020, 7:32 AM · Restricted Project
anton-afanasyev committed rGfcad8d3635cf: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic (authored by anton-afanasyev).
[SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic
Nov 17 2020, 7:12 AM
anton-afanasyev closed D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.
Nov 17 2020, 7:12 AM · Restricted Project
anton-afanasyev added inline comments to D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.
Nov 17 2020, 12:13 AM · Restricted Project
anton-afanasyev updated the diff for D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.

Added another assert

Nov 17 2020, 12:13 AM · Restricted Project

Nov 16 2020

anton-afanasyev updated the diff for D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.

Fixes and assert

Nov 16 2020, 3:24 PM · Restricted Project
anton-afanasyev added inline comments to D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.
Nov 16 2020, 3:22 PM · Restricted Project
anton-afanasyev added inline comments to D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.
Nov 16 2020, 2:18 PM · Restricted Project
anton-afanasyev updated the diff for D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.

Add overloaded newTreeEntry() for common case

Nov 16 2020, 2:18 PM · Restricted Project
anton-afanasyev added inline comments to D90445: [SLP] Make SLPVectorizer to use `llvm.masked.gather` intrinsic.
Nov 16 2020, 1:19 PM · Restricted Project