Page MenuHomePhabricator

sebpop (Sebastian Pop)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 4 2014, 10:15 AM (267 w, 6 d)

Recent Activity

Thu, Sep 19

sebpop added a comment to D67645: [aarch64] add def-pats for dot product.

I looked at both the SLP and loop vectorizer and I think this is more work than I can do right now.

Thu, Sep 19, 7:53 AM · Restricted Project

Wed, Sep 18

sebpop updated the diff for D67645: [aarch64] add def-pats for dot product.
Wed, Sep 18, 12:07 PM · Restricted Project
sebpop added a comment to D67645: [aarch64] add def-pats for dot product.
In D67645#1674197, @az wrote:

There are few things missing in current work such as indexed dot product or what they call s/udot (vector, by element) in the ARM document (no need to do it now but a comment about that would help).

Wed, Sep 18, 11:58 AM · Restricted Project
sebpop added a comment to D67645: [aarch64] add def-pats for dot product.

to do the heavy lifting, this is probably a task for the loop vectorizer.

Wed, Sep 18, 11:23 AM · Restricted Project
sebpop updated the diff for D67645: [aarch64] add def-pats for dot product.
Wed, Sep 18, 7:58 AM · Restricted Project
sebpop added a comment to D67645: [aarch64] add def-pats for dot product.

To catch more dot product cases, we need to fix the passes above instruction selection.

Wed, Sep 18, 7:49 AM · Restricted Project
sebpop added a comment to D67645: [aarch64] add def-pats for dot product.
  • was just curious about the AddedComplexity = 30
Wed, Sep 18, 6:30 AM · Restricted Project

Tue, Sep 17

sebpop added a comment to D67645: [aarch64] add def-pats for dot product.

I've got a cheeky request, and I appreciate that should go in a separate patch, but while you're at at it would you mind repeating this exercise for the ARM backend and AArch32?

Tue, Sep 17, 7:51 AM · Restricted Project
sebpop updated the diff for D67645: [aarch64] add def-pats for dot product.

The new patch does not use the first argument of the dot product instruction: we now set it to zero.
Patch tested on x86_64-apple-darwin with make check-all.

Tue, Sep 17, 7:43 AM · Restricted Project

Mon, Sep 16

sebpop planned changes to D67645: [aarch64] add def-pats for dot product.
Mon, Sep 16, 10:28 PM · Restricted Project
sebpop created D67645: [aarch64] add def-pats for dot product.
Mon, Sep 16, 7:26 PM · Restricted Project

Sat, Sep 14

sebpop accepted D67576: [AArch64] Some more FP16 FMA pattern matching.

Excellent!
Thanks for catching those patterns.
Please commit.

Sat, Sep 14, 12:30 PM · Restricted Project
sebpop added a comment to D67368: [NFCI]Create CommonAttributeInfo Type as base type of *Attr and ParsedAttr..

I still see a link error on aarch64-linux on master:

/usr/bin/ld: tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/AttrImpl.cpp.o: in function `clang::AttributeCommonInfo::getAttributeSpellingListIndex() const':
/home/ubuntu/llvm-project/llvm/tools/clang/include/clang/Basic/AttributeCommonInfo.h:166: undefined reference to `clang::AttributeCommonInfo::calculateAttributeSpellingListIndex() const'
collect2: error: ld returned 1 exit status

I reverted locally this patch and it finishes building.

Sat, Sep 14, 9:42 AM · Restricted Project

Fri, Sep 13

sebpop committed rGd93e136be14c: [aarch64] move custom isel of extract_vector_elt to td file - NFC (authored by sebpop).
[aarch64] move custom isel of extract_vector_elt to td file - NFC
Fri, Sep 13, 12:33 PM

Thu, Sep 12

sebpop added a comment to D67497: [aarch64] move custom isel of extract_vector_elt to td file - NFC.

The patch looks okay to me, but I am still curious what happens with i16. The lowering to umov w8, v0.h[1] in build-vector-extract.ll is probably the interesting one. This is probably covered by rule:

Thu, Sep 12, 2:59 PM · Restricted Project
sebpop accepted D67403: [AArch64] MachineCombiner FMA matching. NFC..

Looks good to me, please apply.

Thu, Sep 12, 11:17 AM · Restricted Project
sebpop updated the diff for D67497: [aarch64] move custom isel of extract_vector_elt to td file - NFC.

Updated patch by removing the patterns that generate i16. Patch passes make check-all on aarch64-linux.

Thu, Sep 12, 9:06 AM · Restricted Project
sebpop added a comment to D67497: [aarch64] move custom isel of extract_vector_elt to td file - NFC.

The tablegen error only happens on the i16 patterns:

def : Pat<(i16 (extractelt (v8i16 V128:$V), (i64 0))), (EXTRACT_SUBREG V128:$V, hsub)>;
def : Pat<(i16 (extractelt (v4i16 V64:$V), (i64 0))), (EXTRACT_SUBREG V64:$V, hsub)>;

If I remove these two patterns, make check-all passes.

Thu, Sep 12, 8:37 AM · Restricted Project
sebpop added a comment to D67497: [aarch64] move custom isel of extract_vector_elt to td file - NFC.

Is there a test case that checks that this change does not break what the code in AArch64DAGToDAGISel::Select() was meant to handle?

Thu, Sep 12, 8:10 AM · Restricted Project
sebpop added inline comments to D67497: [aarch64] move custom isel of extract_vector_elt to td file - NFC.
Thu, Sep 12, 7:51 AM · Restricted Project
sebpop created D67497: [aarch64] move custom isel of extract_vector_elt to td file - NFC.
Thu, Sep 12, 7:13 AM · Restricted Project

Wed, Sep 11

sebpop added inline comments to D67403: [AArch64] MachineCombiner FMA matching. NFC..
Wed, Sep 11, 9:21 AM · Restricted Project
sebpop accepted D67403: [AArch64] MachineCombiner FMA matching. NFC..
Wed, Sep 11, 9:13 AM · Restricted Project
sebpop added a comment to D67403: [AArch64] MachineCombiner FMA matching. NFC..

Never mind. You cannot do the other ones as it would call Match too many times and would not follow the semantics of the original code.

Wed, Sep 11, 8:44 AM · Restricted Project
sebpop added a comment to D67403: [AArch64] MachineCombiner FMA matching. NFC..

Almost LGTM.

Wed, Sep 11, 8:39 AM · Restricted Project

Tue, Sep 10

sebpop added a comment to D67403: [AArch64] MachineCombiner FMA matching. NFC..

I like the patch. Thanks!

Tue, Sep 10, 5:54 PM · Restricted Project

Sat, Sep 7

sebpop committed rGeacb2c2c975c: [aarch64] Add combine patterns for fp16 fmla (authored by sebpop).
[aarch64] Add combine patterns for fp16 fmla
Sat, Sep 7, 1:25 PM

Fri, Sep 6

sebpop created D67297: [aarch64] Add combine patterns for fp16 fmla.
Fri, Sep 6, 12:04 PM · Restricted Project

Aug 20 2019

sebpop committed rG5a7bba09acff: [AArch64][asan] fix typo in AsanStats::Print (authored by sebpop).
[AArch64][asan] fix typo in AsanStats::Print
Aug 20 2019, 4:32 PM
sebpop committed rG63487bfec927: [AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA (authored by sebpop).
[AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA
Aug 20 2019, 1:54 PM
sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Updated patch to current llvm trunk.

Aug 20 2019, 12:18 PM · Restricted Project
sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .
Aug 20 2019, 11:55 AM · Restricted Project
sebpop added inline comments to D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .
Aug 20 2019, 11:53 AM · Restricted Project

Aug 15 2019

sebpop added a comment to D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Ping patch.
The last version of the patch addresses all the comments from reviews.
Ok to commit?

Aug 15 2019, 9:05 AM · Restricted Project

Jun 11 2019

sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

For some reason asan/tests/asan_noinst_test.cc is not compiled on make check-asan and that has exposed a compile error that was not triggered by the other tests:

sanitizer_double_allocator.h:30:11: error: use of non-static data member 'use_first_' of 'DoubleAllocator' from nested type 'DoubleAllocatorCache'
      if (use_first_)
          ^~~~~~~~~~

The updated patch fixes this by accessing the non-static field of the enclosing class through a this pointer to one of the instances:

-      if (use_first_)
+      if (this->use_first_)
Jun 11 2019, 3:09 PM · Restricted Project
sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

The updated patch passes make check-lsan check-asan and is still under test for check-all on aarch64-linux.

Jun 11 2019, 1:16 PM · Restricted Project
sebpop added a comment to D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

The updated patch I will post is addressing all the review comments.

Jun 11 2019, 1:13 PM · Restricted Project

May 19 2019

sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Addressed comments from @vitalybuka: factored up the 3 versions and added more tests.
Passes with no new fails ninja check-all on an AArch64 Graviton A1 instance.

May 19 2019, 1:19 AM · Restricted Project

May 10 2019

sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Fix the x86_64 overflow warning with 1ULL << 48.

May 10 2019, 5:16 PM · Restricted Project
sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Updated patch fixes ASan.

May 10 2019, 2:29 PM · Restricted Project
sebpop added inline comments to D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .
May 10 2019, 8:25 AM · Restricted Project

May 9 2019

sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

I have verified that the updated patch compiles and it reduces the execution time of leak sanitized trivial example.

May 9 2019, 10:41 PM · Restricted Project

Apr 12 2019

sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

This patch reduces the number of #ifdefs as suggested by Kostya, and speeds up both the leak and address sanitizers on aarch64.
Passes check-all on x86_64-linux and aarch64-linux is still under test.
Worked with Brian Rzycki @brzycki.

Apr 12 2019, 12:54 PM · Restricted Project

Apr 8 2019

sebpop added a comment to D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Also, this is changing only the standalone lsan, not lsan used as part of asan. Right?
standalone lsan is not widely used, AFAICT.

Apr 8 2019, 1:58 PM · Restricted Project
sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Address review comments from Kostya: move AArch64 lsan allocator to a separate file to avoid #ifdefs.

Apr 8 2019, 1:15 PM · Restricted Project

Apr 5 2019

sebpop added a comment to D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Ok, I will prepare an updated patch.
Thanks Brian and Kostya for your reviews.

Apr 5 2019, 5:10 PM · Restricted Project
sebpop accepted D60284: [JumpThreading] [PR40992] Fix miscompile when folding a successor of an indirectbr.

Looks good to me.

Apr 5 2019, 9:14 AM · Restricted Project

Apr 3 2019

sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Rebased patch on today's trunk.

Apr 3 2019, 8:08 PM · Restricted Project
sebpop created D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .
Apr 3 2019, 8:02 PM · Restricted Project
sebpop added a comment to D32530: [SVE][IR] Scalable Vector IR Type.

I'd advise caution here, it's really significant/impactful change, and a single sign-off is a bit worrying.

Apr 3 2019, 7:21 AM · Restricted Project

Mar 29 2019

sebpop accepted D32530: [SVE][IR] Scalable Vector IR Type.

I am accepting the scalable vector types based on the comments in
http://lists.llvm.org/pipermail/llvm-dev/2019-March/131137.html

Mar 29 2019, 11:51 AM · Restricted Project

Feb 21 2019

sebpop added a comment to D57953: [Jump Threading] Convert conditional branches into unconditional branches using GVN results.

This seems to be a useful transform that is not yet covered by the current implementation of jump-threading.
I think GCC calls it control dependence DCE.
Please run and report performance numbers on the testsuite and other benchmarks that you have access to.

Feb 21 2019, 9:04 AM

Nov 1 2018

sebpop abandoned D53588: [hot-cold-split] split more than a cold region per function.
Nov 1 2018, 11:22 AM
sebpop added a comment to D53887: [HotColdSplitting] Outline more than once per function.

Maybe you can add the testcase from my previous patch: https://reviews.llvm.org/D53588

Nov 1 2018, 11:22 AM

Oct 29 2018

sebpop accepted D53835: [HotColdSplitting] Use TTI to inform outlining threshold.

Looks good to me.

Oct 29 2018, 3:56 PM
sebpop accepted D53824: [HotColdSplitting] Allow outlining single-block cold regions.

Ok, thanks!

Oct 29 2018, 12:02 PM

Oct 25 2018

sebpop added a comment to D53588: [hot-cold-split] split more than a cold region per function.
In D53588#1276693, @vsk wrote:

@sebpop are you interested in rebasing this on the new cold block propagation code? If not, I'd be happy to give it a try. Is the main challenge in teaching CodeExtractor to update the DT and PDT?

Oct 25 2018, 7:02 PM

Oct 24 2018

sebpop accepted D53627: [HotColdSplitting] Identify larger cold regions using domtree queries.

The change looks good to me. Thanks!

Oct 24 2018, 3:03 PM
sebpop accepted D53534: [hot-cold-split] Name split functions with ".cold" suffix.

ok

Oct 24 2018, 10:04 AM

Oct 23 2018

sebpop accepted D53627: [HotColdSplitting] Identify larger cold regions using domtree queries.

ok

Oct 23 2018, 10:10 PM
sebpop added a comment to D53534: [hot-cold-split] Name split functions with ".cold" suffix.

The change looks good with some minor changes. Thanks!

Oct 23 2018, 8:06 PM
sebpop added a comment to D53588: [hot-cold-split] split more than a cold region per function.
In D53588#1272789, @vsk wrote:

@sebpop thanks for this patch! I don't see any problems with it (although I would prefer that the test explicitly check that outlined functions contain the correct instructions).

At a higher-level, I'm seeing some problems with the forward/back cold propagation done in getHotBlocks on internal projects. The propagation seems to stop when it encounters simple control flow, like an if-then-else or a for loop, after which cold code is unconditionally executed.

I have a prototype of a different propagation scheme which overcomes some of these limitations. The idea is to mark blocks which are post-dominated by a cold block, or are dominated by a cold block, as cold. This is able to handle the control flow I described, and isn't limited to requiring a single exit block. Could you give me a day to evaluate it further, run benchmarks etc. and report back? If it turns out to be promising, istm that it'd make sense to rebase this patch on top of it.

Oct 23 2018, 11:22 AM
sebpop accepted D51861: [LSR] Combine unfolded offset into invariant register.

ok

Oct 23 2018, 10:52 AM
sebpop added inline comments to D53534: [hot-cold-split] Name split functions with ".cold" suffix.
Oct 23 2018, 10:13 AM
sebpop created D53588: [hot-cold-split] split more than a cold region per function.
Oct 23 2018, 10:11 AM
sebpop accepted D53518: [HotColdSplitting] Attach MinSize to outlined code.

ok

Oct 23 2018, 10:06 AM

Oct 22 2018

sebpop added inline comments to D53534: [hot-cold-split] Name split functions with ".cold" suffix.
Oct 22 2018, 9:09 PM
sebpop accepted D53505: [hot-cold-split] Add missing FileCheck invocations.

ok

Oct 22 2018, 10:58 AM
sebpop accepted D53512: [hot-cold-split] Add opt remark on success.

ok

Oct 22 2018, 10:55 AM

Oct 20 2018

sebpop accepted D53437: Schedule Hot Cold Splitting pass after most optimization passes.

ok

Oct 20 2018, 8:47 AM

Oct 15 2018

sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.
In D52904#1266032, @NoQ wrote:
Oct 15 2018, 5:46 PM
sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.

Fixed in https://reviews.llvm.org/rL344566

Oct 15 2018, 3:45 PM
sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.

With this patch I now see a 3% speedup on sqlite vs. no hot-cold-split pass.

Awesome! Thanks for improving this!

Oct 15 2018, 2:30 PM
sebpop updated the summary of D52904: [hot-cold-split] fix static analysis of cold regions.
Oct 15 2018, 2:26 PM
sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.

Are both fixes necessary to fix the issue (the one for back propagation and the one to bail out if the entry block is cold), or is either one sufficient? The patch description only mentions the former.

Oct 15 2018, 2:22 PM
sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.

Are both fixes necessary to fix the issue (the one for back propagation and the one to bail out if the entry block is cold), or is either one sufficient? The patch description only mentions the former.

Oct 15 2018, 1:44 PM
sebpop updated the diff for D52904: [hot-cold-split] fix static analysis of cold regions.

Fix two comments from @tejohnson.

Oct 15 2018, 12:51 PM
sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.

I will post a patch to fix the comments from @tejohnson.

Oct 15 2018, 12:49 PM
sebpop added a reviewer for D53267: [CodeExtractor] Erase debug intrinsics in outlined thunks: tejohnson.

Does this patch fix https://bugs.llvm.org/show_bug.cgi?id=22900
In which case you may want to add the testcases from that bug.

Oct 15 2018, 11:07 AM

Oct 5 2018

sebpop updated the diff for D52904: [hot-cold-split] fix static analysis of cold regions.

Added an early return for outlining if function entry is cold.
Added a check for invoke calls: invoke should not be marked as cold by back propagation.

Oct 5 2018, 8:20 AM

Oct 4 2018

sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.

Before this patch we have a 10% regression in sqlite with hot-cold-split pass.
With this patch I now see a 3% speedup on sqlite vs. no hot-cold-split pass.

Oct 4 2018, 1:38 PM
sebpop created D52904: [hot-cold-split] fix static analysis of cold regions.
Oct 4 2018, 1:35 PM

Oct 2 2018

sebpop accepted D52704: Improve static analysis of cold basic blocks.

Looks good.

Oct 2 2018, 11:45 AM

Oct 1 2018

sebpop added a comment to D52708: Add support for new pass manager.

Please add a testcase that will exercise the new flag.

Oct 1 2018, 6:48 AM

Sep 27 2018

sebpop added a comment to D50658: Hot cold splitting pass.

I see that there is a new Pass Manager version of your new pass, but I don't see that it is ever being enabled in the new pass manager pipeline (or tested). Are you planning to add that?

Pinging question. It should be straightforward to add to the new PM, is that something you plan to do soon as a follow-on?

Sep 27 2018, 12:51 PM

Sep 21 2018

sebpop accepted D52367: Remove address taken, add optnone.

lgtm

Sep 21 2018, 10:47 AM

Sep 12 2018

sebpop accepted D51980: [GVNHoist] computeInsertionPoints() miscalculates the Iterated Dominance Frontiers.

Looks good to me.

Sep 12 2018, 7:01 AM

Sep 11 2018

sebpop added a comment to D50658: Hot cold splitting pass.

I broke the check in https://reviews.llvm.org/rL341838 and fixed it in https://reviews.llvm.org/rL341839

Sep 11 2018, 8:55 AM

Sep 10 2018

sebpop accepted D51801: [MemorySSAUpdater] Avoid creating self-referencing MemoryDefs.

Looks good to me.

Sep 10 2018, 10:04 AM

Sep 6 2018

sebpop accepted D49151: [SimplifyIndVar] Avoid generating truncate instructions with non-hoisted Load operand..

The patch looks good to me.
Please address the last two comments and then apply.

Sep 6 2018, 4:18 PM
sebpop added inline comments to D49151: [SimplifyIndVar] Avoid generating truncate instructions with non-hoisted Load operand..
Sep 6 2018, 12:43 PM

Sep 5 2018

sebpop accepted D50658: Hot cold splitting pass.

lgtm, thanks!

Sep 5 2018, 1:33 PM

Sep 4 2018

sebpop added inline comments to D49151: [SimplifyIndVar] Avoid generating truncate instructions with non-hoisted Load operand..
Sep 4 2018, 11:51 AM
sebpop added inline comments to D49151: [SimplifyIndVar] Avoid generating truncate instructions with non-hoisted Load operand..
Sep 4 2018, 11:45 AM

Aug 25 2018

sebpop added inline comments to D50658: Hot cold splitting pass.
Aug 25 2018, 2:37 PM

Aug 23 2018

sebpop updated the diff for D50658: Hot cold splitting pass.
Aug 23 2018, 2:22 PM
sebpop commandeered D50658: Hot cold splitting pass.
Aug 23 2018, 2:21 PM

Aug 20 2018

sebpop added inline comments to D50658: Hot cold splitting pass.
Aug 20 2018, 7:05 AM

Aug 18 2018

sebpop added inline comments to D50658: Hot cold splitting pass.
Aug 18 2018, 3:35 PM

Aug 16 2018

sebpop added inline comments to D50658: Hot cold splitting pass.
Aug 16 2018, 1:34 PM