Page MenuHomePhabricator

sebpop (Sebastian Pop)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 4 2014, 10:15 AM (258 w, 3 d)

Recent Activity

Jun 11 2019

sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

For some reason asan/tests/asan_noinst_test.cc is not compiled on make check-asan and that has exposed a compile error that was not triggered by the other tests:

sanitizer_double_allocator.h:30:11: error: use of non-static data member 'use_first_' of 'DoubleAllocator' from nested type 'DoubleAllocatorCache'
      if (use_first_)
          ^~~~~~~~~~

The updated patch fixes this by accessing the non-static field of the enclosing class through a this pointer to one of the instances:

-      if (use_first_)
+      if (this->use_first_)
Jun 11 2019, 3:09 PM · Restricted Project
sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

The updated patch passes make check-lsan check-asan and is still under test for check-all on aarch64-linux.

Jun 11 2019, 1:16 PM · Restricted Project
sebpop added a comment to D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

The updated patch I will post is addressing all the review comments.

Jun 11 2019, 1:13 PM · Restricted Project

May 19 2019

sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Addressed comments from @vitalybuka: factored up the 3 versions and added more tests.
Passes with no new fails ninja check-all on an AArch64 Graviton A1 instance.

May 19 2019, 1:19 AM · Restricted Project

May 10 2019

sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Fix the x86_64 overflow warning with 1ULL << 48.

May 10 2019, 5:16 PM · Restricted Project
sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Updated patch fixes ASan.

May 10 2019, 2:29 PM · Restricted Project
sebpop added inline comments to D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .
May 10 2019, 8:25 AM · Restricted Project

May 9 2019

sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

I have verified that the updated patch compiles and it reduces the execution time of leak sanitized trivial example.

May 9 2019, 10:41 PM · Restricted Project

Apr 12 2019

sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

This patch reduces the number of #ifdefs as suggested by Kostya, and speeds up both the leak and address sanitizers on aarch64.
Passes check-all on x86_64-linux and aarch64-linux is still under test.
Worked with Brian Rzycki @brzycki.

Apr 12 2019, 12:54 PM · Restricted Project

Apr 8 2019

sebpop added a comment to D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Also, this is changing only the standalone lsan, not lsan used as part of asan. Right?
standalone lsan is not widely used, AFAICT.

Apr 8 2019, 1:58 PM · Restricted Project
sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Address review comments from Kostya: move AArch64 lsan allocator to a separate file to avoid #ifdefs.

Apr 8 2019, 1:15 PM · Restricted Project

Apr 5 2019

sebpop added a comment to D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Ok, I will prepare an updated patch.
Thanks Brian and Kostya for your reviews.

Apr 5 2019, 5:10 PM · Restricted Project
sebpop accepted D60284: [JumpThreading] [PR40992] Fix miscompile when folding a successor of an indirectbr.

Looks good to me.

Apr 5 2019, 9:14 AM · Restricted Project

Apr 3 2019

sebpop updated the diff for D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .

Rebased patch on today's trunk.

Apr 3 2019, 8:08 PM · Restricted Project
sebpop created D60243: [LSan][AArch64] Speed-up leak and address sanitizers on AArch64 for 48-bit VMA .
Apr 3 2019, 8:02 PM · Restricted Project
sebpop added a comment to D32530: [SVE][IR] Scalable Vector IR Type.

I'd advise caution here, it's really significant/impactful change, and a single sign-off is a bit worrying.

Apr 3 2019, 7:21 AM · Restricted Project

Mar 29 2019

sebpop accepted D32530: [SVE][IR] Scalable Vector IR Type.

I am accepting the scalable vector types based on the comments in
http://lists.llvm.org/pipermail/llvm-dev/2019-March/131137.html

Mar 29 2019, 11:51 AM · Restricted Project

Feb 21 2019

sebpop added a comment to D57953: [Jump Threading] Convert conditional branches into unconditional branches using GVN results.

This seems to be a useful transform that is not yet covered by the current implementation of jump-threading.
I think GCC calls it control dependence DCE.
Please run and report performance numbers on the testsuite and other benchmarks that you have access to.

Feb 21 2019, 9:04 AM

Nov 1 2018

sebpop abandoned D53588: [hot-cold-split] split more than a cold region per function.
Nov 1 2018, 11:22 AM
sebpop added a comment to D53887: [HotColdSplitting] Outline more than once per function.

Maybe you can add the testcase from my previous patch: https://reviews.llvm.org/D53588

Nov 1 2018, 11:22 AM

Oct 29 2018

sebpop accepted D53835: [HotColdSplitting] Use TTI to inform outlining threshold.

Looks good to me.

Oct 29 2018, 3:56 PM
sebpop accepted D53824: [HotColdSplitting] Allow outlining single-block cold regions.

Ok, thanks!

Oct 29 2018, 12:02 PM

Oct 25 2018

sebpop added a comment to D53588: [hot-cold-split] split more than a cold region per function.
In D53588#1276693, @vsk wrote:

@sebpop are you interested in rebasing this on the new cold block propagation code? If not, I'd be happy to give it a try. Is the main challenge in teaching CodeExtractor to update the DT and PDT?

Oct 25 2018, 7:02 PM

Oct 24 2018

sebpop accepted D53627: [HotColdSplitting] Identify larger cold regions using domtree queries.

The change looks good to me. Thanks!

Oct 24 2018, 3:03 PM
sebpop accepted D53534: [hot-cold-split] Name split functions with ".cold" suffix.

ok

Oct 24 2018, 10:04 AM

Oct 23 2018

sebpop accepted D53627: [HotColdSplitting] Identify larger cold regions using domtree queries.

ok

Oct 23 2018, 10:10 PM
sebpop added a comment to D53534: [hot-cold-split] Name split functions with ".cold" suffix.

The change looks good with some minor changes. Thanks!

Oct 23 2018, 8:06 PM
sebpop added a comment to D53588: [hot-cold-split] split more than a cold region per function.
In D53588#1272789, @vsk wrote:

@sebpop thanks for this patch! I don't see any problems with it (although I would prefer that the test explicitly check that outlined functions contain the correct instructions).

At a higher-level, I'm seeing some problems with the forward/back cold propagation done in getHotBlocks on internal projects. The propagation seems to stop when it encounters simple control flow, like an if-then-else or a for loop, after which cold code is unconditionally executed.

I have a prototype of a different propagation scheme which overcomes some of these limitations. The idea is to mark blocks which are post-dominated by a cold block, or are dominated by a cold block, as cold. This is able to handle the control flow I described, and isn't limited to requiring a single exit block. Could you give me a day to evaluate it further, run benchmarks etc. and report back? If it turns out to be promising, istm that it'd make sense to rebase this patch on top of it.

Oct 23 2018, 11:22 AM
sebpop accepted D51861: [LSR] Combine unfolded offset into invariant register.

ok

Oct 23 2018, 10:52 AM
sebpop added inline comments to D53534: [hot-cold-split] Name split functions with ".cold" suffix.
Oct 23 2018, 10:13 AM
sebpop created D53588: [hot-cold-split] split more than a cold region per function.
Oct 23 2018, 10:11 AM
sebpop accepted D53518: [HotColdSplitting] Attach MinSize to outlined code.

ok

Oct 23 2018, 10:06 AM

Oct 22 2018

sebpop added inline comments to D53534: [hot-cold-split] Name split functions with ".cold" suffix.
Oct 22 2018, 9:09 PM
sebpop accepted D53505: [hot-cold-split] Add missing FileCheck invocations.

ok

Oct 22 2018, 10:58 AM
sebpop accepted D53512: [hot-cold-split] Add opt remark on success.

ok

Oct 22 2018, 10:55 AM

Oct 20 2018

sebpop accepted D53437: Schedule Hot Cold Splitting pass after most optimization passes.

ok

Oct 20 2018, 8:47 AM

Oct 15 2018

sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.
In D52904#1266032, @NoQ wrote:
Oct 15 2018, 5:46 PM
sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.

Fixed in https://reviews.llvm.org/rL344566

Oct 15 2018, 3:45 PM
sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.

With this patch I now see a 3% speedup on sqlite vs. no hot-cold-split pass.

Awesome! Thanks for improving this!

Oct 15 2018, 2:30 PM
sebpop updated the summary of D52904: [hot-cold-split] fix static analysis of cold regions.
Oct 15 2018, 2:26 PM
sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.

Are both fixes necessary to fix the issue (the one for back propagation and the one to bail out if the entry block is cold), or is either one sufficient? The patch description only mentions the former.

Oct 15 2018, 2:22 PM
sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.

Are both fixes necessary to fix the issue (the one for back propagation and the one to bail out if the entry block is cold), or is either one sufficient? The patch description only mentions the former.

Oct 15 2018, 1:44 PM
sebpop updated the diff for D52904: [hot-cold-split] fix static analysis of cold regions.

Fix two comments from @tejohnson.

Oct 15 2018, 12:51 PM
sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.

I will post a patch to fix the comments from @tejohnson.

Oct 15 2018, 12:49 PM
sebpop added a reviewer for D53267: [CodeExtractor] Erase debug intrinsics in outlined thunks: tejohnson.

Does this patch fix https://bugs.llvm.org/show_bug.cgi?id=22900
In which case you may want to add the testcases from that bug.

Oct 15 2018, 11:07 AM

Oct 5 2018

sebpop updated the diff for D52904: [hot-cold-split] fix static analysis of cold regions.

Added an early return for outlining if function entry is cold.
Added a check for invoke calls: invoke should not be marked as cold by back propagation.

Oct 5 2018, 8:20 AM

Oct 4 2018

sebpop added a comment to D52904: [hot-cold-split] fix static analysis of cold regions.

Before this patch we have a 10% regression in sqlite with hot-cold-split pass.
With this patch I now see a 3% speedup on sqlite vs. no hot-cold-split pass.

Oct 4 2018, 1:38 PM
sebpop created D52904: [hot-cold-split] fix static analysis of cold regions.
Oct 4 2018, 1:35 PM

Oct 2 2018

sebpop accepted D52704: Improve static analysis of cold basic blocks.

Looks good.

Oct 2 2018, 11:45 AM

Oct 1 2018

sebpop added a comment to D52708: Add support for new pass manager.

Please add a testcase that will exercise the new flag.

Oct 1 2018, 6:48 AM

Sep 27 2018

sebpop added a comment to D50658: Hot cold splitting pass.

I see that there is a new Pass Manager version of your new pass, but I don't see that it is ever being enabled in the new pass manager pipeline (or tested). Are you planning to add that?

Pinging question. It should be straightforward to add to the new PM, is that something you plan to do soon as a follow-on?

Sep 27 2018, 12:51 PM

Sep 21 2018

sebpop accepted D52367: Remove address taken, add optnone.

lgtm

Sep 21 2018, 10:47 AM

Sep 12 2018

sebpop accepted D51980: [GVNHoist] computeInsertionPoints() miscalculates the Iterated Dominance Frontiers.

Looks good to me.

Sep 12 2018, 7:01 AM

Sep 11 2018

sebpop added a comment to D50658: Hot cold splitting pass.

I broke the check in https://reviews.llvm.org/rL341838 and fixed it in https://reviews.llvm.org/rL341839

Sep 11 2018, 8:55 AM

Sep 10 2018

sebpop accepted D51801: [MemorySSAUpdater] Avoid creating self-referencing MemoryDefs.

Looks good to me.

Sep 10 2018, 10:04 AM

Sep 6 2018

sebpop accepted D49151: [SimplifyIndVar] Avoid generating truncate instructions with non-hoisted Load operand..

The patch looks good to me.
Please address the last two comments and then apply.

Sep 6 2018, 4:18 PM
sebpop added inline comments to D49151: [SimplifyIndVar] Avoid generating truncate instructions with non-hoisted Load operand..
Sep 6 2018, 12:43 PM

Sep 5 2018

sebpop accepted D50658: Hot cold splitting pass.

lgtm, thanks!

Sep 5 2018, 1:33 PM

Sep 4 2018

sebpop added inline comments to D49151: [SimplifyIndVar] Avoid generating truncate instructions with non-hoisted Load operand..
Sep 4 2018, 11:51 AM
sebpop added inline comments to D49151: [SimplifyIndVar] Avoid generating truncate instructions with non-hoisted Load operand..
Sep 4 2018, 11:45 AM

Aug 25 2018

sebpop added inline comments to D50658: Hot cold splitting pass.
Aug 25 2018, 2:37 PM

Aug 23 2018

sebpop updated the diff for D50658: Hot cold splitting pass.
Aug 23 2018, 2:22 PM
sebpop commandeered D50658: Hot cold splitting pass.
Aug 23 2018, 2:21 PM

Aug 20 2018

sebpop added inline comments to D50658: Hot cold splitting pass.
Aug 20 2018, 7:05 AM

Aug 18 2018

sebpop added inline comments to D50658: Hot cold splitting pass.
Aug 18 2018, 3:35 PM

Aug 16 2018

sebpop added inline comments to D50658: Hot cold splitting pass.
Aug 16 2018, 1:34 PM

Aug 8 2018

sebpop added a comment to D50323: [GVNHoist] Prune out useless CHI insertions.

A given instruction has always the same Value Number.

That is correct: the integer Number that GVN returns is the same for instructions that compute the same Values at run time.
Asking GVN to provide a number for a given instruction will result in the same integer.

Aug 8 2018, 11:34 AM

Jul 27 2018

sebpop accepted D49858: [RFC] re-enable GVNHoist by default.

Looks good to me.
Thanks Alexandros for fixing the last known bugs with gvn-hoist.

Jul 27 2018, 3:36 PM
sebpop accepted D48489: [MemDep] Use PhiValuesAnalysis to improve alias analysis results.

Looks good to me. Thanks!

Jul 27 2018, 12:18 PM
sebpop accepted D44564: [BasicAA] Use PhiValuesAnalysis if available when handling phi alias.

Looks good to me. Thanks!

Jul 27 2018, 12:14 PM

Jul 20 2018

sebpop accepted D49617: Early exit with cheaper checks.

looks good

Jul 20 2018, 6:22 PM
sebpop accepted D49555: [GVNHoist] safeToHoistLdSt incorrectly checks whether a defining access dominates the insertion point.

looks good, thanks!

Jul 20 2018, 6:17 PM

Jun 28 2018

sebpop accepted D47893: Add a PhiValuesAnalysis pass to calculate the underlying values of phis.

Looks good to me. Thanks.

Jun 28 2018, 5:18 AM

Jun 25 2018

sebpop accepted D48481: [DA] Delinearise AddRecs if we can prove they don't wrap.

Looks good, please commit. Thanks!

Jun 25 2018, 6:53 AM

Jun 22 2018

sebpop requested changes to D47893: Add a PhiValuesAnalysis pass to calculate the underlying values of phis.
Jun 22 2018, 1:10 PM

Jun 20 2018

sebpop accepted D45872: [DA] Enable -da-delinearize by default.

lgtm thanks!

Jun 20 2018, 9:01 AM

May 24 2018

sebpop added a comment to D24033: Convert clamp into fmaxnum/fminnum pairs..

In the following experiment a positive number is an increase in performance,
the best score was taken out of 3 runs on firefly aarch64 A-72:

May 24 2018, 9:13 AM
sebpop added a comment to D45098: [AArch64] Fix PR32384: bump up the number of stores per memset and memcpy.

The experiment is cpu2000 best score out of 3 runs on A-72 of a firefly device.
A better score is positive.

May 24 2018, 9:12 AM

May 23 2018

sebpop updated the diff for D24033: Convert clamp into fmaxnum/fminnum pairs..

Updated patch. I will post perf numbers on some benchmarks with this patch.

May 23 2018, 9:32 AM
sebpop commandeered D24033: Convert clamp into fmaxnum/fminnum pairs..
May 23 2018, 9:30 AM

May 22 2018

sebpop added a comment to D46477: [AARCH64] Gang up loads and stores (for memcpy) for pairing..

You know you can either just use arc patch, and automagically get a
nice commit msg,
or at least manually add "Differential Revision: link", and it will
get closed automatically?

May 22 2018, 3:27 PM
sebpop updated the diff for D45098: [AArch64] Fix PR32384: bump up the number of stores per memset and memcpy.

Following Eli's recommendation the patch does not modify memmov.
I will post the updated numbers on top of the improved code generation
for memcpy: https://reviews.llvm.org/rL332482

May 22 2018, 10:33 AM
sebpop closed D46477: [AARCH64] Gang up loads and stores (for memcpy) for pairing..

Committed in https://reviews.llvm.org/rL332482

May 22 2018, 10:28 AM
sebpop added a comment to D44564: [BasicAA] Use PhiValuesAnalysis if available when handling phi alias.

After some tinkering I've come up with the following solution (I have something that seems to work, but it needs cleaning up and testing):

  • Add a PhiAnalysis analysis pass which returns a PhiInfo
May 22 2018, 9:32 AM

May 17 2018

sebpop added a comment to D46193: [LSR] Skip LSR if the cost of input is cheaper than LSR's solution.

I tried this patch on exynos-m3 and there are several benchmarks improving by about 5%.
Among those benchmarks are spec2000 188.ammp and 256.bzip2 that improve by 3%.
All performance degradations are within noise level.

May 17 2018, 2:00 PM

May 14 2018

sebpop accepted D46477: [AARCH64] Gang up loads and stores (for memcpy) for pairing..

LGTM with some minor changes.

May 14 2018, 2:17 PM
sebpop added a comment to D45098: [AArch64] Fix PR32384: bump up the number of stores per memset and memcpy.

Looking at the generated code a bit, it looks like we do a really terrible job lowering memcpy; we don't form ldp/stp at all, ever. We should probably fix that before we mess with the threshold here; it could substantially change the codesize/performance impact of this change.

May 14 2018, 12:27 PM

May 11 2018

sebpop added a comment to D45821: [AArch64] improve code generation of vectors smaller than 64 bit.

I am reruning the benchmarks with the patch applied on top of https://reviews.llvm.org/D46655 which fixes one of the problems exposed by this patch.

May 11 2018, 12:01 PM

May 9 2018

sebpop added a comment to D45821: [AArch64] improve code generation of vectors smaller than 64 bit.

I am reruning the benchmarks with the patch applied on top of https://reviews.llvm.org/D46655 which fixes one of the problems exposed by this patch.

May 9 2018, 12:01 PM
sebpop added a comment to D46655: [AArch64] Improve single vector lane stores.

This fixes a perf regression we were seeing with generation of vectors smaller than 64 bit: https://reviews.llvm.org/D45821

May 9 2018, 11:58 AM
sebpop added inline comments to D46193: [LSR] Skip LSR if the cost of input is cheaper than LSR's solution.
May 9 2018, 11:48 AM

May 8 2018

sebpop added inline comments to D46477: [AARCH64] Gang up loads and stores (for memcpy) for pairing..
May 8 2018, 9:24 AM

Apr 27 2018

sebpop accepted D45695: [CodeGen] Use RegUnits to track register aliases (NFC).

looks good, thanks!

Apr 27 2018, 9:40 AM
sebpop added a comment to D46193: [LSR] Skip LSR if the cost of input is cheaper than LSR's solution.

I like this change, thanks for implementing it!

Apr 27 2018, 8:58 AM

Apr 19 2018

sebpop added inline comments to D45821: [AArch64] improve code generation of vectors smaller than 64 bit.
Apr 19 2018, 1:15 PM
sebpop updated the diff for D45821: [AArch64] improve code generation of vectors smaller than 64 bit.

clang-format, added test-case, fixed all failing "make check" tests.

Apr 19 2018, 1:06 PM
sebpop added a comment to D45821: [AArch64] improve code generation of vectors smaller than 64 bit.

I am adding test cases for the new vectorized types, and will update the patch shortly.

Apr 19 2018, 9:01 AM
sebpop created D45821: [AArch64] improve code generation of vectors smaller than 64 bit.
Apr 19 2018, 8:58 AM

Apr 5 2018

sebpop accepted D45287: [InstCombine] Properly change GEP type when reassociating loop invariant GEP chains.

Looks good. Thanks!

Apr 5 2018, 9:44 AM

Apr 4 2018

sebpop abandoned D45229: [MI-sched] schedule following instruction latencies.

Thanks @fhahn for the pointer: I'm closing this revision and I will try to fix the problem with something similar to D38279.
I tried that code out and it is not modifying the current behaviour of the scheduler.

Apr 4 2018, 1:29 PM