Page MenuHomePhabricator

davidxl (David Li)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 13 2015, 9:48 AM (371 w, 1 h)

Recent Activity

Mon, May 16

davidxl added a comment to D125495: [Inline][Remark] Annotate inline pass name with link phase information for analysis..

It might be worth annotating the early inliner for PGO as well.

To confirm if I'm understanding it correctly, you mean we should annotate the "early" (i.e., not profile-driven) information for those inline transformations that are are made without PGO profiles [1].

Ask since the early inliner is annotated with prelink/postlink information (same line around line 630 by construction of ModuleInlinerWrapperPass instance)

I am not David, but yes, it would be nice to be able to differentiate the early inlining and rhw profile-driven inlining in the FDO scenario.

Mon, May 16, 11:05 AM · Restricted Project, Restricted Project

Fri, May 13

davidxl added a comment to D124889: [ValueTracking] Add support to deduce a PHI node being a power of 2 if each incoming value is a power of 2..

This one looks fine to me. Wait to see if other reviewers have further comment.

Fri, May 13, 11:04 AM · Restricted Project, Restricted Project

Thu, May 12

davidxl added a comment to D125495: [Inline][Remark] Annotate inline pass name with link phase information for analysis..

It might be worth annotating the early inliner for PGO as well.

Thu, May 12, 2:34 PM · Restricted Project, Restricted Project
davidxl accepted D120233: [SelectOpti][5/5] Optimize select-to-branch transformation.

lgtm

Thu, May 12, 9:34 AM · Restricted Project, Restricted Project

Tue, May 10

davidxl added inline comments to D120233: [SelectOpti][5/5] Optimize select-to-branch transformation.
Tue, May 10, 8:33 PM · Restricted Project, Restricted Project
davidxl added inline comments to D120233: [SelectOpti][5/5] Optimize select-to-branch transformation.
Tue, May 10, 12:09 PM · Restricted Project, Restricted Project
davidxl added a reviewer for D120233: [SelectOpti][5/5] Optimize select-to-branch transformation: davidxl.
Tue, May 10, 11:49 AM · Restricted Project, Restricted Project

Mon, May 9

davidxl added inline comments to D124118: [Peephole-Opt][X86] Enhance peephole opt to see through SUBREG_TO_REG (following AND) and eliminates redundant TEST instruction..
Mon, May 9, 9:57 PM · Restricted Project, Restricted Project
davidxl added a comment to D125070: [InstCombine] Simplify chain of GEP with constant indices.

Perhaps split out part of the patch that handles the following and then work on a patch to do reassociation (nikic's comment)

Mon, May 9, 1:39 PM · Restricted Project, Restricted Project
davidxl accepted D125249: [Inliner] Preserve !prof metadata when converting call to invoke..

lgtm

Mon, May 9, 12:53 PM · Restricted Project, Restricted Project
davidxl added a comment to D125249: [Inliner] Preserve !prof metadata when converting call to invoke..

Should we have a test case to cover the post-inlining scaling. Otherwise looks good.

Mon, May 9, 11:48 AM · Restricted Project, Restricted Project
davidxl added a comment to D125249: [Inliner] Preserve !prof metadata when converting call to invoke..

Does the data need to be scaled first?

Mon, May 9, 11:43 AM · Restricted Project, Restricted Project
davidxl accepted D125071: [InstCombine] Baseline tests for InstCombine optimization to merge GEP instructions with constant indices.

lgtm with nikic's comment addressed.

Mon, May 9, 11:37 AM · Restricted Project, Restricted Project

Thu, May 5

davidxl accepted D120232: [SelectOpti][4/5] Loop Heuristics.

lgtm

Thu, May 5, 1:48 PM · Restricted Project, Restricted Project

Wed, May 4

davidxl accepted D124857: [AIX][PGO] Enable linux style PGO on AIX.

lgtm

Wed, May 4, 9:25 AM · Restricted Project, Restricted Project, Restricted Project, Restricted Project

Tue, May 3

davidxl accepted D124885: [ValueTracking] Baseline tests for Power-of-2 value tracking on PHI nodes.

lgtm

Tue, May 3, 10:20 PM · Restricted Project, Restricted Project
davidxl added inline comments to D124857: [AIX][PGO] Enable linux style PGO on AIX.
Tue, May 3, 2:06 PM · Restricted Project, Restricted Project, Restricted Project, Restricted Project

Mon, May 2

davidxl added a comment to D123748: [ValueTracking] Added support to deduce PHI Nodes values being a power of 2.

From the feedbacks, the action items are

  1. checkin baseine tests;
  2. split out recurrence handling from the main patch -- have it reviewed and committed
  3. send the recurrence handling patch, get it reviewed and committed (the depth may not be needed in this case)
Mon, May 2, 2:34 PM · Restricted Project, Restricted Project

Sat, Apr 30

davidxl added inline comments to D120232: [SelectOpti][4/5] Loop Heuristics.
Sat, Apr 30, 10:05 PM · Restricted Project, Restricted Project

Mon, Apr 25

davidxl accepted D124388: [LoopPeel][NFCI] Simplify the code to calculate peel count for PGO.

This patch slightly changes the behavior (e.g consider alreadyPeeled when comparing with size threshold), but looks more correct. LGTM.

Mon, Apr 25, 8:05 PM · Restricted Project, Restricted Project

Apr 22 2022

davidxl added inline comments to D120232: [SelectOpti][4/5] Loop Heuristics.
Apr 22 2022, 11:22 PM · Restricted Project, Restricted Project
davidxl added a comment to D123865: [LoopPeel] Allow partial unrolling for profile-based peeling.

Unfortunatelly, I do not have access to SPEC tests. I checked with 7zip-benchmark and found that Compressing speed is slightly improved with this patch while Decompressing is slightly degraded. But I have to note that compared to a Release build, PGO one is worse even without the patch.

As for my own tests, I noticed that the performance gain with unrolled loops results from the generated code that has less instructions. In one sample, there was no need to reserve a register for the loop counters and as in each iteration the value of the counter was known, there was no instructions for increment the counter and the comparison instructions used an immediate value instead of comparing with the register; all that made the code path shorter. In another example, the subsequent iterations used overlapped data from the memory. Both compact and unrolled version loaded the data to a set of registers, but the compact version had to shift values in registers in each iteration, so that they are placed each time in the predefined registers, while unrolled version just used instructions which work this different set of registers in each iteration. Thus, unrolling again helped to find a way to generate a code with less instructions for each peeled iteration.

Apr 22 2022, 11:18 AM · Restricted Project, Restricted Project

Apr 20 2022

davidxl added a comment to D124118: [Peephole-Opt][X86] Enhance peephole opt to see through SUBREG_TO_REG (following AND) and eliminates redundant TEST instruction..

I suspect there is something else going on.

The following case works fine without the patch.

define dso_local noundef i64 @_Z3fooPlll(ptr nocapture noundef %0, i64 noundef %1, i64 noundef %2) local_unnamed_addr #0 {

%4 = load i64, ptr %0, align 8
%5 = and i64 %4, 3
%6 = icmp eq i64 %5, 0
%7 = select i1 %6, i64 %1, i64 %2
store i64 %7, ptr %0, align 8
ret i64 %5

}

However a simple change below to replace the operand %2 in the select with %5 (output of and operation), then it stops working.

define dso_local noundef i64 @_Z3fooPlll(ptr nocapture noundef %0, i64 noundef %1, i64 noundef %2) local_unnamed_addr #0 {

%4 = load i64, ptr %0, align 8
%5 = and i64 %4, 3
%6 = icmp eq i64 %5, 0
%7 = select i1 %6, i64 %1, i64 %5
store i64 %7, ptr %0, align 8
ret i64 %5

}

Can you investigate the difference? The behavior difference on ARM is also worth comparing with.

There's a peephole in X86DAGToDAGISel::Select and PostProcessISelDAG that both fail if the AND has users other than the TEST.

Apr 20 2022, 10:00 PM · Restricted Project, Restricted Project
davidxl edited reviewers for D124118: [Peephole-Opt][X86] Enhance peephole opt to see through SUBREG_TO_REG (following AND) and eliminates redundant TEST instruction., added: craig.topper; removed: davidxl.
Apr 20 2022, 9:52 PM · Restricted Project, Restricted Project
davidxl added a comment to D124118: [Peephole-Opt][X86] Enhance peephole opt to see through SUBREG_TO_REG (following AND) and eliminates redundant TEST instruction..

I suspect there is something else going on.

Apr 20 2022, 2:26 PM · Restricted Project, Restricted Project
davidxl added inline comments to D120231: [SelectOpti][3/5] Base Heuristics.
Apr 20 2022, 9:10 AM · Restricted Project, Restricted Project

Apr 19 2022

davidxl added a comment to D120232: [SelectOpti][4/5] Loop Heuristics.

A high level comment. I saw ScalingUpFactor is used in many places. It may be better to directly use Scaled64 data type instead of uint64 to avoid that.

Apr 19 2022, 10:16 PM · Restricted Project, Restricted Project

Apr 18 2022

davidxl added a reviewer for D120232: [SelectOpti][4/5] Loop Heuristics: davidxl.
Apr 18 2022, 10:31 PM · Restricted Project, Restricted Project
davidxl accepted D120231: [SelectOpti][3/5] Base Heuristics.

lgtm

Apr 18 2022, 10:30 PM · Restricted Project, Restricted Project
davidxl added inline comments to D120231: [SelectOpti][3/5] Base Heuristics.
Apr 18 2022, 10:15 PM · Restricted Project, Restricted Project
davidxl added a comment to D123748: [ValueTracking] Added support to deduce PHI Nodes values being a power of 2.

The patch looks reasonable. Please wait for other reviewers (Sanjay ) to comment as well.

Apr 18 2022, 9:51 PM · Restricted Project, Restricted Project

Apr 15 2022

davidxl added a comment to D123865: [LoopPeel] Allow partial unrolling for profile-based peeling.

Our test was for AArch64, so there should be no effects like loop alignment, if I understand you right.

Apr 15 2022, 2:00 PM · Restricted Project, Restricted Project
davidxl added a comment to D123865: [LoopPeel] Allow partial unrolling for profile-based peeling.

Have you analyzed the root cause of the performance regression (with performance counters)? Is it due to other missed optimizations without the peeling or is it due to side effects such as loop alignment?

Apr 15 2022, 12:01 PM · Restricted Project, Restricted Project
davidxl added a comment to D123865: [LoopPeel] Allow partial unrolling for profile-based peeling.

For loops with trip count between maxPeelCount and 2*maxPeelCount, this patch enables partial peeling of the loop for maxPeelCount iterations. This feels very aggressive and can be detrimental to performance: 1) increased icache footprint; 2) more pressure to BPU and 3) the back branch of the remaining loop may become less predictable.

Apr 15 2022, 11:02 AM · Restricted Project, Restricted Project

Apr 14 2022

davidxl accepted D123834: [PGO] Remove legacy PM passes.

thanks for the cleanup. LGTM

Apr 14 2022, 10:06 PM · Restricted Project, Restricted Project
davidxl added inline comments to D123748: [ValueTracking] Added support to deduce PHI Nodes values being a power of 2.
Apr 14 2022, 11:48 AM · Restricted Project, Restricted Project

Apr 13 2022

davidxl added a comment to D123748: [ValueTracking] Added support to deduce PHI Nodes values being a power of 2.

Can you add back the original list of reviewers and mark the places where the prior comments are addressed?

Apr 13 2022, 5:46 PM · Restricted Project, Restricted Project
davidxl added a comment to D123164: [CoverageMapping] Remove dots from paths inside the profile.

Is it possible to generate the covmapping file instead of checking a binary blob?

Apr 13 2022, 1:01 PM · Restricted Project, Restricted Project

Apr 11 2022

davidxl added inline comments to D120231: [SelectOpti][3/5] Base Heuristics.
Apr 11 2022, 11:07 PM · Restricted Project, Restricted Project

Apr 5 2022

davidxl added inline comments to D123071: [ValueTracking] Added support to deduce PHI Nodes values being a power of 2.
Apr 5 2022, 2:59 PM · Restricted Project, Restricted Project
davidxl added inline comments to D120231: [SelectOpti][3/5] Base Heuristics.
Apr 5 2022, 2:31 PM · Restricted Project, Restricted Project

Apr 4 2022

davidxl added a reviewer for D123071: [ValueTracking] Added support to deduce PHI Nodes values being a power of 2: Carrot.
Apr 4 2022, 1:26 PM · Restricted Project, Restricted Project

Apr 1 2022

davidxl added a reviewer for D120231: [SelectOpti][3/5] Base Heuristics: davidxl.
Apr 1 2022, 1:21 PM · Restricted Project, Restricted Project

Mar 31 2022

davidxl added inline comments to D120231: [SelectOpti][3/5] Base Heuristics.
Mar 31 2022, 4:11 PM · Restricted Project, Restricted Project
davidxl accepted D122785: chore: add watermark test for llvm-cov.

lgtm

Mar 31 2022, 8:36 AM · Restricted Project, Restricted Project

Mar 25 2022

davidxl accepted D122259: [SelectOpti][2/5] Select-to-branch base transformation.

Perhaps modify the test case to include debug statement for better coverage. Otherwise LGTM.

Mar 25 2022, 3:21 PM · Restricted Project, Restricted Project
davidxl added inline comments to D122336: [InstrProfiling] No runtime hook for unused funcs.
Mar 25 2022, 11:44 AM · Restricted Project, Restricted Project, Restricted Project

Mar 24 2022

davidxl added inline comments to D122259: [SelectOpti][2/5] Select-to-branch base transformation.
Mar 24 2022, 9:52 PM · Restricted Project, Restricted Project
davidxl added a comment to D122336: [InstrProfiling] No runtime hook for unused funcs.

Should this be fixed in Clang (not generating coverage record)? Adding the logic here is a little confusing.

Mar 24 2022, 10:11 AM · Restricted Project, Restricted Project, Restricted Project

Mar 22 2022

davidxl added a reviewer for D122259: [SelectOpti][2/5] Select-to-branch base transformation: davidxl.
Mar 22 2022, 2:27 PM · Restricted Project, Restricted Project
davidxl added inline comments to D122259: [SelectOpti][2/5] Select-to-branch base transformation.
Mar 22 2022, 2:26 PM · Restricted Project, Restricted Project

Mar 21 2022

davidxl added a comment to D121862: [ProfSampleLoader] When disable-sample-loader-inlining is true, merge profiles of inlined instances to outlining versions..

When the option was introduced in https://reviews.llvm.org/D120344, I think the intention is to use it to disable both sampleloader inlining (pre/postlink), so perhaps we can just tighten the comment for the option and document that the profiles are merged back.

Mar 21 2022, 4:34 PM · Restricted Project, Restricted Project

Mar 17 2022

davidxl added a comment to D121862: [ProfSampleLoader] When disable-sample-loader-inlining is true, merge profiles of inlined instances to outlining versions..

What's your actual use case that motivated this change?

When this was originally added, we used it to disable only pre-link sample loader inlining, in which case we don't want to merge the profiles even when inlining there is skipped. This is because during post-link, sample loader inlining will still happen, so merging profiles during pre-link as if relevant inlining will never happen may cause over-optimize (it was measurable in code size). +@hoy

Good point about the post-link sample loader inlinling. The intention of this internal flag is actually to disable both sample loader inlining for performance experiment purpose (e.g exercising more cost/benefit analysis of regular inlining) -- related effort is kazu@ is doing analysis on context similarity analysis to try to tune down the sample loader inlining.

Can you shed more light on the context similarity analysis? Thanks.

Mar 17 2022, 10:21 AM · Restricted Project, Restricted Project
davidxl added a comment to D121862: [ProfSampleLoader] When disable-sample-loader-inlining is true, merge profiles of inlined instances to outlining versions..

related effort is kazu@ is doing analysis on context similarity analysis to try to tune down the sample loader inlining.

Curious to hear more about this. Is tuning down sample loader inlining for performance or something else (e.g. compile time)? The problem of SCC inliner being bottom-up makes it less selective, which is a disadvantage comparing to early top-down inlining, so we were actually hoping to move more inlining from SCC to sample loader. Or are you shifting inlining to new ModuleInliner?

Mar 17 2022, 9:58 AM · Restricted Project, Restricted Project
davidxl added a comment to D121862: [ProfSampleLoader] When disable-sample-loader-inlining is true, merge profiles of inlined instances to outlining versions..

What's your actual use case that motivated this change?

When this was originally added, we used it to disable only pre-link sample loader inlining, in which case we don't want to merge the profiles even when inlining there is skipped. This is because during post-link, sample loader inlining will still happen, so merging profiles during pre-link as if relevant inlining will never happen may cause over-optimize (it was measurable in code size). +@hoy

Mar 17 2022, 9:35 AM · Restricted Project, Restricted Project

Mar 14 2022

davidxl added a comment to D120230: [SelectOpti][1/5] Setup new select-optimize pass.

On that internal workload, we've got 6% less cmov with this pass turned on for IRPGO (it works, no correctness issue :-) ). But perf-wise it's neutral (we can measure 0.2% perf movement on that workload with high confidence).

Does BOLT's cmov optimization improve performance for this workload?

I didn't measure it yet, but unlikely (see comment below).

Say we end up with cmov in one of the sample PGO iterations (either due to lack of profile, or profile indicating branch being unbiased), we would lose the control flow profile that is needed to tell how biased that original branch is, because we've turned that control flow into data flow. Unless we never use cmov for branches without profile info, we could keep generating cmov in future iterations even if branch becomes more biased later because we will never get control flow profile again.

If we indeed never use cmov for branch without profile, that turn this problem into a typical sample PGO oscillations. That is not the case before this patch set, are we changing the behavior now? I'm also not sure if such oscillation is as easily mitigable as other oscillations like those from speculative ICP.

Regarding BOLT's usage for this problem -- does it mean the profile data is not collected from production binary but collected using pre-BOLD binary in a training run?

Yes, the profile data should be collected from pre-BOLT binary.

If this is the setup, compiler can choose to minimze cmov generation for the sake of better profilling.

The compiler can indeed choose to minimize cmov generation – I've recently added an LLVM knob to force-expand all cmov's in D119777 (x86-cmov-converter-force-all).

However, the data collected with (non-PGO, non-LTO) clang binary suggests that x86-cmov-converter-force-all introduces a significant perf regression that BOLT's CMOV conversion with default heuristics is unable to recover from.

Mar 14 2022, 10:49 PM · Restricted Project, Restricted Project
davidxl added a comment to D120230: [SelectOpti][1/5] Setup new select-optimize pass.

On that internal workload, we've got 6% less cmov with this pass turned on for IRPGO (it works, no correctness issue :-) ). But perf-wise it's neutral (we can measure 0.2% perf movement on that workload with high confidence).

Does BOLT's cmov optimization improve performance for this workload?

This is being worked on now and we don't have data yet. The numbers above didn't have BOLT interfering with cmov.

Say we end up with cmov in one of the sample PGO iterations (either due to lack of profile, or profile indicating branch being unbiased), we would lose the control flow profile that is needed to tell how biased that original branch is, because we've turned that control flow into data flow. Unless we never use cmov for branches without profile info, we could keep generating cmov in future iterations even if branch becomes more biased later because we will never get control flow profile again.

If we indeed never use cmov for branch without profile, that turn this problem into a typical sample PGO oscillations. That is not the case before this patch set, are we changing the behavior now? I'm also not sure if such oscillation is as easily mitigable as other oscillations like those from speculative ICP.

Regarding BOLT's usage for this problem -- does it mean the profile data is not collected from production binary but collected using pre-BOLD binary in a training run? If this is the setup, compiler can choose to minimze cmov generation for the sake of better profilling.

David

Right, making compiler conservative to preserve branch so we can have control flow profile for BOLT to make final decision is the experiment we're doing. Pseudo-probe for sample PGO can also be tuned a bit more intrusive to disallow cmov for better profile in that setup.

Mar 14 2022, 10:36 PM · Restricted Project, Restricted Project
davidxl added a comment to D120230: [SelectOpti][1/5] Setup new select-optimize pass.

On that internal workload, we've got 6% less cmov with this pass turned on for IRPGO (it works, no correctness issue :-) ). But perf-wise it's neutral (we can measure 0.2% perf movement on that workload with high confidence).

Mar 14 2022, 10:14 PM · Restricted Project, Restricted Project

Mar 13 2022

davidxl added a comment to D120230: [SelectOpti][1/5] Setup new select-optimize pass.

FSAFDO allows late profile loading so that the matching of branches should be less of an issue.

Mar 13 2022, 8:35 PM · Restricted Project, Restricted Project

Mar 11 2022

davidxl updated subscribers of D120230: [SelectOpti][1/5] Setup new select-optimize pass.

Right. I mean pass placement.

Mar 11 2022, 9:15 PM · Restricted Project, Restricted Project

Mar 10 2022

davidxl added a comment to D120230: [SelectOpti][1/5] Setup new select-optimize pass.

The patch looks good to me. Adding Teresa to look at the pass ordering change.

Mar 10 2022, 4:01 PM · Restricted Project, Restricted Project
davidxl added a reviewer for D120230: [SelectOpti][1/5] Setup new select-optimize pass: tejohnson.
Mar 10 2022, 4:01 PM · Restricted Project, Restricted Project

Mar 7 2022

davidxl added inline comments to D120230: [SelectOpti][1/5] Setup new select-optimize pass.
Mar 7 2022, 9:41 PM · Restricted Project, Restricted Project
davidxl added a comment to D120584: [NewPM][Inliner] Mark inlined calls to functions in same SCC as callee as noinline.

As I mentioned, the cost of investigating performance regressions is high. However if y'all think it is better to shift the cost, I won't stay in the way.

Mar 7 2022, 10:59 AM · Restricted Project, Restricted Project
davidxl accepted D121084: [NewPM][Inliner] Make inlined calls to functions in same SCC as callee exponentially expensive.
Mar 7 2022, 10:58 AM · Restricted Project, Restricted Project
davidxl added inline comments to D120584: [NewPM][Inliner] Mark inlined calls to functions in same SCC as callee as noinline.
Mar 7 2022, 10:11 AM · Restricted Project, Restricted Project
davidxl added inline comments to D121084: [NewPM][Inliner] Make inlined calls to functions in same SCC as callee exponentially expensive.
Mar 7 2022, 10:09 AM · Restricted Project, Restricted Project
davidxl updated subscribers of D120584: [NewPM][Inliner] Mark inlined calls to functions in same SCC as callee as noinline.

If this is doable, it feels like a winning strategy to me.

Mar 7 2022, 9:10 AM · Restricted Project, Restricted Project

Mar 6 2022

davidxl added a comment to D120584: [NewPM][Inliner] Mark inlined calls to functions in same SCC as callee as noinline.

There's a couple of reasons why a default-disabled option is not great:

  1. It's not compatible with linker plugin LTO. People would have to be experts on LLVM internals and know that they need to add something like -Wl,-mllvm,-enable-unbounded-cross-scc-inlining=0 to their build system if they are linking rust object files.
  2. This problem is not Rust-specific. According to https://discourse.llvm.org/t/rust-newpm-blocker-catastrophic-inlining/6171/2, Apple uses the earlier version of this patch, though not sure for which toolchain(s). IIRC Mozilla previously encountered this with C++ code.
  3. While I agree that correctness and performance is just another tradeoff, we generally always trade off in favor of correctness, unless we expect a widespread performance impact, in which case may temporarily trade off in favor of performance. There is no evidence that this patch has any widespread impact, but there is evidence against it: For rustc performance tests, this patch is entirely performance neutral (while the previous version had minor negative impact) and the entirety of llvm-test-suite has only two programs even showing codegen changes with this patch. There might be an impact in some isolated cases (actually, this is pretty much guaranteed with any inliner change), but all evidence points towards no widespread impact, and as such I don't think there is justification for favoring performance over correctness.

It is reasonable to expect either 1) a proper/complete fix;

I do think this is a proper fix.

Mar 6 2022, 3:44 PM · Restricted Project, Restricted Project
davidxl added a comment to D120584: [NewPM][Inliner] Mark inlined calls to functions in same SCC as callee as noinline.

LLVM user base is large, the only way to get an answer to that is to turn it on and let user report -- but according to my experience, that is a very disruptive process and user may spend long time figuring out the source of regressions. From the nature of the change, the chances of that happening is quite high.

Mar 6 2022, 2:56 PM · Restricted Project, Restricted Project
davidxl added a comment to D120584: [NewPM][Inliner] Mark inlined calls to functions in same SCC as callee as noinline.

I would not object turning it on by default when the more elaborate solution is there.

Mar 6 2022, 2:01 PM · Restricted Project, Restricted Project
davidxl added a comment to D120584: [NewPM][Inliner] Mark inlined calls to functions in same SCC as callee as noinline.

Adding an option is a way to to forward, but I think it should be off by default. If it is on by default, it is very likely to cause some performance regressions.

Mar 6 2022, 10:12 AM · Restricted Project, Restricted Project

Mar 4 2022

davidxl added inline comments to D120584: [NewPM][Inliner] Mark inlined calls to functions in same SCC as callee as noinline.
Mar 4 2022, 1:48 PM · Restricted Project, Restricted Project
davidxl added inline comments to D120584: [NewPM][Inliner] Mark inlined calls to functions in same SCC as callee as noinline.
Mar 4 2022, 10:49 AM · Restricted Project, Restricted Project

Mar 3 2022

davidxl added inline comments to D120584: [NewPM][Inliner] Mark inlined calls to functions in same SCC as callee as noinline.
Mar 3 2022, 11:09 PM · Restricted Project, Restricted Project
davidxl accepted D120430: [memprof] Symbolize and cache stack frames..

lgtm

Mar 3 2022, 10:19 AM · Restricted Project, Restricted Project

Mar 1 2022

davidxl added a comment to D120335: [llvm-profgen] Generating probe-based non-CS profile..

What is the size overhead on binary with probe? What is the impact on source drifting tolerance level?

Mar 1 2022, 1:39 PM · Restricted Project, Restricted Project

Feb 27 2022

davidxl added a reviewer for D118421: [compiler-rt][profile] Compile as C++: MaskRay.
Feb 27 2022, 4:36 PM
davidxl added inline comments to D120504: [AST] RAV doesn't traverse explicitly instantiated function bodies by default.
Feb 27 2022, 4:34 PM · Restricted Project, Restricted Project
davidxl added inline comments to D120430: [memprof] Symbolize and cache stack frames..
Feb 27 2022, 4:25 PM · Restricted Project, Restricted Project

Feb 23 2022

davidxl accepted D120344: [SampleProf][Inliner] Add an option to turn off inliner in sample-profile pass..

lgtm

Feb 23 2022, 2:03 PM · Restricted Project

Feb 22 2022

davidxl accepted D120295: [SampleProfile] [ICP] Handle the case when the option `MaxNumPromotions` is zero..

lgtm

Feb 22 2022, 5:42 PM · Restricted Project
davidxl added a comment to D120344: [SampleProf][Inliner] Add an option to turn off inliner in sample-profile pass..

Probably add a test case in file test/Transforms/SampleProfile/profile-context-tracker.ll

Feb 22 2022, 2:45 PM · Restricted Project
davidxl added a comment to D120295: [SampleProfile] [ICP] Handle the case when the option `MaxNumPromotions` is zero..

Is it possible to check this in the caller context?

Feb 22 2022, 1:24 PM · Restricted Project
davidxl added inline comments to D120344: [SampleProf][Inliner] Add an option to turn off inliner in sample-profile pass..
Feb 22 2022, 12:18 PM · Restricted Project

Feb 18 2022

davidxl accepted D120147: [memprof] Remove packed qualifier for MemprofRecord::Frame..

lgtm

Feb 18 2022, 12:14 PM · Restricted Project

Feb 17 2022

davidxl accepted D120093: [memprof] Fix frame deserialization on big endian systems..

lgtm

Feb 17 2022, 3:28 PM · Restricted Project
davidxl added inline comments to D120093: [memprof] Fix frame deserialization on big endian systems..
Feb 17 2022, 3:21 PM · Restricted Project

Feb 9 2022

davidxl accepted D118653: [memprof] Extend the index prof format to include memory profiles..

lgtm

Feb 9 2022, 4:17 PM · Restricted Project, Restricted Project

Feb 4 2022

davidxl added inline comments to D118653: [memprof] Extend the index prof format to include memory profiles..
Feb 4 2022, 9:23 PM · Restricted Project, Restricted Project

Jan 28 2022

davidxl added a reviewer for D118066: [SimplifyCFG] Don't speculatively execute preductably-taken block: apostolakis.
Jan 28 2022, 9:38 AM · Restricted Project
davidxl accepted D118390: [InstrProf] Make the IndexedInstrProf header backwards compatible..

lgtm

Jan 28 2022, 9:36 AM · Restricted Project

Jan 27 2022

davidxl added inline comments to D116180: [InstrProf] Add single byte coverage mode.
Jan 27 2022, 11:10 AM · Restricted Project, Restricted Project
davidxl added a comment to D118390: [InstrProf] Make the IndexedInstrProf header backwards compatible..

The refactoring looks good. However in what situation do we need to modify the main header? There might be other ways to achieve the functionality.

Jan 27 2022, 9:47 AM · Restricted Project

Jan 24 2022

davidxl added a comment to D116876: [llvm-cov] New parameters to set coverage watermark.

Please add a test case. The documentation for llvm-cov tool also needs update.

Jan 24 2022, 10:59 AM · Restricted Project, Restricted Project

Jan 19 2022

davidxl accepted D117509: [PartialInline] Bail out on asm-goto/callbr.

lgtm

Jan 19 2022, 8:06 AM · Restricted Project

Jan 18 2022

davidxl added inline comments to D117509: [PartialInline] Bail out on asm-goto/callbr.
Jan 18 2022, 11:48 AM · Restricted Project

Jan 17 2022

davidxl added inline comments to D117509: [PartialInline] Bail out on asm-goto/callbr.
Jan 17 2022, 1:40 PM · Restricted Project
davidxl accepted D117261: [InstrProf][NFC] Add InstrProfInstBase base.

lgtm

Jan 17 2022, 1:31 PM · Restricted Project

Jan 13 2022

davidxl added inline comments to D117261: [InstrProf][NFC] Add InstrProfInstBase base.
Jan 13 2022, 5:15 PM · Restricted Project