Page MenuHomePhabricator

wenlei (Wenlei He)
User

Projects

User does not belong to any projects.

User Details

User Since
Apr 3 2019, 5:55 PM (28 w, 3 d)

Recent Activity

Wed, Oct 16

wenlei accepted D68901: [SampleFDO] Add profile remapping support for profile on-demand loading used by ExtBinary format profile.

Thanks for making the changes, LGTM!

Wed, Oct 16, 7:04 PM · Restricted Project
wenlei added inline comments to D68901: [SampleFDO] Add profile remapping support for profile on-demand loading used by ExtBinary format profile.
Wed, Oct 16, 10:46 AM · Restricted Project

Mon, Oct 14

wenlei added inline comments to D68901: [SampleFDO] Add profile remapping support for profile on-demand loading used by ExtBinary format profile.
Mon, Oct 14, 11:30 AM · Restricted Project

Sat, Oct 12

Herald added a project to D52845: Update entry count for cold calls: Restricted Project.

@wmi @davidxl FYI, we have an internal patch to address this issue further. This patch only scales up entry count of outline function for cold callsites, thus losing context-sensitiveness. But ideally, we could reuse the entire profile from that inline context, and merge it all back to the outlined function. We changed the function annotation/inlining order to be top-down, then as we decide to not inline a cold call site, we merge the entire FunctionSamples of that inlinee back to outline counterpart, and since we process functions in top-down order (ignoring recursion for a second), annotation of the outline callee is able to use post merge FunctionSamples which is more accurate than simple scaling of entry count. (Doing context-sensitive inlining top-down has other benefits too, as it allows specialization whiling inlining which helps maximize the benefit of having context sensitive profile)

Sat, Oct 12, 11:13 AM · Restricted Project

Tue, Oct 8

wenlei accepted D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format.

Symbol list can be provided to llvm-profdata in a plain text file.

Tue, Oct 8, 3:45 PM · Restricted Project

Mon, Oct 7

wenlei added a comment to D68601: [SampleFDO] Add indexing for function profiles so they can be loaded on demand in ExtBinary format.

Thanks for making indexing available for extended binary format. We've recently adding indexing to binary format as well in an internal patch, and also observed similar (~30%) build time reduction for some large services.

Mon, Oct 7, 11:21 PM · Restricted Project
wenlei committed rGb3342e180e9c: [llvm-profdata] Minor format fix (authored by wenlei).
[llvm-profdata] Minor format fix
Mon, Oct 7, 10:15 PM
wenlei committed rL373917: [llvm-profdata] Minor format fix.
[llvm-profdata] Minor format fix
Mon, Oct 7, 10:15 PM
wenlei closed D68440: [llvm-profdata] Minor format fix.
Mon, Oct 7, 10:15 PM · Restricted Project

Thu, Oct 3

wenlei added a comment to D68440: [llvm-profdata] Minor format fix.

Before:

...
  13.1: inlined callee: _ZN6StringD2Ev: 0, 0, 0 sampled lines
    No samples collected in the function's body
    Samples collected in inlined callsites {
      0: inlined callee: _ZN6String4freeEv: 0, 0, 6 sampled lines
        Samples collected in the function's body {
          2: 0
          4: 0
          5: 0
          6: 0
          7: 0
          8: 0
        }
        Samples collected in inlined callsites {
          6: inlined callee: my_free: 0, 0, 1 sampled lines
            Samples collected in the function's body {
              4: 0
            }
            No inlined callsites in this function
}
}
}

After:

...
  13.1: inlined callee: _ZN6StringD2Ev: 0, 0, 0 sampled lines
    No samples collected in the function's body
    Samples collected in inlined callsites {
      0: inlined callee: _ZN6String4freeEv: 0, 0, 6 sampled lines
        Samples collected in the function's body {
          2: 0
          4: 0
          5: 0
          6: 0
          7: 0
          8: 0
        }
        Samples collected in inlined callsites {
          6: inlined callee: my_free: 0, 0, 1 sampled lines
            Samples collected in the function's body {
              4: 0
            }
            No inlined callsites in this function
        }
    }
}
Thu, Oct 3, 9:31 PM · Restricted Project
wenlei created D68440: [llvm-profdata] Minor format fix.
Thu, Oct 3, 7:03 PM · Restricted Project

Aug 20 2019

wenlei committed rG5adace352d5e: [AutoFDO] Make call targets order deterministic for sample profile (authored by wenlei).
[AutoFDO] Make call targets order deterministic for sample profile
Aug 20 2019, 1:53 PM
wenlei committed rL369440: [AutoFDO] Make call targets order deterministic for sample profile.
[AutoFDO] Make call targets order deterministic for sample profile
Aug 20 2019, 1:52 PM
wenlei closed D66191: [AutoFDO] Make call targets order deterministic for sample profil.
Aug 20 2019, 1:52 PM · Restricted Project
wenlei added inline comments to D66191: [AutoFDO] Make call targets order deterministic for sample profil.
Aug 20 2019, 1:45 PM · Restricted Project
wenlei updated the diff for D66191: [AutoFDO] Make call targets order deterministic for sample profil.

address review feedback

Aug 20 2019, 1:45 PM · Restricted Project
wenlei updated the diff for D66191: [AutoFDO] Make call targets order deterministic for sample profil.

rebase

Aug 20 2019, 8:41 AM · Restricted Project

Aug 18 2019

wenlei updated the diff for D66191: [AutoFDO] Make call targets order deterministic for sample profil.

Add getSortedCallTargets and SortCallTargets, per review feedback.

Aug 18 2019, 11:41 AM · Restricted Project

Aug 15 2019

wenlei added inline comments to D66191: [AutoFDO] Make call targets order deterministic for sample profil.
Aug 15 2019, 4:20 PM · Restricted Project
wenlei added a comment to D65060: [LICM] Make Loop ICM profile aware.

Thanks for reply, @asbirlea. Now I see where the non-determinism comes from. I thought that BFI query APIs all go through a translation from BasicBlock* to BlockNode, thus in query APIs, we use BasicBlock* just as a look up key without actually dereferencing it. This is what makes me think removing basic block is fine (we'll have dead entry, but it won't be accessed). But if the dangling BasicBlock* pointer happens to point to newly allocated BasicBlock* later, the count we get for that new block can be non-deterministic. This is quite tricky.. I'm not sure how crash can happen still as it seems we don't dereference BasicBlock* in query APIs, though non-determinism is enough of a problem.

Aug 15 2019, 2:51 PM · Restricted Project
wenlei updated the diff for D66191: [AutoFDO] Make call targets order deterministic for sample profil.

rebase

Aug 15 2019, 9:24 AM · Restricted Project
wenlei added a comment to D65060: [LICM] Make Loop ICM profile aware.

Thanks for the context, @chandlerc. Agreed that ideally BFI should be preserved by all loop passes. Not to defend this change, but I have a few questions just to make sure I understand the cause of miscompile/non-determinism and there's no other lurking issues around this.

Aug 15 2019, 12:39 AM · Restricted Project

Aug 13 2019

wenlei updated the diff for D66191: [AutoFDO] Make call targets order deterministic for sample profil.

update description

Aug 13 2019, 5:52 PM · Restricted Project
wenlei created D66191: [AutoFDO] Make call targets order deterministic for sample profil.
Aug 13 2019, 5:52 PM · Restricted Project
wenlei committed rGd328954467f4: [llvm-profdata] Profile dump for compact binary format (authored by wenlei).
[llvm-profdata] Profile dump for compact binary format
Aug 13 2019, 10:56 AM
wenlei committed rL368731: [llvm-profdata] Profile dump for compact binary format.
[llvm-profdata] Profile dump for compact binary format
Aug 13 2019, 10:55 AM
wenlei closed D65162: [llvm-profdata] Profile dump for compact binary format.
Aug 13 2019, 10:55 AM · Restricted Project

Aug 12 2019

wenlei committed rG4b99b58a847c: [ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin… (authored by wenlei).
[ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin…
Aug 12 2019, 10:46 AM
wenlei committed rL368596: [ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin….
[ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin…
Aug 12 2019, 10:45 AM
wenlei closed D65848: [ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin backends.
Aug 12 2019, 10:45 AM · Restricted Project
wenlei updated the diff for D65848: [ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin backends.

add message for assertion.

Aug 12 2019, 10:27 AM · Restricted Project

Aug 11 2019

wenlei committed rGcb5a90fd314a: Fix pass dependency for LICM (authored by wenlei).
Fix pass dependency for LICM
Aug 11 2019, 3:54 PM
wenlei committed rL368542: Fix pass dependency for LICM.
Fix pass dependency for LICM
Aug 11 2019, 3:54 PM

Aug 10 2019

wenlei committed rG7e71aa24bc07: [LICM] Make Loop ICM profile aware (authored by wenlei).
[LICM] Make Loop ICM profile aware
Aug 10 2019, 11:06 PM
wenlei committed rL368526: [LICM] Make Loop ICM profile aware.
[LICM] Make Loop ICM profile aware
Aug 10 2019, 11:06 PM
wenlei closed D65060: [LICM] Make Loop ICM profile aware.
Aug 10 2019, 11:06 PM · Restricted Project
wenlei committed rGd664072dd5ec: Revert "test commit" (authored by wenlei).
Revert "test commit"
Aug 10 2019, 11:04 PM
wenlei committed rL368525: Revert "test commit".
Revert "test commit"
Aug 10 2019, 10:58 PM
wenlei committed rG7bd327da0098: test commit (authored by wenlei).
test commit
Aug 10 2019, 10:51 PM
wenlei committed rL368524: test commit.
test commit
Aug 10 2019, 10:50 PM

Aug 8 2019

wenlei added inline comments to D65848: [ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin backends.
Aug 8 2019, 9:20 PM · Restricted Project
wenlei updated the diff for D65848: [ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin backends.

let GUIDToFuncNameMapper populate GUIDToFuncNameMap field of FunctionSamples.

Aug 8 2019, 9:12 PM · Restricted Project
wenlei updated the diff for D65848: [ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin backends.

avoid TLS and move GUIDToFuncNameMapper to class member

Aug 8 2019, 9:34 AM · Restricted Project
wenlei added a comment to D65848: [ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin backends.
In D65848#1621227, @wmi wrote:

Thanks for working on the issue.

From reading the comment of LLVM_THREAD_LOCAL in include/llvm/Support/Compiler.h, it seems we should only use TLS for POD if we want it to be functioning as we expect on all platforms.

How about change GUIDToFuncNameMap and CurrentModule as member fields in class GUIDToFuncNameMapper so we can create them separately for each module? Add a unique_ptr of GUIDToFuncNameMapper in SampleProfileLoader and a pointer field of type "GUIDToFuncNameMapper *" for each FunctionSamples object, so FunctionSamples object can access GUIDToFuncNameMap and CurrentModule via the pointer of GUIDToFuncNameMapper. The initialization of GUIDToFuncNameMapper pointer of each FunctionSamples object can be done in GUIDToFuncNameMapper's constructor (need to pass Reader as a param to get all the FunctionSamples).

Aug 8 2019, 8:56 AM · Restricted Project

Aug 6 2019

wenlei created D65848: [ThinLTO][AutoFDO] Fix memory corruption due to race condition from thin backends.
Aug 6 2019, 11:31 PM · Restricted Project

Aug 1 2019

wenlei added a comment to D65060: [LICM] Make Loop ICM profile aware.
In D65060#1596177, @vsk wrote:

Thanks for explaining. I don't have any other concerns about the patch as-written, but am interested in how it fits in with LoopSink (per David's comment).

Aug 1 2019, 9:36 AM · Restricted Project

Jul 29 2019

wenlei added a comment to D65060: [LICM] Make Loop ICM profile aware.

gentle ping.. @davidxl, let me know if the last update addressed your comments. thanks!

Jul 29 2019, 9:36 AM · Restricted Project

Jul 24 2019

wenlei added inline comments to D65060: [LICM] Make Loop ICM profile aware.
Jul 24 2019, 2:35 PM · Restricted Project
wenlei updated the diff for D65060: [LICM] Make Loop ICM profile aware.

add hasProfileData check, expose tuning knob per review feedback.

Jul 24 2019, 2:35 PM · Restricted Project
wenlei added inline comments to D65162: [llvm-profdata] Profile dump for compact binary format.
Jul 24 2019, 11:33 AM · Restricted Project
wenlei updated the diff for D65162: [llvm-profdata] Profile dump for compact binary format.

address feedback

Jul 24 2019, 11:24 AM · Restricted Project
wenlei added a comment to D65060: [LICM] Make Loop ICM profile aware.

It is also better to introduce a coldness factor parameter 'F', i.e, if the source block's count is less than 1/F of the header count, suppress the hoisting.

Jul 24 2019, 11:08 AM · Restricted Project

Jul 23 2019

wenlei created D65162: [llvm-profdata] Profile dump for compact binary format.
Jul 23 2019, 11:50 AM · Restricted Project

Jul 22 2019

wenlei added a comment to D65060: [LICM] Make Loop ICM profile aware.

Hal probably remembered more details. IIRC, the argument was that LICM provides IR canonicalization which can simplify analysis -- there were some examples about exposed vectorization opportunities etc.

Jul 22 2019, 10:59 PM · Restricted Project
wenlei added a comment to D65060: [LICM] Make Loop ICM profile aware.

This was tried before. IIRC, the conclusion was to implement look sinking pass to undo non-profitable LICM. The loop sinking pass was in D22778. Can the loop sinking pass be enhanced to handle the case here ( I have not looked in details) ?

Jul 22 2019, 12:07 PM · Restricted Project
wenlei added inline comments to D65060: [LICM] Make Loop ICM profile aware.
Jul 22 2019, 11:51 AM · Restricted Project
wenlei added inline comments to D65060: [LICM] Make Loop ICM profile aware.
Jul 22 2019, 11:23 AM · Restricted Project
wenlei updated the diff for D65060: [LICM] Make Loop ICM profile aware.

Address review feedbacks.

Jul 22 2019, 11:16 AM · Restricted Project

Jul 21 2019

wenlei added a comment to D65060: [LICM] Make Loop ICM profile aware.

Extra note: we saw a case from one of the big services (at facebook) that's very similar to the example in LICM/sink.ll, where there're cold paths in the loop. And by checking block frequency to avoid hoisting to hotter blocks, we got good improvements (~7% for a CPU related metric).

Jul 21 2019, 11:04 AM · Restricted Project
wenlei created D65060: [LICM] Make Loop ICM profile aware.
Jul 21 2019, 10:51 AM · Restricted Project

Jul 15 2019

wenlei added a comment to D64784: Skip zero column for inline sites.

@dblaikie, thanks for review. Could you help land this change? I don't have commit permission yet..

Jul 15 2019, 9:29 PM · Restricted Project
wenlei added a comment to D64784: Skip zero column for inline sites.

Test coverage? (you could add another inlined call to the test case you added for the feature - but just leave the inlined-at location with a zero column for that particular call - so you can still test both positive/negatiev in the one file)

Jul 15 2019, 9:03 PM · Restricted Project
wenlei updated the diff for D64784: Skip zero column for inline sites.

Tweak test case to add coverage for zero-coloumn inline site

Jul 15 2019, 8:58 PM · Restricted Project
wenlei created D64784: Skip zero column for inline sites.
Jul 15 2019, 8:21 PM · Restricted Project
wenlei added a comment to D64033: Add column info for inline sites.

What's the motivation for this?

Jul 15 2019, 8:09 PM · Restricted Project

Jul 12 2019

wenlei added a comment to D64033: Add column info for inline sites.

LGTM. Thank you!

Jul 12 2019, 11:59 AM · Restricted Project

Jul 11 2019

wenlei added a comment to D64033: Add column info for inline sites.

If the change from DW_AT_MIPS_linkage_name to DW_AT_linkage_name made you think the older .o files were MIPS, more likely it is a change in the default DWARF version; older versions of DWARF did not have DW_AT_linkage_name.

I think the .ll tests are sufficient to show we emit the call_column so no need to update the odr-member-functions tests IMO.

Jul 11 2019, 11:03 AM · Restricted Project
wenlei updated the diff for D64033: Add column info for inline sites.

Undo changes to odr-member-functions test.

Jul 11 2019, 10:52 AM · Restricted Project

Jul 10 2019

wenlei added a comment to D64033: Add column info for inline sites.

I guess we can leave the .s file tests alone. Most of them look like they were derived from real -S output and it would be pedantically consistent to update them, but they don't have to be.
@JDevlieghere once you sort out the odr-member-functions the patch is fine with me. I am mildly curious why all three .o files got smaller.

I applied the patch and that obviously doesn't include the binaries, my bad. Do we need to regenerate these binaries at all? I'm tempted to just leave this test as-is.

Jul 10 2019, 12:03 AM · Restricted Project

Jul 9 2019

wenlei added a comment to D64033: Add column info for inline sites.

@JDevlieghere once you sort out the odr-member-functions the patch is fine with me. I am mildly curious why all three .o files got smaller.

Jul 9 2019, 11:59 PM · Restricted Project
wenlei added a comment to D64033: Add column info for inline sites.

I suspect there are more tests that must or should be updated; grepping for "DW_AT_call_" turns up for example:
llvm/test/DebugInfo/NVPTX/debug-info.ll
llvm/test/DebugInfo/X86/dbg-value-inlined-parameter.ll
llvm/test/tools/dsymutil/odr-member-functions.cpp

and there are a variety of .s files with hand-generated DWARF that probably should be updated as well, under llvm/test/tools/llvm-dwarfdump.

Jul 9 2019, 1:43 PM · Restricted Project
wenlei updated the diff for D64033: Add column info for inline sites.

Update more tests.

Jul 9 2019, 1:26 PM · Restricted Project

Jul 1 2019

wenlei created D64033: Add column info for inline sites.
Jul 1 2019, 2:00 PM · Restricted Project

Jun 3 2019

wenlei added inline comments to rG082d99f58cbe: Unbreak non-PIC builds after r362390 / D62720.
Jun 3 2019, 9:58 PM

Apr 30 2019

wenlei added a comment to D38061: Set AnonymousTagLocations false for ASTContext if column info is expected not to be used.

Seems like the right thing would be for the DWARF code that wants a rendered type name to pass its own printing policy, rather than changing some relatively global one.

(though also I have my doubts about the whole approach - macro expansion can change the line number as well as the column number, so only suppressing column number would be insufficient - and this also reduces the usability for users (because the file/line/column of an anonymous type is a useful debugging aid). Seems to me like this should be opt-in if it's supported at all - though ideally build systems would use -frewrite-includes rather than full preprocessing, and then macros and lines/columns would be preserved, I think)

Apr 30 2019, 9:53 AM · Restricted Project