Page MenuHomePhabricator

wmi (Wei Mi)
User

Projects

User does not belong to any projects.

User Details

User Since
Feb 20 2015, 10:57 AM (221 w, 1 d)

Recent Activity

Tue, Apr 23

wmi added a reviewer for D60088: [GlobalOpt][SampleFDO] Add an option to control whether to rename alias target: eraman.
Tue, Apr 23, 10:26 AM · Restricted Project

Mon, Apr 22

wmi committed rG01f8d556aa72: [PGO/SamplePGO][NFC] Move the function updateProfWeight from Instruction to… (authored by wmi).
[PGO/SamplePGO][NFC] Move the function updateProfWeight from Instruction to…
Mon, Apr 22, 10:03 AM
wmi committed rL358900: [PGO/SamplePGO][NFC] Move the function updateProfWeight from Instruction.
[PGO/SamplePGO][NFC] Move the function updateProfWeight from Instruction
Mon, Apr 22, 10:03 AM
wmi closed D60911: [PGO/SamplePGO][NFC] Move the function updateProfWeight from Instruction to CallInst.
Mon, Apr 22, 10:02 AM · Restricted Project
wmi added a comment to D60911: [PGO/SamplePGO][NFC] Move the function updateProfWeight from Instruction to CallInst.

Run internal FDO testing and the speedup is within expected range.

Mon, Apr 22, 8:30 AM · Restricted Project

Fri, Apr 19

wmi created D60911: [PGO/SamplePGO][NFC] Move the function updateProfWeight from Instruction to CallInst.
Fri, Apr 19, 11:57 AM · Restricted Project

Apr 19 2019

wmi added a comment to D60903: [SampleFDO] Never set profile entry count to 0.

Looked at the Instruction::updateProfWeight() --- the part that update branch_weights seems bogus -- there is need need to scale branch weight at all.

VP data needs to be scaled of course, but if 'S' or 'T' is zero, just zero them out.

Apr 19 2019, 10:35 AM · Restricted Project
wmi added a comment to D60903: [SampleFDO] Never set profile entry count to 0.

Seems we did that all the time even before https://reviews.llvm.org/rL352001 (Since rL352001, we did the update even in sample profile loader pass, when we found cold inline instance was not inlined by sample profile loader). Here is the code snippet before the refactoring involved in rL352001.

Apr 19 2019, 10:19 AM · Restricted Project
wmi added a comment to D60903: [SampleFDO] Never set profile entry count to 0.

It is possible to for an entry with zero count -- as for instance in instrumentation PGO. Should it be fixed in place where the div by zero happens?

Apr 19 2019, 9:49 AM · Restricted Project
wmi created D60903: [SampleFDO] Never set profile entry count to 0.
Apr 19 2019, 9:10 AM · Restricted Project

Apr 3 2019

wmi added a comment to D57591: fix weights for promoted indirect calls.

No major improvement or regression was found in our test.

Apr 3 2019, 10:06 AM · Restricted Project
wmi accepted D59835: [ProfileSummary] Count callsite samples when computing total samples..

Did the test. No major improvement or regression was found. LGTM.

Apr 3 2019, 10:06 AM · Restricted Project

Apr 2 2019

wmi added a comment to D60086: [SampleProfile] Check entry count instead of total count to decide if inlined callsite is hot..

@wmi Thank you for the concrete example! I think what we need for your example is context-sensitive profiling and function specialization, not inlining. Admittedly we don't have an infrastructure in LLVM to support context-sensitive profiling for non-inlined case and we don't perform context sensitive function specialization...

Again, my bigger concern is with using PSI->isHotCount to check the function hotness. If we want to stay with the function total count based heuristic, wouldn't it make more sense if we have something like ProfileSummaryInfo::isFunctionHotInCallGraph, which actually checks the hotness of the function itself, and use it?

Apr 2 2019, 4:20 PM · Restricted Project
wmi added a comment to D60086: [SampleProfile] Check entry count instead of total count to decide if inlined callsite is hot..

I can understand the value of context-sensitive profile information, but wouldn't it be only valuable if the callsite is actually worth inlining?

Apr 2 2019, 3:12 PM · Restricted Project
wmi added a comment to D60086: [SampleProfile] Check entry count instead of total count to decide if inlined callsite is hot..

@wmi Thanks for the reply! I can totally understand that entry count is not as precise as total count, but still don't think current implementation is the right way to address the issue. As I mentioned in the summary it compares two different things (instruction level counter vs function level counter), opens up a possibility for optimizing against wrong function (e.g. long and cold function), and makes it hard to find the root cause of the performance issue.

If we can't have a precise entry count, the right way to address the issue would be not using PSI based heuristic but using a heuristic that actually considers a total count of the function.

Apr 2 2019, 10:20 AM · Restricted Project
wmi added a comment to D59835: [ProfileSummary] Count callsite samples when computing total samples..

That is a reasonable change to me. Thanks. However it probably needs some careful performance evaluation since some tuning may be needed for that change. I will do that in our benchmarks. How does this change on performance look like on your side?

Apr 2 2019, 9:44 AM · Restricted Project
wmi added a comment to D57591: fix weights for promoted indirect calls.

That is a good improvement. Thanks! I will evaluate how much impact it will have in our benchmarks.

Apr 2 2019, 9:29 AM · Restricted Project

Apr 1 2019

wmi added a comment to D60086: [SampleProfile] Check entry count instead of total count to decide if inlined callsite is hot..

Theorectically, as you said checking entry sample count rather than the total sample count makes more sense. However, the entry sample count of callsite is not as precise as function sample entry count, which is got from lbr directly. For callsite, it can get very wrong entry sample count because of missing debug information after optimization. That is why we chose total sample count instead of entry sample count when evaluating the hotness of callsite.

Apr 1 2019, 4:30 PM · Restricted Project
wmi accepted D59940: [SampleProfile] Repeat indirect call promotion only when the target is actually hot..

Thanks! The change makes a lot sense to me.

Apr 1 2019, 4:13 PM · Restricted Project
wmi accepted D59078: memcpy is not tailcalled.

LGTM.

Apr 1 2019, 2:50 PM
wmi created D60088: [GlobalOpt][SampleFDO] Add an option to control whether to rename alias target.
Apr 1 2019, 12:09 PM · Restricted Project

Mar 27 2019

wmi added a comment to D59869: [NewPM] Fix a nasty bug with analysis invalidation in the new PM..

Thanks for the detailed answer for how to make the choices!

Mar 27 2019, 4:05 PM · Restricted Project
wmi added a comment to D59869: [NewPM] Fix a nasty bug with analysis invalidation in the new PM..

Thanks for working on the patch which will fix several build failures found internally! I believe the patch is already after a lot of balancing, and here are just some questions out of curiousity.

Mar 27 2019, 2:27 PM · Restricted Project

Mar 11 2019

wmi accepted D58832: [SampleFDO] add suffix elision control for fcn names.

LGTM. Thanks for working on it!

Mar 11 2019, 2:34 PM · Restricted Project
wmi added inline comments to D58832: [SampleFDO] add suffix elision control for fcn names.
Mar 11 2019, 11:56 AM · Restricted Project

Mar 8 2019

wmi added a comment to D58832: [SampleFDO] add suffix elision control for fcn names.
In D58832#1422205, @wmi wrote:

We need function attribute only when we want to use different strategies for different functions. Do we need the function granularity control over the suffix elision policy?

We want to be able to eventually support ThinLTO in which you have a mix of C and Go functions (it is fairly common for Go programs to call out into C code). Without such a change we'd have to pick one policy or another and hope that it works for code generated by every front end.

Mar 8 2019, 4:16 PM · Restricted Project
wmi added inline comments to D59143: [RegisterCoalescer] Limit the number of joins for large live interval with many valnos. .
Mar 8 2019, 3:34 PM · Restricted Project
wmi committed rG98214347c4ac: Rename a local variable counter to Counter. (authored by wmi).
Rename a local variable counter to Counter.
Mar 8 2019, 3:32 PM
wmi committed rL355759: Rename a local variable counter to Counter..
Rename a local variable counter to Counter.
Mar 8 2019, 3:32 PM
wmi committed rGfb9693d1c9cc: [RegisterCoalescer][NFC] bind a DenseMap access to a reference to avoid… (authored by wmi).
[RegisterCoalescer][NFC] bind a DenseMap access to a reference to avoid…
Mar 8 2019, 3:29 PM
wmi committed rL355757: [RegisterCoalescer][NFC] bind a DenseMap access to a reference to avoid.
[RegisterCoalescer][NFC] bind a DenseMap access to a reference to avoid
Mar 8 2019, 3:29 PM
wmi committed rG72ec6801b5b2: [RegisterCoalescer] Limit the number of joins for large live interval with many… (authored by wmi).
[RegisterCoalescer] Limit the number of joins for large live interval with many…
Mar 8 2019, 11:27 AM
wmi committed rL355714: [RegisterCoalescer] Limit the number of joins for large live interval with.
[RegisterCoalescer] Limit the number of joins for large live interval with
Mar 8 2019, 11:25 AM
wmi closed D59143: [RegisterCoalescer] Limit the number of joins for large live interval with many valnos. .
Mar 8 2019, 11:25 AM · Restricted Project
wmi added a comment to D59143: [RegisterCoalescer] Limit the number of joins for large live interval with many valnos. .

Thanks for the fast response!

Mar 8 2019, 10:46 AM · Restricted Project
wmi created D59143: [RegisterCoalescer] Limit the number of joins for large live interval with many valnos. .
Mar 8 2019, 10:32 AM · Restricted Project

Mar 7 2019

wmi added a comment to D58832: [SampleFDO] add suffix elision control for fcn names.

We need function attribute only when we want to use different strategies for different functions. Do we need the function granularity control over the suffix elision policy?

Mar 7 2019, 3:27 PM · Restricted Project

Feb 28 2019

wmi accepted D58589: [ConstantHoisting] Call cleanup() in ConstantHoistingPass::runImpl to avoid dangling elements in ConstIntInfoVec for new PM.

LGTM

Feb 28 2019, 9:23 PM · Restricted Project

Feb 27 2019

wmi added inline comments to D58589: [ConstantHoisting] Call cleanup() in ConstantHoistingPass::runImpl to avoid dangling elements in ConstIntInfoVec for new PM.
Feb 27 2019, 2:51 PM · Restricted Project

Feb 20 2019

wmi committed rG500606f270ff: [Inliner] Pass nullptr for the ORE param of getInlineCost if RemarkEnabled is… (authored by wmi).
[Inliner] Pass nullptr for the ORE param of getInlineCost if RemarkEnabled is…
Feb 20 2019, 6:59 PM
wmi committed rL354542: [Inliner] Pass nullptr for the ORE param of getInlineCost if RemarkEnabled.
[Inliner] Pass nullptr for the ORE param of getInlineCost if RemarkEnabled
Feb 20 2019, 6:58 PM
wmi closed D58399: [Inliner] Don't initialize ComputeFullInlineCost to be always true because of ORE.
Feb 20 2019, 6:58 PM · Restricted Project
wmi added a comment to D58399: [Inliner] Don't initialize ComputeFullInlineCost to be always true because of ORE.

Thanks for the review!

Feb 20 2019, 6:45 PM · Restricted Project

Feb 19 2019

wmi updated the diff for D58399: [Inliner] Don't initialize ComputeFullInlineCost to be always true because of ORE.

Add a TODO comment.

Feb 19 2019, 6:44 PM · Restricted Project
wmi updated the diff for D58399: [Inliner] Don't initialize ComputeFullInlineCost to be always true because of ORE.

Address Chandler's comment.

Feb 19 2019, 6:39 PM · Restricted Project
wmi added inline comments to D58399: [Inliner] Don't initialize ComputeFullInlineCost to be always true because of ORE.
Feb 19 2019, 6:38 PM · Restricted Project
wmi updated the diff for D58399: [Inliner] Don't initialize ComputeFullInlineCost to be always true because of ORE.

Address Easwaran and Chandler's comments.

Feb 19 2019, 4:26 PM · Restricted Project
wmi added inline comments to D58399: [Inliner] Don't initialize ComputeFullInlineCost to be always true because of ORE.
Feb 19 2019, 4:18 PM · Restricted Project
wmi added inline comments to D58399: [Inliner] Don't initialize ComputeFullInlineCost to be always true because of ORE.
Feb 19 2019, 3:44 PM · Restricted Project
wmi updated the diff for D58399: [Inliner] Don't initialize ComputeFullInlineCost to be always true because of ORE.

Address Easwaran and Chandler's comments.

Feb 19 2019, 3:22 PM · Restricted Project
wmi added a comment to D58399: [Inliner] Don't initialize ComputeFullInlineCost to be always true because of ORE.

One option is to call DiagnosticHandler::isMissedOptRemarkEnabled in Inliner and pass the result (bool) when creating the CallAnalyzer object. This can be ORed with whether the missed opt remark is enabled for inline-cost.

Feb 19 2019, 3:20 PM · Restricted Project
wmi added inline comments to D58399: [Inliner] Don't initialize ComputeFullInlineCost to be always true because of ORE.
Feb 19 2019, 12:35 PM · Restricted Project
wmi created D58399: [Inliner] Don't initialize ComputeFullInlineCost to be always true because of ORE.
Feb 19 2019, 11:26 AM · Restricted Project

Feb 11 2019

wmi accepted D58064: [ThinLTO] Record in index whether IR used a flattened sample PGO profile.

LGTM.

Feb 11 2019, 1:35 PM · Restricted Project

Feb 7 2019

wmi edited reviewers for D57929: [InstrProf] Implement static profdata registration, added: xur; removed: wmi.
Feb 7 2019, 3:48 PM · Restricted Project, Restricted Project

Feb 4 2019

wmi committed rG4901f371a280: [SamplePGO][NFC] Minor improvement to replace a temporary vector with a brace… (authored by wmi).
[SamplePGO][NFC] Minor improvement to replace a temporary vector with a brace…
Feb 4 2019, 4:58 PM
wmi committed rL353129: [SamplePGO][NFC] Minor improvement to replace a temporary vector with a.
[SamplePGO][NFC] Minor improvement to replace a temporary vector with a
Feb 4 2019, 4:57 PM
wmi closed D57726: [SamplePGO][NFC] Minor improvement to replace a temporary vector with a brace-enclosed init list.
Feb 4 2019, 4:57 PM · Restricted Project
wmi created D57726: [SamplePGO][NFC] Minor improvement to replace a temporary vector with a brace-enclosed init list.
Feb 4 2019, 4:52 PM · Restricted Project
wmi added a comment to D57706: [SamplePGO] Minor efficiency improvement in samplePGO ICP.

LGTM. There is such place in SampleProfileLoader pass. I will do the same for it.

Feb 4 2019, 4:29 PM · Restricted Project
wmi accepted D57705: [SamplePGO] More pipeline changes when flattened profile used in ThinLTO postlink.

LGTM.

Feb 4 2019, 4:16 PM · Restricted Project

Jan 23 2019

wmi accepted D52845: Update entry count for cold calls.

Sorry it takes me some time to test the change on our major benchmarks. The results are neutral.

Jan 23 2019, 2:08 PM

Jan 17 2019

wmi committed rL351476: [SampleFDO] Skip profile reading when flattened profile used in ThinLTO postlink.
[SampleFDO] Skip profile reading when flattened profile used in ThinLTO postlink
Jan 17 2019, 12:52 PM
wmi closed D54819: [SampleFDO] Skip profile reading when flatten profile is used in ThinLTO postlink phase.
Jan 17 2019, 12:52 PM
wmi updated the diff for D54819: [SampleFDO] Skip profile reading when flatten profile is used in ThinLTO postlink phase.

Address Teresa's comments.

Jan 17 2019, 9:31 AM

Jan 16 2019

wmi committed rL351397: Fix a mistake in rL351392..
Fix a mistake in rL351392.
Jan 16 2019, 3:35 PM
wmi committed rL351392: [PGO] Make pgo related options in opt more consistent..
[PGO] Make pgo related options in opt more consistent.
Jan 16 2019, 3:23 PM
wmi closed D56749: [NFC] Make pgo related options in opt more consistent. .
Jan 16 2019, 3:22 PM
wmi added inline comments to D56749: [NFC] Make pgo related options in opt more consistent. .
Jan 16 2019, 3:22 PM

Jan 15 2019

wmi created D56749: [NFC] Make pgo related options in opt more consistent. .
Jan 15 2019, 3:23 PM
wmi added inline comments to D54819: [SampleFDO] Skip profile reading when flatten profile is used in ThinLTO postlink phase.
Jan 15 2019, 2:52 PM
wmi added inline comments to D52845: Update entry count for cold calls.
Jan 15 2019, 10:28 AM
wmi accepted D56491: treat invoke like call.

LGTM.

Jan 15 2019, 9:48 AM
wmi accepted D56435: We can improve the performance (generally) by memo-izing the action to map a debug location to its function summary..

LGTM.

Jan 15 2019, 9:13 AM

Jan 14 2019

wmi added inline comments to D52845: Update entry count for cold calls.
Jan 14 2019, 4:28 PM
wmi added inline comments to D52845: Update entry count for cold calls.
Jan 14 2019, 12:12 PM
wmi added a comment to D56435: We can improve the performance (generally) by memo-izing the action to map a debug location to its function summary..

Seems the compile time saving is got mainly because there are multiple instructions sharing the same debug location, is my understanding correct?

Seems incorrect. findFunctionSamples may be called for the same instruction multiple times due to multiple iterations of hot functions inlining or profile propagation.

Jan 14 2019, 11:58 AM
wmi accepted D55094: Ignore PhiNodes when mapping sample profile data.

LGTM.

Jan 14 2019, 10:19 AM
wmi added a comment to D56491: treat invoke like call.

Could you have a testcase for it?

Jan 14 2019, 10:10 AM
wmi added a comment to D56435: We can improve the performance (generally) by memo-izing the action to map a debug location to its function summary..

Seems the compile time saving is got mainly because there are multiple instructions sharing the same debug location, is my understanding correct?

Jan 14 2019, 9:18 AM

Jan 11 2019

wmi updated the diff for D54819: [SampleFDO] Skip profile reading when flatten profile is used in ThinLTO postlink phase.

Address Teresa's comments.

Jan 11 2019, 6:46 PM
wmi added a comment to D54819: [SampleFDO] Skip profile reading when flatten profile is used in ThinLTO postlink phase.

Yes, the support is also helpful for old pass manager. Add some code to make it easier to enable sampleprofileloader pass in the pipeline of old pass manager through opt, mainly for testing purpose.

Jan 11 2019, 6:44 PM

Jan 7 2019

wmi committed rL350586: [RegisterCoalescer] dst register's live interval needs to be updated when.
[RegisterCoalescer] dst register's live interval needs to be updated when
Jan 7 2019, 4:30 PM
wmi closed D55867: [RegisterCoalescer] dst register's live interval needs to be updated when merging a src register in ToBeUpdated set.
Jan 7 2019, 4:30 PM
wmi added a comment to D55867: [RegisterCoalescer] dst register's live interval needs to be updated when merging a src register in ToBeUpdated set.

friendly ping. Please take another look.

Jan 7 2019, 12:35 PM

Jan 2 2019

wmi committed rL350223: [PowerPC] Remove SeenUse check when optimizing conditional branch in.
[PowerPC] Remove SeenUse check when optimizing conditional branch in
Jan 2 2019, 9:11 AM
wmi closed D56041: [PowerPC] Fix a bug when optimizing conditional branch in PPCPreEmitPeephole pass.
Jan 2 2019, 9:10 AM

Dec 21 2018

wmi created D56041: [PowerPC] Fix a bug when optimizing conditional branch in PPCPreEmitPeephole pass.
Dec 21 2018, 5:25 PM
wmi added inline comments to D55867: [RegisterCoalescer] dst register's live interval needs to be updated when merging a src register in ToBeUpdated set.
Dec 21 2018, 11:36 AM
wmi updated the diff for D55867: [RegisterCoalescer] dst register's live interval needs to be updated when merging a src register in ToBeUpdated set.

Address Quentin's comment.

Dec 21 2018, 11:34 AM

Dec 18 2018

wmi updated the diff for D55867: [RegisterCoalescer] dst register's live interval needs to be updated when merging a src register in ToBeUpdated set.

A minor update to the comment.

Dec 18 2018, 4:53 PM
wmi created D55867: [RegisterCoalescer] dst register's live interval needs to be updated when merging a src register in ToBeUpdated set.
Dec 18 2018, 4:39 PM

Dec 17 2018

wmi added a comment to D55681: [llvm] API for encoding/decoding DWARF discriminators..

LGTM from my limited knowledge on discriminator. I added dblaikie to help on the review.

Dec 17 2018, 11:22 AM
wmi added a reviewer for D55681: [llvm] API for encoding/decoding DWARF discriminators.: dblaikie.
Dec 17 2018, 9:38 AM

Dec 13 2018

wmi committed rL349088: [SampleFDO] handle ProfileSampleAccurate when initializing function entry count.
[SampleFDO] handle ProfileSampleAccurate when initializing function entry count
Dec 13 2018, 1:55 PM
wmi closed D55660: [SampleFDO] handle ProfileSampleAccurate when initializing function entry count.
Dec 13 2018, 1:54 PM
wmi updated the diff for D55660: [SampleFDO] handle ProfileSampleAccurate when initializing function entry count.

Fix test/Transforms/SampleProfile/inline-cold-callsite-samplepgo.ll which was over simplified in the last revision.

Dec 13 2018, 11:27 AM
wmi updated the diff for D55660: [SampleFDO] handle ProfileSampleAccurate when initializing function entry count.

Address Easwaran and Teresa's comments.

Dec 13 2018, 11:16 AM
wmi added a comment to D55660: [SampleFDO] handle ProfileSampleAccurate when initializing function entry count.

Could you add a test case or augment an existing test to check that the function entry count is 0 if -sample-profile and -profile-sample-accurate is specified?

Dec 13 2018, 11:14 AM
wmi created D55660: [SampleFDO] handle ProfileSampleAccurate when initializing function entry count.
Dec 13 2018, 9:35 AM