Page MenuHomePhabricator

danielcdh (Dehao Chen)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 26 2015, 2:20 PM (172 w, 4 d)

Recent Activity

Mon, Nov 19

danielcdh accepted D54659: [LoopSink] Add preheader to alias set.
Mon, Nov 19, 9:07 AM

Sep 14 2018

danielcdh accepted D51863: [SampleFDO] Add FunctionOffsetTable in compact binary format profile..
Sep 14 2018, 11:47 AM

Sep 10 2018

danielcdh accepted D51863: [SampleFDO] Add FunctionOffsetTable in compact binary format profile..
Sep 10 2018, 10:20 AM

Sep 6 2018

danielcdh accepted D51643: [SampleFDO] Make sample profile loader unaware of compact format change.
Sep 6 2018, 2:38 PM
danielcdh added inline comments to D51643: [SampleFDO] Make sample profile loader unaware of compact format change.
Sep 6 2018, 10:36 AM
danielcdh added inline comments to D51643: [SampleFDO] Make sample profile loader unaware of compact format change.
Sep 6 2018, 10:30 AM

Aug 7 2018

danielcdh accepted D50370: [SampleFDO] Fix a bug in getOrCompHotCountThreshold/getOrCompColdCountThreshold.
Aug 7 2018, 8:42 AM

Jun 8 2018

danielcdh accepted D47955: [SampleFDO] Add a new compact binary format for sample profile.

This is great stuff!

Jun 8 2018, 4:26 PM

May 10 2018

danielcdh accepted D45377: [SampleFDO] Don't let inliner treat warm callsite with inline instance in the profile as cold.
May 10 2018, 1:09 PM

Apr 30 2018

danielcdh added inline comments to D45377: [SampleFDO] Don't let inliner treat warm callsite with inline instance in the profile as cold.
Apr 30 2018, 6:41 PM
danielcdh accepted D46275: [JumpThreading] ComputeValueKnownInPredecessors only handles the case when phi is on lhs of a comparison op.
Apr 30 2018, 6:35 PM

Apr 16 2018

danielcdh added a comment to D45377: [SampleFDO] Don't let inliner treat warm callsite with inline instance in the profile as cold.
In D45377#1068853, @wmi wrote:

Thanks for the fix!

I think maybe a preferred way to fix this is to change SampleProfileLoader::inlineHotFunctions to inline these "warm" inlined callsites early. The current algorithm uses callsiteIsHot, which compares inline instance's total count to the caller's total count, which could be misleading if the caller is super large/hot. A better algorithm should compare inline instance's total count to PSI to get a global hotness. In this way, if the profile annotator thinks a callsite is not hot, the later inliner should *not* even try to inline it. This makes the design cleaner and more stable. WDYT?

I tried the idea to compute the inline instance's total count divided by its bb count, and compare the division result to PSI hot threshold. That improved the regression benchmark but did not recover the whole regression. That is why I choosed to keep the current callsiteIsHot check in early inliner unchanged because I guessed regular inliner may have a better position to decide whether to inline such warm/medium size callsite.

Apr 16 2018, 9:42 AM

Apr 15 2018

danielcdh added a comment to D45377: [SampleFDO] Don't let inliner treat warm callsite with inline instance in the profile as cold.

Thanks for the fix!

Apr 15 2018, 6:33 PM

Mar 6 2018

danielcdh accepted D44161: [SampleFDO] Extend SampleProfReader to handle demangled names.

Just to confirm, this only applies to text format. Binary format should not have this problem, right?

Mar 6 2018, 11:02 AM

Dec 22 2017

danielcdh added a comment to D41466: [Unroll][DebugInfo] Propagate loop body's debug location to epilog preheader.

The change makes sense to me. It should not affect SamplePGO as terminator instructions is not used in profile annotation.

Dec 22 2017, 8:01 AM · debug-info

Dec 15 2017

danielcdh accepted D41307: [PGO] Fix handling of cold entry count for instrumented PGO.

LGTM, I'm curious what is the code size impact on PGO binaries (I'd expect it to reduce code size significantly)

Dec 15 2017, 9:33 PM

Dec 4 2017

danielcdh added inline comments to D40413: [CodeExtractor] Add debug locations for new call and branch instrs..
Dec 4 2017, 9:15 PM
danielcdh added a reviewer for D40413: [CodeExtractor] Add debug locations for new call and branch instrs.: wmi.
Dec 4 2017, 2:12 PM
danielcdh accepted D40413: [CodeExtractor] Add debug locations for new call and branch instrs..

The patch LGTM. Notes about SamplePGO use cases below, which may be addressed by separate patch/discussion, if necessary.

Dec 4 2017, 2:12 PM

Nov 10 2017

danielcdh accepted D39923: [ThinLTO] Handle -fdebug-pass-manager for backend invocations via clang.

Thanks for the fix.

Nov 10 2017, 2:15 PM

Nov 6 2017

danielcdh closed D38763: Include already promoted counts when computing SUM for VP..
Nov 6 2017, 11:53 AM

Nov 1 2017

danielcdh closed D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..
Nov 1 2017, 1:27 PM
danielcdh updated the diff for D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..

update

Nov 1 2017, 11:40 AM
danielcdh added inline comments to D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..
Nov 1 2017, 11:40 AM
danielcdh updated the diff for D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..

update

Nov 1 2017, 11:15 AM
danielcdh added a comment to D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..

If the indirect calls along this chain were promoted and then inlined into foo in the profiled binary, we cannot do the promotion/inlining of the call back to A.c:baz() during the compile step since it is intra-module, but we need to be able to annotate the call somewhere. And apparently we can't do this on the VP metadata for the indirect call in B.c:bar() since it was inlined into foo() in the profiled binary (Dehao - is that correct?).

That's correct

Nov 1 2017, 11:15 AM

Oct 31 2017

danielcdh added a comment to D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..

Does VP metadata get synthesized from the SamplePGO profile? The original bug report had some of the targets on VP metadata attached to the indirect call, but was missing others. Why not include all the targets on that VP metadata rather than marking them for import which seems more hacky.

Because we need to handle the indirect call chain too. E.g. foo() indirectly calls bar(), which indirectly calls baz(). Assuming these 2 calls are ICPed and inlined in the profiling binary, when processing foo's indirect callsite, marking bar's GUID in the VP is not enough, but adding baz's GUID to foo's callsite does not make sense as baz is not its call target. That's why we need to keep the transitive closure as function level metadata.

But the case here is when they are in fact in the same module. I'm assuming the situation you describe here already works when bar and baz are in different modules than foo. In this case we have foo and bar in the same module, and were missing the VP metadata showing that foo calls bar indirectly. If baz is also in the same module then the same situation applies to the callsite in bar. If it is in a different module then wouldn't the existing logic handle it? I.e. it would be on bar's function entry metadata?

Oct 31 2017, 3:34 PM
danielcdh added a comment to D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..

Does VP metadata get synthesized from the SamplePGO profile? The original bug report had some of the targets on VP metadata attached to the indirect call, but was missing others. Why not include all the targets on that VP metadata rather than marking them for import which seems more hacky.

Oct 31 2017, 3:12 PM
danielcdh created D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..
Oct 31 2017, 2:46 PM

Oct 26 2017

danielcdh closed D39343: Do not add discriminator encoding for debug intrinsics..
Oct 26 2017, 2:21 PM
danielcdh created D39343: Do not add discriminator encoding for debug intrinsics..
Oct 26 2017, 1:23 PM

Oct 10 2017

danielcdh created D38763: Include already promoted counts when computing SUM for VP..
Oct 10 2017, 2:21 PM
danielcdh closed D38478: Use the first instruction's count to estimate the funciton's entry frequency..
Oct 10 2017, 2:13 PM
danielcdh updated the diff for D38478: Use the first instruction's count to estimate the funciton's entry frequency..

extract the change that update the SUM of VP.

Oct 10 2017, 10:39 AM

Oct 9 2017

danielcdh added inline comments to D38478: Use the first instruction's count to estimate the funciton's entry frequency..
Oct 9 2017, 9:26 AM
danielcdh updated the diff for D38478: Use the first instruction's count to estimate the funciton's entry frequency..

update comment

Oct 9 2017, 9:26 AM

Oct 6 2017

danielcdh updated the diff for D38478: Use the first instruction's count to estimate the funciton's entry frequency..

rebase

Oct 6 2017, 10:09 AM
danielcdh closed D38603: Directly return promoted direct call instead of rely on stripPointerCast..
Oct 6 2017, 10:06 AM
danielcdh updated the diff for D38603: Directly return promoted direct call instead of rely on stripPointerCast..

update comments

Oct 6 2017, 8:54 AM
danielcdh added a comment to D38603: Directly return promoted direct call instead of rely on stripPointerCast..

Thanks.

Oct 6 2017, 8:54 AM

Oct 5 2017

danielcdh created D38603: Directly return promoted direct call instead of rely on stripPointerCast..
Oct 5 2017, 2:50 PM
danielcdh updated the diff for D38478: Use the first instruction's count to estimate the funciton's entry frequency..

rebase

Oct 5 2017, 1:18 PM
danielcdh closed D38477: Annotate VP prof on indirect call if it is ICPed in the profiled binary..
Oct 5 2017, 1:17 PM
danielcdh updated the summary of D38477: Annotate VP prof on indirect call if it is ICPed in the profiled binary..
Oct 5 2017, 12:38 PM

Oct 3 2017

danielcdh updated the diff for D38478: Use the first instruction's count to estimate the funciton's entry frequency..

update comment

Oct 3 2017, 12:48 PM
danielcdh added inline comments to D38478: Use the first instruction's count to estimate the funciton's entry frequency..
Oct 3 2017, 12:48 PM
danielcdh added inline comments to D38477: Annotate VP prof on indirect call if it is ICPed in the profiled binary..
Oct 3 2017, 11:28 AM
danielcdh updated the diff for D38477: Annotate VP prof on indirect call if it is ICPed in the profiled binary..

update comment

Oct 3 2017, 11:28 AM

Oct 2 2017

danielcdh created D38478: Use the first instruction's count to estimate the funciton's entry frequency..
Oct 2 2017, 1:52 PM
danielcdh created D38477: Annotate VP prof on indirect call if it is ICPed in the profiled binary..
Oct 2 2017, 1:35 PM
danielcdh closed D37877: Update getMergedLocation to check the instruction type and merge properly..
Oct 2 2017, 11:14 AM
danielcdh updated the diff for D37877: Update getMergedLocation to check the instruction type and merge properly..

update comments

Oct 2 2017, 10:33 AM

Sep 30 2017

danielcdh closed D38094: Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases..
Sep 30 2017, 10:26 PM
danielcdh updated the diff for D38094: Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases..

update

Sep 30 2017, 2:51 PM
danielcdh updated the diff for D38094: Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases..

rebase and address Teresa's comments

Sep 30 2017, 2:07 PM
danielcdh added a comment to D38094: Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases..

Is this patch NFC? It looks like there is some new handling for indirect call targets in findImportedFunctions that makes it non-NFC. Could you split this into an NFC refactoring patch and a separate non-NFC patch with the enhancements if so? I think that would be clearer and easier to review.

Sep 30 2017, 2:06 PM

Sep 29 2017

danielcdh added a comment to D37877: Update getMergedLocation to check the instruction type and merge properly..

Adrian, please let me know if you have further concerns about this patch.

Sep 29 2017, 5:39 PM
danielcdh added a comment to D38392: Disallow sinking of unordered atomic loads into loops.

I see your point now. My concern is performance: if we allow hoisting of atomic load, but not allow sinking it, we may end up with bad performance as we may have too much redundant atomic loads in the preheader. Any suggestions on how to solve that?

Not really. I can say that we've been running performance tests for months in this configuration (LICM hoisting unordered loads, no LoopSink) without noticing any problems. I'm not immediately concerned.

Sep 29 2017, 5:31 PM
danielcdh added a comment to D38392: Disallow sinking of unordered atomic loads into loops.

I see your point now. My concern is performance: if we allow hoisting of atomic load, but not allow sinking it, we may end up with bad performance as we may have too much redundant atomic loads in the preheader. Any suggestions on how to solve that?

Sep 29 2017, 4:36 PM
danielcdh added a comment to D38392: Disallow sinking of unordered atomic loads into loops.

In the following example:

Sep 29 2017, 12:35 PM
danielcdh added a comment to D38392: Disallow sinking of unordered atomic loads into loops.

How about the following case:

Sep 29 2017, 9:20 AM
danielcdh added a comment to D38392: Disallow sinking of unordered atomic loads into loops.

Is it legal to hoist an atomic load out of the loop?

Sep 29 2017, 9:04 AM
danielcdh accepted D38405: [ThinLTO] Use decimal suffix for promoted values to match demanglers.
Sep 29 2017, 8:56 AM

Sep 20 2017

danielcdh updated the diff for D38094: Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases..

refactor

Sep 20 2017, 12:38 PM
danielcdh created D38094: Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases..
Sep 20 2017, 12:27 PM
danielcdh accepted D38086: [ThinLTO] Fix dead stripping analysis for SamplePGO.

LGTM, Thanks!

Sep 20 2017, 9:38 AM

Sep 19 2017

danielcdh closed D36637: Import all inlined indirect call targets for SamplePGO..
Sep 19 2017, 2:19 PM
danielcdh added inline comments to D36637: Import all inlined indirect call targets for SamplePGO..
Sep 19 2017, 1:39 PM
danielcdh updated the diff for D36637: Import all inlined indirect call targets for SamplePGO..

update

Sep 19 2017, 1:39 PM
danielcdh closed D38018: Handle profile mismatch correctly for SamplePGO..
Sep 19 2017, 11:28 AM
danielcdh updated the diff for D38018: Handle profile mismatch correctly for SamplePGO..

update

Sep 19 2017, 10:48 AM
danielcdh added inline comments to D38018: Handle profile mismatch correctly for SamplePGO..
Sep 19 2017, 10:47 AM
danielcdh added inline comments to D37877: Update getMergedLocation to check the instruction type and merge properly..
Sep 19 2017, 9:14 AM

Sep 18 2017

danielcdh updated the diff for D37877: Update getMergedLocation to check the instruction type and merge properly..

update

Sep 18 2017, 7:02 PM
danielcdh added inline comments to D37877: Update getMergedLocation to check the instruction type and merge properly..
Sep 18 2017, 7:02 PM
danielcdh created D38018: Handle profile mismatch correctly for SamplePGO..
Sep 18 2017, 6:26 PM

Sep 15 2017

danielcdh updated the diff for D37877: Update getMergedLocation to check the instruction type and merge properly..

update the API

Sep 15 2017, 6:23 PM
danielcdh added inline comments to D37877: Update getMergedLocation to check the instruction type and merge properly..
Sep 15 2017, 6:23 PM

Sep 14 2017

danielcdh updated the diff for D37877: Update getMergedLocation to check the instruction type and merge properly..

update test

Sep 14 2017, 5:52 PM
danielcdh added inline comments to D37877: Update getMergedLocation to check the instruction type and merge properly..
Sep 14 2017, 5:52 PM
danielcdh created D37877: Update getMergedLocation to check the instruction type and merge properly..
Sep 14 2017, 5:00 PM
danielcdh closed D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..
Sep 14 2017, 10:31 AM
danielcdh updated the diff for D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..

fix the msan uninitialized memory access error.

Sep 14 2017, 10:31 AM

Sep 13 2017

danielcdh closed D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..
Sep 13 2017, 2:24 PM
danielcdh updated the diff for D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..

update

Sep 13 2017, 2:23 PM
danielcdh updated the diff for D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..

rebase

Sep 13 2017, 2:09 PM
danielcdh accepted D37819: [Inliner] Add another way to compute full inline cost..
Sep 13 2017, 12:31 PM

Sep 12 2017

danielcdh accepted D37783: [ThinLTO] For SamplePGO, need to handle ICP targets consistently in thin link.

Thanks for fixing this!

Sep 12 2017, 11:56 PM
danielcdh updated the diff for D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..

update

Sep 12 2017, 4:57 PM
danielcdh created D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..
Sep 12 2017, 4:27 PM
danielcdh closed D37773: Refactor the code to pass down ACT to SampleProfileLoader correctly..
Sep 12 2017, 2:57 PM
danielcdh added inline comments to D37773: Refactor the code to pass down ACT to SampleProfileLoader correctly..
Sep 12 2017, 2:57 PM
danielcdh updated the diff for D37773: Refactor the code to pass down ACT to SampleProfileLoader correctly..

update

Sep 12 2017, 2:51 PM
danielcdh updated the summary of D37773: Refactor the code to pass down ACT to SampleProfileLoader correctly..
Sep 12 2017, 2:51 PM
danielcdh created D37773: Refactor the code to pass down ACT to SampleProfileLoader correctly..
Sep 12 2017, 2:24 PM

Sep 5 2017

danielcdh added a comment to D37463: Fix miscompile in LoopSink pass.

Thanks for raising the issue.

Sep 5 2017, 9:20 AM

Sep 3 2017

danielcdh added a comment to D37364: [DWARF] Line 0 should not have a discriminator.

The change looks good to me. But the test is quite large and hard to understand. I don't think it's necessary to have the test generated from a source file, just forge a test with discriminator attached to line 0, and then checks if discriminators is omitted in the generated code. I guess it should suffice with < 20 LOC

Sep 3 2017, 5:35 PM

Aug 29 2017

danielcdh accepted D37246: [PGO] Fixed non-determinism with DenseSet storing function importing info..

The sorting is not a concern as it's only called once perf function, and the list should not be large.

Aug 29 2017, 9:21 AM
danielcdh closed D37252: Add null check for promoted direct call.
Aug 29 2017, 8:29 AM

Aug 28 2017

danielcdh updated the diff for D37252: Add null check for promoted direct call.

update

Aug 28 2017, 11:41 PM