danielcdh (Dehao Chen)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 26 2015, 2:20 PM (119 w, 6 d)

Recent Activity

Mon, Dec 4

danielcdh added inline comments to D40413: [CodeExtractor] Add debug locations for new call and branch instrs..
Mon, Dec 4, 9:15 PM
danielcdh added a reviewer for D40413: [CodeExtractor] Add debug locations for new call and branch instrs.: wmi.
Mon, Dec 4, 2:12 PM
danielcdh accepted D40413: [CodeExtractor] Add debug locations for new call and branch instrs..

The patch LGTM. Notes about SamplePGO use cases below, which may be addressed by separate patch/discussion, if necessary.

Mon, Dec 4, 2:12 PM

Nov 10 2017

danielcdh accepted D39923: [ThinLTO] Handle -fdebug-pass-manager for backend invocations via clang.

Thanks for the fix.

Nov 10 2017, 2:15 PM

Nov 6 2017

danielcdh closed D38763: Include already promoted counts when computing SUM for VP..
Nov 6 2017, 11:53 AM

Nov 1 2017

danielcdh closed D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..
Nov 1 2017, 1:27 PM
danielcdh updated the diff for D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..

update

Nov 1 2017, 11:40 AM
danielcdh added inline comments to D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..
Nov 1 2017, 11:40 AM
danielcdh updated the diff for D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..

update

Nov 1 2017, 11:15 AM
danielcdh added a comment to D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..

If the indirect calls along this chain were promoted and then inlined into foo in the profiled binary, we cannot do the promotion/inlining of the call back to A.c:baz() during the compile step since it is intra-module, but we need to be able to annotate the call somewhere. And apparently we can't do this on the VP metadata for the indirect call in B.c:bar() since it was inlined into foo() in the profiled binary (Dehao - is that correct?).

That's correct

Nov 1 2017, 11:15 AM

Oct 31 2017

danielcdh added a comment to D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..

Does VP metadata get synthesized from the SamplePGO profile? The original bug report had some of the targets on VP metadata attached to the indirect call, but was missing others. Why not include all the targets on that VP metadata rather than marking them for import which seems more hacky.

Because we need to handle the indirect call chain too. E.g. foo() indirectly calls bar(), which indirectly calls baz(). Assuming these 2 calls are ICPed and inlined in the profiling binary, when processing foo's indirect callsite, marking bar's GUID in the VP is not enough, but adding baz's GUID to foo's callsite does not make sense as baz is not its call target. That's why we need to keep the transitive closure as function level metadata.

But the case here is when they are in fact in the same module. I'm assuming the situation you describe here already works when bar and baz are in different modules than foo. In this case we have foo and bar in the same module, and were missing the VP metadata showing that foo calls bar indirectly. If baz is also in the same module then the same situation applies to the callsite in bar. If it is in a different module then wouldn't the existing logic handle it? I.e. it would be on bar's function entry metadata?

Oct 31 2017, 3:34 PM
danielcdh added a comment to D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..

Does VP metadata get synthesized from the SamplePGO profile? The original bug report had some of the targets on VP metadata attached to the indirect call, but was missing others. Why not include all the targets on that VP metadata rather than marking them for import which seems more hacky.

Oct 31 2017, 3:12 PM
danielcdh created D39480: Include GUIDs from the same module when computing GUIDs that needs to be imported..
Oct 31 2017, 2:46 PM

Oct 26 2017

danielcdh closed D39343: Do not add discriminator encoding for debug intrinsics..
Oct 26 2017, 2:21 PM
danielcdh created D39343: Do not add discriminator encoding for debug intrinsics..
Oct 26 2017, 1:23 PM

Oct 10 2017

danielcdh created D38763: Include already promoted counts when computing SUM for VP..
Oct 10 2017, 2:21 PM
danielcdh closed D38478: Use the first instruction's count to estimate the funciton's entry frequency..
Oct 10 2017, 2:13 PM
danielcdh updated the diff for D38478: Use the first instruction's count to estimate the funciton's entry frequency..

extract the change that update the SUM of VP.

Oct 10 2017, 10:39 AM

Oct 9 2017

danielcdh added inline comments to D38478: Use the first instruction's count to estimate the funciton's entry frequency..
Oct 9 2017, 9:26 AM
danielcdh updated the diff for D38478: Use the first instruction's count to estimate the funciton's entry frequency..

update comment

Oct 9 2017, 9:26 AM

Oct 6 2017

danielcdh updated the diff for D38478: Use the first instruction's count to estimate the funciton's entry frequency..

rebase

Oct 6 2017, 10:09 AM
danielcdh closed D38603: Directly return promoted direct call instead of rely on stripPointerCast..
Oct 6 2017, 10:06 AM
danielcdh updated the diff for D38603: Directly return promoted direct call instead of rely on stripPointerCast..

update comments

Oct 6 2017, 8:54 AM
danielcdh added a comment to D38603: Directly return promoted direct call instead of rely on stripPointerCast..

Thanks.

Oct 6 2017, 8:54 AM

Oct 5 2017

danielcdh created D38603: Directly return promoted direct call instead of rely on stripPointerCast..
Oct 5 2017, 2:50 PM
danielcdh updated the diff for D38478: Use the first instruction's count to estimate the funciton's entry frequency..

rebase

Oct 5 2017, 1:18 PM
danielcdh closed D38477: Annotate VP prof on indirect call if it is ICPed in the profiled binary..
Oct 5 2017, 1:17 PM
danielcdh updated the summary of D38477: Annotate VP prof on indirect call if it is ICPed in the profiled binary..
Oct 5 2017, 12:38 PM

Oct 3 2017

danielcdh updated the diff for D38478: Use the first instruction's count to estimate the funciton's entry frequency..

update comment

Oct 3 2017, 12:48 PM
danielcdh added inline comments to D38478: Use the first instruction's count to estimate the funciton's entry frequency..
Oct 3 2017, 12:48 PM
danielcdh added inline comments to D38477: Annotate VP prof on indirect call if it is ICPed in the profiled binary..
Oct 3 2017, 11:28 AM
danielcdh updated the diff for D38477: Annotate VP prof on indirect call if it is ICPed in the profiled binary..

update comment

Oct 3 2017, 11:28 AM

Oct 2 2017

danielcdh created D38478: Use the first instruction's count to estimate the funciton's entry frequency..
Oct 2 2017, 1:52 PM
danielcdh created D38477: Annotate VP prof on indirect call if it is ICPed in the profiled binary..
Oct 2 2017, 1:35 PM
danielcdh closed D37877: Update getMergedLocation to check the instruction type and merge properly..
Oct 2 2017, 11:14 AM
danielcdh updated the diff for D37877: Update getMergedLocation to check the instruction type and merge properly..

update comments

Oct 2 2017, 10:33 AM

Sep 30 2017

danielcdh closed D38094: Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases..
Sep 30 2017, 10:26 PM
danielcdh updated the diff for D38094: Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases..

update

Sep 30 2017, 2:51 PM
danielcdh updated the diff for D38094: Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases..

rebase and address Teresa's comments

Sep 30 2017, 2:07 PM
danielcdh added a comment to D38094: Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases..

Is this patch NFC? It looks like there is some new handling for indirect call targets in findImportedFunctions that makes it non-NFC. Could you split this into an NFC refactoring patch and a separate non-NFC patch with the enhancements if so? I think that would be clearer and easier to review.

Sep 30 2017, 2:06 PM

Sep 29 2017

danielcdh added a comment to D37877: Update getMergedLocation to check the instruction type and merge properly..

Adrian, please let me know if you have further concerns about this patch.

Sep 29 2017, 5:39 PM
danielcdh added a comment to D38392: Disallow sinking of unordered atomic loads into loops.

I see your point now. My concern is performance: if we allow hoisting of atomic load, but not allow sinking it, we may end up with bad performance as we may have too much redundant atomic loads in the preheader. Any suggestions on how to solve that?

Not really. I can say that we've been running performance tests for months in this configuration (LICM hoisting unordered loads, no LoopSink) without noticing any problems. I'm not immediately concerned.

Sep 29 2017, 5:31 PM
danielcdh added a comment to D38392: Disallow sinking of unordered atomic loads into loops.

I see your point now. My concern is performance: if we allow hoisting of atomic load, but not allow sinking it, we may end up with bad performance as we may have too much redundant atomic loads in the preheader. Any suggestions on how to solve that?

Sep 29 2017, 4:36 PM
danielcdh added a comment to D38392: Disallow sinking of unordered atomic loads into loops.

In the following example:

Sep 29 2017, 12:35 PM
danielcdh added a comment to D38392: Disallow sinking of unordered atomic loads into loops.

How about the following case:

Sep 29 2017, 9:20 AM
danielcdh added a comment to D38392: Disallow sinking of unordered atomic loads into loops.

Is it legal to hoist an atomic load out of the loop?

Sep 29 2017, 9:04 AM
danielcdh accepted D38405: [ThinLTO] Use decimal suffix for promoted values to match demanglers.
Sep 29 2017, 8:56 AM

Sep 20 2017

danielcdh updated the diff for D38094: Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases..

refactor

Sep 20 2017, 12:38 PM
danielcdh created D38094: Separate the logic when handling indirect calls in SamplePGO ThinLTO compile phase and other phases..
Sep 20 2017, 12:27 PM
danielcdh accepted D38086: [ThinLTO] Fix dead stripping analysis for SamplePGO.

LGTM, Thanks!

Sep 20 2017, 9:38 AM

Sep 19 2017

danielcdh closed D36637: Import all inlined indirect call targets for SamplePGO..
Sep 19 2017, 2:19 PM
danielcdh added inline comments to D36637: Import all inlined indirect call targets for SamplePGO..
Sep 19 2017, 1:39 PM
danielcdh updated the diff for D36637: Import all inlined indirect call targets for SamplePGO..

update

Sep 19 2017, 1:39 PM
danielcdh closed D38018: Handle profile mismatch correctly for SamplePGO..
Sep 19 2017, 11:28 AM
danielcdh updated the diff for D38018: Handle profile mismatch correctly for SamplePGO..

update

Sep 19 2017, 10:48 AM
danielcdh added inline comments to D38018: Handle profile mismatch correctly for SamplePGO..
Sep 19 2017, 10:47 AM
danielcdh added inline comments to D37877: Update getMergedLocation to check the instruction type and merge properly..
Sep 19 2017, 9:14 AM

Sep 18 2017

danielcdh updated the diff for D37877: Update getMergedLocation to check the instruction type and merge properly..

update

Sep 18 2017, 7:02 PM
danielcdh added inline comments to D37877: Update getMergedLocation to check the instruction type and merge properly..
Sep 18 2017, 7:02 PM
danielcdh created D38018: Handle profile mismatch correctly for SamplePGO..
Sep 18 2017, 6:26 PM

Sep 15 2017

danielcdh updated the diff for D37877: Update getMergedLocation to check the instruction type and merge properly..

update the API

Sep 15 2017, 6:23 PM
danielcdh added inline comments to D37877: Update getMergedLocation to check the instruction type and merge properly..
Sep 15 2017, 6:23 PM

Sep 14 2017

danielcdh updated the diff for D37877: Update getMergedLocation to check the instruction type and merge properly..

update test

Sep 14 2017, 5:52 PM
danielcdh added inline comments to D37877: Update getMergedLocation to check the instruction type and merge properly..
Sep 14 2017, 5:52 PM
danielcdh created D37877: Update getMergedLocation to check the instruction type and merge properly..
Sep 14 2017, 5:00 PM
danielcdh closed D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..
Sep 14 2017, 10:31 AM
danielcdh updated the diff for D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..

fix the msan uninitialized memory access error.

Sep 14 2017, 10:31 AM

Sep 13 2017

danielcdh closed D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..
Sep 13 2017, 2:24 PM
danielcdh updated the diff for D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..

update

Sep 13 2017, 2:23 PM
danielcdh updated the diff for D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..

rebase

Sep 13 2017, 2:09 PM
danielcdh accepted D37819: [Inliner] Add another way to compute full inline cost..
Sep 13 2017, 12:31 PM

Sep 12 2017

danielcdh accepted D37783: [ThinLTO] For SamplePGO, need to handle ICP targets consistently in thin link.

Thanks for fixing this!

Sep 12 2017, 11:56 PM
danielcdh updated the diff for D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..

update

Sep 12 2017, 4:57 PM
danielcdh created D37779: Invoke GetInlineCost for legality check before inline functions in SampleProfileLoader..
Sep 12 2017, 4:27 PM
danielcdh closed D37773: Refactor the code to pass down ACT to SampleProfileLoader correctly..
Sep 12 2017, 2:57 PM
danielcdh added inline comments to D37773: Refactor the code to pass down ACT to SampleProfileLoader correctly..
Sep 12 2017, 2:57 PM
danielcdh updated the diff for D37773: Refactor the code to pass down ACT to SampleProfileLoader correctly..

update

Sep 12 2017, 2:51 PM
danielcdh updated the summary of D37773: Refactor the code to pass down ACT to SampleProfileLoader correctly..
Sep 12 2017, 2:51 PM
danielcdh created D37773: Refactor the code to pass down ACT to SampleProfileLoader correctly..
Sep 12 2017, 2:24 PM

Sep 5 2017

danielcdh added a comment to D37463: Fix miscompile in LoopSink pass.

Thanks for raising the issue.

Sep 5 2017, 9:20 AM

Sep 3 2017

danielcdh added a comment to D37364: [DWARF] Line 0 should not have a discriminator.

The change looks good to me. But the test is quite large and hard to understand. I don't think it's necessary to have the test generated from a source file, just forge a test with discriminator attached to line 0, and then checks if discriminators is omitted in the generated code. I guess it should suffice with < 20 LOC

Sep 3 2017, 5:35 PM

Aug 29 2017

danielcdh accepted D37246: [PGO] Fixed non-determinism with DenseSet storing function importing info..

The sorting is not a concern as it's only called once perf function, and the list should not be large.

Aug 29 2017, 9:21 AM
danielcdh closed D37252: Add null check for promoted direct call.
Aug 29 2017, 8:29 AM

Aug 28 2017

danielcdh updated the diff for D37252: Add null check for promoted direct call.

update

Aug 28 2017, 11:41 PM
danielcdh created D37252: Add null check for promoted direct call.
Aug 28 2017, 11:37 PM
danielcdh added a comment to D37246: [PGO] Fixed non-determinism with DenseSet storing function importing info..

does this change any tests?

Aug 28 2017, 9:13 PM

Aug 24 2017

danielcdh closed D37091: Expose -mllvm -accurate-sample-profile to clang..
Aug 24 2017, 2:38 PM
danielcdh closed D37113: Move accurate-sample-profile into the function attribute..
Aug 24 2017, 2:38 PM
danielcdh updated the diff for D37091: Expose -mllvm -accurate-sample-profile to clang..

update

Aug 24 2017, 1:36 PM
danielcdh updated the diff for D37113: Move accurate-sample-profile into the function attribute..

update

Aug 24 2017, 1:19 PM
danielcdh added a comment to D37113: Move accurate-sample-profile into the function attribute..

How can we test it works for ThinLTO? some end-to-end test is also desired (given that this becomes an external feature).

Aug 24 2017, 1:19 PM
danielcdh updated the diff for D37091: Expose -mllvm -accurate-sample-profile to clang..

Add an end-to-end test.

Aug 24 2017, 1:19 PM
danielcdh updated the diff for D37091: Expose -mllvm -accurate-sample-profile to clang..

update

Aug 24 2017, 1:11 PM
danielcdh created D37113: Move accurate-sample-profile into the function attribute..
Aug 24 2017, 9:33 AM
danielcdh updated the diff for D37091: Expose -mllvm -accurate-sample-profile to clang..

updated the patch to put it into function attribute so that it works with ThinLTO

Aug 24 2017, 9:27 AM

Aug 23 2017

danielcdh updated the diff for D37091: Expose -mllvm -accurate-sample-profile to clang..

add document and test

Aug 23 2017, 7:52 PM
danielcdh created D37091: Expose -mllvm -accurate-sample-profile to clang..
Aug 23 2017, 6:52 PM
danielcdh closed D37084: Add test to cover accurate-sample-profile..
Aug 23 2017, 4:21 PM
danielcdh added a comment to D37084: Add test to cover accurate-sample-profile..

What happens if caller is hot, and the callee (missing profile) is unconditionally called by the caller? Will it still be treated as cold?

Aug 23 2017, 4:10 PM
danielcdh created D37084: Add test to cover accurate-sample-profile..
Aug 23 2017, 4:00 PM