danielcdh (Dehao Chen)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 26 2015, 2:20 PM (87 w, 3 d)

Recent Activity

Fri, Apr 28

danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

move the pass till the end of the optimizations.

Fri, Apr 28, 10:51 AM
danielcdh retitled D32563: Add LiveRangeShrink pass to shrink live range within BB. from Improve code placement algorithm in Reassociate pass. to Add LiveRangeShrink pass to shrink live range within BB..
Fri, Apr 28, 10:17 AM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

Update the patch to have a separate pass to handle live range shrinking within BB.

Fri, Apr 28, 10:13 AM

Thu, Apr 27

danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Can you please describe the actual algorithm you are trying to use for placement here?

Also, do you have performance numbers?

Thu, Apr 27, 1:01 PM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

Thu, Apr 27, 1:00 PM
danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Can you please describe the actual algorithm you are trying to use for placement here?

Thu, Apr 27, 12:59 PM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

Thu, Apr 27, 12:58 PM

Wed, Apr 26

danielcdh updated the summary of D32563: Add LiveRangeShrink pass to shrink live range within BB..
Wed, Apr 26, 3:10 PM
danielcdh created D32563: Add LiveRangeShrink pass to shrink live range within BB..
Wed, Apr 26, 2:46 PM

Thu, Apr 20

danielcdh updated the diff for D32315: Introduce a new DWARFContext::getInliningInfoForAddress API to expose pointers to strings stored in DWARF file..

set initial value for local variables.

Thu, Apr 20, 3:15 PM
danielcdh created D32315: Introduce a new DWARFContext::getInliningInfoForAddress API to expose pointers to strings stored in DWARF file..
Thu, Apr 20, 3:11 PM

Wed, Apr 19

danielcdh added a comment to D32177: Using address range map to speedup finding inline stack for address..

updated in NFC commit r300753.

Wed, Apr 19, 2:05 PM
danielcdh closed D32177: Using address range map to speedup finding inline stack for address..
Wed, Apr 19, 1:22 PM
danielcdh added inline comments to D32236: PR32710: Disable using PMADDWD for unsigned short..
Wed, Apr 19, 1:21 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

update

Wed, Apr 19, 1:07 PM
danielcdh added a comment to D32177: Using address range map to speedup finding inline stack for address..

Will have separate patch to address other concerns.

Wed, Apr 19, 1:07 PM
danielcdh closed D32236: PR32710: Disable using PMADDWD for unsigned short..
Wed, Apr 19, 1:03 PM
danielcdh updated the diff for D32236: PR32710: Disable using PMADDWD for unsigned short..

update test

Wed, Apr 19, 1:03 PM
danielcdh added a comment to D32236: PR32710: Disable using PMADDWD for unsigned short..

Hi Dehao (and Michael),

Test madd.ll has wrong CHECK-LABEL comments. For example, I see that "label" checks are lower-case. Those should be all upper-case.
I think that those checks are never used in practice.

To avoid problems with FileCheck, I also suggest to add an explicit check line for the 'ret' statement. That way, we know exactly what is the range of instructions where (v)pmaddwd should not appear.

-Andrea

Wed, Apr 19, 1:02 PM
danielcdh created D32236: PR32710: Disable using PMADDWD for unsigned short..
Wed, Apr 19, 11:41 AM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

Updated the logic and remove the assertion. Add a unittest to cover the symbolization of the padding zone.

Wed, Apr 19, 11:20 AM
danielcdh reopened D32177: Using address range map to speedup finding inline stack for address..

Reverted as this breaks buildbot: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/3369

Wed, Apr 19, 11:19 AM
danielcdh closed D32177: Using address range map to speedup finding inline stack for address..
Wed, Apr 19, 8:03 AM

Tue, Apr 18

danielcdh added a comment to D32177: Using address range map to speedup finding inline stack for address..

Now the assertion is enabled to ensure the behavior does not change. PTAL. Thanks!

Tue, Apr 18, 4:05 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

update the code to ignore 0-sized ranges.

Tue, Apr 18, 4:02 PM
danielcdh added a comment to D32177: Using address range map to speedup finding inline stack for address..

hold on... the assertion actually triggers... looking into why...

Tue, Apr 18, 2:33 PM
danielcdh added inline comments to D32177: Using address range map to speedup finding inline stack for address..
Tue, Apr 18, 2:32 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

remove unintended change...

Tue, Apr 18, 2:32 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

update

Tue, Apr 18, 2:31 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

update comment

Tue, Apr 18, 2:05 PM
danielcdh added inline comments to D32177: Using address range map to speedup finding inline stack for address..
Tue, Apr 18, 2:02 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

update

Tue, Apr 18, 2:02 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

update with more comments.

Tue, Apr 18, 12:44 PM
danielcdh added inline comments to D32177: Using address range map to speedup finding inline stack for address..
Tue, Apr 18, 10:34 AM
danielcdh created D32177: Using address range map to speedup finding inline stack for address..
Tue, Apr 18, 10:02 AM

Mon, Apr 17

danielcdh closed D31952: Build SymbolMap in SampleProfileLoader to help matchin function names with suffix..
Mon, Apr 17, 3:35 PM
danielcdh added inline comments to D31952: Build SymbolMap in SampleProfileLoader to help matchin function names with suffix..
Mon, Apr 17, 2:54 PM
danielcdh updated the diff for D31952: Build SymbolMap in SampleProfileLoader to help matchin function names with suffix..

update

Mon, Apr 17, 2:54 PM
danielcdh added inline comments to D31952: Build SymbolMap in SampleProfileLoader to help matchin function names with suffix..
Mon, Apr 17, 2:33 PM
danielcdh updated the diff for D31952: Build SymbolMap in SampleProfileLoader to help matchin function names with suffix..

add comment

Mon, Apr 17, 2:33 PM
danielcdh closed D32134: Add GNU_discriminator support for inline callsites in llvm-symbolizer..
Mon, Apr 17, 1:23 PM
danielcdh created D32134: Add GNU_discriminator support for inline callsites in llvm-symbolizer..
Mon, Apr 17, 1:14 PM
danielcdh updated the diff for D31952: Build SymbolMap in SampleProfileLoader to help matchin function names with suffix..

rebase

Mon, Apr 17, 10:38 AM

Fri, Apr 14

danielcdh accepted D32008: [SampleProfile] Skip intrinsic calls when visiting callsites in InlineHotFunctions. NFC..
Fri, Apr 14, 7:47 PM

Thu, Apr 13

danielcdh closed D31950: SamplePGO: convert callsite samples map key from callsite_location to callsite_location+callee_name.
Thu, Apr 13, 1:04 PM
danielcdh added inline comments to D31950: SamplePGO: convert callsite samples map key from callsite_location to callsite_location+callee_name.
Thu, Apr 13, 10:13 AM
danielcdh updated the diff for D31950: SamplePGO: convert callsite samples map key from callsite_location to callsite_location+callee_name.

update

Thu, Apr 13, 10:02 AM
danielcdh added inline comments to D31950: SamplePGO: convert callsite samples map key from callsite_location to callsite_location+callee_name.
Thu, Apr 13, 10:02 AM

Tue, Apr 11

danielcdh accepted D31900: [AddDiscriminators] Assign discriminators to memset/memcpy/memmove intrinsic calls..
Tue, Apr 11, 11:30 AM
danielcdh created D31952: Build SymbolMap in SampleProfileLoader to help matchin function names with suffix..
Tue, Apr 11, 11:19 AM
danielcdh created D31950: SamplePGO: convert callsite samples map key from callsite_location to callsite_location+callee_name.
Tue, Apr 11, 10:30 AM
danielcdh added a comment to D31900: [AddDiscriminators] Assign discriminators to memset/memcpy/memmove intrinsic calls..

Thanks for the explanation. This makes sense. The reasons that we try to avoid IntrinsicInst is 2 fold:

  1. avoid non-deterministic discriminator assignment for different debug level
  2. minimize the number of base discriminators

    Apparently, bypassing all IntrinsicInst is an overkill to solve #1, and in fact it will cause the bug as shown in the unittest, and your patch managed to fix it. But for #2, your patch introduces extra base discriminators that is unnecessary. I suggest only keep the first callsite of shouldHaveDiscriminator, and revert the 2nd callsite to just check IntrinsicInst. Please add comment at both places to explain what's the motivation (the first place, i.e. call to shouldHaveDiscriminator is aiming at addressing #1, the 2nd place, i.e. checking IntrinsicInst, is aiming at addressing #1 and #2). And also in the unittest, please check explicitly that the discriminator for the memcpy intrinsic is the same as other instructions in that BB.

Thanks for the feedback.
I will add the comments you have requested and I will restore the 2nd callsite to the check for IntrinsicInst.

Do you want me to keep test memcpy-discriminator.ll in the final patch?

About the unit test: is there a template that I can use for this particular case? I never had to write an llvm unittest before, so I don't know where to look at for examples.

Tue, Apr 11, 9:49 AM
danielcdh added a comment to D31900: [AddDiscriminators] Assign discriminators to memset/memcpy/memmove intrinsic calls..

Thanks for the explanation. This makes sense. The reasons that we try to avoid IntrinsicInst is 2 fold:

Tue, Apr 11, 9:02 AM
danielcdh added a comment to D31900: [AddDiscriminators] Assign discriminators to memset/memcpy/memmove intrinsic calls..

Thanks for explaining. My understanding is that memcpy will introduce new basic block, which will share the same discriminator with other basic block without this patch. As a result, that basic block is incorrectly annotated so that the block placement is suboptimal. If my understanding is correct, could you include a that in the unittest?

Tue, Apr 11, 6:52 AM

Mon, Apr 10

danielcdh added a comment to D31900: [AddDiscriminators] Assign discriminators to memset/memcpy/memmove intrinsic calls..

I don't quite understand why you want to set a new discriminator for memcpy builtin. What optimization would it enable? For normal function calls, we need to have discriminator to distinguish callsites in the same BB so that we can annotate the inlined callee correctly. But for the memcpy case, looks like adding a new discriminator does not help down-stream optimizations like inlining?

Mon, Apr 10, 7:19 PM
danielcdh closed D31826: Emit less compiler optimization remarks in samplepgo to reduce a call to findCalleeFunctionSamples which is going to be refactored..
Mon, Apr 10, 2:01 PM

Fri, Apr 7

danielcdh created D31826: Emit less compiler optimization remarks in samplepgo to reduce a call to findCalleeFunctionSamples which is going to be refactored..
Fri, Apr 7, 12:38 PM
danielcdh closed D31679: Use PMADDWD to expand reduction in a loop.
Fri, Apr 7, 8:54 AM

Thu, Apr 6

danielcdh updated the diff for D31679: Use PMADDWD to expand reduction in a loop.

update

Thu, Apr 6, 7:25 PM
danielcdh added inline comments to D31679: Use PMADDWD to expand reduction in a loop.
Thu, Apr 6, 7:25 PM
danielcdh updated the diff for D31679: Use PMADDWD to expand reduction in a loop.

simplify test

Thu, Apr 6, 8:31 AM
danielcdh added a comment to D31679: Use PMADDWD to expand reduction in a loop.
In D31679#719786, @zvi wrote:

Thanks for working on this patch. Regarding support for PMADDUBSW, can we match something like the following?

for (int i = 0; i < count; i++) {
  a = saturate(a + x[i] * y[i]);
}
Thu, Apr 6, 8:31 AM

Tue, Apr 4

danielcdh retitled D31679: Use PMADDWD to expand reduction in a loop from Support PMADDWD and PMADDUBSW to Use PMADDWD to expand reduction in a loop.
Tue, Apr 4, 5:02 PM
danielcdh added inline comments to D31679: Use PMADDWD to expand reduction in a loop.
Tue, Apr 4, 5:00 PM
danielcdh updated the diff for D31679: Use PMADDWD to expand reduction in a loop.

remove the support for PMADDUBSW as it cannot handle overflow case.

Tue, Apr 4, 5:00 PM
danielcdh created D31679: Use PMADDWD to expand reduction in a loop.
Tue, Apr 4, 2:05 PM

Mar 31 2017

danielcdh closed D31344: Fix the InstCombine to reserve the VP metadata and sets correct call count..
Mar 31 2017, 9:12 AM
danielcdh updated the diff for D31344: Fix the InstCombine to reserve the VP metadata and sets correct call count..

rebase and update

Mar 31 2017, 9:12 AM

Mar 24 2017

danielcdh created D31344: Fix the InstCombine to reserve the VP metadata and sets correct call count..
Mar 24 2017, 9:47 AM

Mar 23 2017

danielcdh closed D31310: Fix trellis layout to avoid mis-identify triangle..
Mar 23 2017, 4:40 PM
danielcdh closed D31143: Set the prof weight correctly for call instructions in DeadArgumentElimination..
Mar 23 2017, 4:38 PM
danielcdh closed D31225: Use isFunctionHotInCallGraph to set the function section prefix..
Mar 23 2017, 4:26 PM
danielcdh retitled D31310: Fix trellis layout to avoid mis-identify triangle. from Fix trellis layout when there is triangle. to Fix trellis layout to avoid mis-identify triangle..
Mar 23 2017, 4:26 PM
danielcdh updated the diff for D31310: Fix trellis layout to avoid mis-identify triangle..

update

Mar 23 2017, 4:09 PM
danielcdh closed D31219: Update the SamplePGO test to verify that unroll/icp is not invoked in thinlto compile phase..
Mar 23 2017, 2:32 PM
danielcdh closed D31217: Disable loop unrolling and icp in SamplePGO ThinLTO compile phase.
Mar 23 2017, 2:32 PM
danielcdh created D31310: Fix trellis layout to avoid mis-identify triangle..
Mar 23 2017, 2:20 PM
danielcdh closed D31228: Do not set branch weight if the branch weight annotation is present..
Mar 23 2017, 7:55 AM

Mar 22 2017

danielcdh updated the diff for D31225: Use isFunctionHotInCallGraph to set the function section prefix..

update

Mar 22 2017, 5:20 PM
danielcdh updated the diff for D31143: Set the prof weight correctly for call instructions in DeadArgumentElimination..

update

Mar 22 2017, 5:14 PM
danielcdh updated the diff for D31225: Use isFunctionHotInCallGraph to set the function section prefix..

update

Mar 22 2017, 4:18 PM
danielcdh added inline comments to D31225: Use isFunctionHotInCallGraph to set the function section prefix..
Mar 22 2017, 3:59 PM
danielcdh updated the diff for D31225: Use isFunctionHotInCallGraph to set the function section prefix..

update

Mar 22 2017, 3:59 PM
danielcdh added a comment to D31225: Use isFunctionHotInCallGraph to set the function section prefix..

Updated the patch to change for the cold prefix tool.

Mar 22 2017, 2:31 PM
danielcdh updated the diff for D31225: Use isFunctionHotInCallGraph to set the function section prefix..

update

Mar 22 2017, 2:30 PM
danielcdh updated the diff for D31143: Set the prof weight correctly for call instructions in DeadArgumentElimination..

update

Mar 22 2017, 2:06 PM
danielcdh added inline comments to D31143: Set the prof weight correctly for call instructions in DeadArgumentElimination..
Mar 22 2017, 2:06 PM

Mar 21 2017

danielcdh created D31228: Do not set branch weight if the branch weight annotation is present..
Mar 21 2017, 6:15 PM
danielcdh created D31225: Use isFunctionHotInCallGraph to set the function section prefix..
Mar 21 2017, 5:33 PM
danielcdh created D31219: Update the SamplePGO test to verify that unroll/icp is not invoked in thinlto compile phase..
Mar 21 2017, 3:37 PM
danielcdh added inline comments to D31217: Disable loop unrolling and icp in SamplePGO ThinLTO compile phase.
Mar 21 2017, 3:17 PM
danielcdh created D31217: Disable loop unrolling and icp in SamplePGO ThinLTO compile phase.
Mar 21 2017, 2:55 PM
danielcdh closed D31213: Add support for -fno-auto-profile and -fno-profile-sample-use.
Mar 21 2017, 2:53 PM
danielcdh updated the diff for D31213: Add support for -fno-auto-profile and -fno-profile-sample-use.

add more test

Mar 21 2017, 2:51 PM
danielcdh created D31213: Add support for -fno-auto-profile and -fno-profile-sample-use.
Mar 21 2017, 2:36 PM
danielcdh closed D31202: Clang change: Do not inline hot callsites for samplepgo in thinlto compile phase..
Mar 21 2017, 1:07 PM
danielcdh closed D31201: Do not inline hot callsites for samplepgo in thinlto compile phase..
Mar 21 2017, 1:07 PM
danielcdh created D31202: Clang change: Do not inline hot callsites for samplepgo in thinlto compile phase..
Mar 21 2017, 12:34 PM
danielcdh created D31201: Do not inline hot callsites for samplepgo in thinlto compile phase..
Mar 21 2017, 12:33 PM
danielcdh closed D31154: Use ProfileSummary:getProfileCount to get ScaledCount for ModuleSummary.
Mar 21 2017, 10:34 AM