danielcdh (Dehao Chen)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 26 2015, 2:20 PM (95 w, 4 d)

Recent Activity

Wed, Jun 21

danielcdh closed D33341: Enable vectorizer-maximize-bandwidth by default..
Wed, Jun 21, 3:02 PM
danielcdh closed D34456: Do not inline recursive direct calls in sample loader pass..
Wed, Jun 21, 10:58 AM
danielcdh created D34456: Do not inline recursive direct calls in sample loader pass..
Wed, Jun 21, 8:48 AM

Tue, Jun 20

danielcdh added a comment to D33850: Inlining: Don't re-map simplified cloned instructions..

https://reviews.llvm.org/D34017 prevents recursive inline from happening. But even with recursive inlining, it should not trigger compiler error. I think this patch is still needed to fix the underlying issue?

Tue, Jun 20, 10:17 AM

Wed, Jun 14

danielcdh accepted D34219: Allow -profile-guided-section-prefix more than once.
Wed, Jun 14, 1:16 PM
danielcdh updated the diff for D33341: Enable vectorizer-maximize-bandwidth by default..

update tests. PTAL. Thanks!

Wed, Jun 14, 1:02 PM

Tue, Jun 13

danielcdh added a comment to D33341: Enable vectorizer-maximize-bandwidth by default..

Any update on this patch?

Dehao indicated on the llvm-dev thread that he planned to land it by end of day today unless someone else hit issues. Marking as LGTM just as a formality...

Tue, Jun 13, 11:04 AM
danielcdh added a comment to D33341: Enable vectorizer-maximize-bandwidth by default..

As discussed in llvm-dev:

Tue, Jun 13, 11:03 AM

Thu, Jun 8

danielcdh closed D34017: Do not early-inline recursive calls in sample profile loader..
Thu, Jun 8, 1:12 PM

Wed, Jun 7

danielcdh updated the diff for D34017: Do not early-inline recursive calls in sample profile loader..

add comments

Wed, Jun 7, 5:30 PM
danielcdh added a comment to D34017: Do not early-inline recursive calls in sample profile loader..

So the profile is from GCC built binary ?

Wed, Jun 7, 5:19 PM
danielcdh added a comment to D34017: Do not early-inline recursive calls in sample profile loader..

Does it miss opportunities to profile annotate inline instance of indirect target?

Wed, Jun 7, 4:49 PM
danielcdh added a comment to D34017: Do not early-inline recursive calls in sample profile loader..

It doesn't have to blow it up exponentially. Can we clone the function to be inlined, re-writing the recursive calls to call the clone, and then inline that? Similar to a worker-wrapper transformation

Wed, Jun 7, 4:34 PM
danielcdh created D34017: Do not early-inline recursive calls in sample profile loader..
Wed, Jun 7, 4:09 PM

Fri, Jun 2

danielcdh added inline comments to D32451: Improve profile-guided heuristics to use estimated trip count..
Fri, Jun 2, 1:13 PM
danielcdh added inline comments to D32451: Improve profile-guided heuristics to use estimated trip count..
Fri, Jun 2, 12:44 PM

Thu, Jun 1

danielcdh accepted D33746: [PGO] Dump branch condition type, branch probability and branch hotness as optimization remarks.
Thu, Jun 1, 9:25 AM
danielcdh accepted D33757: [Profile] Fix builtin_expect lowering bug.
Thu, Jun 1, 9:09 AM

Wed, May 31

danielcdh closed D32563: Add LiveRangeShrink pass to shrink live range within BB..
Wed, May 31, 4:25 PM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update test

Wed, May 31, 1:44 PM

Tue, May 30

danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

Tue, May 30, 5:48 PM
danielcdh added inline comments to D32563: Add LiveRangeShrink pass to shrink live range within BB..
Tue, May 30, 5:23 PM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

Tue, May 30, 5:23 PM

May 19 2017

danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

clang-format...

May 19 2017, 4:01 PM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

Updated the implementation and tests:

May 19 2017, 4:00 PM

May 18 2017

danielcdh reopened D32563: Add LiveRangeShrink pass to shrink live range within BB..

The patch was reverted because it breaks tests at runtime.

May 18 2017, 6:26 PM
danielcdh created D33341: Enable vectorizer-maximize-bandwidth by default..
May 18 2017, 2:44 PM

May 17 2017

danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Could you send me an IR file or snippet that could trigger this performance issue? I can include it in the patch as unittest.

May 17 2017, 2:19 PM
danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

patch to move the pass to x86 is proposed: https://reviews.llvm.org/D33294

May 17 2017, 12:08 PM
danielcdh created D33294: Only enable LiveRangeShrink for x86..
May 17 2017, 12:05 PM
danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Sorry about the regression. As Quentin pointed out, moving instructions close to phi should not be a problem as the algorithm checks the moved instruction uses >1 single-use defs. Looking at the example of the regression case, the transformation looks reasonable if we only consider shrinking live-ranges. Could you provide an IR that I can try reproduce the issue and see why register allocation is worse?

May 17 2017, 11:34 AM

May 12 2017

danielcdh closed D32563: Add LiveRangeShrink pass to shrink live range within BB..
May 12 2017, 12:42 PM
danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

I'll submit the patch now to allow rest of today to catch potential buildbot breakage. Thanks again for all the reviews.

May 12 2017, 12:41 PM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

May 12 2017, 8:35 AM

May 11 2017

danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Thanks all for the reviews. Any other comments/concerns? If not, I'll commit the patch tomorrow.

May 11 2017, 5:12 PM
danielcdh closed D33111: Change sample profile writer to make it deterministic..
May 11 2017, 4:56 PM
danielcdh created D33111: Change sample profile writer to make it deterministic..
May 11 2017, 4:23 PM
danielcdh added inline comments to D32563: Add LiveRangeShrink pass to shrink live range within BB..
May 11 2017, 2:25 PM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

May 11 2017, 2:25 PM
danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Thanks for the reviews!

May 11 2017, 8:39 AM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

May 11 2017, 8:39 AM

May 10 2017

danielcdh added inline comments to D32563: Add LiveRangeShrink pass to shrink live range within BB..
May 10 2017, 1:56 PM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

May 10 2017, 1:56 PM

May 9 2017

danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

May 9 2017, 2:40 PM
danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Thanks for the review.

May 9 2017, 2:39 PM
danielcdh accepted D33012: [ProfileSummary] Make getProfileCount a non-static member function..

oops, sent to the wrong patch review...

May 9 2017, 2:39 PM
danielcdh added a comment to D33012: [ProfileSummary] Make getProfileCount a non-static member function..

Thanks for the review.

May 9 2017, 2:38 PM
danielcdh added inline comments to D32563: Add LiveRangeShrink pass to shrink live range within BB..
May 9 2017, 8:15 AM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

May 9 2017, 8:15 AM

May 8 2017

danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

rebase and update

May 8 2017, 6:09 PM
danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Can you collect spill code stats and benchmark scores from the test suite?
Can you get someone to run benchmarks on arm64 as well?

May 8 2017, 6:09 PM
danielcdh accepted D32983: Fix code section prefix for proper layout.
May 8 2017, 3:53 PM

May 5 2017

danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

May 5 2017, 4:18 PM
danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Sorry, did not address this in the last patch. Done now. Note that I do not want to touch unnecessary files in this change. So I simply copied the code from MachineSink.cpp, and will refactor it to a utility function in MachineInstr.h/.cpp in a follow-up patch.

No problem. Thanks!

About your new test case. I think a MIR test would be even better. That said, I don't have a strong opinion (I am okay with your new test).
In case, could you please regenerate the CHECK lines using `update_llc_test_checks.py? The test should be small, so I don't expect too many CHECK lines inserted by the script.

Done, but I'm not sure if that's the right thing to do. update_llc_test_checks makes the test too fragile: any subtle change of the codegen would break it. And it hides what the author really want to test and follow-up patch writers may simply run the script and regenerate the test (without understanding what it's testing, or they need to spend much time on what the original purpose is). BTW, what's MIR test? could you point me to one example? Thanks.

Right, I think it your test is okay.

About MIR tests, you can have a look at this page: http://llvm.org/docs/MIRLangRef.html
In particular, section "MIR Testing Guide" (and the next two sub-sections) are quite useful imho.

For example, you could do something like:

llc my-new-test.ll -stop-after=machine-sink -o test.mir

Then, test.mir with a RUN line which only tests your new pass
For example:

llc -o - %s -mtriple=x86_64-- -run-pass=lrshrink | FileCheck %s

The advantage is that you test your new pass in isolation.
But, as I wrote, I think your test is okay.

May 5 2017, 4:17 PM
danielcdh added a comment to D32877: Restrict call metadata based hotness detection to Sample PGO mode.

I feel the underlying problem here is the use of VP metadata to get counts in sample PGO mode. Dehao, even when VP metadata is available, can we add branch_weights to the call? Then, only sample PGO will annotate branch weights to calls and that won't affect instrumented PGO.

That sounds like potentially a good longer term change. But I'd still like for this to go in as an immediate fix for the issue, which is causing a noticeable impact on instrumentation PGO performance, and because there is no reason AFAIK for anything other than Sample PGO to need to look at call metadata for determining hotness.

May 5 2017, 3:23 PM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

May 5 2017, 11:26 AM
danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

When you move an instruction, you should also move its associated debug values.

See for example how this is done by MachineSink in method MachineSinking::SinkInstruction() (see file MachineSink.cpp).
For every instruction INSN that is a candidate for sinking, MachineSink collects all the DBG_VALUE instructions which are associated to INSN.
When INSN is moved to a different location, all the associated debug values (if any) are moved too.
I think you should do something similar.

May 5 2017, 11:26 AM
danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

I had a quick look at the code, and I noticed that you never check for the presence of DBG_VALUE instructions in a basic block.
I think that your pass would not work as expected with debug info.

More in general:

  • When you check the number of uses of a register, you should skip debug values.
  • When you move an instruction, you should also move its associated debug values.

    A lot of tests have changed as a result of this patch. However, I think there is a value in adding your original test cases involving function calls. Could you please add a new MIR test with code from (at least one of) your original small test cases?
May 5 2017, 10:06 AM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

May 5 2017, 10:05 AM

May 4 2017

danielcdh closed D32773: Update VP prof metadata during inlining..
May 4 2017, 6:00 PM
danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

I've updated the patch to implement this in Machine Level. The down side is that it needs to update a lot of unittests especially those autogenerated llc tests.

May 4 2017, 5:08 PM
danielcdh updated the diff for D32773: Update VP prof metadata during inlining..

update

May 4 2017, 5:03 PM
danielcdh updated the diff for D32773: Update VP prof metadata during inlining..

update

May 4 2017, 4:30 PM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

change to do live range shrinking at machine level.

May 4 2017, 4:21 PM

May 3 2017

danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

The time to shrink live ranges is after all the global machine code
motion is done if the goal is handle as many cases as possible.

Doing code motion in machine code is tricky because you have to worry
about physical registers, flags and such. But those are exactly the
things you need to worry about to avoid generating terrible code. For
example, you don't want to hoist something over a call that defines a
physreg or is used by a physreg copy!

That said, if reassociate is making silly decisions about where to
place operations, I wouldn't see a problem fixing that directly.

May 3 2017, 8:44 AM

May 2 2017

danielcdh updated the diff for D32773: Update VP prof metadata during inlining..

update

May 2 2017, 6:15 PM
danielcdh created D32773: Update VP prof metadata during inlining..
May 2 2017, 4:36 PM

May 1 2017

danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Yes, adding -pre-RA-sched=list-ilp can produce the expected result without this patch for my testcase. I'm not familiar with -pre-RA-sched, any quick insight why this option is not on by default?

May 1 2017, 9:23 AM

Apr 28 2017

danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

move the pass till the end of the optimizations.

Apr 28 2017, 10:51 AM
danielcdh retitled D32563: Add LiveRangeShrink pass to shrink live range within BB. from Improve code placement algorithm in Reassociate pass. to Add LiveRangeShrink pass to shrink live range within BB..
Apr 28 2017, 10:17 AM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

Update the patch to have a separate pass to handle live range shrinking within BB.

Apr 28 2017, 10:13 AM

Apr 27 2017

danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Can you please describe the actual algorithm you are trying to use for placement here?

Also, do you have performance numbers?

Apr 27 2017, 1:01 PM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

Apr 27 2017, 1:00 PM
danielcdh added a comment to D32563: Add LiveRangeShrink pass to shrink live range within BB..

Can you please describe the actual algorithm you are trying to use for placement here?

Apr 27 2017, 12:59 PM
danielcdh updated the diff for D32563: Add LiveRangeShrink pass to shrink live range within BB..

update

Apr 27 2017, 12:58 PM

Apr 26 2017

danielcdh updated the summary of D32563: Add LiveRangeShrink pass to shrink live range within BB..
Apr 26 2017, 3:10 PM
danielcdh created D32563: Add LiveRangeShrink pass to shrink live range within BB..
Apr 26 2017, 2:46 PM

Apr 20 2017

danielcdh updated the diff for D32315: Introduce a new DWARFContext::getInliningInfoForAddress API to expose pointers to strings stored in DWARF file..

set initial value for local variables.

Apr 20 2017, 3:15 PM
danielcdh created D32315: Introduce a new DWARFContext::getInliningInfoForAddress API to expose pointers to strings stored in DWARF file..
Apr 20 2017, 3:11 PM

Apr 19 2017

danielcdh added a comment to D32177: Using address range map to speedup finding inline stack for address..

updated in NFC commit r300753.

Apr 19 2017, 2:05 PM
danielcdh closed D32177: Using address range map to speedup finding inline stack for address..
Apr 19 2017, 1:22 PM
danielcdh added inline comments to D32236: PR32710: Disable using PMADDWD for unsigned short..
Apr 19 2017, 1:21 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

update

Apr 19 2017, 1:07 PM
danielcdh added a comment to D32177: Using address range map to speedup finding inline stack for address..

Will have separate patch to address other concerns.

Apr 19 2017, 1:07 PM
danielcdh closed D32236: PR32710: Disable using PMADDWD for unsigned short..
Apr 19 2017, 1:03 PM
danielcdh updated the diff for D32236: PR32710: Disable using PMADDWD for unsigned short..

update test

Apr 19 2017, 1:03 PM
danielcdh added a comment to D32236: PR32710: Disable using PMADDWD for unsigned short..

Hi Dehao (and Michael),

Test madd.ll has wrong CHECK-LABEL comments. For example, I see that "label" checks are lower-case. Those should be all upper-case.
I think that those checks are never used in practice.

To avoid problems with FileCheck, I also suggest to add an explicit check line for the 'ret' statement. That way, we know exactly what is the range of instructions where (v)pmaddwd should not appear.

-Andrea

Apr 19 2017, 1:02 PM
danielcdh created D32236: PR32710: Disable using PMADDWD for unsigned short..
Apr 19 2017, 11:41 AM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

Updated the logic and remove the assertion. Add a unittest to cover the symbolization of the padding zone.

Apr 19 2017, 11:20 AM
danielcdh reopened D32177: Using address range map to speedup finding inline stack for address..

Reverted as this breaks buildbot: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/3369

Apr 19 2017, 11:19 AM
danielcdh closed D32177: Using address range map to speedup finding inline stack for address..
Apr 19 2017, 8:03 AM

Apr 18 2017

danielcdh added a comment to D32177: Using address range map to speedup finding inline stack for address..

Now the assertion is enabled to ensure the behavior does not change. PTAL. Thanks!

Apr 18 2017, 4:05 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

update the code to ignore 0-sized ranges.

Apr 18 2017, 4:02 PM
danielcdh added a comment to D32177: Using address range map to speedup finding inline stack for address..

hold on... the assertion actually triggers... looking into why...

Apr 18 2017, 2:33 PM
danielcdh added inline comments to D32177: Using address range map to speedup finding inline stack for address..
Apr 18 2017, 2:32 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

remove unintended change...

Apr 18 2017, 2:32 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

update

Apr 18 2017, 2:31 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

update comment

Apr 18 2017, 2:05 PM
danielcdh added inline comments to D32177: Using address range map to speedup finding inline stack for address..
Apr 18 2017, 2:02 PM
danielcdh updated the diff for D32177: Using address range map to speedup finding inline stack for address..

update

Apr 18 2017, 2:02 PM