andrewng (Andrew Ng)
User

Projects

User does not belong to any projects.

User Details

User Since
Mar 2 2017, 6:37 AM (60 w, 8 h)

Recent Activity

Fri, Apr 20

andrewng created D45871: [DebugInfo] Use WithColor for more debug line warnings.
Fri, Apr 20, 4:06 AM
andrewng created D45869: [DebugInfo] Fix for split dwarf test on Windows (NFC).
Fri, Apr 20, 3:31 AM

Mar 27 2018

andrewng abandoned D44898: [ELF] Fix incorrect deduplication of MergeSyntheticSection's.

Abandoned in favour of D44923.

Mar 27 2018, 8:10 AM
andrewng updated the diff for D44923: [ELF] Disable ICF for synthetic sections.

Updated to address review comments.

Mar 27 2018, 6:49 AM
andrewng added a comment to D44898: [ELF] Fix incorrect deduplication of MergeSyntheticSection's.

2.8%. Somehow clang-fsds got 2.39% faster.

That speedup is interesting. I don't have a good theory to explain it though.

Even if this patch is somewhat performance-neutral, we still should avoid extra memory allocation if we can. Let's try ignoring synthetic sections in ICF. That's I think better.

Mar 27 2018, 6:07 AM
andrewng created D44923: [ELF] Disable ICF for synthetic sections.
Mar 27 2018, 6:04 AM

Mar 26 2018

andrewng added a comment to D44898: [ELF] Fix incorrect deduplication of MergeSyntheticSection's.

How much did Scylla get slower?

Maybe the best option is just disabling ICF of synthetic sections?

That sounds like a better way of fixing the issue.

Mar 26 2018, 3:07 PM
andrewng updated the diff for D44898: [ELF] Fix incorrect deduplication of MergeSyntheticSection's.

Updated to fix issue with clang compilation.

Mar 26 2018, 9:40 AM
andrewng updated the diff for D44682: [ELF] Fix X86 & X86_64 PLT retpoline padding.

Removed assertions and tidied up white space a little.

Mar 26 2018, 9:25 AM
andrewng updated the diff for D44898: [ELF] Fix incorrect deduplication of MergeSyntheticSection's.

Updated test to address review comments.

Mar 26 2018, 9:14 AM
andrewng added a comment to D44898: [ELF] Fix incorrect deduplication of MergeSyntheticSection's.

Seems reasonable. I will just do a quick check on the performance impact.

Mar 26 2018, 8:56 AM
andrewng created D44898: [ELF] Fix incorrect deduplication of MergeSyntheticSection's.
Mar 26 2018, 8:31 AM

Mar 25 2018

andrewng added a comment to D44682: [ELF] Fix X86 & X86_64 PLT retpoline padding.

I think the PLT writing should handle the padding itself. It certainly feels more correct. It looks though D44775 is heading in that direction too.

Mar 25 2018, 10:15 AM

Mar 21 2018

andrewng added a comment to D44682: [ELF] Fix X86 & X86_64 PLT retpoline padding.

Instead of doing this, we should fill executable segments with trap instructions even if linker script is in use, shouldn't we?

Mar 21 2018, 3:27 AM

Mar 20 2018

andrewng updated subscribers of D44682: [ELF] Fix X86 & X86_64 PLT retpoline padding.
Mar 20 2018, 8:27 AM
andrewng created D44682: [ELF] Fix X86 & X86_64 PLT retpoline padding.
Mar 20 2018, 8:21 AM

Mar 15 2018

andrewng added a comment to D44168: [ELF] Add .eh_frame pieces to map file.

Looking at the EhFrameSection writing and finalize code, it would seem that
some kind of alignment is being applied. This aligning is also the reason
why for x86_64 the mapping of CIE/FDE's to the output is so fragmented,
because all input we have seen is 4-byte aligned, but LLD uses 8-byte
alignment for the output.

Mar 15 2018, 8:08 AM

Mar 9 2018

andrewng added a comment to D44168: [ELF] Add .eh_frame pieces to map file.

Apart from the typo and the value for the alignment, LGTM.

Mar 9 2018, 6:32 AM
andrewng abandoned D42960: [ELF] Add .eh_frame pieces to map file.

Abandoning this revision in favour of D44168.

Mar 9 2018, 6:24 AM

Mar 8 2018

andrewng added a comment to D42960: [ELF] Add .eh_frame pieces to map file.

There are also some slight differences in the output between this implementation and the one in D44168. This patch does not output the "+0x0" for a zero offset and outputs the input section's alignment for the "Align" column. Which style of output is preferred?

Mar 8 2018, 2:22 AM

Mar 7 2018

andrewng updated the diff for D42960: [ELF] Add .eh_frame pieces to map file.

Updated to address review comments.

Mar 7 2018, 5:24 AM

Mar 6 2018

andrewng added a comment to D44168: [ELF] Add .eh_frame pieces to map file.

I agree that this change is simpler, but it also doesn't produce the same output as D42960. Part of the reason for the "complexity" in my change in D42960 was to try to prevent the map file output from getting too big/verbose.

Mar 6 2018, 11:14 AM
andrewng updated the diff for D42960: [ELF] Add .eh_frame pieces to map file.

Updated to address comments.

Mar 6 2018, 9:50 AM
andrewng retitled D42960: [ELF] Add .eh_frame pieces to map file from [ELF] DEMO: Example for adding .eh_frame pieces to map file to [ELF] Add .eh_frame pieces to map file.
Mar 6 2018, 9:42 AM

Mar 1 2018

andrewng updated the diff for D42960: [ELF] Add .eh_frame pieces to map file.

Rebased and updated with minor change to make the output offset hex rather than decimal. Modified existing map file test to demonstrate the proposed .eh_frame output.

Mar 1 2018, 7:43 AM
andrewng commandeered D42960: [ELF] Add .eh_frame pieces to map file.
Mar 1 2018, 7:39 AM

Nov 22 2017

andrewng added a comment to D40366: [Support][Parallel] ThreadPoolExecutor fixes for Windows.

I absolutely do not want to disable support for ConcRT under any circumstances unless we find that it is a completely broken library, which I highly doubt. Even then, I would want to ask Microsoft to fix it.

What is the exact problem you're running into?

Nov 22 2017, 11:08 AM
andrewng created D40366: [Support][Parallel] ThreadPoolExecutor fixes for Windows.
Nov 22 2017, 10:35 AM

Nov 16 2017

andrewng added a comment to D40119: [Support][CachePruning] Fix regression in pruning interval.
Nov 16 2017, 9:28 AM

Oct 16 2017

andrewng abandoned D38964: [TableGen] Workaround for VS 2015 Update 3 RelWithDebInfo build.

Workaround no longer required. This issue is also fixed by commit r315932.

Oct 16 2017, 1:55 PM
andrewng created D38964: [TableGen] Workaround for VS 2015 Update 3 RelWithDebInfo build.
Oct 16 2017, 10:47 AM

Oct 10 2017

andrewng closed D38690: [LLD] Fix findOrphanPos to consistently ignore "dead" OutputSection's.

Committed in rL315292: [LLD] Fix findOrphanPos to consistently ignore "dead" OutputSection's.

Oct 10 2017, 3:16 AM

Oct 9 2017

andrewng created D38690: [LLD] Fix findOrphanPos to consistently ignore "dead" OutputSection's.
Oct 9 2017, 8:44 AM

Oct 6 2017

andrewng abandoned D38539: [LLD] Do not unlink OutputFile prematurely.

Abandoned in favour of change https://reviews.llvm.org/D38571.

Oct 6 2017, 1:46 AM

Oct 5 2017

andrewng added a comment to D38539: [LLD] Do not unlink OutputFile prematurely.
In D38539#889005, @ruiu wrote:

Can you try https://reviews.llvm.org/D38571 to see if it will fix your problem? Thanks.

Oct 5 2017, 5:09 AM

Oct 4 2017

andrewng added a comment to D38539: [LLD] Do not unlink OutputFile prematurely.

What is the problem? std::thread::detach is a part of C++11, and it is hard to believe that we shouldn't use it at all.

Oct 4 2017, 10:25 AM
andrewng added a comment to D38539: [LLD] Do not unlink OutputFile prematurely.
In D38539#888333, @ruiu wrote:

From my understanding of FileOutputBuffer::commit in its current form, it does a call to sys::fs::rename for a regular file. So there is no direct association to the "unlink" performance issue described in the comment to unlinkAsync. Although, it could be that the implementation of rename on Linux could suffer from similar performance problems (i.e. it effectively still performs an unlink). Previously, FileOutputBuffer::create did remove (unlink) the target file. I believe that you recently removed this behaviour and it was this unlink that the unlinkAsync was trying to improve.

If you try it yourself, you'd realize that rename(2)'ing onto an existing file is as slow as calling unlink(2) on that file. My previous change to FileOutputBuffer doesn't fix the correctness issue in the usual use case, but it also improved the speed of lld, because lld no longer tries to unlink a file in the main thread. Before making that change, it tried to do that in the Driver by tryCreateFile.

Oct 4 2017, 9:53 AM
andrewng added a comment to D38539: [LLD] Do not unlink OutputFile prematurely.

Hi Rui,

Oct 4 2017, 9:31 AM
andrewng created D38539: [LLD] Do not unlink OutputFile prematurely.
Oct 4 2017, 6:44 AM

Sep 6 2017

andrewng updated the diff for D37462: [LLD] Fix padding of .eh_frame when in executable segment.

Updated based on suggestion. I still think that keeping the common "AlignSize" is better and safer.

Sep 6 2017, 3:08 AM

Sep 5 2017

andrewng added a comment to D37462: [LLD] Fix padding of .eh_frame when in executable segment.

A similar fix was required for the string table, see https://reviews.llvm.org/D36267.

Sep 5 2017, 3:38 AM
andrewng created D37462: [LLD] Fix padding of .eh_frame when in executable segment.
Sep 5 2017, 3:33 AM

Jul 5 2017

andrewng added a comment to D34976: [lit] Fix unit test discovery for Visual Studio builds..

Is this patch good to land?

Jul 5 2017, 10:45 AM

Jul 4 2017

andrewng added a comment to D34976: [lit] Fix unit test discovery for Visual Studio builds..

This patch fixes a Windows Visual Studio build lit issue introduced by rL306895 (D34855).

Jul 4 2017, 1:49 AM
andrewng updated subscribers of D34976: [lit] Fix unit test discovery for Visual Studio builds..
Jul 4 2017, 1:44 AM
andrewng created D34976: [lit] Fix unit test discovery for Visual Studio builds..
Jul 4 2017, 1:43 AM
andrewng added a comment to D34855: [lit] Factor out listdir logic shared by different test formats..

Ah, I just noticed the D34853, it fixes the issue I am observing.

Jul 4 2017, 12:53 AM
andrewng added a comment to D34853: Fix (benignly) incorrect GoogleTest specs in various lit configs..

I believe that this "build mode" is intended for the Visual Studio MSVC build. This build is special in that it can produce builds for multiple configurations, e.g. Debug, Release & RelWithDebInfo, within the same top level build output directory. It is this configuration type that defines the "build mode". This means that the unit tests will only pick up the configuration that matches that of the lit that was run. Without the "build mode" I believe lit might end up running all configurations of unit tests that have been built, which is probably not the intended behaviour.

Jul 4 2017, 12:35 AM · Restricted Project, lld

Jun 29 2017

andrewng added a comment to D34204: [LLD][LinkerScript] Allow non-alloc sections to be assigned to segments..

Hi Rafael,

Is this patch now good to land?

Cheers,
Andrew

Jun 29 2017, 6:30 AM

Jun 26 2017

andrewng added a comment to D34204: [LLD][LinkerScript] Allow non-alloc sections to be assigned to segments..

Hi Rafael,

Jun 26 2017, 10:02 AM
andrewng updated the diff for D34204: [LLD][LinkerScript] Allow non-alloc sections to be assigned to segments..

Updated for review comments.

Jun 26 2017, 2:22 AM
andrewng updated the summary of D34204: [LLD][LinkerScript] Allow non-alloc sections to be assigned to segments..
Jun 26 2017, 2:19 AM

Jun 23 2017

andrewng added a comment to D34204: [LLD][LinkerScript] Allow non-alloc sections to be assigned to segments..

Is there any more feedback on this patch or is it good to land?

Jun 23 2017, 9:36 AM

Jun 21 2017

andrewng added a comment to D34204: [LLD][LinkerScript] Allow non-alloc sections to be assigned to segments..
In D34204#786240, @ruiu wrote:

Can you change your program?

Jun 21 2017, 11:06 AM

Jun 20 2017

andrewng added a comment to D34204: [LLD][LinkerScript] Allow non-alloc sections to be assigned to segments..
In D34204#784448, @ruiu wrote:

Hi Andrew,

What are you actually trying to address with this patch? I'd like to know more about the background as to why you want this. Is there any reason you can't fix your program rather than the linker?

Jun 20 2017, 3:12 AM

Jun 19 2017

andrewng added a comment to D34203: [LLD][LinkerScript] Add support for segment NONE..

Is this patch good to land?

Jun 19 2017, 3:04 AM

Jun 16 2017

andrewng removed reviewers for D34204: [LLD][LinkerScript] Allow non-alloc sections to be assigned to segments.: jhenderson, bd1976bris, edd, gbreynoo.
Jun 16 2017, 1:51 AM
andrewng removed reviewers for D34203: [LLD][LinkerScript] Add support for segment NONE.: jhenderson, bd1976bris, edd, gbreynoo.
Jun 16 2017, 1:49 AM

Jun 14 2017

andrewng created D34204: [LLD][LinkerScript] Allow non-alloc sections to be assigned to segments..
Jun 14 2017, 6:41 AM
andrewng created D34203: [LLD][LinkerScript] Add support for segment NONE..
Jun 14 2017, 6:29 AM

May 18 2017

andrewng added inline comments to D33112: Optimize orphan plament in a general way.
May 18 2017, 4:52 AM

May 5 2017

andrewng added a comment to D32910: [Lit] Fix to prevent creation of "%SystemDrive%" directory on Windows..

I don't disbelieve you, but do you have any thoughts about why I've never seen this when running tests locally?

May 5 2017, 8:39 AM
andrewng created D32910: [Lit] Fix to prevent creation of "%SystemDrive%" directory on Windows..
May 5 2017, 8:20 AM

Apr 27 2017

andrewng updated the diff for D31604: [DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values..

As suggested by Adrian, I have made prependDIExpr a static member function of DIExpression (and also appendOffset used by prependDIExpr). This is then used to apply the LEA address shift to the replacement debug value.

Apr 27 2017, 8:03 AM

Apr 26 2017

andrewng added inline comments to D31604: [DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values..
Apr 26 2017, 1:08 PM

Apr 25 2017

andrewng added inline comments to D31604: [DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values..
Apr 25 2017, 10:03 AM
andrewng added a comment to D31604: [DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values..

Sorry for the slight delay but only just got back from holiday.

Apr 25 2017, 6:22 AM

Apr 21 2017

andrewng added inline comments to D31604: [DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values..
Apr 21 2017, 3:02 AM
andrewng added a comment to D31755: [DebugInfo][X86] Fix handling of DBG_VALUE's in post-RA scheduler..

Rather than scanning forward for DBG_VALUEs, is there a more systematic way to find them? E.g., by iterating over USEs? If not, doing it this way is probably fine.

Apr 21 2017, 2:56 AM

Apr 18 2017

andrewng updated the diff for D31806: [SimplifyLibCalls] Fix infinite loop with fast-math optimization..

Updated test to focus only on the instcombine pass.

Apr 18 2017, 11:06 PM

Apr 15 2017

andrewng updated the diff for D31806: [SimplifyLibCalls] Fix infinite loop with fast-math optimization..

I've updated the comments to explicitly mention the MinGW-w64 case that encounters this issue and removed the unnecessary FunctionType related checks.

Apr 15 2017, 10:05 PM

Apr 12 2017

andrewng added inline comments to D31806: [SimplifyLibCalls] Fix infinite loop with fast-math optimization..
Apr 12 2017, 6:53 PM

Apr 11 2017

andrewng added a comment to D31806: [SimplifyLibCalls] Fix infinite loop with fast-math optimization..

gcc transforms the given C code to a call to expf(), just like LLVM, but then it looks like their inliner works a bit differently so the generated code ends up with a call to expf() rather than an infinite loop. With your patch, LLVM also generates a call to expf(). This seems weird... does MinGW's C library actually have an expf() function?

Apr 11 2017, 10:42 PM

Apr 9 2017

andrewng added a comment to D31806: [SimplifyLibCalls] Fix infinite loop with fast-math optimization..

Can you explain a little more how exactly we end up in this situation? Your example has undefined behavior according to the C standard.

Apr 9 2017, 3:21 AM

Apr 7 2017

andrewng created D31806: [SimplifyLibCalls] Fix infinite loop with fast-math optimization..
Apr 7 2017, 5:03 AM

Apr 6 2017

andrewng created D31755: [DebugInfo][X86] Fix handling of DBG_VALUE's in post-RA scheduler..
Apr 6 2017, 5:27 AM

Apr 4 2017

andrewng updated the diff for D31604: [DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values..

Removed DWARF version checks as suggested by Adrian (thanks) and updated comments.

Apr 4 2017, 5:38 AM

Apr 3 2017

andrewng added inline comments to D31604: [DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values..
Apr 3 2017, 10:57 AM
andrewng created D31604: [DebugInfo][X86] Improve X86 Optimize LEAs handling of debug values..
Apr 3 2017, 8:35 AM

Mar 17 2017

andrewng added a comment to D30835: [DebugInfo][X86] Teach Optimize LEAs pass to handle debug values.

Thanks for the review.

Mar 17 2017, 9:57 AM
andrewng added a comment to D30835: [DebugInfo][X86] Teach Optimize LEAs pass to handle debug values.

I have now split this patch and updated this review to contain the first part which fixes the codegen issue but doesn't preserve the debug values. I will work on a separate patch which will attempt to preserve the debug values.

Mar 17 2017, 7:35 AM
andrewng updated the diff for D30835: [DebugInfo][X86] Teach Optimize LEAs pass to handle debug values.
Mar 17 2017, 7:31 AM

Mar 16 2017

andrewng added a comment to D30835: [DebugInfo][X86] Teach Optimize LEAs pass to handle debug values.

Looks like this change does two things - fixes the "debug info affects optimization" and also does some work to improve optimized debug info quality? Might be worth separating this into two changes to make sure there's good test coverage/easier to review/etc?

I'm about to leave work now but I will look on Monday into whether separating this into two makes sense.

On further consideration, I think it makes sense to keep the patch as is because otherwise the first patch would be "lossy", i.e. the output would lose debug information. So in fixing the optimisation, ideally the debug information should be correct and preserved. However, if there are concerns with the "patching" of the debug information, then it might make sense to split the patch. What are your thoughts?

Mar 16 2017, 9:22 AM

Mar 13 2017

andrewng added a comment to D30835: [DebugInfo][X86] Teach Optimize LEAs pass to handle debug values.

Looks like this change does two things - fixes the "debug info affects optimization" and also does some work to improve optimized debug info quality? Might be worth separating this into two changes to make sure there's good test coverage/easier to review/etc?

I'm about to leave work now but I will look on Monday into whether separating this into two makes sense.

Mar 13 2017, 6:06 AM

Mar 10 2017

andrewng added a comment to D30835: [DebugInfo][X86] Teach Optimize LEAs pass to handle debug values.

Looks like this change does two things - fixes the "debug info affects optimization" and also does some work to improve optimized debug info quality? Might be worth separating this into two changes to make sure there's good test coverage/easier to review/etc?

Mar 10 2017, 11:10 AM
andrewng created D30835: [DebugInfo][X86] Teach Optimize LEAs pass to handle debug values.
Mar 10 2017, 10:01 AM