Page MenuHomePhabricator

weiwang (Wei Wang)
User

Projects

User does not belong to any projects.

User Details

User Since
Jul 20 2020, 2:20 PM (102 w, 1 d)

Recent Activity

Apr 27 2022

weiwang committed rG26a0d53b1544: [CHR] Skip region containing llvm.coro.id (authored by weiwang).
[CHR] Skip region containing llvm.coro.id
Apr 27 2022, 10:28 AM · Restricted Project, Restricted Project
weiwang closed D124418: [CHR] Skip region containing llvm.coro.id.
Apr 27 2022, 10:27 AM · Restricted Project, Restricted Project
weiwang updated the diff for D124418: [CHR] Skip region containing llvm.coro.id.

minior update to test

Apr 27 2022, 10:25 AM · Restricted Project, Restricted Project
weiwang updated the diff for D124418: [CHR] Skip region containing llvm.coro.id.

clang-format

Apr 27 2022, 9:57 AM · Restricted Project, Restricted Project
weiwang updated the diff for D124418: [CHR] Skip region containing llvm.coro.id.

comments and fix typo

Apr 27 2022, 9:55 AM · Restricted Project, Restricted Project

Apr 26 2022

weiwang added a comment to D124418: [CHR] Skip region containing llvm.coro.id.

Hi, could you elaborate more what CHR does and why we need to skip coro.id? I don't know what happened now

Apr 26 2022, 1:53 PM · Restricted Project, Restricted Project

Apr 25 2022

weiwang added a reviewer for D124418: [CHR] Skip region containing llvm.coro.id: ChuanqiXu.
Apr 25 2022, 4:49 PM · Restricted Project, Restricted Project
weiwang updated the summary of D124418: [CHR] Skip region containing llvm.coro.id.
Apr 25 2022, 2:55 PM · Restricted Project, Restricted Project
weiwang requested review of D124418: [CHR] Skip region containing llvm.coro.id.
Apr 25 2022, 2:01 PM · Restricted Project, Restricted Project

Mar 30 2022

weiwang updated the diff for D122759: [time-report] Add timers to codegen actions.

fix typo

Mar 30 2022, 2:07 PM · Restricted Project, Restricted Project
weiwang updated the summary of D122759: [time-report] Add timers to codegen actions.
Mar 30 2022, 2:04 PM · Restricted Project, Restricted Project
weiwang requested review of D122759: [time-report] Add timers to codegen actions.
Mar 30 2022, 1:09 PM · Restricted Project, Restricted Project

Jan 21 2022

weiwang committed rG55d887b83364: [time-trace] Add optimizer and codegen regions to NPM (authored by weiwang).
[time-trace] Add optimizer and codegen regions to NPM
Jan 21 2022, 8:21 PM
weiwang closed D117605: [time-trace] Add optimizer and codegen regions to NPM.
Jan 21 2022, 8:21 PM · Restricted Project
weiwang added a reviewer for D117605: [time-trace] Add optimizer and codegen regions to NPM: bruno.
Jan 21 2022, 11:17 AM · Restricted Project

Jan 18 2022

weiwang requested review of D117605: [time-trace] Add optimizer and codegen regions to NPM.
Jan 18 2022, 2:12 PM · Restricted Project

Nov 19 2021

weiwang committed rGa075d6722283: [Sema] fix nondeterminism in ASTContext::getDeducedTemplateSpecializationType (authored by weiwang).
[Sema] fix nondeterminism in ASTContext::getDeducedTemplateSpecializationType
Nov 19 2021, 1:30 PM
weiwang closed D112481: [Sema] fix nondeterminism in ASTContext::getDeducedTemplateSpecializationType.
Nov 19 2021, 1:30 PM · Restricted Project, Restricted Project

Oct 27 2021

weiwang added a comment to D112481: [Sema] fix nondeterminism in ASTContext::getDeducedTemplateSpecializationType.

This issue has been blocking our internal module re-enablement for some time now, and we really appreciate any feedback. We also wonder if only DeducedTemplateSpecializationType is affected or it could also happen to other types.

Oct 27 2021, 10:09 PM · Restricted Project, Restricted Project

Oct 25 2021

weiwang committed rGb283d55c90dd: [openmp] Emit deferred diag only when device compilation presents (authored by weiwang).
[openmp] Emit deferred diag only when device compilation presents
Oct 25 2021, 11:19 AM
weiwang closed D109175: [openmp] Emit deferred diag only when device compilation presents.
Oct 25 2021, 11:19 AM · Restricted Project

Sep 15 2021

weiwang added a comment to D109175: [openmp] Emit deferred diag only when device compilation presents.

I agree with Johannes and Alexey that deferred diags are only needed when LangOpts.OMPTargetTriples.empty(). However, I am not sure whether it is only needed in device compilation.

For other offloading languages like CUDA/HIP it is needed in both device and host compilation.

Technically, we might even want to delay in host only mode for OpenMP, but that is something we can revisit (e.g., by dynamically setting a flag based on the directives we've seen).
@yaxunl Should we for now check if there is any associated offload job?

Shall we go ahead and get this change in and think about more longer term solution later?

LGTM. This patch should be sufficient to limit deferred diags to OpenMP with offloading. Device compilation is covered by OpenMPIsDevice and host compilation is covered by !LangOpts.OMPTargetTriples.empty(). I will leave the decision to Johannes.

Sep 15 2021, 9:59 AM · Restricted Project

Sep 8 2021

weiwang added a comment to D109175: [openmp] Emit deferred diag only when device compilation presents.

I agree with Johannes and Alexey that deferred diags are only needed when LangOpts.OMPTargetTriples.empty(). However, I am not sure whether it is only needed in device compilation.

For other offloading languages like CUDA/HIP it is needed in both device and host compilation.

Technically, we might even want to delay in host only mode for OpenMP, but that is something we can revisit (e.g., by dynamically setting a flag based on the directives we've seen).
@yaxunl Should we for now check if there is any associated offload job?

Sep 8 2021, 10:06 AM · Restricted Project

Sep 3 2021

weiwang retitled D109175: [openmp] Emit deferred diag only when device compilation presents from [openmp] Add clang cc1 option -fopenmp-skip-deferred-diags to [openmp] Emit deferred diag only when device compilation presents.
Sep 3 2021, 11:18 AM · Restricted Project
weiwang updated the diff for D109175: [openmp] Emit deferred diag only when device compilation presents.

update as discussed.

Sep 3 2021, 11:17 AM · Restricted Project

Sep 2 2021

weiwang added a comment to D109175: [openmp] Emit deferred diag only when device compilation presents.

Why do we need this flag, is the absence of -fopenmp-targets not sufficient?

Just double checked, this is the full omp related options currently in use:

"-fopenmp"
"-fopenmp-version=31"
"-fopenmp-version=31"
"-fopenmp-cuda-parallel-target-regions"

We saw a huge number of DECLS_TO_CHECK_FOR_DEFERRED_DIAGS records. I don't know if this has anything to do with omp version being 31, since prior 5.0, everything is available on host.

I don't think we are selective right now. As I was saying, disable deferred parsing if fopenmp-targets is missing, no need for this option.

Sep 2 2021, 2:00 PM · Restricted Project
weiwang added a comment to D109175: [openmp] Emit deferred diag only when device compilation presents.

Why do we need this flag, is the absence of -fopenmp-targets not sufficient?

Just double checked, this is the full omp related options currently in use:

"-fopenmp"
"-fopenmp-version=31"
"-fopenmp-version=31"
"-fopenmp-cuda-parallel-target-regions"

We saw a huge number of DECLS_TO_CHECK_FOR_DEFERRED_DIAGS records. I don't know if this has anything to do with omp version being 31, since prior 5.0, everything is available on host.

Sep 2 2021, 1:24 PM · Restricted Project
weiwang added a comment to D109175: [openmp] Emit deferred diag only when device compilation presents.

Our internal codebase never uses the target directive. Once the deferred diags is bypassed, we observed 18% e2e build time improvement.

Is that with -fopenmp or without?
That seems, kinda a lot more than i would have expected,
perhaps there are some other ways to reduce the overhead other than this approach?

Sep 2 2021, 11:45 AM · Restricted Project
weiwang added a comment to D109175: [openmp] Emit deferred diag only when device compilation presents.

Our internal codebase never uses the target directive. Once the deferred diags is bypassed, we observed 18% e2e build time improvement.

Sep 2 2021, 10:57 AM · Restricted Project
weiwang updated subscribers of D109175: [openmp] Emit deferred diag only when device compilation presents.
Sep 2 2021, 10:50 AM · Restricted Project
weiwang added a reviewer for D109175: [openmp] Emit deferred diag only when device compilation presents: yaxunl.
Sep 2 2021, 10:49 AM · Restricted Project
weiwang requested review of D109175: [openmp] Emit deferred diag only when device compilation presents.
Sep 2 2021, 10:47 AM · Restricted Project

Aug 26 2021

weiwang added a comment to D70172: [CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese.

Hi @yaxunl! I'm working on upgrading a large codebase from LLVM-9 to LLVM-12. I noticed on average 10% compilation speed regression that seems to be caused this change. We use Clang modules and historically provide -fopenmp compiler flag by default. The problem seems to be that compiling and importing modules is now slower, with the generated modules size increased by 2X. llvm-bcanalyzer tool shows that it's dominated by DECLS_TO_CHECK_FOR_DEFERRED_DIAGS. If I understand it right, your change is only relevant when target offloading is used. I inspected all of #pragma omp directives and can confirm that we don't use it.

I see that most of this code is gated by OpenMP flag. I wonder if there is a finer grain way to enable openmp parallel code generation without target offloading? Would it make sense to extend this code to check if -fopenom-targets is set before recording DECLS_TO_CHECK_FOR_DEFERRED_DIAGS?

Note, this was measured with @weiwang's https://reviews.llvm.org/D101793.

Aug 26 2021, 10:01 AM · Restricted Project

May 21 2021

weiwang added a comment to D101793: [clang][AST] Improve AST Reader/Writer memory footprint.

Thanks for the approval!

Just want to understand the list of "decls to check for deferred diagnostics" better, where are these decls coming from? And why do they need to be checked for warnings? I see decls from libc are in the list, but I have no idea why are they selected.

For offloading languages e.g. OpenMP/CUDA/HIP, there are apparent errors in functions shared between host and device. However, unless these functions are sure to be emitted on device or host, these errors should not be emitted. These errors are so called deferred error messages. The function decls which need to be checked are recorded. After AST is finalized, the AST of these functions are iterated. If a function is found sure to be emitted, the deferred error message in it are emitted.

May 21 2021, 10:05 AM · Restricted Project

May 20 2021

weiwang added a comment to D101793: [clang][AST] Improve AST Reader/Writer memory footprint.

Thanks for the approval!

May 20 2021, 4:14 PM · Restricted Project
weiwang committed rGe6b8320c0a63: [clang][AST] Improve AST Reader/Writer memory footprint (authored by weiwang).
[clang][AST] Improve AST Reader/Writer memory footprint
May 20 2021, 3:35 PM
weiwang closed D101793: [clang][AST] Improve AST Reader/Writer memory footprint.
May 20 2021, 3:35 PM · Restricted Project
weiwang updated the diff for D101793: [clang][AST] Improve AST Reader/Writer memory footprint.

make both ASTReader::DeclsToCheckForDeferredDiags and Sema::DeclsToCheckForDeferredDiags SmallSetVector

May 20 2021, 3:00 PM · Restricted Project
weiwang added a comment to D101793: [clang][AST] Improve AST Reader/Writer memory footprint.

Tried to make Sema::DeclsToCheckForDeferredDiags llvm::SmallSetVector. The heap RSS did drop significantly (from peak 100GB to 59GB) , but not as good as the current fix (peak 26GB), which makes ASTReader::DeclsToCheckForDeferredDiags llvm::SmallSetVector.

I think the reason is that the duplicated decls are read from multiple module file sources (ASTReader::ReadAST() -> ASTReader::ReadASTBlock()), then stored into ASTReader::DeclsToCheckForDeferredDiags, then goes into Sema::DeclsToCheckForDeferredDiags in ASTReader::ReadDeclsToCheckForDeferredDiags(). Doing dedup at the early stage when the decls were just read in ASTReader is more effective at reducing RSS.

What if you use SmallSetVector for both Sema::DeclsToCheckForDeferredDiags and ASTReader::DeclsToCheckForDeferredDiags? Does it cause extra memory usage compared to using it only for ASTReader::DeclsToCheckForDeferredDiags? Thanks.

May 20 2021, 1:25 PM · Restricted Project

May 19 2021

weiwang added a comment to D101793: [clang][AST] Improve AST Reader/Writer memory footprint.

Tried to make Sema::DeclsToCheckForDeferredDiags llvm::SmallSetVector. The heap RSS did drop significantly (from peak 100GB to 59GB) , but not as good as the current fix (peak 26GB), which makes ASTReader::DeclsToCheckForDeferredDiags llvm::SmallSetVector.

May 19 2021, 11:53 AM · Restricted Project

May 17 2021

weiwang added a comment to D101793: [clang][AST] Improve AST Reader/Writer memory footprint.

I think the root cause might be duplicated decls are added to Sema::DeclsToCheckForDeferredDiags defined in

https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Sema/Sema.h#L1789

When compiling source codes, a decl is added only once. However if modules are imported, duplicate decls may be added.

We need to avoid adding duplicate decls to Sema::DeclsToCheckForDeferredDiags. However we cannot simply change it to a set since the order is important, otherwise the error message for later code may show up earlier, causing confusion for users. I would suggest to change its type to SetVector, which keeps the order and also avoids duplicates.

May 17 2021, 3:51 PM · Restricted Project

May 13 2021

weiwang added a comment to D101793: [clang][AST] Improve AST Reader/Writer memory footprint.

Finally dealt with the other issues I need to take care. Let's resume the discussion.

May 13 2021, 10:59 AM · Restricted Project

May 11 2021

weiwang added a comment to D102229: [libomptarget][nfc] Add hook to easily disable building amdgcn bclib.

LGTM.

May 11 2021, 9:09 AM · Restricted Project
weiwang added a comment to D101509: An attempt to abandon omptarget out-of-tree builds..

Thanks for helping on this issue!

May 11 2021, 8:49 AM · Restricted Project

May 10 2021

weiwang added a comment to D101509: An attempt to abandon omptarget out-of-tree builds..

Yes, you can disable the device RTL build using this control.

It looks like your scripts define one of these PATH variables. And it has something to do with -ffreestanding. I was able to reproduce it with the following example:

#include <stdint.h>

uint64_t foo() {
  return UINT64_C(0x1);
}

$ CPLUS_INCLUDE_PATH=<path>/x86_64-linux-gnu/9.2.0/include clang test.cpp -c -ffreestanding

test.cpp:4:10: error: use of undeclared identifier '__UINT64_C'
  return UINT64_C(0x1);
         ^
<path>/x86_64-linux-gnu/9.2.0/include/stdint-gcc.h:254:21: note: expanded from macro 'UINT64_C'
#define UINT64_C(c) __UINT64_C(c)
                    ^
1 error generated.

I am not sure whether clang is supposed to work in this conditions or not, but the error is not specific to amdgcn device RTL build. Basically, any target that uses the in-tree clang may fail the same way (e.g. LIT tests).

May 10 2021, 9:28 PM · Restricted Project
weiwang added a comment to D101509: An attempt to abandon omptarget out-of-tree builds..

Looks like -DLIBOMPTARGET_AMDGCN_GFXLIST="" would disable Device RTL build, but still build rest of the libomptarget.

May 10 2021, 8:48 PM · Restricted Project
weiwang added a comment to D101509: An attempt to abandon omptarget out-of-tree builds..

I have a guess that clang-resource-headers are not built at the point, where we invoke clang. Can you please check this workaround in openmp/libomptarget/deviceRTLs/amdgcn/CMakeLists.txt?

add_custom_command(
  OUTPUT ${bc1_filename}
  COMMAND ${cu_cmd} ${file} -o ${bc1_filename}
  DEPENDS ${file} ${h_files} clang-resource-headers)

I added clang-resource-headers dependency after ${h_files}.

May 10 2021, 8:31 PM · Restricted Project
weiwang added a comment to D101509: An attempt to abandon omptarget out-of-tree builds..

@weiwang, I hope you do not mind if I ask you to run some experiments on your side? Otherwise, I am not sure how to proceed :)

Can you please run the command that fails, pass -E to it and check where the header files are coming from? I.e. run this:

cd /data/users/wangwei/tp2/llvm-build/platform009/build_nopic/projects/openmp/libomptarget/deviceRTLs/amdgcn && /data/users/wangwei/tp2/llvm-build/platform009/build_nopic/bin/clang-13 -xc++ -c -std=c++14 -ffreestanding -target amdgcn-amd-amdhsa -emit-llvm -Xclang -aux-triple -Xclang x86_64-unknown-linux-gnu -fopenmp -fopenmp-cuda-mode -Xclang -fopenmp-is-device -D__AMDGCN__ -Xclang -target-cpu -Xclang gfx700 -fvisibility=default -Wno-unused-value -nogpulib -O2 -I/home/wangwei/local/llvm-project/openmp/libomptarget/deviceRTLs/amdgcn/src -I/home/wangwei/local/llvm-project/openmp/libomptarget/deviceRTLs/common/include -I/home/wangwei/local/llvm-project/openmp/libomptarget/deviceRTLs /home/wangwei/local/llvm-project/openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip -E

Regarding your question "how this change would change anything", can you please check for "Not building AMDGCN device RTL: AOMP not found" message in your "old" builds? I suppose my change for find_package invocation might have caused different behavior in your setup. Before my change we were looking for LLVM in the following paths:

$ENV{AOMP}
$ENV{HOME}/rocm/aomp
/opt/rocm/aomp
/usr/lib/rocm/aomp
${LIBOMPTARGET_NVPTX_CUDA_COMPILER_DIR}
${LIBOMPTARGET_NVPTX_CUDA_LINKER_DIR}
${CMAKE_CXX_COMPILER_DIR}

Not we look for LLVM in all paths that cmake examines by default: https://cmake.org/cmake/help/latest/command/find_package.html#search-procedure

May 10 2021, 7:06 PM · Restricted Project
weiwang added a comment to D101509: An attempt to abandon omptarget out-of-tree builds..

We are getting build errors internally with this change. They are all related to libomptarget. Our internal build script uses gcc to build llvm.

One example:

[108/122] Generating target_impl.gfx700.bc
FAILED: projects/openmp/libomptarget/deviceRTLs/amdgcn/target_impl.gfx700.bc
cd /data/users/wangwei/tp2/llvm-build/platform009/build_nopic/projects/openmp/libomptarget/deviceRTLs/amdgcn && /data/users/wangwei/tp2/llvm-build/platform009/build_nopic/bin/clang-13 -xc++ -c -std=c++14 -ffreestanding -target amdgcn-amd-amdhsa -emit-llvm -Xclang -aux-triple -Xclang x86_64-unknown-linux-gnu -fopenmp -fopenmp-cuda-mode -Xclang -fopenmp-is-device -D__AMDGCN__ -Xclang -target-cpu -Xclang gfx700 -fvisibility=default -Wno-unused-value -nogpulib -O2 -I/home/wangwei/local/llvm-project/openmp/libomptarget/deviceRTLs/amdgcn/src -I/home/wangwei/local/llvm-project/openmp/libomptarget/deviceRTLs/common/include -I/home/wangwei/local/llvm-project/openmp/libomptarget/deviceRTLs /home/wangwei/local/llvm-project/openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip -o target_impl.gfx700.bc
/home/wangwei/local/llvm-project/openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip:185:25: error: use of undeclared identifier '__UINT64_C'
  lo = (uint32_t)(val & UINT64_C(0x00000000FFFFFFFF));
                        ^
/home/wangwei/fbsource/fbcode/third-party2/gcc/9.x/centos7-native/3bed279/lib/gcc/x86_64-redhat-linux-gnu/9.x/include/stdint-gcc.h:254:21: note: expanded from macro 'UINT64_C'
#define UINT64_C(c) __UINT64_C(c)
                    ^
/home/wangwei/local/llvm-project/openmp/libomptarget/deviceRTLs/amdgcn/src/target_impl.hip:186:26: error: use of undeclared identifier '__UINT64_C'
  hi = (uint32_t)((val & UINT64_C(0xFFFFFFFF00000000)) >> 32);
                         ^
/home/wangwei/fbsource/fbcode/third-party2/gcc/9.x/centos7-native/3bed279/lib/gcc/x86_64-redhat-linux-gnu/9.x/include/stdint-gcc.h:254:21: note: expanded from macro 'UINT64_C'
#define UINT64_C(c) __UINT64_C(c)
                    ^
2 errors generated.

Hello @weiwang, thank you for reporting this. Can you please provide some details how to reproduce it?

One strange thing is that the pre-built /data/users/wangwei/tp2/llvm-build/platform009/build_nopic/bin/clang-13 includes stdint-gcc.h, where it takes the definition for UINT64_C. In my builds clang takes the definition from its own lib/clang/13.0.0/include/stdint.h.

Thanks for the quick response. It may not be easily reproducible since the build script that triggers this sets up its own environment. This is part of the company's internal build system. During my local try, clang built clang always works, but the build script uses gcc to build clang. Maybe gcc would insert its own library headers into search path, and this could cause some confusing about the order of include paths? But again, we have always used gcc to build clang, and it never had issue until now. I am not sure how this change would change anything.

If you do not need libomptarget for your package, you may pass -DOPENMP_ENABLE_LIBOMPTARGET=OFF to cmake.

With -DOPENMP_ENABLE_LIBOMPTARGET=OFF, the error is gone. I'll check internally to see if libomptarget can be disabled. Meanwhile, it would still be great to know what went wrong.

May 10 2021, 6:34 PM · Restricted Project
weiwang added a comment to D101509: An attempt to abandon omptarget out-of-tree builds..

We are getting build errors internally with this change. They are all related to libomptarget. Our internal build script uses gcc to build llvm.

May 10 2021, 5:19 PM · Restricted Project

May 6 2021

weiwang added reviewers for D101793: [clang][AST] Improve AST Reader/Writer memory footprint: rsmith, riccibruno.
May 6 2021, 10:12 AM · Restricted Project
weiwang added a comment to D101793: [clang][AST] Improve AST Reader/Writer memory footprint.

Decls in Sema::DeclsToCheckForDeferredDiags is supposed to be unique. Therefore the fact that '1,734,387,685 out of 1,734,404,000 elements are the same' is surprising. Did this happen when you compile the source code and write AST? What language was the source code? C++, OpenMP, or CUDA? What was the decl that got duplicated? Thanks.

May 6 2021, 10:11 AM · Restricted Project

May 3 2021

weiwang added a comment to D101793: [clang][AST] Improve AST Reader/Writer memory footprint.

We've seen a huge memory footprint from AST Reader/Writer in a single CU with module enabled from internal workloads. Upon further analysis, the content of vector DeclsToCheckForDeferredDiags seems mostly redundant. In one case, 1,734,387,685 out of 1,734,404,000 elements are the same. While this may indicate something wrong with the source itself, it also suggests that compiler would be better to perform deduplication on this type of Decl ID.

May 3 2021, 4:11 PM · Restricted Project
weiwang added a reviewer for D101793: [clang][AST] Improve AST Reader/Writer memory footprint: yaxunl.
May 3 2021, 4:07 PM · Restricted Project
weiwang requested review of D101793: [clang][AST] Improve AST Reader/Writer memory footprint.
May 3 2021, 4:03 PM · Restricted Project

Apr 28 2021

weiwang added inline comments to D101374: [LV] Consider Loop Unroll Hints When Making Interleave Decisions.
Apr 28 2021, 11:02 PM · Restricted Project, Restricted Project

Feb 12 2021

weiwang committed rG80dc0661bd8b: [LTO] Perform DSOLocal propagation in combined index (authored by weiwang).
[LTO] Perform DSOLocal propagation in combined index
Feb 12 2021, 11:11 PM
weiwang closed D96398: [LTO] Perform DSOLocal propagation in combined index.
Feb 12 2021, 11:10 PM · Restricted Project
weiwang added inline comments to D96398: [LTO] Perform DSOLocal propagation in combined index.
Feb 12 2021, 10:56 PM · Restricted Project
weiwang updated the diff for D96398: [LTO] Perform DSOLocal propagation in combined index.

update comment and remove one unintended change.

Feb 12 2021, 10:52 PM · Restricted Project
weiwang updated the diff for D96398: [LTO] Perform DSOLocal propagation in combined index.

fixed test failures due to bit check

Feb 12 2021, 1:56 PM · Restricted Project
weiwang added inline comments to D96398: [LTO] Perform DSOLocal propagation in combined index.
Feb 12 2021, 10:46 AM · Restricted Project
weiwang updated the diff for D96398: [LTO] Perform DSOLocal propagation in combined index.
  1. serialize and de-serialize the flag.
  2. update related test cases.
Feb 12 2021, 10:44 AM · Restricted Project

Feb 11 2021

weiwang added inline comments to D96398: [LTO] Perform DSOLocal propagation in combined index.
Feb 11 2021, 5:29 PM · Restricted Project
weiwang updated the diff for D96398: [LTO] Perform DSOLocal propagation in combined index.

update according to comment.

Feb 11 2021, 5:22 PM · Restricted Project
weiwang retitled D96398: [LTO] Perform DSOLocal propagation in combined index from [LTO] Pre-populate IsDSOLocal result in combined index to [LTO] Perform DSOLocal propagation in combined index.
Feb 11 2021, 3:44 PM · Restricted Project
weiwang updated the diff for D96398: [LTO] Perform DSOLocal propagation in combined index.

Use the propagation approach.

Feb 11 2021, 3:41 PM · Restricted Project
weiwang added inline comments to D96398: [LTO] Perform DSOLocal propagation in combined index.
Feb 11 2021, 10:58 AM · Restricted Project

Feb 10 2021

weiwang added a comment to D96398: [LTO] Perform DSOLocal propagation in combined index.

The overhead of repeatedly scanning all summaries could be quite high in the current ValueInfo::isDSOLocal implementation, especially in large projects. This change helped reduce overall thinLTO link time of a major internal workload by 20 mins.

Feb 10 2021, 9:25 AM · Restricted Project

Feb 9 2021

weiwang added a reviewer for D96398: [LTO] Perform DSOLocal propagation in combined index: tejohnson.
Feb 9 2021, 10:49 PM · Restricted Project
weiwang requested review of D96398: [LTO] Perform DSOLocal propagation in combined index.
Feb 9 2021, 10:43 PM · Restricted Project

Dec 1 2020

weiwang added a comment to D85808: [Remarks][2/2] Expand remarks hotness threshold option support in more tools.

Ah, I am sorry. Thanks for fixing it.

Dec 1 2020, 9:07 AM · Restricted Project, Restricted Project

Nov 30 2020

weiwang committed rG93dc1b5b8cb2: [Remarks][2/2] Expand remarks hotness threshold option support in more tools (authored by weiwang).
[Remarks][2/2] Expand remarks hotness threshold option support in more tools
Nov 30 2020, 9:58 PM
weiwang committed rG3acda91742b7: [Remarks][1/2] Expand remarks hotness threshold option support in more tools (authored by weiwang).
[Remarks][1/2] Expand remarks hotness threshold option support in more tools
Nov 30 2020, 9:58 PM
weiwang closed D85808: [Remarks][2/2] Expand remarks hotness threshold option support in more tools.
Nov 30 2020, 9:57 PM · Restricted Project, Restricted Project
weiwang closed D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.
Nov 30 2020, 9:57 PM · Restricted Project
weiwang added a comment to D85808: [Remarks][2/2] Expand remarks hotness threshold option support in more tools.

lgtm with a couple of minor nits noted below that you can fix before submitting

Nov 30 2020, 9:53 PM · Restricted Project, Restricted Project
weiwang updated the diff for D85808: [Remarks][2/2] Expand remarks hotness threshold option support in more tools.
  1. Fix typo.
  2. Minor order adjustment in testcase.
Nov 30 2020, 9:51 PM · Restricted Project, Restricted Project
weiwang updated the diff for D85808: [Remarks][2/2] Expand remarks hotness threshold option support in more tools.

rebase

Nov 30 2020, 11:24 AM · Restricted Project, Restricted Project
weiwang updated the diff for D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.

rebase

Nov 30 2020, 11:24 AM · Restricted Project
weiwang added a comment to D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.

@tejohnson @MaskRay Do you have other comments?

Nov 30 2020, 9:52 AM · Restricted Project
weiwang added a comment to D85808: [Remarks][2/2] Expand remarks hotness threshold option support in more tools.

@tejohnson @MaskRay Do you have other comments?

Nov 30 2020, 9:51 AM · Restricted Project, Restricted Project

Nov 18 2020

weiwang added a comment to D85808: [Remarks][2/2] Expand remarks hotness threshold option support in more tools.

Thanks for adding the Driver test. I was thinking of something to test the CompilerInvocation changes, similar to your test using opt, that ensures the option has the desired behavior when invoked via clang. Looks like there is an existing test clang/test/Frontend/optimization-remark-with-hotness.c that perhaps could be extended or leveraged?

Nov 18 2020, 4:25 PM · Restricted Project, Restricted Project
weiwang updated the diff for D85808: [Remarks][2/2] Expand remarks hotness threshold option support in more tools.
  1. Add clang test with remarks output.
  2. Fix a missing dependency on PSI in legacy pass manager.
Nov 18 2020, 4:23 PM · Restricted Project, Restricted Project

Nov 17 2020

weiwang updated the diff for D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.

revert back MSVC fix

Nov 17 2020, 5:50 PM · Restricted Project
weiwang added inline comments to D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.
Nov 17 2020, 5:39 PM · Restricted Project
weiwang updated the diff for D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.

update to address MSVC build error

Nov 17 2020, 5:25 PM · Restricted Project
weiwang updated the diff for D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.

update test case

Nov 17 2020, 5:08 PM · Restricted Project
weiwang added inline comments to D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.
Nov 17 2020, 5:08 PM · Restricted Project
weiwang added inline comments to D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.
Nov 17 2020, 3:38 PM · Restricted Project
weiwang added inline comments to D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.
Nov 17 2020, 3:31 PM · Restricted Project
weiwang added inline comments to D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.
Nov 17 2020, 2:35 PM · Restricted Project
weiwang updated the diff for D85808: [Remarks][2/2] Expand remarks hotness threshold option support in more tools.

update test case for clang option pass-through

Nov 17 2020, 1:19 PM · Restricted Project, Restricted Project
weiwang committed rG3279347da05e: [BPI] Look through bitcasts in calcZeroHeuristic (authored by weiwang).
[BPI] Look through bitcasts in calcZeroHeuristic
Nov 17 2020, 9:35 AM
weiwang closed D91450: [BPI] Look through bitcasts in calcZeroHeuristic.
Nov 17 2020, 9:34 AM · Restricted Project
weiwang added a comment to D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.

Thanks! I will wait for his input.

Nov 17 2020, 9:29 AM · Restricted Project

Nov 16 2020

weiwang added a comment to D85809: [Remarks][1/2] Expand remarks hotness threshold option support in more tools.

@MaskRay @tejohnson, do you have other comments regarding this change and its dependent?

Nov 16 2020, 4:33 PM · Restricted Project
weiwang updated the diff for D91450: [BPI] Look through bitcasts in calcZeroHeuristic.

add test case

Nov 16 2020, 10:26 AM · Restricted Project

Nov 13 2020

weiwang updated the summary of D91450: [BPI] Look through bitcasts in calcZeroHeuristic.
Nov 13 2020, 11:27 AM · Restricted Project
weiwang added a reviewer for D91450: [BPI] Look through bitcasts in calcZeroHeuristic: samparker.
Nov 13 2020, 11:22 AM · Restricted Project