- User Since
- Jul 20 2020, 2:20 PM (139 w, 1 d)
Thu, Mar 16
Wed, Mar 15
Tue, Feb 28
add target triple
Mon, Feb 27
add test case and some comments
Thu, Feb 23
Sep 13 2022
Sep 12 2022
Aug 2 2022
Thanks for the quick feedback!
Your concerns are valid. I encountered them as well during local testing.
- Our internal build system (buck) applies -nostdlib throughout linking, so this requirement is satisfied.
- There are no direct support to extract specific members from archive. We are updating build system to make all regular archives into thin archive (--[start|end]-lib). With that done, archive members can be accessed directly. And you are right about elf-init.oS. During my local testing, I had to extract it manually, so some work needs to be done here. In a more general approach, formatting the index with ("archivename", "object") pair and using a separate flag to parse the index (like "--remapping-file=") is also doable.
- The index file contains line separated input file list, and can be used directly as response file in the final link (I should've just used @thinlto.index.full in the test case provided).
The current implicit ThinLTO takes the simplest approach by inserting lto.tmp at the end, breaking ordering.
I don't think the scheme is set in stone and changing distributed ThinLTO to behave like it is probably not the right direction.
On the other hand, it's non-trivial to fix its ordering. Therefore, I think making distributed ThinLTO and implicit ThinLTO have very similar symbol resolution behaviors is a stretch goal. If --thinlto-index= doesn't behave like implicit ThinLTO, I don't think it is a design flaw.
Jul 22 2022
Jul 21 2022
Apr 27 2022
minior update to test
comments and fix typo
Apr 26 2022
Apr 25 2022
Mar 30 2022
Jan 21 2022
Jan 18 2022
Nov 19 2021
Oct 27 2021
This issue has been blocking our internal module re-enablement for some time now, and we really appreciate any feedback. We also wonder if only DeducedTemplateSpecializationType is affected or it could also happen to other types.
Oct 25 2021
Sep 15 2021
Sep 8 2021
Sep 3 2021
update as discussed.
Sep 2 2021
Just double checked, this is the full omp related options currently in use:
"-fopenmp" "-fopenmp-version=31" "-fopenmp-version=31" "-fopenmp-cuda-parallel-target-regions"
We saw a huge number of DECLS_TO_CHECK_FOR_DEFERRED_DIAGS records. I don't know if this has anything to do with omp version being 31, since prior 5.0, everything is available on host.
Our internal codebase never uses the target directive. Once the deferred diags is bypassed, we observed 18% e2e build time improvement.
Aug 26 2021
May 21 2021
May 20 2021
Thanks for the approval!
make both ASTReader::DeclsToCheckForDeferredDiags and Sema::DeclsToCheckForDeferredDiags SmallSetVector
May 19 2021
Tried to make Sema::DeclsToCheckForDeferredDiags llvm::SmallSetVector. The heap RSS did drop significantly (from peak 100GB to 59GB) , but not as good as the current fix (peak 26GB), which makes ASTReader::DeclsToCheckForDeferredDiags llvm::SmallSetVector.
May 17 2021
May 13 2021
Finally dealt with the other issues I need to take care. Let's resume the discussion.
May 11 2021
Thanks for helping on this issue!
May 10 2021
Looks like -DLIBOMPTARGET_AMDGCN_GFXLIST="" would disable Device RTL build, but still build rest of the libomptarget.
Thanks for the quick response. It may not be easily reproducible since the build script that triggers this sets up its own environment. This is part of the company's internal build system. During my local try, clang built clang always works, but the build script uses gcc to build clang. Maybe gcc would insert its own library headers into search path, and this could cause some confusing about the order of include paths? But again, we have always used gcc to build clang, and it never had issue until now. I am not sure how this change would change anything.
If you do not need libomptarget for your package, you may pass -DOPENMP_ENABLE_LIBOMPTARGET=OFF to cmake.
With -DOPENMP_ENABLE_LIBOMPTARGET=OFF, the error is gone. I'll check internally to see if libomptarget can be disabled. Meanwhile, it would still be great to know what went wrong.
We are getting build errors internally with this change. They are all related to libomptarget. Our internal build script uses gcc to build llvm.
May 6 2021
May 3 2021
We've seen a huge memory footprint from AST Reader/Writer in a single CU with module enabled from internal workloads. Upon further analysis, the content of vector DeclsToCheckForDeferredDiags seems mostly redundant. In one case, 1,734,387,685 out of 1,734,404,000 elements are the same. While this may indicate something wrong with the source itself, it also suggests that compiler would be better to perform deduplication on this type of Decl ID.
Apr 28 2021
Feb 12 2021
update comment and remove one unintended change.
fixed test failures due to bit check
- serialize and de-serialize the flag.
- update related test cases.
Feb 11 2021
update according to comment.
Use the propagation approach.