This is an archive of the discontinued LLVM Phabricator instance.

[PGO] Extend the value profile buckets for mem op sizes.
ClosedPublic

Authored by hjyamauchi on Jun 11 2020, 12:11 PM.

Details

Summary

Extend the memop value profile buckets to be more flexible (could accommodate a
mix of individual values and ranges) and to cover more value ranges (from 11 to
22 buckets).

Disabled behind a flag (to be enabled separately) and the existing code to be
removed later.

Diff Detail

Event Timeline

hjyamauchi created this revision.Jun 11 2020, 12:11 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJun 11 2020, 12:11 PM
Herald added subscribers: Restricted Project, hiraditya, mgorny. · View Herald Transcript

the approach looks sound.

compiler-rt/include/profile/InstrProfData.inc
168

add FIXME

170

unwanted format change?

810

should 5,6, and 7 be merged into one range?

compiler-rt/lib/profile/InstrProfilingValue.c
254

add FIXME

llvm/include/llvm/ProfileData/InstrProf.h
78

FIXME

hjyamauchi marked 6 inline comments as done.

Address comments.

compiler-rt/include/profile/InstrProfData.inc
810

This is as intended because some mem ops could have separate code paths for 5, 6, and 7 (at least our memcmp does in fact) and because the old buckets were 0-9 individually. I'd like the new buckets to be flexible to accommodate those and to supercede the old buckets.

davidxl added inline comments.Jun 19 2020, 10:05 AM
compiler-rt/include/profile/InstrProfData.inc
836

The name of the interface is too long.

InstrProfGetRangeRepValue would be good enough.

855

InstrProfIsSingleValRange(uint64_t RepValue)

llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp
620

this branch can be merged with the above -- though it is to be deprecated.

hjyamauchi marked 4 inline comments as done.

Address comments.

hjyamauchi added inline comments.Jun 22 2020, 11:43 AM
llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp
620

They can be merged into something like:

    Type *ParamTypes[] = {
#define VALUE_PROF_FUNC_PARAM(ParamType, ParamName, ParamLLVMType) ParamLLVMType
#include "llvm/ProfileData/InstrProfData.inc"
    };

    Type *RangeParamTypes[] = {
#define VALUE_RANGE_PROF 1
#define VALUE_PROF_FUNC_PARAM(ParamType, ParamName, ParamLLVMType) ParamLLVMType
#include "llvm/ProfileData/InstrProfData.inc"
#undef VALUE_RANGE_PROF
    };

    auto *ValueProfilingCallTy =
        CallType != ValueProfilingCallType::OldMemOp
	FunctionType::get(ReturnTy, makeArrayRef(ParamTypes), false) :
        FunctionType::get(ReturnTy, makeArrayRef(RangeParamTypes), false);

    StringRef FuncName = CallType != ValueProfilingCallType::MemOp
                             ? getInstrProfValueProfFuncName()
                             : getInstrProfValueProfMemOpFuncName();

    return M.getOrInsertFunction(FuncName, ValueProfilingCallTy, AL);

which would be a fewer lines shorter and fine.

But I like the current code in that it already makes it more clear-cut what the new code would look like and what will be removed in a future patch that removes the deprecated code (the if statement along with the else block will be simply removed.)

davidxl accepted this revision.Jun 24 2020, 1:40 PM

lgtm

This revision is now accepted and ready to land.Jun 24 2020, 1:40 PM
This revision was automatically updated to reflect the committed changes.

Build error

C:\PROGRA~2\MIB055~1\2017\PROFES~1\VC\Tools\MSVC\1416~1.270\bin\Hostx64\x64\cl.exe /nologo /TP -DGTEST_HAS_RTTI=0 -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -DSTDC_CONSTANT_MACROS -DSTDC_FORMAT_MACROS -DSTDC_LIMIT_MACROS -Ilib\Transforms\Instrumentation -IC:\b\slave\sanitizer-windows\llvm-project\llvm\lib\Transforms\Instrumentation -Iinclude -IC:\b\slave\sanitizer-windows\llvm-project\llvm\include /DWIN32 /D_WINDOWS /Zc:inline /Zi /Zc:strictStrings /Oi /Zc:rvalueCast /W4 -wd4141 -wd4146 -wd4244 -wd4267 -wd4291 -wd4351 -wd4456 -wd4457 -wd4458 -wd4459 -wd4503 -wd4624 -wd4722 -wd4100 -wd4127 -wd4512 -wd4505 -wd4610 -wd4510 -wd4702 -wd4245 -wd4706 -wd4310 -wd4701 -wd4703 -wd4389 -wd4611 -wd4805 -wd4204 -wd4577 -wd4091 -wd4592 -wd4319 -wd4709 -wd4324 -w14062 -we4238 /Gw /MD /O2 /Ob2 /EHs-c- /GR- -UNDEBUG /showIncludes /Folib\Transforms\Instrumentation\CMakeFiles\LLVMInstrumentation.dir\PGOMemOPSizeOpt.cpp.obj /Fdlib\Transforms\Instrumentation\CMakeFiles\LLVMInstrumentation.dir\ /FS -c C:\b\slave\sanitizer-windows\llvm-project\llvm\lib\Transforms\Instrumentation\PGOMemOPSizeOpt.cpp
C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ProfileData/InstrProfData.inc(843): error C3861: '
builtin_popcountll': identifier not found
C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ProfileData/InstrProfData.inc(848): error C3861: 'builtin_clzll': identifier not found
C:\b\slave\sanitizer-windows\llvm-project\llvm\include\llvm/ProfileData/InstrProfData.inc(859): error C3861: '
builtin_popcountll': identifier not found

hjyamauchi reopened this revision.Jun 25 2020, 1:25 PM
This revision is now accepted and ready to land.Jun 25 2020, 1:25 PM

Update to fix MSVC build errors.

Here's the raw diff for the last update

diff --git a/compiler-rt/include/profile/InstrProfData.inc b/compiler-rt/include/profile/InstrProfData.inc
index f4cb2524ad7..8c6e6cffe95 100644
--- a/compiler-rt/include/profile/InstrProfData.inc
+++ b/compiler-rt/include/profile/InstrProfData.inc
@@ -830,6 +830,47 @@ typedef struct InstrProfValueData {
  * metadata. For example, it's 2 for [2, 2] and 64 for [65, 127].
  */
 
+/*
+ * Clz and Popcount. This code was copied from
+ * compiler-rt/lib/fuzzer/{FuzzerBuiltins.h,FuzzerBuiltinsMsvc.h}.
+ */
+#if defined(_MSC_VER) && !defined(__clang__)
+
+#include <intrin.h>
+INSTR_PROF_VISIBILITY INSTR_PROF_INLINE
+uint32_t InstProfClzll(uint64_t X) {
+  unsigned long LeadZeroIdx = 0;
+#if !defined(_M_ARM) && !defined(_M_X64)
+  // Scan the high 32 bits.
+  if (_BitScanReverse(&LeadZeroIdx, static_cast<unsigned long>(X >> 32)))
+    return static_cast<int>(63 - (LeadZeroIdx + 32)); // Create a bit offset
+                                                      // from the MSB.
+  // Scan the low 32 bits.
+  if (_BitScanReverse(&LeadZeroIdx, static_cast<unsigned long>(X)))
+    return static_cast<int>(63 - LeadZeroIdx);
+#else
+  if (_BitScanReverse64(&LeadZeroIdx, X)) return 63 - LeadZeroIdx;
+#endif
+  return 64;
+}
+INSTR_PROF_VISIBILITY INSTR_PROF_INLINE
+int InstProfPopcountll(unsigned long long X) {
+#if !defined(_M_ARM) && !defined(_M_X64)
+  return __popcnt(X) + __popcnt(X >> 32);
+#else
+  return __popcnt64(X);
+#endif
+}
+
+#else
+
+INSTR_PROF_VISIBILITY INSTR_PROF_INLINE
+uint32_t InstProfClzll(unsigned long long X) { return __builtin_clzll(X); }
+INSTR_PROF_VISIBILITY INSTR_PROF_INLINE
+int InstProfPopcountll(unsigned long long X) { return __builtin_popcountll(X); }
+
+#endif  /* defined(_MSC_VER) && !defined(__clang__) */
+
 /* Map an (observed) memop size value to the representative value of its range.
  * For example, 5 -> 5, 22 -> 17, 99 -> 65, 256 -> 256, 1001 -> 513. */
 INSTR_PROF_VISIBILITY INSTR_PROF_INLINE uint64_t
@@ -840,12 +881,12 @@ InstrProfGetRangeRepValue(uint64_t Value) {
   else if (Value >= 513)
     // The last range is mapped to its lowest value.
     return 513;
-  else if (__builtin_popcountll(Value) == 1)
+  else if (InstProfPopcountll(Value) == 1)
     // If it's a power of two, use it as is.
     return Value;
   else
     // Otherwise, take to the previous power of two + 1.
-    return (1 << (64 - __builtin_clzll(Value) - 1)) + 1;
+    return (1 << (64 - InstProfClzll(Value) - 1)) + 1;
 }
 
 /* Return true if the range that an (observed) memop size value belongs to has
@@ -856,7 +897,7 @@ InstrProfIsSingleValRange(uint64_t Value) {
   if (Value <= 8)
     // The first ranges are individually tracked.
     return 1;
-  else if (__builtin_popcountll(Value) == 1)
+  else if (InstProfPopcountll(Value) == 1)
     // If it's a power of two, there's only one value.
     return 1;
   else
diff --git a/llvm/include/llvm/ProfileData/InstrProfData.inc b/llvm/include/llvm/ProfileData/InstrProfData.inc
index f4cb2524ad7..8c6e6cffe95 100644
--- a/llvm/include/llvm/ProfileData/InstrProfData.inc
+++ b/llvm/include/llvm/ProfileData/InstrProfData.inc
@@ -830,6 +830,47 @@ typedef struct InstrProfValueData {
  * metadata. For example, it's 2 for [2, 2] and 64 for [65, 127].
  */
 
+/*
+ * Clz and Popcount. This code was copied from
+ * compiler-rt/lib/fuzzer/{FuzzerBuiltins.h,FuzzerBuiltinsMsvc.h}.
+ */
+#if defined(_MSC_VER) && !defined(__clang__)
+
+#include <intrin.h>
+INSTR_PROF_VISIBILITY INSTR_PROF_INLINE
+uint32_t InstProfClzll(uint64_t X) {
+  unsigned long LeadZeroIdx = 0;
+#if !defined(_M_ARM) && !defined(_M_X64)
+  // Scan the high 32 bits.
+  if (_BitScanReverse(&LeadZeroIdx, static_cast<unsigned long>(X >> 32)))
+    return static_cast<int>(63 - (LeadZeroIdx + 32)); // Create a bit offset
+                                                      // from the MSB.
+  // Scan the low 32 bits.
+  if (_BitScanReverse(&LeadZeroIdx, static_cast<unsigned long>(X)))
+    return static_cast<int>(63 - LeadZeroIdx);
+#else
+  if (_BitScanReverse64(&LeadZeroIdx, X)) return 63 - LeadZeroIdx;
+#endif
+  return 64;
+}
+INSTR_PROF_VISIBILITY INSTR_PROF_INLINE
+int InstProfPopcountll(unsigned long long X) {
+#if !defined(_M_ARM) && !defined(_M_X64)
+  return __popcnt(X) + __popcnt(X >> 32);
+#else
+  return __popcnt64(X);
+#endif
+}
+
+#else
+
+INSTR_PROF_VISIBILITY INSTR_PROF_INLINE
+uint32_t InstProfClzll(unsigned long long X) { return __builtin_clzll(X); }
+INSTR_PROF_VISIBILITY INSTR_PROF_INLINE
+int InstProfPopcountll(unsigned long long X) { return __builtin_popcountll(X); }
+
+#endif  /* defined(_MSC_VER) && !defined(__clang__) */
+
 /* Map an (observed) memop size value to the representative value of its range.
  * For example, 5 -> 5, 22 -> 17, 99 -> 65, 256 -> 256, 1001 -> 513. */
 INSTR_PROF_VISIBILITY INSTR_PROF_INLINE uint64_t
@@ -840,12 +881,12 @@ InstrProfGetRangeRepValue(uint64_t Value) {
   else if (Value >= 513)
     // The last range is mapped to its lowest value.
     return 513;
-  else if (__builtin_popcountll(Value) == 1)
+  else if (InstProfPopcountll(Value) == 1)
     // If it's a power of two, use it as is.
     return Value;
   else
     // Otherwise, take to the previous power of two + 1.
-    return (1 << (64 - __builtin_clzll(Value) - 1)) + 1;
+    return (1 << (64 - InstProfClzll(Value) - 1)) + 1;
 }
 
 /* Return true if the range that an (observed) memop size value belongs to has
@@ -856,7 +897,7 @@ InstrProfIsSingleValRange(uint64_t Value) {
   if (Value <= 8)
     // The first ranges are individually tracked.
     return 1;
-  else if (__builtin_popcountll(Value) == 1)
+  else if (InstProfPopcountll(Value) == 1)
     // If it's a power of two, there's only one value.
     return 1;
   else
davidxl added a reviewer: rnk.Jun 25 2020, 2:19 PM

Added Reid to help review the MS compiler relevant part: https://reviews.llvm.org/D81682/new

Further refined the MSVC related code.

Based on https://godbolt.org/z/BxlSWX,

  • popcnt and popcnt64 aren't available on arm/arm64 and not on all x86/x86-64 CPUs. So, go back to bit twiddling.
  • _BitScanReverse64 is available on arm64 and x86-64 only.
  • Removed static_cast as this is included in a .c file.

The relevant MSVC related code now looks like

/*
 * Clz and Popcount. This code was copied from
 * compiler-rt/lib/fuzzer/{FuzzerBuiltins.h,FuzzerBuiltinsMsvc.h} and
 * llvm/include/llvm/Support/MathExtras.h.
 */
#if defined(_MSC_VER) && !defined(__clang__)

#include <intrin.h>
INSTR_PROF_VISIBILITY INSTR_PROF_INLINE
int InstProfClzll(unsigned long long X) {
  unsigned long LeadZeroIdx = 0;
#if !defined(_M_ARM64) && !defined(_M_X64)
  // Scan the high 32 bits.
  if (_BitScanReverse(&LeadZeroIdx, (unsigned long)(X >> 32)))
    return (int)(63 - (LeadZeroIdx + 32)); // Create a bit offset
                                                      // from the MSB.
  // Scan the low 32 bits.
  if (_BitScanReverse(&LeadZeroIdx, (unsigned long)(X)))
    return (int)(63 - LeadZeroIdx);
#else
  if (_BitScanReverse64(&LeadZeroIdx, X)) return 63 - LeadZeroIdx;
#endif
  return 64;
}
INSTR_PROF_VISIBILITY INSTR_PROF_INLINE
int InstProfPopcountll(unsigned long long X) {
  // This code originates from https://reviews.llvm.org/rG30626254510f.
  unsigned long long v = X;
  v = v - ((v >> 1) & 0x5555555555555555ULL);
  v = (v & 0x3333333333333333ULL) + ((v >> 2) & 0x3333333333333333ULL);
  v = (v + (v >> 4)) & 0x0F0F0F0F0F0F0F0FULL;
  return (int)((unsigned long long)(v * 0x0101010101010101ULL) >> 56);
}

#else

INSTR_PROF_VISIBILITY INSTR_PROF_INLINE
int InstProfClzll(unsigned long long X) { return __builtin_clzll(X); }
INSTR_PROF_VISIBILITY INSTR_PROF_INLINE
int InstProfPopcountll(unsigned long long X) { return __builtin_popcountll(X); }

#endif  /* defined(_MSC_VER) && !defined(__clang__) */
davidxl added inline comments.Jul 7 2020, 8:20 PM
compiler-rt/include/profile/InstrProfData.inc
839
842

Since these helpers are only used by runtime on target, not by the host compiler, they should be moved to InstrProfilingUtil.c instead as InstrProfData.Inc is shared by runtime and compiler.

compiler-rt/lib/profile/InstrProfilingValue.c
274

Ideally, this function should be inline expanded by the compiler at instrumentation time -- but that can be done separately.

hjyamauchi marked 3 inline comments as done.Jul 8 2020, 2:38 PM
hjyamauchi added inline comments.
compiler-rt/include/profile/InstrProfData.inc
839

Unfortunately, no. I already tried this. (note "popcnt and popcnt64 aren't available on arm/arm64 and not on all x86/x86-64 CPUs" in the older update message.)

I can confirm it on https://godbolt.org/z/bn5PEy

842

InstrProfIsSingleValRange (hence InstProfPopcountll) is used by the compiler in lib/Transforms/Instrumentation/PGOMemOPSizeOpt.cpp.

compiler-rt/lib/profile/InstrProfilingValue.c
274

True.

davidxl accepted this revision.Jul 14 2020, 9:04 AM

lgtm

dmajor added a subscriber: dmajor.Jul 22 2020, 3:11 PM

The re-landing of this patch in 4a539faf74b9b4c25ee3b880e4007564bd5139b0 causes our PGO build of clang to have thousands of

LLVM Profile Warning: Unable to track new values: Running out of static counters.  Consider using option -mllvm -vp-counters-per-site=<n> to allocate more value profile counters at compile time.

Is that expected? Should the cmake set a higher value of -vp-counters-per-site in self-hosted PGO builds?

MaskRay reopened this revision.Jul 22 2020, 4:19 PM
MaskRay added a subscriber: MaskRay.

Chatted with @yamauchi in another channel. Reverted in 27650ec5541cd604a5027ad63895e0badfd35efe

It caused __llvm_profile_instrument_range related crash in PGO-instrumented clang (on many source files):

  1│ Dump of assembler code for function __llvm_profile_instrument_target:
  2│    0x000055555dba0820 <+0>:     push   %rbp
  3│    0x000055555dba0821 <+1>:     mov    %rsp,%rbp
  4│    0x000055555dba0824 <+4>:     push   %r15
  5│    0x000055555dba0826 <+6>:     push   %r14
  6│    0x000055555dba0828 <+8>:     push   %r12
  7│    0x000055555dba082a <+10>:    push   %rbx
  8│    0x000055555dba082b <+11>:    test   %rsi,%rsi
  9│    0x000055555dba082e <+14>:    je     0x55555dba098c <__llvm_profile_instrument_target+364>
 10│    0x000055555dba0834 <+20>:    mov    %edx,%r14d
 11│    0x000055555dba0837 <+23>:    mov    %rsi,%rbx
 12│    0x000055555dba083a <+26>:    mov    %rdi,%r15
 13│    0x000055555dba083d <+29>:    mov    0x20(%rsi),%r12
 14│    0x000055555dba0841 <+33>:    test   %r12,%r12
 15│    0x000055555dba0844 <+36>:    je     0x55555dba08bd <__llvm_profile_instrument_target+157>
 16│    0x000055555dba0846 <+38>:    mov    %r14d,%r14d
 17│    0x000055555dba0849 <+41>:    mov    (%r12,%r14,8),%rsi
 18│    0x000055555dba084d <+45>:    test   %rsi,%rsi
 19│    0x000055555dba0850 <+48>:    je     0x55555dba091a <__llvm_profile_instrument_target+250>
 20│    0x000055555dba0856 <+54>:    mov    $0xffffffffffffffff,%rdx
 21│    0x000055555dba085d <+61>:    xor    %ecx,%ecx
 22│    0x000055555dba085f <+63>:    xor    %eax,%eax
 23│    0x000055555dba0861 <+65>:    nopw   %cs:0x0(%rax,%rax,1)
 24│    0x000055555dba086b <+75>:    nopl   0x0(%rax,%rax,1)
 25│    0x000055555dba0870 <+80>:    mov    %rsi,%rbx
 26│    0x000055555dba0873 <+83>:    mov    0x8(%rsi),%rsi    ####### %rsi=-1 on this line
 27│    0x000055555dba0877 <+87>:    cmp    %r15,(%rbx)
 28│    0x000055555dba087a <+90>:    je     0x55555dba0976 <__llvm_profile_instrument_target+342>
 29│    0x000055555dba0880 <+96>:    cmp    %rdx,%rsi
 30│    0x000055555dba0883 <+99>:    cmovb  %rbx,%rax
 31│    0x000055555dba0887 <+103>:   cmovb  %rsi,%rdx
 32│    0x000055555dba088b <+107>:   add    $0x1,%cl
 33│    0x000055555dba088e <+110>:   mov    0x10(%rbx),%rsi
 34│    0x000055555dba0892 <+114>:   test   %rsi,%rsi
 35│    0x000055555dba0895 <+117>:   jne    0x55555dba0870 <__llvm_profile_instrument_target+80>
 36│    0x000055555dba0897 <+119>:   movzbl %cl,%ecx
 37│    0x000055555dba089a <+122>:   cmp    %ecx,0x6cffac(%rip)        # 0x55555e27084c <VPMaxNumValsPerSite>
** Dump of assembler code for function __llvm_profile_instrument_target: (55555dba0820 - 55555dba0a37) **                                                                                                                               
(gdb) i r rsi
rsi            0xffffffffffffffff  -1
(gdb)
(gdb) bt
#0  0x000055555dba0873 in __llvm_profile_instrument_target ()
#1  0x000055555d88de3f in llvm::APInt::udiv(llvm::APInt const&) const ()
#2  0x000055555d2e203d in getRangeForAffineARHelper(llvm::APInt, llvm::ConstantRange const&, llvm::APInt const&, unsigned int, bool) ()
#3  0x000055555d2e1585 in llvm::ScalarEvolution::getRangeForAffineAR(llvm::SCEV const*, llvm::SCEV const*, llvm::SCEV const*, unsigned int) ()
#4  0x000055555d2dfc38 in llvm::ScalarEvolution::getRangeRef(llvm::SCEV const*, llvm::ScalarEvolution::RangeSignHint) ()
This revision is now accepted and ready to land.Jul 22 2020, 4:19 PM

The re-landing of this patch in 4a539faf74b9b4c25ee3b880e4007564bd5139b0 causes our PGO build of clang to have thousands of

LLVM Profile Warning: Unable to track new values: Running out of static counters.  Consider using option -mllvm -vp-counters-per-site=<n> to allocate more value profile counters at compile time.

Is that expected? Should the cmake set a higher value of -vp-counters-per-site in self-hosted PGO builds?

This has been reverted for now. There was a related issue.

This revision was landed with ongoing or failed builds.Aug 3 2020, 11:04 AM
This revision was automatically updated to reflect the committed changes.
In D81682#2168134, @yamauchi wrote:

The re-landing of this patch in 4a539faf74b9b4c25ee3b880e4007564bd5139b0 causes our PGO build of clang to have thousands of

LLVM Profile Warning: Unable to track new values: Running out of static counters.  Consider using option -mllvm -vp-counters-per-site=<n> to allocate more value profile counters at compile time.

Is that expected? Should the cmake set a higher value of -vp-counters-per-site in self-hosted PGO builds?

This has been reverted for now. There was a related issue.

Our build bots still show lots of this warning, is that expected?

rnk removed a reviewer: rnk.Oct 13 2020, 12:30 PM
In D81682#2168134, @yamauchi wrote:

The re-landing of this patch in 4a539faf74b9b4c25ee3b880e4007564bd5139b0 causes our PGO build of clang to have thousands of

LLVM Profile Warning: Unable to track new values: Running out of static counters.  Consider using option -mllvm -vp-counters-per-site=<n> to allocate more value profile counters at compile time.

Is that expected? Should the cmake set a higher value of -vp-counters-per-site in self-hosted PGO builds?

This has been reverted for now. There was a related issue.

Our build bots still show lots of this warning, is that expected?

It may actually be that -vp-counters-per-site needs adjusting. How do you reproduce?

It may actually be that -vp-counters-per-site needs adjusting. How do you reproduce?

This is with clang trying to PGO itself. Build with -DLLVM_BUILD_INSTRUMENTED=IR -DLLVM_BUILD_RUNTIME=No and use the result to build the LLVM tree again.

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=318445929&repo=ash&lineNumber=17322

This is with clang trying to PGO itself. Build with -DLLVM_BUILD_INSTRUMENTED=IR -DLLVM_BUILD_RUNTIME=No and use the result to build the LLVM tree again.

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=318445929&repo=ash&lineNumber=17322

It seems that I can't reproduce it. Here's my script. Can you reproduce with it or modify it so it reproduces?

# cd into the root directory "llvm-project"
LLVMROOT=`pwd`

rm -rf $LLVMROOT/clang-bootstrap
mkdir $LLVMROOT/clang-bootstrap
STAGE1=$LLVMROOT/clang-bootstrap/stage1
STAGE2_PROF_GEN=$LLVMROOT/clang-bootstrap/stage2-prof-gen
STAGE2_TRAIN=$LLVMROOT/clang-bootstrap/stage2-train

CMAKE="cmake -G Ninja $LLVMROOT/llvm -DLLVM_ENABLE_PROJECTS=clang;compiler-rt -DLLVM_TARGETS_TO_BUILD=X86 -DCMAKE_BUILD_TYPE=Release -DLLVM_PARALLEL_LINK_JOBS=6"

mkdir $STAGE1
cd $STAGE1
$CMAKE -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ \
  -DCMAKE_INSTALL_PREFIX=$STAGE1/install
ninja install -v

mkdir -p $STAGE2_PROF_GEN
cd $STAGE2_PROF_GEN
$CMAKE -DCMAKE_C_COMPILER=$STAGE1/install/bin/clang \
  -DCMAKE_CXX_COMPILER=$STAGE1/install/bin/clang++ \
  -DLLVM_BUILD_INSTRUMENTED=IR \
  -DLLVM_BUILD_RUNTIME=No \
  -DCMAKE_INSTALL_PREFIX=$STAGE2_PROF_GEN/install
ninja install -v

mkdir $STAGE2_TRAIN
cd $STAGE2_TRAIN
$CMAKE -DCMAKE_C_COMPILER=$STAGE2_PROF_GEN/install/bin/clang \
  -DCMAKE_CXX_COMPILER=$STAGE2_PROF_GEN/install/bin/clang++ \
  -DCMAKE_INSTALL_PREFIX=$STAGE2_TRAIN/install
ninja -v

Just a quick status update, I don't reproduce the issue with your script. I'm attempting to adjust the script to match our CI, but I'm distracted by other work.

In D81682#2331245, @yamauchi wrote:

It seems that I can't reproduce it. Here's my script. Can you reproduce with it or modify it so it reproduces?

I'm sorry this took so long, I finally narrowed down the minimal required change. The second stage should have -DLLVM_LINK_LLVM_DYLIB=ON.