Details
Diff Detail
Event Timeline
Yes, this does need a test, but I'm also not sure why this belongs in the hot/cold splitting pass.
It's possible to mark functions with the cold attribute outside of the splitting pass. Shouldn't all of these cold functions should have a .unlikely section prefix? If so, setting the section prefix should be done later (say, in PreISelLowering).
It's normally done in CodeGenPrepare.cpp, search there for the calls to setSectionPrefix. It's based on the profile counts, and since the entry count is being set to 0 it should behave as expected, unless this is being done without PGO.
Because my goal is to improve outlining without PGO (with static analysis), I'm wondering what would be the best place to put this other than HotColdSplitting.
Perhaps the code in CodeGenPrepare.cpp should fall back to looking for the Cold attribute if there is no profile (since this attribute can be used independent of splitting).
It is happening without my changes with -codegenprepare. So just keeping the testcase.
btw, I had this patch in mind:
diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp index 5edc696c29a..60726320cf1 100644 --- a/llvm/lib/CodeGen/CodeGenPrepare.cpp +++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp @@ -434,11 +434,11 @@ bool CodeGenPrepare::runOnFunction(Function &F) { ProfileSummaryInfo *PSI = &getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI(); + if (!PSI->hasProfileSummary()) { + auto EC = F.getEntryCount(); + if (EC.hasValue() && (F.getEntryCount().getCount() == 0)) + F.setSectionPrefix(".unlikely"); + } + }
If we have an entry count then presumably we have profile data, so this shouldn't be needed. I was thinking the Cold attribute, that gets added in the splitting cold regardless of whether there is profile data.
@hiraditya The testcase is passing. This should be ready to be merged.
FYI, opt -hotcoldsplit -hotcoldsplit-threshold=0 -codegenprepare -S < coldentrycount.ll produces the following output:
; ModuleID = '<stdin>' source_filename = "<stdin>" target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-apple-macosx10.14.0" define void @fun() !prof !14 !section_prefix !15 { entry: br i1 undef, label %if.then, label %codeRepl if.then: ; preds = %entry ret void codeRepl: ; preds = %entry call void @fun.cold.1() #2 br label %if.else.ret if.else.ret: ; preds = %codeRepl ret void } ; Function Attrs: cold declare void @sink() #0 ; Function Attrs: cold minsize define internal void @fun.cold.1() #1 !prof !16 !section_prefix !17 { newFuncRoot: br label %if.else if.else.ret.exitStub: ; preds = %if.else ret void if.else: ; preds = %newFuncRoot call void @sink() br label %if.else.ret.exitStub } attributes #0 = { cold } attributes #1 = { cold minsize } attributes #2 = { noinline } !llvm.module.flags = !{!0} !0 = !{i32 1, !"ProfileSummary", !1} !1 = !{!2, !3, !4, !5, !6, !7, !8, !9} !2 = !{!"ProfileFormat", !"InstrProf"} !3 = !{!"TotalCount", i64 10000} !4 = !{!"MaxCount", i64 10} !5 = !{!"MaxInternalCount", i64 1} !6 = !{!"MaxFunctionCount", i64 1000} !7 = !{!"NumCounts", i64 3} !8 = !{!"NumFunctions", i64 3} !9 = !{!"DetailedSummary", !10} !10 = !{!11, !12, !13} !11 = !{i32 10000, i64 100, i32 1} !12 = !{i32 999000, i64 100, i32 1} !13 = !{i32 999999, i64 1, i32 2} !14 = !{!"function_entry_count", i64 100} !15 = !{!"function_section_prefix", !".hot"} !16 = !{!"function_entry_count", i64 0} !17 = !{!"function_section_prefix", !".unlikely"}
@lebedev.ri Just a suggestion, but maybe the title should now be "[HotColdSplit] Add test case for unlikely attribute in outlined function"? Since this patch is now only adding a test case.
This test case has been failing on many of the ARM and AArch64 bots all weekend, e.g. http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/19408, would you mind looking into this?
Here's the log of one of the failures (from http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/19408/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Acoldentrycount.ll):
******************** TEST 'LLVM :: Transforms/HotColdSplit/coldentrycount.ll' FAILED ******************** Script: -- : 'RUN: at line 4'; /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/stage1/bin/opt -hotcoldsplit -hotcoldsplit-threshold=0 -codegenprepare -S < /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/llvm/llvm/test/Transforms/HotColdSplit/coldentrycount.ll | /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/stage1/bin/FileCheck /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/llvm/llvm/test/Transforms/HotColdSplit/coldentrycount.ll -- Exit Code: 2 Command Output (stderr): -- LLVM ERROR: Trying to construct TargetPassConfig without a target machine. Scheduling a CodeGen pass without a target triple set? PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/stage1/bin/opt -hotcoldsplit -hotcoldsplit-threshold=0 -codegenprepare -S FileCheck error: '<stdin>' is empty. FileCheck command line: /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/stage1/bin/FileCheck /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/llvm/llvm/test/Transforms/HotColdSplit/coldentrycount.ll -- ********************
Our team is seeing this too. I suspect the addition of '-codegenprepare' requires the creation of the x86_64-apple target, which is not necessarily built into every compilation of LLVM.
A sure-fire way to know if this is the case would be to build LLVM with -DLLVM_TARGETS_TO_BUILD=ARM and run this test.
If it's not possible for this test to run in this case, then there should be a REQUIRES clause here, asserting that the correct target is enabled.
Seems like this is caused by target datalayout and target triple. See https://reviews.llvm.org/D85215 for proposed fix.