This is an archive of the discontinued LLVM Phabricator instance.

[HotColdSplit] Add unlikely attribute to outlined function
ClosedPublic

Authored by hiraditya on Oct 24 2019, 6:55 AM.

Diff Detail

Event Timeline

hiraditya created this revision.Oct 24 2019, 6:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 24 2019, 6:55 AM
vsk added a comment.Oct 24 2019, 10:45 AM

Yes, this does need a test, but I'm also not sure why this belongs in the hot/cold splitting pass.

It's possible to mark functions with the cold attribute outside of the splitting pass. Shouldn't all of these cold functions should have a .unlikely section prefix? If so, setting the section prefix should be done later (say, in PreISelLowering).

In D69384#1720142, @vsk wrote:

Yes, this does need a test, but I'm also not sure why this belongs in the hot/cold splitting pass.

It's possible to mark functions with the cold attribute outside of the splitting pass. Shouldn't all of these cold functions should have a .unlikely section prefix? If so, setting the section prefix should be done later (say, in PreISelLowering).

It's normally done in CodeGenPrepare.cpp, search there for the calls to setSectionPrefix. It's based on the profile counts, and since the entry count is being set to 0 it should behave as expected, unless this is being done without PGO.

Because my goal is to improve outlining without PGO (with static analysis), I'm wondering what would be the best place to put this other than HotColdSplitting.

Because my goal is to improve outlining without PGO (with static analysis), I'm wondering what would be the best place to put this other than HotColdSplitting.

Perhaps the code in CodeGenPrepare.cpp should fall back to looking for the Cold attribute if there is no profile (since this attribute can be used independent of splitting).

hiraditya updated this revision to Diff 226329.EditedOct 24 2019, 2:23 PM

test case

Because my goal is to improve outlining without PGO (with static analysis), I'm wondering what would be the best place to put this other than HotColdSplitting.

Perhaps the code in CodeGenPrepare.cpp should fall back to looking for the Cold attribute if there is no profile (since this attribute can be used independent of splitting).

+1

hiraditya updated this revision to Diff 226335.Oct 24 2019, 2:49 PM

It is happening without my changes with -codegenprepare. So just keeping the testcase.

hiraditya updated this revision to Diff 226336.EditedOct 24 2019, 2:50 PM

btw, I had this patch in mind:

diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index 5edc696c29a..60726320cf1 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -434,11 +434,11 @@ bool CodeGenPrepare::runOnFunction(Function &F) {
   ProfileSummaryInfo *PSI =
       &getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();
 

+  if (!PSI->hasProfileSummary()) {
+    auto EC = F.getEntryCount();
+     if (EC.hasValue() && (F.getEntryCount().getCount() == 0))
+      F.setSectionPrefix(".unlikely");
+  }
+  }
hiraditya updated this revision to Diff 226339.Oct 24 2019, 2:57 PM

Add check for both hot and cold prefixes.

btw, I had this patch in mind:

diff --git a/llvm/lib/CodeGen/CodeGenPrepare.cpp b/llvm/lib/CodeGen/CodeGenPrepare.cpp
index 5edc696c29a..60726320cf1 100644
--- a/llvm/lib/CodeGen/CodeGenPrepare.cpp
+++ b/llvm/lib/CodeGen/CodeGenPrepare.cpp
@@ -434,11 +434,11 @@ bool CodeGenPrepare::runOnFunction(Function &F) {
   ProfileSummaryInfo *PSI =
       &getAnalysis<ProfileSummaryInfoWrapperPass>().getPSI();
 

+  if (!PSI->hasProfileSummary()) {
+    auto EC = F.getEntryCount();
+     if (EC.hasValue() && (F.getEntryCount().getCount() == 0))
+      F.setSectionPrefix(".unlikely");
+  }
+  }

If we have an entry count then presumably we have profile data, so this shouldn't be needed. I was thinking the Cold attribute, that gets added in the splitting cold regardless of whether there is profile data.

rjf added a subscriber: rjf.Jul 21 2020, 8:53 AM
rjf added a comment.Jul 30 2020, 11:07 PM

@hiraditya The testcase is passing. This should be ready to be merged.


FYI, opt -hotcoldsplit -hotcoldsplit-threshold=0 -codegenprepare -S < coldentrycount.ll produces the following output:

; ModuleID = '<stdin>'
source_filename = "<stdin>"
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-apple-macosx10.14.0"

define void @fun() !prof !14 !section_prefix !15 {
entry:
  br i1 undef, label %if.then, label %codeRepl

if.then:                                          ; preds = %entry
  ret void

codeRepl:                                         ; preds = %entry
  call void @fun.cold.1() #2
  br label %if.else.ret

if.else.ret:                                      ; preds = %codeRepl
  ret void
}

; Function Attrs: cold
declare void @sink() #0

; Function Attrs: cold minsize
define internal void @fun.cold.1() #1 !prof !16 !section_prefix !17 {
newFuncRoot:
  br label %if.else

if.else.ret.exitStub:                             ; preds = %if.else
  ret void

if.else:                                          ; preds = %newFuncRoot
  call void @sink()
  br label %if.else.ret.exitStub
}

attributes #0 = { cold }
attributes #1 = { cold minsize }
attributes #2 = { noinline }

!llvm.module.flags = !{!0}

!0 = !{i32 1, !"ProfileSummary", !1}
!1 = !{!2, !3, !4, !5, !6, !7, !8, !9}
!2 = !{!"ProfileFormat", !"InstrProf"}
!3 = !{!"TotalCount", i64 10000}
!4 = !{!"MaxCount", i64 10}
!5 = !{!"MaxInternalCount", i64 1}
!6 = !{!"MaxFunctionCount", i64 1000}
!7 = !{!"NumCounts", i64 3}
!8 = !{!"NumFunctions", i64 3}
!9 = !{!"DetailedSummary", !10}
!10 = !{!11, !12, !13}
!11 = !{i32 10000, i64 100, i32 1}
!12 = !{i32 999000, i64 100, i32 1}
!13 = !{i32 999999, i64 1, i32 2}
!14 = !{!"function_entry_count", i64 100}
!15 = !{!"function_section_prefix", !".hot"}
!16 = !{!"function_entry_count", i64 0}
!17 = !{!"function_section_prefix", !".unlikely"}
lebedev.ri retitled this revision from Add unlikely attribute to outlined function to [HotColdSplit] Add unlikely attribute to outlined function.Jul 31 2020, 3:57 AM
rjf added a subscriber: lebedev.ri.Jul 31 2020, 9:12 AM

@lebedev.ri Just a suggestion, but maybe the title should now be "[HotColdSplit] Add test case for unlikely attribute in outlined function"? Since this patch is now only adding a test case.

This revision was not accepted when it landed; it landed in state Needs Review.Aug 1 2020, 10:35 PM
This revision was automatically updated to reflect the committed changes.

This test case has been failing on many of the ARM and AArch64 bots all weekend, e.g. http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/19408, would you mind looking into this?

Here's the log of one of the failures (from http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/19408/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Acoldentrycount.ll):

******************** TEST 'LLVM :: Transforms/HotColdSplit/coldentrycount.ll' FAILED ********************
Script:
--
: 'RUN: at line 4';   /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/stage1/bin/opt -hotcoldsplit -hotcoldsplit-threshold=0 -codegenprepare -S < /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/llvm/llvm/test/Transforms/HotColdSplit/coldentrycount.ll | /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/stage1/bin/FileCheck /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/llvm/llvm/test/Transforms/HotColdSplit/coldentrycount.ll
--
Exit Code: 2

Command Output (stderr):
--
LLVM ERROR: Trying to construct TargetPassConfig without a target machine. Scheduling a CodeGen pass without a target triple set?
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/stage1/bin/opt -hotcoldsplit -hotcoldsplit-threshold=0 -codegenprepare -S 
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/stage1/bin/FileCheck /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/llvm/llvm/test/Transforms/HotColdSplit/coldentrycount.ll

--

********************

This test case has been failing on many of the ARM and AArch64 bots all weekend, e.g. http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/19408, would you mind looking into this?

Here's the log of one of the failures (from http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/19408/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Acoldentrycount.ll):

******************** TEST 'LLVM :: Transforms/HotColdSplit/coldentrycount.ll' FAILED ********************
Script:
--
: 'RUN: at line 4';   /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/stage1/bin/opt -hotcoldsplit -hotcoldsplit-threshold=0 -codegenprepare -S < /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/llvm/llvm/test/Transforms/HotColdSplit/coldentrycount.ll | /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/stage1/bin/FileCheck /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/llvm/llvm/test/Transforms/HotColdSplit/coldentrycount.ll
--
Exit Code: 2

Command Output (stderr):
--
LLVM ERROR: Trying to construct TargetPassConfig without a target machine. Scheduling a CodeGen pass without a target triple set?
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/stage1/bin/opt -hotcoldsplit -hotcoldsplit-threshold=0 -codegenprepare -S 
FileCheck error: '<stdin>' is empty.
FileCheck command line:  /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/stage1/bin/FileCheck /home/tcwg-buildslave/worker/clang-cmake-armv7-quick/llvm/llvm/test/Transforms/HotColdSplit/coldentrycount.ll

--

********************

Our team is seeing this too. I suspect the addition of '-codegenprepare' requires the creation of the x86_64-apple target, which is not necessarily built into every compilation of LLVM.
A sure-fire way to know if this is the case would be to build LLVM with -DLLVM_TARGETS_TO_BUILD=ARM and run this test.

If it's not possible for this test to run in this case, then there should be a REQUIRES clause here, asserting that the correct target is enabled.

Looking into this. Seems like removing target triple should fix it.

Fix available in: D85215

rjf added a comment.EditedAug 4 2020, 9:03 AM

Seems like this is caused by target datalayout and target triple. See https://reviews.llvm.org/D85215 for proposed fix.