Download Raw Diff

Details

Reviewers

hans
rengolin
compnerd
rjmccall
hfinkel

Commits

rG2b0e424b2f71: [Frontend] Fix mcount inlining bug
rC280355: [Frontend] Fix mcount inlining bug
rL280355: [Frontend] Fix mcount inlining bug

Summary

Since some profiling tools, such as gprof, ftrace, and uftrace, use
-pg option to generate a mcount function call at the entry of each
function. Function invocation can be detected by this hook function.

But mcount insertion is done before function inlining phase in clang,
sometime a function that already has a mcount call can be inlined in the
middle of another function.

This patch adds an attribute "counting-function" to each function
rather than emitting the mcount call directly in frontend so that this
attribute can be processed in backend. Then the mcount calls can be
properly inserted in backend after all the other optimizations are
completed.

Link: https://llvm.org/bugs/show_bug.cgi?id=28660
Signed-off-by: Honggyu Kim <hong.gyu.kim@lge.com>

Diff Detail

Repository: rL LLVM

Event Timeline

Hi John and Hans,

I don't know if I found right reviewers for this patch, but could you please have a look at it?
I found this problem while I was trying to profile a binary that is compiled with -pg and -O2 as I reported in bugzilla.
https://llvm.org/bugs/show_bug.cgi?id=28660

I know the better approach is to do function inlining first, then insert mcount calls for each function remained. But this can fix the problem in a simpler way as of now. If anyone can give me some guide or someone can move mcount insertion phase later, that would be also good.

Thanks,
Honggyu

What's the actual desired behavior here? Should the inliner strip out calls to mcount() as it inlines?

In D22666#493026, @hfinkel wrote:

What's the actual desired behavior here? Should the inliner strip out calls to mcount() as it inlines?

I think they basically just want a late pass (as in immediately prior to codegen) to insert the mcount calls.

In D22666#493109, @rjmccall wrote:

In D22666#493026, @hfinkel wrote:

What's the actual desired behavior here? Should the inliner strip out calls to mcount() as it inlines?

I think they basically just want a late pass (as in immediately prior to codegen) to insert the mcount calls.

You're right. I wanted to move mcount insertion phase myself but it seems not that simple as I expected because mcount call is inserted in bitcode anyway, then can be optimized using opt tool to do function inlining later on. In this case, function inliner must strip mcount calls at the entry of each function.

Otherwise, we shouldn't generate mcount calls inside bitcode so that opt can simply do function inlining as it does, then code generator, llc, has to insert mcount just before code generation is done. I think this is not that simple as well.

To sum up, we have two approaches.

inliner must strip mcount calls when it does function inlining.
llc has to insert mcount calls just before code generation.

We have to choose one of these approaches. But I think this patch fixes the problem in a simpler way as of now.

In D22666#493706, @honggyu.kim wrote:

In D22666#493109, @rjmccall wrote:

In D22666#493026, @hfinkel wrote:

What's the actual desired behavior here? Should the inliner strip out calls to mcount() as it inlines?

I think they basically just want a late pass (as in immediately prior to codegen) to insert the mcount calls.

You're right. I wanted to move mcount insertion phase myself but it seems not that simple as I expected because mcount call is inserted in bitcode anyway, then can be optimized using opt tool to do function inlining later on. In this case, function inliner must strip mcount calls at the entry of each function.

Otherwise, we shouldn't generate mcount calls inside bitcode so that opt can simply do function inlining as it does, then code generator, llc, has to insert mcount just before code generation is done. I think this is not that simple as well.

To sum up, we have two approaches.

inliner must strip mcount calls when it does function inlining.

llc has to insert mcount calls just before code generation.

We have to choose one of these approaches. But I think this patch fixes the problem in a simpler way as of now.

But it will also severely hurt performance, and that's not good either.

Adding an early IR-level pass to llc to insert calls to mcount seems straightforward. We could, for example, add some function attribute in the frontend which causes the pass to actually insert the mcount calls. There is some frontend knowledge currently regarding exactly what the mcount function is named (lib/Basic/Targets.cpp sets the MCountName variable) that we'd either need to pass in the attribute or move to the backend.

Another option is to add a new function attribute, drop_when_inlining (modulo bikeshedding), add that to the mcount function in the frontend, and teach the inliner to drop functions with that attribute when inlining.

Note that the presence of the mcount call itself will significantly impact inlining and, really, the entire optimization pipeline. Having this be a late pass seems more in keeping with what I assume is a goal that this instrumentation doesn't drastically change the generated code.

In D22666#493708, @rjmccall wrote:

Note that the presence of the mcount call itself will significantly impact inlining and, really, the entire optimization pipeline. Having this be a late pass seems more in keeping with what I assume is a goal that this instrumentation doesn't drastically change the generated code.

I agree that the best solution is to move mcount insertion phase in the late pass so llc has to insert mcount calls for each function. But I don't have clear idea how much work has to be done for this. I confess that I don't have much experience on current LLVM code base. That's why I wrote this workaround patch to fix the bug first, then hopefully someone can move mcount insertion phase as you mentioned, or someone give me some guide how to do that. Writing a new attribute "drop_when_inlining" seems to be a bit more comfortable for me as Hal suggested.
Thanks for the comments.

honggyu.kim added a reviewer: hfinkel.Jul 26 2016, 6:39 AM

honggyu.kim removed a subscriber: hfinkel.

One more problem is that -finstrument-functions has the same inlining problem. The code in llvm/tools/clang/lib/CodeGen/CodeGenFunction.cpp shows that.

void CodeGenFunction::StartFunction(GlobalDecl GD,
                                    QualType RetTy,
                                    llvm::Function *Fn,
                                    const CGFunctionInfo &FnInfo,
                                    const FunctionArgList &Args,
                                    SourceLocation Loc,
                                    SourceLocation StartLoc) {
    ...
  if (ShouldInstrumentFunction())
    EmitFunctionInstrumentation("__cyg_profile_func_enter");

  if (CGM.getCodeGenOpts().InstrumentForProfiling)
    EmitMCountInstrumentation();
    ...
}

In D22666#495978, @honggyu.kim wrote:

In D22666#493708, @rjmccall wrote:

Note that the presence of the mcount call itself will significantly impact inlining and, really, the entire optimization pipeline. Having this be a late pass seems more in keeping with what I assume is a goal that this instrumentation doesn't drastically change the generated code.

I agree that the best solution is to move mcount insertion phase in the late pass so llc has to insert mcount calls for each function. But I don't have clear idea how much work has to be done for this. I confess that I don't have much experience on current LLVM code base. That's why I wrote this workaround patch to fix the bug first, then hopefully someone can move mcount insertion phase as you mentioned, or someone give me some guide how to do that. Writing a new attribute "drop_when_inlining" seems to be a bit more comfortable for me as Hal suggested.
Thanks for the comments.

I agree with John; the right solution here is to insert the functions in the backend. As I understand it, the backend part is pretty simple. I'll write that part shortly so we can experiment.

hfinkel mentioned this in D22825: Add a counter-function insertion pass.Jul 26 2016, 10:58 AM

In D22666#496218, @hfinkel wrote:

In D22666#495978, @honggyu.kim wrote:

In D22666#493708, @rjmccall wrote:

Note that the presence of the mcount call itself will significantly impact inlining and, really, the entire optimization pipeline. Having this be a late pass seems more in keeping with what I assume is a goal that this instrumentation doesn't drastically change the generated code.

I agree that the best solution is to move mcount insertion phase in the late pass so llc has to insert mcount calls for each function. But I don't have clear idea how much work has to be done for this. I confess that I don't have much experience on current LLVM code base. That's why I wrote this workaround patch to fix the bug first, then hopefully someone can move mcount insertion phase as you mentioned, or someone give me some guide how to do that. Writing a new attribute "drop_when_inlining" seems to be a bit more comfortable for me as Hal suggested.
Thanks for the comments.

I agree with John; the right solution here is to insert the functions in the backend. As I understand it, the backend part is pretty simple. I'll write that part shortly so we can experiment.

Here's a patch for the backend part: D22825 - If this makes sense to everyone, we can change the frontend to add the corresponding attribute instead of inserting the function calls directly.

In D22666#496355, @hfinkel wrote:

Here's a patch for the backend part: D22825 - If this makes sense to everyone, we can change the frontend to add the corresponding attribute instead of inserting the function calls directly.

I appreciate your help. I think I can write the frontend part that inserts 'counting-function' attribute when -pg is enabled. Please let me do this work!

In D22666#497085, @honggyu.kim wrote:

In D22666#496355, @hfinkel wrote:

Here's a patch for the backend part: D22825 - If this makes sense to everyone, we can change the frontend to add the corresponding attribute instead of inserting the function calls directly.

I appreciate your help.

No problem.

I think I can write the frontend part that inserts 'counting-function' attribute when -pg is enabled. Please let me do this work!

Great, please do :-)

I've updated the diff that can be working with D22825. I appreciate your work in backend, Hal.
Please correct me if I'm wrong.

honggyu.kim added a parent revision: D22825: Add a counter-function insertion pass.Jul 27 2016, 11:38 PM

honggyu.kim updated this object.

I just wrote how I tested this in the bugzilla report page below:
https://llvm.org/bugs/show_bug.cgi?id=28660

Comments about the comment, but otherwise, LGTM.

lib/CodeGen/CodeGenFunction.cpp
789 ↗	(On Diff #65879)	emitting mcount call -> emitting the mcount call optimization phase -> optimizations
790 ↗	(On Diff #65879)	mark -> add
791 ↗	(On Diff #65879)	Attribute -> The attribute
792 ↗	(On Diff #65879)	remove ", and is processed by CountingFunctionInserter". That's an implementation detail of the backend, which might change, and is irrelevant to the frontend.

This revision is now accepted and ready to land.Jul 28 2016, 12:12 PM

LGTM, too.

honggyu.kim removed a parent revision: D22825: Add a counter-function insertion pass.Jul 28 2016, 6:33 PM

In D22666#499583, @hfinkel wrote:

Comments about the comment, but otherwise, LGTM.

Just updated the comment as you mentioned. Thanks for correcting my English :)

Should we also modify clang/test/CodeGen/mcount.c as well? I'm not actually familiar with test infra.

$ cat llvm/tools/clang/test/CodeGen/mcount.c
// RUN: %clang_cc1 -pg -triple i386-unknown-unknown -emit-llvm -o - %s | FileCheck %s
// RUN: %clang_cc1 -pg -triple powerpc-unknown-gnu-linux -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
// RUN: %clang_cc1 -pg -triple powerpc64-unknown-gnu-linux -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
// RUN: %clang_cc1 -pg -triple powerpc64le-unknown-gnu-linux -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
// RUN: %clang_cc1 -pg -triple i386-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
// RUN: %clang_cc1 -pg -triple x86_64-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
// RUN: %clang_cc1 -pg -triple arm-netbsd-eabi -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
// RUN: %clang_cc1 -pg -triple aarch64-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
// RUN: %clang_cc1 -pg -triple mips-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
// RUN: %clang_cc1 -pg -triple powerpc-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
// RUN: %clang_cc1 -pg -triple powerpc64-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
// RUN: %clang_cc1 -pg -triple powerpc64le-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
// RUN: %clang_cc1 -pg -triple sparc-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
// RUN: %clang_cc1 -pg -triple sparc64-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
void foo(void) {
// CHECK: call void @mcount()
// CHECK-PREFIXED: call void @_mcount()
}

In D22666#500326, @honggyu.kim wrote:

Should we also modify clang/test/CodeGen/mcount.c as well? I'm not actually familiar with test infra.

Yes, you'll need to modify it to test for the attribute instead.

In D22666#500327, @rjmccall wrote:

In D22666#500326, @honggyu.kim wrote:

Should we also modify clang/test/CodeGen/mcount.c as well? I'm not actually familiar with test infra.

Yes, you'll need to modify it to test for the attribute instead.

Thanks. As I mentioned, I don't know about the test infra well so can you please give me some guide to check the attribute? I will learn this time and do it myself next time.

In D22666#500328, @honggyu.kim wrote:

In D22666#500327, @rjmccall wrote:

In D22666#500326, @honggyu.kim wrote:

Should we also modify clang/test/CodeGen/mcount.c as well? I'm not actually familiar with test infra.

Yes, you'll need to modify it to test for the attribute instead.

Thanks. As I mentioned, I don't know about the test infra well so can you please give me some guide to check the attribute? I will learn this time and do it myself next time.

In this case, I think that making a simple test (changing the current test to be) like test/CodeGen/stackrealign.c would be fine. If you have any questions, please feel free to ask.

In D22666#500329, @hfinkel wrote:

In this case, I think that making a simple test (changing the current test to be) like test/CodeGen/stackrealign.c would be fine. If you have any questions, please feel free to ask.

Thanks. Please correct me if I'm wrong.

diff --git a/test/CodeGen/mcount.c b/test/CodeGen/mcount.c
index 99dc907..31b5288 100644
--- a/test/CodeGen/mcount.c
+++ b/test/CodeGen/mcount.c
@@ -13,6 +13,7 @@
 // RUN: %clang_cc1 -pg -triple sparc-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
 // RUN: %clang_cc1 -pg -triple sparc64-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
 void foo(void) {
-// CHECK: call void @mcount()
-// CHECK-PREFIXED: call void @_mcount()
 }
+
+// CHECK: attributes #{{[0-9]+}} = { "counting-function"="mcount" {{.*}} }
+// CHECK-PREFIXED: attributes #{{[0-9]+}} = { "counting-function"="_mcount" {{.*}} }

In D22666#500338, @honggyu.kim wrote:
In D22666#500329, @hfinkel wrote:

In this case, I think that making a simple test (changing the current test to be) like test/CodeGen/stackrealign.c would be fine. If you have any questions, please feel free to ask.

Thanks. Please correct me if I'm wrong.
diff --git a/test/CodeGen/mcount.c b/test/CodeGen/mcount.c
index 99dc907..31b5288 100644
--- a/test/CodeGen/mcount.c
+++ b/test/CodeGen/mcount.c
@@ -13,6 +13,7 @@
 // RUN: %clang_cc1 -pg -triple sparc-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
 // RUN: %clang_cc1 -pg -triple sparc64-netbsd -emit-llvm -o - %s | FileCheck -check-prefix=CHECK-PREFIXED %s
 void foo(void) {
-// CHECK: call void @mcount()
-// CHECK-PREFIXED: call void @_mcount()
 }
+
+// CHECK: attributes #{{[0-9]+}} = { "counting-function"="mcount" {{.*}} }

Both of the RUN lines have --check=CHECK-PREFIXED; does anything actually use the above CHECK: line? Maybe you just want to have one CHECK: line, associated with the comment with -pg, and one CHECK-NOT: line, associated with a command line without -pg.

+// CHECK-PREFIXED: attributes #{{[0-9]+}} = { "counting-function"="_mcount" {{.*}} }

I've updated the testcase again it adds 2 more tests for -O2 and without -pg.
Please have a look at it.

$ llvm-lit mcount.c
-- Testing: 1 tests, 1 threads --
PASS: Clang :: CodeGen/mcount.c (1 of 1)
Testing Time: 32.61s
  Expected Passes    : 1

honggyu.kim added a reviewer: rengolin.Aug 4 2016, 10:51 PM

honggyu.kim added subscribers: shenhan, compnerd.

Hi Renato and Saleem,

I found one more testcase that has to be changed. It's in llvm/tools/clang/test/Frontend/gnu-mcount.c.
But strangely, some cases provide mcount name with prefix '\01' such as below:

class ARMTargetInfo : public TargetInfo {
  ...
public:
  ARMTargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts,
                bool IsBigEndian)
      : TargetInfo(Triple), FPMath(FP_Default), IsAAPCS(true), LDREX(0),
        HW_FP(0) {
    ...
    if (Triple.getOS() == llvm::Triple::Linux ||
        Triple.getOS() == llvm::Triple::UnknownOS)
      this->MCountName =
          Opts.EABIVersion == "gnu" ? "\01__gnu_mcount_nc" : "\01mcount";
  ...
  }
  ...
};

class AArch64TargetInfo : public TargetInfo {
  ...
  AArch64TargetInfo(const llvm::Triple &Triple, const TargetOptions &Opts)
      : TargetInfo(Triple), ABI("aapcs") {
    ...
    if (Triple.getOS() == llvm::Triple::Linux ||
        Triple.getOS() == llvm::Triple::UnknownOS)
      this->MCountName = Opts.EABIVersion == "gnu" ? "\01_mcount" : "mcount";
    ...
  }
  ...
};

The modified testcase does not pass because of '\01' prefix. The following example is reduced version of the testcase.

$ cat test-mcount.c
// RUN: %clang -target armv7-unknown-none-eabi -pg -meabi gnu -S -emit-llvm -o - %s | FileCheck %s -check-prefix CHECK -check-prefix CHECK-ARM-EABI-MEABI-GNU

int f() {
  return 0;
}

// CHECK-LABEL: f
// CHECK-ARM-EABI-MEABI-GNU: attributes #{{[0-9]+}} = { {{.*}}"counting-function"="\01__gnu_mcount_nc"{{.*}} }

This is not passed only because of '\01' prefix. It seems that '\01' prefix is analyzed in a strange way when string match is performed by llvm-lit.
If I remove \01 prefix in MCountName in llvm/tools/clang/test/Frontend/gnu-mcount.c file and recompile it, then test it again, the testcase passes.
Can you please check why '\01' prefix is needed?

The \01 is to prevent the mangler from touching the function name. If you noticed the check tags in the quoted test, some targets expect it to be decorated and others do not. What exactly do you mean that lit has a strings matching with \01? Its not a string \01, its the octal character 1.

In D22666#506884, @compnerd wrote:

The \01 is to prevent the mangler from touching the function name. If you noticed the check tags in the quoted test, some targets expect it to be decorated and others do not. What exactly do you mean that lit has a strings matching with \01? Its not a string \01, its the octal character 1.

Thanks for the explanation. I mean the following statement is not properly matched by llvm-lit even if frontend generate mcount name as "\01__gnu_mcount_nc".

// CHECK-ARM-EABI-MEABI-GNU: attributes #{{[0-9]+}} = { {{.*}}"counting-function"="\01__gnu_mcount_nc"{{.*}} }

If I remove '\01' prefix in frontend then generate just "__gnu_mcount_nc", the following statement that also doesn't have '\01' prefix is matched without problem.

// CHECK-ARM-EABI-MEABI-GNU: attributes #{{[0-9]+}} = { {{.*}}"counting-function"="__gnu_mcount_nc"{{.*}} }

Do you have any idea why this problem happens?

I see the difference now.
Below is the originally generated IR without this patch. It shows \01 clearly in call void @"\01__gnu_mcount_nc"().

$ clang -target armv7-unknown-none-eabi -pg -meabi gnu -S -emit-llvm -o - test-mcount.c
; ModuleID = 'mcount.c'
source_filename = "mcount.c"
target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
target triple = "armv7-unknown-none-eabi"

; Function Attrs: nounwind
define i32 @f() #0 {
  call void @"\01__gnu_mcount_nc"() #1
  ret i32 0
}

declare void @"\01__gnu_mcount_nc"()

attributes #0 = { nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a8" "target-features"="+dsp,+neon,+strict-align,+vfp3" "unsafe-fp-math"="false" "use-soft-float"="false" }
attributes #1 = { nounwind }
  ...

Here is the IR result with this patch. But it doesn't show \01 in front of __gnu_mcount_nc in attribute list. It shows only "counting-function"="__gnu_mcount_nc" here but if I dump the output into a file then open it, it shows as "counting-function"="^A__gnu_mcount_nc". Maybe there's some mishandling of \01 prefix whiling passing string that I didn't find yet.

$ clang -target armv7-unknown-none-eabi -pg -meabi gnu -S -emit-llvm -o - test-mcount.c
; ModuleID = 'mcount.c'
source_filename = "mcount.c"
target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
target triple = "armv7-unknown-none-eabi"

; Function Attrs: nounwind
define i32 @f() #0 {
  ret i32 0
}

attributes #0 = { nounwind "counting-function"="__gnu_mcount_nc" "disable-tail-calls"="false" "less-precise-fpmad"="false" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a8" "target-features"="+dsp,+neon,+strict-align,+vfp3" "unsafe-fp-math"="false" "use-soft-float"="false" }
  ...

In D22666#506884, @compnerd wrote:

The \01 is to prevent the mangler from touching the function name. If you noticed the check tags in the quoted test, some targets expect it to be decorated and others do not. What exactly do you mean that lit has a strings matching with \01? Its not a string \01, its the octal character 1.

Hi Saleem,

It seems that name mangling is done before mcount call insertion so I think we don't have to put \01 prefix for mcount names. I have tested on x86-64 and ARM. Are there any other targets that work in a different order?
I've checked this clang::CodeGen::CodeGenModule::getMangledName() is always executed before clang::CodeGen::CodeGenFunction::StartFunction(), which actually inserts mcount calls.
Please correct me if I'm wrong

Theres a frontend mangler and a backend mangler. You should try this on Windows GNU environments :-).

In D22666#509904, @compnerd wrote:

Theres a frontend mangler and a backend mangler. You should try this on Windows GNU environments :-).

Thanks for the confirmation :)
If so, would it be okay if I put one more backslash for each mcount name with '\0' prefix? It shows the output properly with '\0' in IR generation. For example,

@@ -4920,7 +4920,7 @@ public:
     if (Triple.getOS() == llvm::Triple::Linux ||
         Triple.getOS() == llvm::Triple::UnknownOS)
       this->MCountName =
-          Opts.EABIVersion == "gnu" ? "\01__gnu_mcount_nc" : "\01mcount";
+          Opts.EABIVersion == "gnu" ? "\\01__gnu_mcount_nc" : "\\01mcount";
   }

   StringRef getABI() const override { return ABI; }

test-mcount.c test above can be passed with this modification.

No, the inserted character is a literal SOH, not the string "\01".

In D22666#513465, @compnerd wrote:

No, the inserted character is a literal SOH, not the string "\01".

I don't know why the string is passed in a different way but '\01' is shown in IR previously as below

$ clang -target armv7-unknown-none-eabi -pg -meabi gnu -S -emit-llvm -o - test-mcount.c
  ...
define i32 @f() #0 {
  call void @"\01__gnu_mcount_nc"() #1
  ret i32 0
}
  ...

But it doesn't show '\01' in IR with this modifiction.

$ clang -target armv7-unknown-none-eabi -pg -meabi gnu -S -emit-llvm -o - test-mcount.c
  ...
attributes #0 = { nounwind "counting-function"="__gnu_mcount_nc" ... }
  ...

The expected result is as follows. '\01' is clearly shown in IR.

$ clang -target armv7-unknown-none-eabi -pg -meabi gnu -S -emit-llvm -o - test-mcount.c
  ...
attributes #0 = { nounwind "counting-function"="\01__gnu_mcount_nc" ... }
  ...

Since I couldn't find why string passing is handled in a different way with previous version, I just put double backslash in "\\01__gnu_mcount_nc" because it generates the expected IR output. I know that we should have clear reason for this.
Do you have any idea why this problem happens?

I have fixed SOH character problem by D23792. I've verified that mcount.c and gnu-mcount.c tests are passed now.

$ llvm-lit tools/clang/test/CodeGen/mcount.c
-- Testing: 1 tests, 1 threads --
PASS: Clang :: CodeGen/mcount.c (1 of 1)
Testing Time: 0.81s
  Expected Passes    : 1

$ llvm-lit tools/clang/test/Frontend/gnu-mcount.c
-- Testing: 1 tests, 1 threads --
PASS: Clang :: Frontend/gnu-mcount.c (1 of 1)
Testing Time: 10.36s
  Expected Passes    : 1

I think it's almost ready to be commited now. So please have a look at it with D23792.

honggyu.kim mentioned this in D23792: IR: Properly handle escape characters in Attribute::getAsString().Aug 29 2016, 5:07 AM

Closed by commit rL280355: [Frontend] Fix mcount inlining bug (authored by honggyu.kim). · Explain WhySep 1 2016, 4:37 AM

This revision was automatically updated to reflect the committed changes.

Diff 69986

cfe/trunk/lib/CodeGen/CodeGenFunction.cpp

Show First 20 Lines • Show All 422 Lines • ▼ Show 20 Lines	void CodeGenFunction::EmitFunctionInstrumentation(const char *Fn) {
llvm::Value *args[] = {		llvm::Value *args[] = {
llvm::ConstantExpr::getBitCast(CurFn, PointerTy),		llvm::ConstantExpr::getBitCast(CurFn, PointerTy),
CallSite		CallSite
};		};

EmitNounwindRuntimeCall(F, args);		EmitNounwindRuntimeCall(F, args);
}		}

void CodeGenFunction::EmitMCountInstrumentation() {
llvm::FunctionType *FTy = llvm::FunctionType::get(VoidTy, false);

llvm::Constant *MCountFn =
CGM.CreateRuntimeFunction(FTy, getTarget().getMCountName());
EmitNounwindRuntimeCall(MCountFn);
}

// OpenCL v1.2 s5.6.4.6 allows the compiler to store kernel argument		// OpenCL v1.2 s5.6.4.6 allows the compiler to store kernel argument
// information in the program executable. The argument information stored		// information in the program executable. The argument information stored
// includes the argument name, its type, the address and access qualifiers used.		// includes the argument name, its type, the address and access qualifiers used.
static void GenOpenCLArgMetadata(const FunctionDecl FD, llvm::Function Fn,		static void GenOpenCLArgMetadata(const FunctionDecl FD, llvm::Function Fn,
CodeGenModule &CGM, llvm::LLVMContext &Context,		CodeGenModule &CGM, llvm::LLVMContext &Context,
CGBuilderTy &Builder, ASTContext &ASTCtx) {		CGBuilderTy &Builder, ASTContext &ASTCtx) {
// Create MDNodes that represent the kernel arg metadata.		// Create MDNodes that represent the kernel arg metadata.
// Each MDNode is a list in the form of "key", N number of values which is		// Each MDNode is a list in the form of "key", N number of values which is
▲ Show 20 Lines • Show All 342 Lines • ▼ Show 20 Lines	if (CGDebugInfo *DI = getDebugInfo()) {
QualType FnType = getContext().getFunctionType(		QualType FnType = getContext().getFunctionType(
RetTy, ArgTypes, FunctionProtoType::ExtProtoInfo(CC));		RetTy, ArgTypes, FunctionProtoType::ExtProtoInfo(CC));
DI->EmitFunctionStart(GD, Loc, StartLoc, FnType, CurFn, Builder);		DI->EmitFunctionStart(GD, Loc, StartLoc, FnType, CurFn, Builder);
}		}

if (ShouldInstrumentFunction())		if (ShouldInstrumentFunction())
EmitFunctionInstrumentation("__cyg_profile_func_enter");		EmitFunctionInstrumentation("__cyg_profile_func_enter");

		// Since emitting the mcount call here impacts optimizations such as function
		// inlining, we just add an attribute to insert a mcount call in backend.
		// The attribute "counting-function" is set to mcount function name which is
		// architecture dependent.
if (CGM.getCodeGenOpts().InstrumentForProfiling)		if (CGM.getCodeGenOpts().InstrumentForProfiling)
EmitMCountInstrumentation();		Fn->addFnAttr("counting-function", getTarget().getMCountName());

if (RetTy->isVoidType()) {		if (RetTy->isVoidType()) {
// Void type; nothing to return.		// Void type; nothing to return.
ReturnValue = Address::invalid();		ReturnValue = Address::invalid();

// Count the implicit return.		// Count the implicit return.
if (!endsWithReturn(D))		if (!endsWithReturn(D))
++NumReturnExprs;		++NumReturnExprs;
▲ Show 20 Lines • Show All 1,203 Lines • Show Last 20 Lines

cfe/trunk/test/CodeGen/mcount.c

	// RUN: %clang_cc1 -pg -triple i386-unknown-unknown -emit-llvm -o - %s \| FileCheck %s			// RUN: %clang_cc1 -pg -triple i386-unknown-unknown -emit-llvm -o - %s \| FileCheck %s
				// RUN: %clang_cc1 -pg -triple i386-unknown-unknown -emit-llvm -O2 -o - %s \| FileCheck %s
	// RUN: %clang_cc1 -pg -triple powerpc-unknown-gnu-linux -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple powerpc-unknown-gnu-linux -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	// RUN: %clang_cc1 -pg -triple powerpc64-unknown-gnu-linux -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple powerpc64-unknown-gnu-linux -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	// RUN: %clang_cc1 -pg -triple powerpc64le-unknown-gnu-linux -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple powerpc64le-unknown-gnu-linux -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	// RUN: %clang_cc1 -pg -triple i386-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple i386-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	// RUN: %clang_cc1 -pg -triple x86_64-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple x86_64-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	// RUN: %clang_cc1 -pg -triple arm-netbsd-eabi -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple arm-netbsd-eabi -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	// RUN: %clang_cc1 -pg -triple aarch64-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple aarch64-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	// RUN: %clang_cc1 -pg -triple mips-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple mips-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	// RUN: %clang_cc1 -pg -triple powerpc-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple powerpc-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	// RUN: %clang_cc1 -pg -triple powerpc64-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple powerpc64-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	// RUN: %clang_cc1 -pg -triple powerpc64le-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple powerpc64le-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	// RUN: %clang_cc1 -pg -triple sparc-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple sparc-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	// RUN: %clang_cc1 -pg -triple sparc64-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s			// RUN: %clang_cc1 -pg -triple sparc64-netbsd -emit-llvm -o - %s \| FileCheck -check-prefix=CHECK-PREFIXED %s
	void foo(void) {			// RUN: %clang_cc1 -emit-llvm -o - %s \| FileCheck %s -check-prefix=NO-MCOUNT
	// CHECK: call void @mcount()
	// CHECK-PREFIXED: call void @_mcount()			int bar(void) {
				return 0;
	}			}

				int foo(void) {
				return bar();
				}

				int main(void) {
				return foo();
				}

				// CHECK: attributes #{{[0-9]+}} = { {{.}}"counting-function"="mcount"{{.}} }
				// CHECK-PREFIXED: attributes #{{[0-9]+}} = { {{.}}"counting-function"="_mcount"{{.}} }
				// NO-MCOUNT-NOT: attributes #{{[0-9]}} = { {{.}}"counting-function"={{.}} }

cfe/trunk/test/Frontend/gnu-mcount.c

	Show All 38 Lines
	// RUN: %clang -target aarch64-unknown-cloudabi-gnueabihf -pg -S -emit-llvm -o - %s \| FileCheck %s -check-prefix CHECK -check-prefix CHECK-ARM64-EABI-CLOUDABI			// RUN: %clang -target aarch64-unknown-cloudabi-gnueabihf -pg -S -emit-llvm -o - %s \| FileCheck %s -check-prefix CHECK -check-prefix CHECK-ARM64-EABI-CLOUDABI
	// RUN: %clang -target aarch64-unknown-cloudabi-gnueabihf -meabi gnu -pg -S -emit-llvm -o - %s \| FileCheck %s -check-prefix CHECK -check-prefix CHECK-ARM64-EABI-CLOUDABI			// RUN: %clang -target aarch64-unknown-cloudabi-gnueabihf -meabi gnu -pg -S -emit-llvm -o - %s \| FileCheck %s -check-prefix CHECK -check-prefix CHECK-ARM64-EABI-CLOUDABI

	int f() {			int f() {
	return 0;			return 0;
	}			}

	// CHECK-LABEL: f			// CHECK-LABEL: f
	// CHECK-ARM-IOS-NOT: call void @_mcount()			// CHECK-ARM-IOS-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="_mcount"{{.}} }
	// CHECK-ARM-IOS-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM-IOS-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM-EABI: call void @"\01mcount"()			// CHECK-ARM-EABI: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01mcount"{{.}} }
	// CHECK-ARM-EABI-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM-EABI-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM64-EABI: call void @mcount()			// CHECK-ARM64-EABI: attributes #{{[0-9]+}} = { {{.}}"counting-function"="mcount"{{.}} }
	// CHECK-ARM64-EABI-MEABI-GNU: call void @"\01_mcount"()			// CHECK-ARM64-EABI-MEABI-GNU: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01_mcount"{{.}} }
	// CHECK-ARM64-EABI-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM64-EABI-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM-EABI-FREEBSD: call void @__mcount()			// CHECK-ARM-EABI-FREEBSD: attributes #{{[0-9]+}} = { {{.}}"counting-function"="__mcount"{{.}} }
	// CHECK-ARM-EABI-FREEBSD-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM-EABI-FREEBSD-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM64-EABI-FREEBSD: call void @.mcount()			// CHECK-ARM64-EABI-FREEBSD: attributes #{{[0-9]+}} = { {{.}}"counting-function"=".mcount"{{.}} }
	// CHECK-ARM64-EABI-FREEBSD-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM64-EABI-FREEBSD-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM-EABI-NETBSD: call void @_mcount()			// CHECK-ARM-EABI-NETBSD: attributes #{{[0-9]+}} = { {{.}}"counting-function"="_mcount"{{.}} }
	// CHECK-ARM-EABI-NETBSD-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM-EABI-NETBSD-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM-EABI-OPENBSD: call void @__mcount()			// CHECK-ARM-EABI-OPENBSD: attributes #{{[0-9]+}} = { {{.}}"counting-function"="__mcount"{{.}} }
	// CHECK-ARM-EABI-OPENBSD-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM-EABI-OPENBSD-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM64-EABI-OPENBSD: call void @mcount()			// CHECK-ARM64-EABI-OPENBSD: attributes #{{[0-9]+}} = { {{.}}"counting-function"="mcount"{{.}} }
	// CHECK-ARM64-EABI-OPENBSD-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM64-EABI-OPENBSD-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM-EABI-MEABI-GNU-NOT: call void @mcount()			// CHECK-ARM-EABI-MEABI-GNU-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="mcount"{{.}} }
	// CHECK-ARM-EABI-MEABI-GNU: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM-EABI-MEABI-GNU: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM-EABI-BITRIG: call void @__mcount()			// CHECK-ARM-EABI-BITRIG: attributes #{{[0-9]+}} = { {{.}}"counting-function"="__mcount"{{.}} }
	// CHECK-ARM-EABI-BITRIG-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM-EABI-BITRIG-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM54-EABI-BITRIG: call void @mcount()			// CHECK-ARM54-EABI-BITRIG: attributes #{{[0-9]+}} = { {{.}}"counting-function"="mcount"{{.}} }
	// CHECK-ARM54-EABI-BITRIG-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM54-EABI-BITRIG-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM-EABI-RTEMS: call void @mcount()			// CHECK-ARM-EABI-RTEMS: attributes #{{[0-9]+}} = { {{.}}"counting-function"="mcount"{{.}} }
	// CHECK-ARM-EABI-RTEMS-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM-EABI-RTEMS-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM64-EABI-RTEMS: call void @mcount()			// CHECK-ARM64-EABI-RTEMS: attributes #{{[0-9]+}} = { {{.}}"counting-function"="mcount"{{.}} }
	// CHECK-ARM64-EABI-RTEMS-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM64-EABI-RTEMS-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM-EABI-CLOUDABI: call void @mcount()			// CHECK-ARM-EABI-CLOUDABI: attributes #{{[0-9]+}} = { {{.}}"counting-function"="mcount"{{.}} }
	// CHECK-ARM-EABI-CLOUDABI-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM-EABI-CLOUDABI-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }
	// CHECK-ARM64-EABI-CLOUDABI: call void @mcount()			// CHECK-ARM64-EABI-CLOUDABI: attributes #{{[0-9]+}} = { {{.}}"counting-function"="mcount"{{.}} }
	// CHECK-ARM64-EABI-CLOUDABI-NOT: call void @"\01__gnu_mcount_nc"()			// CHECK-ARM64-EABI-CLOUDABI-NOT: attributes #{{[0-9]+}} = { {{.}}"counting-function"="\01__gnu_mcount_nc"{{.}} }

This is an archive of the discontinued LLVM Phabricator instance.

Frontend: Fix mcount inlining bug
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 69986

cfe/trunk/lib/CodeGen/CodeGenFunction.cpp

cfe/trunk/test/CodeGen/mcount.c

cfe/trunk/test/Frontend/gnu-mcount.c

This is an archive of the discontinued LLVM Phabricator instance.

Frontend: Fix mcount inlining bugClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 69986

cfe/trunk/lib/CodeGen/CodeGenFunction.cpp

cfe/trunk/test/CodeGen/mcount.c

cfe/trunk/test/Frontend/gnu-mcount.c

Frontend: Fix mcount inlining bug
ClosedPublic