Page MenuHomePhabricator

Add a counter-function insertion pass
ClosedPublic

Authored by hfinkel on Jul 26 2016, 10:58 AM.

Details

Summary

As discussed in D22666, our current mechanism to support -pg profiling, where we insert calls to mcount(), or some similar function, is fundamentally broken. We insert these calls in the frontend, which means they get duplicated when inlining, and so the accumulated execution counts for the inlined-into functions are wrong.

Because we don't want the presence of these functions to affect optimizaton, they should be inserted in the backend. Here's a pass which would do just that. The knowledge of the name of the counting function lives in the frontend, so I'm passing it here as a function attribute. This should be easy to add in the frontend, work correctly with LTO, etc.

Diff Detail

Event Timeline

hfinkel updated this revision to Diff 65556.Jul 26 2016, 10:58 AM
hfinkel retitled this revision from to Add a counter-function insertion pass.
hfinkel updated this object.
hfinkel added reviewers: rjmccall, honggyu.kim, majnemer.
hfinkel added a subscriber: llvm-commits.
hfinkel updated this revision to Diff 65566.Jul 26 2016, 11:31 AM

@majnemer pointed out on IRC that this pass does not preserve all analyses. Here's a more-refined version (it currently does not matter what we do here, since CGP will run afterward, and that does not preserve anything; since we use AA in the backend, I've just included GlobalsAA preservation for good measure).

rjmccall edited edge metadata.Jul 26 2016, 12:54 PM

This seems like the right approach to me; I won't review the pass parts, though. Should we do this + the change to actually start setting the flags in one patch?

lib/CodeGen/TargetPassConfig.cpp
474

This should only be added if the flag to add the attributes is enabled, right?

hfinkel added inline comments.Jul 26 2016, 1:38 PM
lib/CodeGen/TargetPassConfig.cpp
474

No, it always adds the pass. The pass just dosen't do anything if the function does not have the attribute. We don't want to predicate it on some CodeGen state because then it won't work with LTO.

This seems like the right approach to me; I won't review the pass parts, though. Should we do this + the change to actually start setting the flags in one patch?

Do you prefer that? I generally prefer when people split the Clang and LLVM patches into separate reviews (especially when they're functionally separable). On other other hand, if we move everything to one git repository, single patches will be easier...

Ah, no, doing it separately is fine, then.

honggyu.kim edited edge metadata.Jul 26 2016, 7:00 PM

This seems like the right approach to me; I won't review the pass parts, though. Should we do this + the change to actually start setting the flags in one patch?

Do you prefer that? I generally prefer when people split the Clang and LLVM patches into separate reviews (especially when they're functionally separable). On other other hand, if we move everything to one git repository, single patches will be easier...

Ah, no, doing it separately is fine, then.

Hi, I have tested it and it works great for the backend. Thanks for the patch!

And I think we should consider -finstrument-functions option as well. It inserts two functions both at the entry and exit. We maybe better to change the name 'CountingFunction' to something else to cover both cases. 'CountingEntryFunction' and 'CountingExitFunction' seems to be okay as well.

And I think we should consider -finstrument-functions option as well. It inserts two functions both at the entry and exit. We maybe better to change the name 'CountingFunction' to something else to cover both cases. 'CountingEntryFunction' and 'CountingExitFunction' seems to be okay as well.

No, those are different. They're supposed to be duplicated upon inlining. The GCC manual says:

"This instrumentation is also done for functions expanded inline in other functions. The profiling calls will indicate where, conceptually, the inline function is entered and exited. This means that addressable versions of such functions must be available. If all your uses of a function are expanded inline, this may mean an additional expansion of code size. If you use `extern inline' in your C code, an addressable version of such functions must be provided."

So these can be be inserted in the frontend as they are now.

And I think we should consider -finstrument-functions option as well. It inserts two functions both at the entry and exit. We maybe better to change the name 'CountingFunction' to something else to cover both cases. 'CountingEntryFunction' and 'CountingExitFunction' seems to be okay as well.

No, those are different. They're supposed to be duplicated upon inlining. The GCC manual says:

"This instrumentation is also done for functions expanded inline in other functions. The profiling calls will indicate where, conceptually, the inline function is entered and exited. This means that addressable versions of such functions must be available. If all your uses of a function are expanded inline, this may mean an additional expansion of code size. If you use `extern inline' in your C code, an addressable version of such functions must be provided."

So these can be be inserted in the frontend as they are now.

I see. Thanks for the correction.

honggyu.kim added inline comments.Jul 27 2016, 6:57 PM
test/Transforms/CountingFunctionInserter/mcount.ll
26

Is there a reason that we have to explicitly provide name of "counting-function" attribute? Would it be better to get the name from TargetInfo? I need to find a way to get TargetInfo in FunctionPass first thought. I think we better not to increase target-dependency in IR level even if it's just attributes because mcount name is kind of target-dependent.

hfinkel added inline comments.Jul 27 2016, 7:41 PM
test/Transforms/CountingFunctionInserter/mcount.ll
26

Is there a reason that we have to explicitly provide name of "counting-function" attribute? Would it be better to get the name from TargetInfo?

Because the functionality might be useful for cases outside of -pg (and might even be useful for a tracing JIT), and there's no need to make a trivially-generalizable functionality artificially specific, I think it makes sense to make the function name something the IR-generator can specify.

I think we better not to increase target-dependency in IR level even if it's just attributes because mcount name is kind of target-dependent.

I'm fine with having a target-specific default. We'd move the current logic for setting the mcount name into initialize in lib/Analysis/TargetLibraryInfo.cpp - see, for example, the calls to TLI.setAvailableWithName in that function.

honggyu.kim accepted this revision.EditedAug 2 2016, 5:55 PM
honggyu.kim edited edge metadata.

It works fine with D22666 now. Thanks for your help :)

This revision is now accepted and ready to land.Aug 2 2016, 5:55 PM

Hi Hal,

I've just got commit access from Chris. Can you please commit this patch? I will also commit D22666 and D23792 as soon as you commit this.

Thanks,
Honggyu

This revision was automatically updated to reflect the committed changes.

I've also committed related patches. I really appreciate your help, Hal!

Honggyu