This is an archive of the discontinued LLVM Phabricator instance.

add support for -fno-instrument-functions and -finstrument-functions-exclude-{file,function}-list=<arg1,arg2,...> to match gcc options.
Needs ReviewPublic

Authored by choikwa on Sep 8 2017, 7:21 AM.

Details

Summary

Builds on previous Differential D2219

Changes include:

  • Using unordered_map with SourceLocation.ID (raw encoding) as key
  • Demangle only if !isExternC. Used dyn_cast<FunctionDecl>((Decl*)CurFuncDecl) for this
  • Modified an existing C testcase to test for options.

Event Timeline

choikwa created this revision.Sep 8 2017, 7:21 AM
aaron.ballman added inline comments.
lib/CodeGen/CodeGenFunction.cpp
425

This parameter can be const.

436

Can use const auto * here.

444

This static has me worried. Does this data need to be cached across codegen modules? If not, perhaps this variable can be hung onto the CodeGenModule instead?

The variable should be named Cache instead of cache. Also, is an unordered_map the correct data structure to use? Would a DenseMap make more sense because the keys and values are both small (PresumedLoc::getFilename() returns a const char * that I believe can be used).

447

key -> Key

449

ctx -> CTX, and I think this can be a const ref.

460–461

You can use a range-based for loop here instead, and then get rid of the typedef for CIt.

467

You can avoid the copy here by assigning to a StringRef instead.

472

status -> Status

473

result -> Result

486

Can use a range-based for loop here.

choikwa updated this revision to Diff 114380.Sep 8 2017, 9:40 AM

addressed code review. made doc consistent with functionality.

Forgot to hang Cache to CodeGenModule, will do that shortly

choikwa updated this revision to Diff 114388.Sep 8 2017, 10:24 AM

renamed and moved Cache to SourceLocToFileNameMap in CodeGenModule

Please also add a C++ test to check the mangling-related features.

lib/CodeGen/CodeGenFunction.cpp
419

Using the system demangler here seems like the wrong solution. You have the current function declaration, CurFuncDecl, and from it you should be able to extract the full name.

I think that this will do it:

std::string FullFuncName = PredefinedExpr::ComputeName(
        PredefinedExpr::PrettyFunctionNoVirtual, CurFuncDecl);
lib/CodeGen/CodeGenModule.cpp
4601 ↗(On Diff #114388)

Make this function inline in the class definition.

choikwa updated this revision to Diff 117264.Sep 30 2017, 2:34 PM

Addressing Hal's feedback

majnemer added inline comments.Sep 30 2017, 5:44 PM
lib/CodeGen/CodeGenFunction.cpp
470

Is this comment still correct?

lib/CodeGen/CodeGenModule.h
503 ↗(On Diff #117264)

Pointers lean right.

1212 ↗(On Diff #117264)

inline is redundant here.

choikwa updated this revision to Diff 117267.Sep 30 2017, 6:02 PM
  • - Address formating feedback, remove redundant inline
hfinkel added inline comments.Sep 30 2017, 6:49 PM
test/CodeGenCXX/instrument-functions.cpp
8 ↗(On Diff #117267)

I'm a bit confused about this test. Are you trying to match against the mangled names, or the demangled names, or both?

choikwa updated this revision to Diff 117268.Sep 30 2017, 7:07 PM
  • add comment to CPP test to explain usage
  • add comment to CPP test to explain usage

Thanks. Please also add some tests showing matching overloaded functions, functions with template parameters, etc.

Do we need to strip whitespace before trying to match the demangled names?

choikwa updated this revision to Diff 117269.Sep 30 2017, 7:55 PM
  • add more CPP tests: func overload, template special
majnemer added inline comments.Sep 30 2017, 7:58 PM
lib/CodeGen/CodeGenFunction.cpp
462

Space after if.

480

Ditto.

choikwa added a comment.EditedSep 30 2017, 8:34 PM
  • add comment to CPP test to explain usage

Thanks. Please also add some tests showing matching overloaded functions, functions with template parameters, etc.

Do we need to strip whitespace before trying to match the demangled names?

Some cursory testing with g++ shows that only the 'test5' of 'test5(float, int, int*)' is matched. 'test5(' or 'test5 (' is not matched. It seems weird that arguments are not matched.

g++ man page shows

"The function name to be matched is its user-visible name, such as "vector<int> blah(const vector<int> &)", not the internal mangled name"

but it doesn't seem to be including parameters.

Also uncovered a bug where sub argument list containing comma needs to be surrounded by single quote, but clang seems to ignores single quote.
I'll try to dig around ArgList implementation to see if it can return argument surrounded by single-quote as a whole.

test/CodeGenCXX/instrument-functions.cpp
8 ↗(On Diff #117267)

It's matching demangled names, so Z5test3 would not match and insert instrumentation calls to test3(int). Since this wasn't clear, I will add comment in the test.

  • add comment to CPP test to explain usage

Thanks. Please also add some tests showing matching overloaded functions, functions with template parameters, etc.

Do we need to strip whitespace before trying to match the demangled names?

Some cursory testing with g++ shows that only the 'test5' of 'test5(float, int, int*)' is matched. 'test5(' or 'test5 (' is not matched. It seems weird that arguments are not matched.

g++ man page shows

"The function name to be matched is its user-visible name, such as "vector<int> blah(const vector<int> &)", not the internal mangled name"

but it doesn't seem to be including parameters.

Interesting. Can you tell what GCC is doing w.r.t. namespace names, class names, etc. and template parameters?

Also uncovered a bug where sub argument list containing comma needs to be surrounded by single quote, but clang seems to ignores single quote.
I'll try to dig around ArgList implementation to see if it can return argument surrounded by single-quote as a whole.

choikwa added a comment.EditedOct 1 2017, 12:13 AM
  • add comment to CPP test to explain usage

Thanks. Please also add some tests showing matching overloaded functions, functions with template parameters, etc.

Do we need to strip whitespace before trying to match the demangled names?

Some cursory testing with g++ shows that only the 'test5' of 'test5(float, int, int*)' is matched. 'test5(' or 'test5 (' is not matched. It seems weird that arguments are not matched.

g++ man page shows

"The function name to be matched is its user-visible name, such as "vector<int> blah(const vector<int> &)", not the internal mangled name"

but it doesn't seem to be including parameters.

Interesting. Can you tell what GCC is doing w.r.t. namespace names, class names, etc. and template parameters?

Also uncovered a bug where sub argument list containing comma needs to be surrounded by single quote, but clang seems to ignores single quote.
I'll try to dig around ArgList implementation to see if it can return argument surrounded by single-quote as a whole.

Given A::B::C<D>(T a), only 'C<D>' is meaningful in g++'s matcher. Adding anything else escapes the match [ 'B::C', 'C<D>(' ].
It seems like g++ will also not match single quotes as a whole, ie. int fooboo() is matched by 'foo,boo'.

  • add comment to CPP test to explain usage

Thanks. Please also add some tests showing matching overloaded functions, functions with template parameters, etc.

Do we need to strip whitespace before trying to match the demangled names?

Some cursory testing with g++ shows that only the 'test5' of 'test5(float, int, int*)' is matched. 'test5(' or 'test5 (' is not matched. It seems weird that arguments are not matched.

g++ man page shows

"The function name to be matched is its user-visible name, such as "vector<int> blah(const vector<int> &)", not the internal mangled name"

but it doesn't seem to be including parameters.

Interesting. Can you tell what GCC is doing w.r.t. namespace names, class names, etc. and template parameters?

Also uncovered a bug where sub argument list containing comma needs to be surrounded by single quote, but clang seems to ignores single quote.
I'll try to dig around ArgList implementation to see if it can return argument surrounded by single-quote as a whole.

Given A::B::C<D>(T a), only 'C<D>' is meaningful in g++'s matcher. Adding anything else escapes the match [ 'B::C', 'C<D>(' ].
It seems like g++ will also not match single quotes as a whole, ie. int fooboo() is matched by 'foo,boo'.

Can you get more information on what GCC actually implemented and why? It's not clear to me that ignoring the namespaces is the most-useful way to do this. I don't want to emulate GCC bugs, but maybe there's a good reason why their implementation works this way.

Can you get more information on what GCC actually implemented and why? It's not clear to me that ignoring the namespaces is the most-useful way to do this. I don't want to emulate GCC bugs, but maybe there's a good reason why their implementation works this way.

This is what I found, https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00473.html
Diff shows it doesn't seem to specially treat single quotes.

+ case OPT_finstrument_functions_include_function_list_:
+ add_comma_separated_to_vector
+ (&opts->x_flag_instrument_functions_include_functions, arg);
+ break;

Can you get more information on what GCC actually implemented and why? It's not clear to me that ignoring the namespaces is the most-useful way to do this. I don't want to emulate GCC bugs, but maybe there's a good reason why their implementation works this way.

This is what I found, https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00473.html

To state the obvious, that patch has zero C++ test cases. Can try emailing the author of the feature in GCC and try to get some clarity on what the intended behavior is on their end? I'm trying to figure out what's useful here. I think we do want to match namespaces.

Diff shows it doesn't seem to specially treat single quotes.

+ case OPT_finstrument_functions_include_function_list_:
+ add_comma_separated_to_vector
+ (&opts->x_flag_instrument_functions_include_functions, arg);
+ break;

choikwa updated this revision to Diff 120252.Oct 25 2017, 7:43 AM

rebase to trunk

apazos added a subscriber: apazos.Aug 23 2018, 4:26 PM

Hello folks, is there a plan to merge this feature still?

Hello folks, is there a plan to merge this feature still?

There's still a desire for the underlying functionality. We still need, as I recall, to figure out what's supposed to happen for C++ functions.

Hi Folks, don't know if this review is still active but I can provide some context on the GCC behavior with respect to namespaces, template params and function arguments, they were bugs, fixed in 2019.

Associated bug reports for namespace/class, and template parameter and argument mishandling:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90816
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90809

Is there a plan to revive this? this is quite necessary for one of my use cases.
can we at least have '-fno-instrument-functions'?

Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2022, 4:04 PM
rnk edited reviewers, added: aaron.ballman; removed: rnk.Mar 14 2022, 12:32 PM

+1, someone just asked me if this feature exists and then found this patch. If this could be revived, I think it would help a lot of folks out. :)

+1, someone just asked me if this feature exists and then found this patch. If this could be revived, I think it would help a lot of folks out. :)

I took a quick pass over it and the changes look reasonable at a glance. If someone rebases this to trunk, I can look into it more thoroughly and rope in the other code owners.