Page MenuHomePhabricator

add support for -fno-instrument-functions and -finstrument-functions-exclude-{file,function}-list=<arg1,arg2,...> to match gcc options.
Needs ReviewPublic

Authored by choikwa on Sep 8 2017, 7:21 AM.

Details

Reviewers
rnk
majnemer
Summary

Builds on previous Differential D2219

Changes include:

  • Using unordered_map with SourceLocation.ID (raw encoding) as key
  • Demangle only if !isExternC. Used dyn_cast<FunctionDecl>((Decl*)CurFuncDecl) for this
  • Modified an existing C testcase to test for options.

Event Timeline

choikwa created this revision.Sep 8 2017, 7:21 AM
aaron.ballman added inline comments.
lib/CodeGen/CodeGenFunction.cpp
427

This parameter can be const.

437

Can use const auto * here.

445

This static has me worried. Does this data need to be cached across codegen modules? If not, perhaps this variable can be hung onto the CodeGenModule instead?

The variable should be named Cache instead of cache. Also, is an unordered_map the correct data structure to use? Would a DenseMap make more sense because the keys and values are both small (PresumedLoc::getFilename() returns a const char * that I believe can be used).

448

key -> Key

450

ctx -> CTX, and I think this can be a const ref.

461–462

You can use a range-based for loop here instead, and then get rid of the typedef for CIt.

468

You can avoid the copy here by assigning to a StringRef instead.

473

status -> Status

474

result -> Result

487

Can use a range-based for loop here.

choikwa updated this revision to Diff 114380.Sep 8 2017, 9:40 AM

addressed code review. made doc consistent with functionality.

Forgot to hang Cache to CodeGenModule, will do that shortly

choikwa updated this revision to Diff 114388.Sep 8 2017, 10:24 AM

renamed and moved Cache to SourceLocToFileNameMap in CodeGenModule

Please also add a C++ test to check the mangling-related features.

lib/CodeGen/CodeGenFunction.cpp
427

Using the system demangler here seems like the wrong solution. You have the current function declaration, CurFuncDecl, and from it you should be able to extract the full name.

I think that this will do it:

std::string FullFuncName = PredefinedExpr::ComputeName(
        PredefinedExpr::PrettyFunctionNoVirtual, CurFuncDecl);
lib/CodeGen/CodeGenModule.cpp
4601 ↗(On Diff #114388)

Make this function inline in the class definition.

choikwa updated this revision to Diff 117264.Sep 30 2017, 2:34 PM

Addressing Hal's feedback

majnemer added inline comments.Sep 30 2017, 5:44 PM
lib/CodeGen/CodeGenFunction.cpp
471

Is this comment still correct?

lib/CodeGen/CodeGenModule.h
503

Pointers lean right.

1230

inline is redundant here.

choikwa updated this revision to Diff 117267.Sep 30 2017, 6:02 PM
  • - Address formating feedback, remove redundant inline
hfinkel added inline comments.Sep 30 2017, 6:49 PM
test/CodeGenCXX/instrument-functions.cpp
8

I'm a bit confused about this test. Are you trying to match against the mangled names, or the demangled names, or both?

choikwa updated this revision to Diff 117268.Sep 30 2017, 7:07 PM
  • add comment to CPP test to explain usage
  • add comment to CPP test to explain usage

Thanks. Please also add some tests showing matching overloaded functions, functions with template parameters, etc.

Do we need to strip whitespace before trying to match the demangled names?

choikwa updated this revision to Diff 117269.Sep 30 2017, 7:55 PM
  • add more CPP tests: func overload, template special
majnemer added inline comments.Sep 30 2017, 7:58 PM
lib/CodeGen/CodeGenFunction.cpp
463

Space after if.

481

Ditto.

choikwa added a comment.EditedSep 30 2017, 8:34 PM
  • add comment to CPP test to explain usage

Thanks. Please also add some tests showing matching overloaded functions, functions with template parameters, etc.

Do we need to strip whitespace before trying to match the demangled names?

Some cursory testing with g++ shows that only the 'test5' of 'test5(float, int, int*)' is matched. 'test5(' or 'test5 (' is not matched. It seems weird that arguments are not matched.

g++ man page shows

"The function name to be matched is its user-visible name, such as "vector<int> blah(const vector<int> &)", not the internal mangled name"

but it doesn't seem to be including parameters.

Also uncovered a bug where sub argument list containing comma needs to be surrounded by single quote, but clang seems to ignores single quote.
I'll try to dig around ArgList implementation to see if it can return argument surrounded by single-quote as a whole.

test/CodeGenCXX/instrument-functions.cpp
8

It's matching demangled names, so Z5test3 would not match and insert instrumentation calls to test3(int). Since this wasn't clear, I will add comment in the test.

  • add comment to CPP test to explain usage

Thanks. Please also add some tests showing matching overloaded functions, functions with template parameters, etc.

Do we need to strip whitespace before trying to match the demangled names?

Some cursory testing with g++ shows that only the 'test5' of 'test5(float, int, int*)' is matched. 'test5(' or 'test5 (' is not matched. It seems weird that arguments are not matched.

g++ man page shows

"The function name to be matched is its user-visible name, such as "vector<int> blah(const vector<int> &)", not the internal mangled name"

but it doesn't seem to be including parameters.

Interesting. Can you tell what GCC is doing w.r.t. namespace names, class names, etc. and template parameters?

Also uncovered a bug where sub argument list containing comma needs to be surrounded by single quote, but clang seems to ignores single quote.
I'll try to dig around ArgList implementation to see if it can return argument surrounded by single-quote as a whole.

choikwa added a comment.EditedOct 1 2017, 12:13 AM
  • add comment to CPP test to explain usage

Thanks. Please also add some tests showing matching overloaded functions, functions with template parameters, etc.

Do we need to strip whitespace before trying to match the demangled names?

Some cursory testing with g++ shows that only the 'test5' of 'test5(float, int, int*)' is matched. 'test5(' or 'test5 (' is not matched. It seems weird that arguments are not matched.

g++ man page shows

"The function name to be matched is its user-visible name, such as "vector<int> blah(const vector<int> &)", not the internal mangled name"

but it doesn't seem to be including parameters.

Interesting. Can you tell what GCC is doing w.r.t. namespace names, class names, etc. and template parameters?

Also uncovered a bug where sub argument list containing comma needs to be surrounded by single quote, but clang seems to ignores single quote.
I'll try to dig around ArgList implementation to see if it can return argument surrounded by single-quote as a whole.

Given A::B::C<D>(T a), only 'C<D>' is meaningful in g++'s matcher. Adding anything else escapes the match [ 'B::C', 'C<D>(' ].
It seems like g++ will also not match single quotes as a whole, ie. int fooboo() is matched by 'foo,boo'.

  • add comment to CPP test to explain usage

Thanks. Please also add some tests showing matching overloaded functions, functions with template parameters, etc.

Do we need to strip whitespace before trying to match the demangled names?

Some cursory testing with g++ shows that only the 'test5' of 'test5(float, int, int*)' is matched. 'test5(' or 'test5 (' is not matched. It seems weird that arguments are not matched.

g++ man page shows

"The function name to be matched is its user-visible name, such as "vector<int> blah(const vector<int> &)", not the internal mangled name"

but it doesn't seem to be including parameters.

Interesting. Can you tell what GCC is doing w.r.t. namespace names, class names, etc. and template parameters?

Also uncovered a bug where sub argument list containing comma needs to be surrounded by single quote, but clang seems to ignores single quote.
I'll try to dig around ArgList implementation to see if it can return argument surrounded by single-quote as a whole.

Given A::B::C<D>(T a), only 'C<D>' is meaningful in g++'s matcher. Adding anything else escapes the match [ 'B::C', 'C<D>(' ].
It seems like g++ will also not match single quotes as a whole, ie. int fooboo() is matched by 'foo,boo'.

Can you get more information on what GCC actually implemented and why? It's not clear to me that ignoring the namespaces is the most-useful way to do this. I don't want to emulate GCC bugs, but maybe there's a good reason why their implementation works this way.

Can you get more information on what GCC actually implemented and why? It's not clear to me that ignoring the namespaces is the most-useful way to do this. I don't want to emulate GCC bugs, but maybe there's a good reason why their implementation works this way.

This is what I found, https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00473.html
Diff shows it doesn't seem to specially treat single quotes.

+ case OPT_finstrument_functions_include_function_list_:
+ add_comma_separated_to_vector
+ (&opts->x_flag_instrument_functions_include_functions, arg);
+ break;

Can you get more information on what GCC actually implemented and why? It's not clear to me that ignoring the namespaces is the most-useful way to do this. I don't want to emulate GCC bugs, but maybe there's a good reason why their implementation works this way.

This is what I found, https://gcc.gnu.org/ml/gcc-patches/2015-12/msg00473.html

To state the obvious, that patch has zero C++ test cases. Can try emailing the author of the feature in GCC and try to get some clarity on what the intended behavior is on their end? I'm trying to figure out what's useful here. I think we do want to match namespaces.

Diff shows it doesn't seem to specially treat single quotes.

+ case OPT_finstrument_functions_include_function_list_:
+ add_comma_separated_to_vector
+ (&opts->x_flag_instrument_functions_include_functions, arg);
+ break;

choikwa updated this revision to Diff 120252.Oct 25 2017, 7:43 AM

rebase to trunk

apazos added a subscriber: apazos.Aug 23 2018, 4:26 PM

Hello folks, is there a plan to merge this feature still?

Hello folks, is there a plan to merge this feature still?

There's still a desire for the underlying functionality. We still need, as I recall, to figure out what's supposed to happen for C++ functions.