This is an archive of the discontinued LLVM Phabricator instance.

[GCOV] Add options to filter files which must be instrumented.
ClosedPublic

Authored by calixte on Sep 13 2018, 7:10 AM.

Details

Summary

When making code coverage, a lot of files (like the ones coming from /usr/include) are removed when post-processing gcno/gcda so finally they doen't need to be instrumented nor to appear in gcno/gcda.
The goal of the patch is to be able to filter the files we want to instrument, there are several advantages to do that:

  • improve speed (no overhead due to instrumentation on files we don't care)
  • reduce gcno/gcda size
  • it gives the possibility to easily instrument only few files (e.g. ones modified in a patch) without changing the build system
  • need to accept this patch to be enabled in clang: https://reviews.llvm.org/D52034

Diff Detail

Repository
rL LLVM

Event Timeline

calixte created this revision.Sep 13 2018, 7:10 AM
calixte edited the summary of this revision. (Show Details)Sep 13 2018, 7:19 AM
vsk added inline comments.Sep 13 2018, 12:28 PM
lib/Transforms/Instrumentation/GCOVProfiling.cpp
102 ↗(On Diff #165278)

Just reading the signature here, it's not clear what getRegex does. Perhaps a more specific name would help? Maybe 'createRegexesFromFilterString'?

Also, it's an llvm convention to rely on NRVO instead of having a single out-parameter (so, simply returning the std::vector here is fine).

103 ↗(On Diff #165278)

StringRef is cheap to copy, no need for a const reference here.

135 ↗(On Diff #165278)

StringMap<bool> sounds like it's just StringSet?

451 ↗(On Diff #165278)

Nit, maybe 'doesFilenameMatchARegex' would be a clearer name.

476 ↗(On Diff #165278)

This might not be portable. Try llvm::sys::fs::real_path.

479 ↗(On Diff #165278)

Use a more specific name, perhaps 'bool ShouldInstrument'.

487 ↗(On Diff #165278)

Does this do the right thing if both sets of filters are empty?

Stepping back a bit, what is the motivation for having both inclusion & exclusion filters? Why not just have one?

Is it so that you can, say, exclude "/usr/*", but also include "/usr/local/*"? In that case the user needs to know the order in which you apply filters, because they're not commutative.

calixte added inline comments.Sep 13 2018, 1:32 PM
lib/Transforms/Instrumentation/GCOVProfiling.cpp
135 ↗(On Diff #165278)

I need to know if the filename has been already processed to avoid to do everything again.

476 ↗(On Diff #165278)

Thx for that, I looked after it and didn't find it.

487 ↗(On Diff #165278)

If both are empty then we returned at the beginning of the function.

Having only one is quite difficult to use.
The 2 common cases I thought about:

  • exclude /usr/include/ stuff
  • include only foo.cpp, bar.cpp (to make code coverage on modified files in a patch)

Having only one will induce that it will be more difficult to write regex.

When we've both then a file is included when it matches one of the inc regex and when it doesn't match all the exc regex.
So exc /usr/* and inc /usr/local/* means that all files in /usr/ are rejected, if you want only the /usr/local/ just don't put exc stuff.
And exc /usr/local/* and inc /usr/* means that we take all files in /usr except the ones in /usr/local.
But definitely, I must make a clearer doc.

calixte updated this revision to Diff 165373.Sep 13 2018, 1:40 PM

Fix the nits

calixte updated this revision to Diff 165648.Sep 15 2018, 10:49 AM

Fix stupid error when checking the returned value of real_path

marco-c accepted this revision.Sep 20 2018, 4:19 AM
marco-c added inline comments.
include/llvm/Transforms/Instrumentation.h
69 ↗(On Diff #165648)

Nit: maybe you could give a better name to these options, e.g. something like FilterFiles and ExcludeFiles.

lib/Transforms/Instrumentation/GCOVProfiling.cpp
434 ↗(On Diff #165648)

Nit: the name is a little confusing as it makes it look like it only applied to Filter and not to Exclude

Maybe you could rename it simply createRegexFromString?

This revision is now accepted and ready to land.Sep 20 2018, 4:19 AM
vsk added inline comments.Sep 20 2018, 4:33 PM
lib/Transforms/Instrumentation/GCOVProfiling.cpp
474 ↗(On Diff #165648)

If it's safe to use Filename when there's an error getting the real_path, I'm not sure why it's necessary to use the real_path at all. Why not fall back to the default behavior when there's an error?

487 ↗(On Diff #165278)

I see, what I was looking for is the rule you specified for when inclusion + exclusion lists exist. Sounds good.

calixte added inline comments.Sep 25 2018, 9:31 AM
lib/Transforms/Instrumentation/GCOVProfiling.cpp
474 ↗(On Diff #165648)

When compiling Firefox, most of the paths are already absolute paths with two exceptions:

  • /usr/lib/gcc/x86_64-linux-gnu/8/../../../../include/c++/8/bits/***: we need to have real_path here
  • hb-ot-shape-complex-indic-machine.rl: real_path is failing but if the exclude regex is hb.*\.rl we can exclude them
calixte added inline comments.Sep 25 2018, 11:57 PM
include/llvm/Transforms/Instrumentation.h
69 ↗(On Diff #165648)

The two variables here have the same name as the options (ie coverage-filter & coverage-exclude)

calixte updated this revision to Diff 172751.Nov 6 2018, 6:14 AM

Regexs separator is ':' for Unix and ';' for Windows

calixte updated this revision to Diff 172760.Nov 6 2018, 6:48 AM

Fix plural form of regex and just use ';' as separator.

This revision was automatically updated to reflect the committed changes.