This is an archive of the discontinued LLVM Phabricator instance.

[Coverage] Store compilation dir separately in coverage mapping
ClosedPublic

Authored by phosek on Jan 30 2021, 5:15 PM.

Details

Summary

We currently always store absolute filenames in coverage mapping. This
is problematic for several reasons. It poses a problem for distributed
compilation as source location might vary across machines. We are also
duplicating the path prefix potentially wasting space.

This change modifies how we store filenames in coverage mapping. Rather
than absolute paths, it stores the compilation directory and file paths
as given to the compiler, either relative or absolute. Later when
reading the coverage mapping information, we recombine relative paths
with the working directory. This approach is similar to handling
of DW_AT_comp_dir in DWARF.

Finally, we also provide a new option, -fprofile-compilation-dir akin
to -fdebug-compilation-dir which can be used to manually override the
compilation directory which is useful in distributed compilation cases.

Diff Detail

Event Timeline

phosek requested review of this revision.Jan 30 2021, 5:15 PM
phosek created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJan 30 2021, 5:15 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript

@vsk @davidxl I'd be interested in your opinion. I've made this change unconditionally to match the compiler behavior in cases like debug info or __FILE__ expansion, but I'm also happy to conditionalize it on a flag (for example -f[no]-coverage-mapping-abspath) if you'd like to avoid changing the current behavior.

vsk added a comment.Feb 1 2021, 9:38 AM

For context, the reason abspaths are used here is because users don't expect coverage results to go missing if the build directory changes (e.g. given cd A; cc x.c; cd ../B; cc x.c, the expectation is that coverage for two distinct x.c's is reported).

I think it's important that this continues to work by default. Default-enabling the suggested -fcoverage-mapping-abspath option sounds like a workable solution. I haven't kept up with the work on -f(debug-)?compilation-dir, but if that option were specified, it could make sense to make coverage filenames "absolute" (relative to chroot <compilation-dir>).

In D95753#2534061, @vsk wrote:

For context, the reason abspaths are used here is because users don't expect coverage results to go missing if the build directory changes (e.g. given cd A; cc x.c; cd ../B; cc x.c, the expectation is that coverage for two distinct x.c's is reported).

I think it's important that this continues to work by default. Default-enabling the suggested -fcoverage-mapping-abspath option sounds like a workable solution. I haven't kept up with the work on -f(debug-)?compilation-dir, but if that option were specified, it could make sense to make coverage filenames "absolute" (relative to chroot <compilation-dir>).

That's a useful context, thank you!

In the case of debug info, we store the compilation directory (DW_AT_comp_dir) separately from the file (DW_AT_decl_file) so even if we use relative path for the file by default, you can still get the absolute path by concatenating it with the directory which is absolute by default. In the case of coverage, there's no separate directory field, hence using the absolute path by default.

To produce a fully relocatable debug info, you can use -fdebug-compilation-dir ., but then it's the responsibility of the developer/build system to ensure that processing is done from the correct build directory to avoid the issue you mentioned. Given that, I think it makes sense to similarly gate the use of relative paths in coverage mapping on a flag.

vsk added a comment.Feb 1 2021, 2:06 PM

Thanks for pointing DW_AT_comp_dir out. For coverage, gating the use of relative paths on a flag sounds reasonable to me. It should also be possible to extend the .covmapping format to include something like DW_AT_comp_dir: compared to adding a flag, this would take a lot more work.

phosek updated this revision to Diff 322609.Feb 10 2021, 12:46 AM
phosek retitled this revision from [CoverageMapping] Don't absolutize source paths to Store compilation dir separately in coverage mapping.
phosek edited the summary of this revision. (Show Details)
In D95753#2534895, @vsk wrote:

Thanks for pointing DW_AT_comp_dir out. For coverage, gating the use of relative paths on a flag sounds reasonable to me. It should also be possible to extend the .covmapping format to include something like DW_AT_comp_dir: compared to adding a flag, this would take a lot more work.

I've implemented what's effectively an equivalent of DW_AT_comp_dir by storing the current directory as the first entry in coverage mapping filenames as discussed over email, let me know if this makes sense.

vsk added a comment.Feb 10 2021, 9:21 AM

This looks great, that turned out to be a lot cleaner than I expected.

clang/lib/CodeGen/CoverageMappingGen.cpp
1717

There's some nice alignment here with your change that gets rid of DecompressedData. If CoverageFilenamesSectionWriter were to accept an ArrayRef<string>, this strange FilenameRefs container can go away.

clang/lib/Driver/ToolChains/Clang.cpp
5302

Perhaps leave a comment explaining why the compdir flag differs for these GPU targets?

llvm/include/llvm/ProfileData/Coverage/CoverageMappingReader.h
178

This is a nice cleanup that should probably happen before/independently of the format change.

llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
149

This would need to be gated on the CovMapVersion. For context, the format needs to maintain backwards compatibility so that newer IDE's can inspect binaries built by older compilers.

phosek updated this revision to Diff 323106.Feb 11 2021, 11:52 AM
phosek marked 2 inline comments as done.

I have addressed all the comments and fixed all tests. I'm happy to slice this into multiple patches if you prefer landing the refactoring separately.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 11 2021, 11:52 AM
Herald added a subscriber: Restricted Project. · View Herald Transcript
vsk added a comment.Feb 11 2021, 1:47 PM

Two minor missing items:

  • Please include a binary test that verifies llvm-cov can prepare a report for a binary containing a Version6 coverage blob (for compatibility testing).
  • Please also include a short blurb in docs/CoverageMappingFormat.rst explaining the format change.

Everything else looks fine.

phosek updated this revision to Diff 323231.Feb 12 2021, 12:06 AM

Documentation updated and test included.

vsk accepted this revision.Feb 12 2021, 9:38 AM

Thanks, lgtm. The pre-merge checks appear to have flagged some issues, but I don't anticipate any major revision.

This revision is now accepted and ready to land.Feb 12 2021, 9:38 AM
vsk added inline comments.Feb 12 2021, 9:39 AM
llvm/docs/CoverageMappingFormat.rst
277 ↗(On Diff #323231)

I think the above bullet point could go in this section?

(I haven't read through the discussion.)

-fprofile-prefix-map has been added. Does this option has overlap with that option?

MaskRay added inline comments.Feb 15 2021, 3:20 PM
clang/include/clang/Driver/Options.td
1107

Using fcoverage_compilation_dir_EQ for both driver option and CC1 option is preferred.

Can you drop fcoverage_compilation_dir ?

tunz added a subscriber: tunz.Feb 15 2021, 3:23 PM
keith added a subscriber: keith.Feb 16 2021, 9:45 AM
phosek updated this revision to Diff 324224.Feb 17 2021, 1:10 AM
phosek marked an inline comment as done.
phosek retitled this revision from Store compilation dir separately in coverage mapping to [Coverage] Store compilation dir separately in coverage mapping.
phosek added inline comments.Feb 17 2021, 10:40 AM
clang/include/clang/Driver/Options.td
1107

I've been following -fdebug-compilation-dir which supports both variants. Should we drop -fdebug-compilation-dir as well? I'd like to keep these consistent.

MaskRay added inline comments.Feb 17 2021, 12:53 PM
clang/include/clang/Driver/Options.td
1107

We cannot drop the driver option -fdebug-compilation-dir X now for compatibility reasons: D96886.

Newer options should stick with the convention, not adding more Separate< forms.

phosek updated this revision to Diff 324423.Feb 17 2021, 2:14 PM
phosek marked 2 inline comments as done.
phosek updated this revision to Diff 324733.Feb 18 2021, 12:26 PM
phosek edited the summary of this revision. (Show Details)
This revision was landed with ongoing or failed builds.Feb 18 2021, 12:28 PM
This revision was automatically updated to reflect the committed changes.
phosek reopened this revision.Feb 18 2021, 1:28 PM
This revision is now accepted and ready to land.Feb 18 2021, 1:28 PM
phosek updated this revision to Diff 324755.Feb 18 2021, 1:29 PM
phosek updated this revision to Diff 324759.Feb 18 2021, 1:40 PM
This revision was landed with ongoing or failed builds.Feb 18 2021, 2:34 PM
This revision was automatically updated to reflect the committed changes.

Hello, we are seeing failures like below:

Note: Google Test filter = ParameterizedCovMapTest/CoverageMappingTest.skip_duplicate_function_record/0
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from ParameterizedCovMapTest/CoverageMappingTest
[ RUN ] ParameterizedCovMapTest/CoverageMappingTest.skip_duplicate_function_record/0
/scratch/build_jenkins/workspace/Downstream_Changes/llvm_cgt/llvm-project/llvm/unittests/ProfileData/CoverageMappingTest.cpp:889: Failure
Expected: 3U
Which is: 3
To be equal to: NumFuncs
Which is: 5
[ FAILED ] ParameterizedCovMapTest/CoverageMappingTest.skip_duplicate_function_record/0, where GetParam() = (false, false) (0 ms)

Do you have any suggestions on why we would start seeing above failures after your commit? Was passing before.

Thanks
Greg

Hello, we are seeing failures like below:

Note: Google Test filter = ParameterizedCovMapTest/CoverageMappingTest.skip_duplicate_function_record/0
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from ParameterizedCovMapTest/CoverageMappingTest
[ RUN ] ParameterizedCovMapTest/CoverageMappingTest.skip_duplicate_function_record/0
/scratch/build_jenkins/workspace/Downstream_Changes/llvm_cgt/llvm-project/llvm/unittests/ProfileData/CoverageMappingTest.cpp:889: Failure
Expected: 3U
Which is: 3
To be equal to: NumFuncs
Which is: 5
[ FAILED ] ParameterizedCovMapTest/CoverageMappingTest.skip_duplicate_function_record/0, where GetParam() = (false, false) (0 ms)

Do you have any suggestions on why we would start seeing above failures after your commit? Was passing before.

Thanks
Greg

That's really unexpected. What's your build configuration? Is there any way for me to reproduce this failure locally? I'd like to take a closer look.

jsji added a subscriber: jsji.EditedFeb 23 2021, 8:23 AM

Hello, we are seeing failures like below:

Note: Google Test filter = ParameterizedCovMapTest/CoverageMappingTest.skip_duplicate_function_record/0
[==========] Running 1 test from 1 test case.
[----------] Global test environment set-up.
[----------] 1 test from ParameterizedCovMapTest/CoverageMappingTest
[ RUN ] ParameterizedCovMapTest/CoverageMappingTest.skip_duplicate_function_record/0
/scratch/build_jenkins/workspace/Downstream_Changes/llvm_cgt/llvm-project/llvm/unittests/ProfileData/CoverageMappingTest.cpp:889: Failure
Expected: 3U
Which is: 3
To be equal to: NumFuncs
Which is: 5
[ FAILED ] ParameterizedCovMapTest/CoverageMappingTest.skip_duplicate_function_record/0, where GetParam() = (false, false) (0 ms)

Do you have any suggestions on why we would start seeing above failures after your commit? Was passing before.

Thanks
Greg

That's really unexpected. What's your build configuration? Is there any way for me to reproduce this failure locally? I'd like to take a closer look.

Also seeing similar failure on PowerPC RHEL buildbot, which should have reported this earilier, but failed to report due to Texas power outage...

http://lab.llvm.org:8011/#/builders/57/builds/4516/steps/7/logs/FAIL__LLVM-Unit__1

vsk added a comment.Feb 23 2021, 9:38 AM

Just skimming, haven't had a chance to look closely.

llvm/unittests/ProfileData/CoverageMappingTest.cpp
149 ↗(On Diff #324778)

assert(R->second > 0 && "Got index of compilation dir")?

165 ↗(On Diff #324778)

Ditto.

166 ↗(On Diff #324778)

Does this need a +1?

jsji added inline comments.Feb 23 2021, 3:03 PM
llvm/unittests/ProfileData/CoverageMappingTest.cpp
211 ↗(On Diff #324778)

Somehow previous value of Data.Filenames will get corrupted after current call to Reader.read.
So we will actually get garbage filenames at the end, hence all getting different hash, so treated as not duplicated.

jsji added a comment.Feb 23 2021, 9:11 PM

https://reviews.llvm.org/D97353 should fix the unitest failures.