This is an archive of the discontinued LLVM Phabricator instance.

[Coverage] Add comment to skipped regions
ClosedPublic

Authored by zequanwu on Jul 10 2020, 3:41 PM.

Details

Summary

Bug filled here: https://bugs.llvm.org/show_bug.cgi?id=45757.
Add comment to skipped regions so we don't track execution count for lines containing only comments.

Diff Detail

Event Timeline

zequanwu created this revision.Jul 10 2020, 3:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 10 2020, 3:41 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
zequanwu edited the summary of this revision. (Show Details)Jul 10 2020, 3:53 PM
hans added a comment.Jul 13 2020, 10:40 AM

Not sure if this is good idea to untrack comments, it breaks many tests under CoverageMapping.

I didn't look, but I'm surprised there are many coverage tests that have comments in them. If the comments are not important for those tests, maybe they could be removed? As long as we keep one that tracks the intended behavior of coverage on comments.

clang/lib/Parse/Parser.cpp
37 ↗(On Diff #277157)

I don't think this is the right way to do it. It seems this callback is intended for #if macros that exclude part of the file from preprocessing.

zequanwu updated this revision to Diff 277518.Jul 13 2020, 11:52 AM
zequanwu marked an inline comment as done.

Classfiying comments as SkippedRegion, https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h#L217
This will result overlapping SkippedRegion for comments and macro. Maybe, we shuold merge the overlapped SkippedRegion at the end.
Example:

File 0, 1:12 -> 6:2 = #0
 Skipped,File 0, 2:3 -> 4:9 = 0
 Skipped,File 0, 2:15 -> 2:25 = 0
   1|      1|int main() {
   2|       |  #ifdef TEST // comment
   3|       |  int x = 1;
   4|       |  #endif
   5|      1|  return 0;
   6|      1|}
zequanwu added inline comments.Jul 13 2020, 11:52 AM
clang/lib/Parse/Parser.cpp
37 ↗(On Diff #277157)

I believe we want to classify comments as SkippedRegion, https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h#L217.
Added a new method CommentSkipped to add comment range to skipped ranges.

zequanwu added a comment.EditedJul 13 2020, 11:54 AM

Not sure if this is good idea to untrack comments, it breaks many tests under CoverageMapping.

I didn't look, but I'm surprised there are many coverage tests that have comments in them. If the comments are not important for those tests, maybe they could be removed? As long as we keep one that tracks the intended behavior of coverage on comments.

The comments in those coverage tests are used for FileCheck, like //CHECK:. So, we can't remove those ones.

zequanwu updated this revision to Diff 277538.Jul 13 2020, 12:34 PM

Merge overlapped SkippedRegion.

Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2020, 12:34 PM
vsk added a comment.Jul 13 2020, 1:34 PM

@zequanwu thanks for working on this. Instead of threading CommentSkipped through PPCallbacks, wdyt of taking advantage of the existing CommentHandler interface? To simplify things, we can add a static method like 'CoverageMappingGen::setupPreprocessorCallbacks(Preprocessor &)' and call that from CodeGenAction::CreateASTConsumer.

llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
152 ↗(On Diff #277538)

This seems like a lot of complexity to add to handle a narrow case. Is it necessary to merge skipped regions early in the process? Note that llvm's SegmentBuilder takes care of merging regions at the very end.

183 ↗(On Diff #277538)

This is stack use after free, right? This needs to release ownership of the 'MergedRegions' container.

zequanwu marked an inline comment as done.Jul 13 2020, 3:37 PM
In D83592#2148376, @vsk wrote:

taking advantage of the existing CommentHandler interface? To simplify things, we can add a static method like 'CoverageMappingGen::setupPreprocessorCallbacks(Preprocessor &)' and call that from CodeGenAction::CreateASTConsumer.

Sorry, I don't quite understand this. I am already using ActionCommentHandler to handle this one inside HandleComment.

llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
152 ↗(On Diff #277538)

Merging at here is to avoid duplicate output for option -dump-coverage-mapping. Like the following example to avoid Skipped,File 0, 2:3 -> 4:9 = 0 and Skipped,File 0, 2:15 -> 2:25 = 0 to be outputted at the same time. They should be merged into 1 region Skipped,File 0, 2:3 -> 4:9 = 0

File 0, 1:12 -> 6:2 = #0
 Skipped,File 0, 2:3 -> 4:9 = 0
 Skipped,File 0, 2:15 -> 2:25 = 0
   1|      1|int main() {
   2|       |  #ifdef TEST // comment
   3|       |  int x = 1;
   4|       |  #endif
   5|      1|  return 0;
   6|      1|}

So, we need to do it early.

vsk added a comment.Jul 13 2020, 4:01 PM
In D83592#2148376, @vsk wrote:

taking advantage of the existing CommentHandler interface? To simplify things, we can add a static method like 'CoverageMappingGen::setupPreprocessorCallbacks(Preprocessor &)' and call that from CodeGenAction::CreateASTConsumer.

Sorry, I don't quite understand this. I am already using ActionCommentHandler to handle this one inside HandleComment.

ActionCommentHandler seems a bit specific to Sema. This is what I had in mind: https://github.com/vedantk/llvm-project/commit/5101dd5dcd92742805069dba6b3d6d6846ef3755.

llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
152 ↗(On Diff #277538)

I see that there are duplicate 'Skipped' regions emitted in this case, but I'm not sure what problems this causes. For testing, we could check that both skipped regions are emitted, or perhaps only check that the outermost skipped region is emitted and ignore others. Is there some other kind of hard error (like an assert) caused by having nested skipped regions? If not, perhaps it's not worth it to merge them.

zequanwu updated this revision to Diff 277605.Jul 13 2020, 4:28 PM

Vedant, thanks for your suggestions.

  • Remove merging function.
  • Address comments.
zequanwu marked an inline comment as done.Jul 13 2020, 4:29 PM
zequanwu added inline comments.
llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
152 ↗(On Diff #277538)

No, it doesn't cause any error.

zequanwu updated this revision to Diff 277606.Jul 13 2020, 4:32 PM

Remove old code.

zequanwu updated this revision to Diff 277611.Jul 13 2020, 4:36 PM

Accidentally uploaded binary.. removed now.

vsk added a comment.Jul 13 2020, 4:49 PM

Before updating any tests, maybe it's worth doing a quick experiment with comments placed in different contexts, to see whether adding these skipped regions is really sufficient. For example, given:

1|  for (auto x : collection) {
2|    // Explain the loop.
3|  }

The loop region covers lines 1-3. If we skip the comment range on line 2, does an execution count from the loop region still get picked up? I'd expect it to. It's possible that we need more information from the preprocessor about whether the line is fully comment/whitespace-only.

zequanwu added a comment.EditedJul 13 2020, 4:57 PM
In D83592#2148833, @vsk wrote:

Before updating any tests, maybe it's worth doing a quick experiment with comments placed in different contexts, to see whether adding these skipped regions is really sufficient. For example, given:

1|  for (auto x : collection) {
2|    // Explain the loop.
3|  }

The loop region covers lines 1-3. If we skip the comment range on line 2, does an execution count from the loop region still get picked up? I'd expect it to. It's possible that we need more information from the preprocessor about whether the line is fully comment/whitespace-only.

Here is what I got:

$ clang -fcoverage-mapping -fprofile-instr-generate /tmp/a.c -Xclang -dump-coverage-mapping && ./a.out && llvm-profdata merge -sparse default.profraw -o a.profdata && llvm-cov show ./a.out -instr-profile=a.profdata
main:
  File 0, 1:12 -> 6:2 = #0
  File 0, 2:19 -> 2:25 = (#0 + #1)
  File 0, 2:27 -> 2:30 = #1
  Gap,File 0, 2:31 -> 2:32 = #1
  File 0, 2:32 -> 4:4 = #1
  Skipped,File 0, 3:5 -> 3:15 = 0
    1|      1|int main() {
    2|     11|  for (int i = 0; i < 10; i++) {
    3|       |    // comment
    4|     10|  }
    5|      1|  return 0;
    6|      1|}

For those failed test cases, they are caused by extra Skipped, File ... lines which invalidate some CHECK-NEXT.
I could either change CHECK-NEXT to CHECK or moving those checks inside function to above the function, since comments outside functions will not be tracked.

hans added a comment.Jul 14 2020, 3:58 AM

The comments in those coverage tests are used for FileCheck, like //CHECK:. So, we can't remove those ones.

Oh, I didn't think about that :-)

It's a bit unusual and annoying that the test expectations end up affecting the test output. For the comments that are not really part of the test, maybe we could do something to exclude them, like running 'sed' before compiling, or wrapping them in #if 0 blocks or something.

But figuring that out seems less important than figuring out how to implement the feature, and I think the current approach looks promising.

zequanwu updated this revision to Diff 277909.Jul 14 2020, 11:16 AM

Fix failed test cases with sed to remove comments.

vsk added a comment.EditedJul 14 2020, 12:05 PM

Could you add an end-to-end llvm-cov test (see e.g. compiler-rt/test/profile/Linux/coverage_ctors.cpp)? Here are some important cases I think we should check:

  • /* comment at the start of a line */ expr;
  • expr; /* comment at the end of a line */
  • expr; // comment at the end of a line
  • // comment spanning entire line

In all but the last case, llvm-cov should report an execution count for the line. Testing multi-line variations of those cases would also be helpful.

clang/test/CoverageMapping/break.c
1–2

Could you introduce a lit substitution in clang/test/lit.cfg.py for this (say, '%strip_comments'?), so that there's just one copy of the sed command?

zequanwu updated this revision to Diff 277986.Jul 14 2020, 2:49 PM

Add lit substitution %strip_comments -> sed 's/\/\/.*//' %s > %T/%basename_t.

zequanwu updated this revision to Diff 277988.Jul 14 2020, 2:53 PM

%strip_comments -> sed 's/[ \t]*\/\/.*//' %s > %T/%basename_t.

In D83592#2151217, @vsk wrote:

Could you add an end-to-end llvm-cov test (see e.g. compiler-rt/test/profile/Linux/coverage_ctors.cpp)? Here are some important cases I think we should check:

  • /* comment at the start of a line */ expr;
  • expr; /* comment at the end of a line */
  • expr; // comment at the end of a line
  • // comment spanning entire line

In all but the last case, llvm-cov should report an execution count for the line. Testing multi-line variations of those cases would also be helpful.

llvm-cov doesn't report the execution count for the first 3 liens... I will try to fix that.

In D83592#2151217, @vsk wrote:

Could you add an end-to-end llvm-cov test (see e.g. compiler-rt/test/profile/Linux/coverage_ctors.cpp)? Here are some important cases I think we should check:

  • /* comment at the start of a line */ expr;
  • expr; /* comment at the end of a line */
  • expr; // comment at the end of a line
  • // comment spanning entire line

In all but the last case, llvm-cov should report an execution count for the line. Testing multi-line variations of those cases would also be helpful.

llvm-cov doesn't report the execution count for the first 3 liens... I will try to fix that.

The problem is that simply marking comment regions as SkippedRegion doesn't give enough information for llvm-cov about if there is code area before or after comments in the same line.

$ clang++ -fcoverage-mapping -fprofile-instr-generate /tmp/a.cpp -Xclang -dump-coverage-mapping && ./a.out && llvm-profdata merge -sparse default.profraw -o a.profdata && llvm-cov show ./a.out -instr-profile=a.profdata -debug-only="coverage-mapping"
main:
  File 0, 1:12 -> 7:2 = #0
  Skipped,File 0, 2:3 -> 2:39 = 0
  Skipped,File 0, 3:15 -> 3:49 = 0
  Skipped,File 0, 4:15 -> 4:46 = 0
  Skipped,File 0, 5:3 -> 5:34 = 0
Counter in file 0 1:12 -> 7:2, #0
Counter in file 0 2:3 -> 2:39, 0
Counter in file 0 3:15 -> 3:49, 0
Counter in file 0 4:15 -> 4:46, 0
Counter in file 0 5:3 -> 5:34, 0
Emitting segments for file: /tmp/a.cpp
Combined regions:
  1:12 -> 7:2 (count=1)
  2:3 -> 2:39 (count=0)
  3:15 -> 3:49 (count=0)
  4:15 -> 4:46 (count=0)
  5:3 -> 5:34 (count=0)
Segment at 1:12 (count = 1), RegionEntry
Segment at 2:3 (count = 0), RegionEntry, Skipped
Segment at 2:39 (count = 1)
Segment at 3:15 (count = 0), RegionEntry, Skipped
Segment at 3:49 (count = 1)
Segment at 4:15 (count = 0), RegionEntry, Skipped
Segment at 4:46 (count = 1)
Segment at 5:3 (count = 0), RegionEntry, Skipped
Segment at 5:34 (count = 1)
Segment at 7:2 (count = 0), Skipped
    1|      1|int main() {
    2|       |  /* comment at the start of a line */ int x = 0;
    3|       |  int y = 0;  /* comment at the end of a line */
    4|       |  int z = 0;  // comment at the end of a line
    5|       |  // comment spanning entire line
    6|      1|  return 0;
    7|      1|}
    8|       |

One way I could think is probably when we visit decl in CounterCoverageMappingBuilder, check for if there is SkippedRegion in the same line and then mark the decl range as CodeRegion. For the example, we will have 3 more CodeRegion which covers the ranges of int x = 0, int y = 0 and int z = 0.

vsk added a comment.Jul 16 2020, 3:16 PM

! In D83592#2156758, @zequanwu wrote:
One way I could think is probably when we visit decl in CounterCoverageMappingBuilder, check for if there is SkippedRegion in the same line and then mark the decl range as CodeRegion. For the example, we will have 3 more CodeRegion which covers the ranges of int x = 0, int y = 0 and int z = 0.

I don't think that will work well, because a decl can span multiple lines (which in turn can include comments).

The key issue seems to be that -- even having SourceRanges for all comments -- we don't know whether a line has non-whitespace, non-comment tokens. If it does, the line should have an execution count (and we don't even need to bother with emitting a skipped region covering the line). It sounds like we need a different hook in the preprocessor that reports SourceRanges for lines with no non-comment, non-whitespace tokens. Practically, I think this can be wired up just like a CommentHandler (maybe it could be called WhitespaceHandler?). One side-benefit of introducing a new hook like this is that coverage will handle whitespace-only lines the same way as comment-only lines.

There's probably more than one way to approach this. One thing to be aware of: llvm's LineCoverageStats class is the one responsible for determining whether a line is "mapped" (i.e. whether it has an execution count). It assumes that when a line begins with a skipped region, the subsequent portion of the line must also be skipped. If we only emit skipped regions for fully whitespace-like lines, that's fine. If not, depending on how C-style comments are handled, that may need updating.

In D83592#2157130, @vsk wrote:

! In D83592#2156758, @zequanwu wrote:
One way I could think is probably when we visit decl in CounterCoverageMappingBuilder, check for if there is SkippedRegion in the same line and then mark the decl range as CodeRegion. For the example, we will have 3 more CodeRegion which covers the ranges of int x = 0, int y = 0 and int z = 0.

I don't think that will work well, because a decl can span multiple lines (which in turn can include comments).

The key issue seems to be that -- even having SourceRanges for all comments -- we don't know whether a line has non-whitespace, non-comment tokens. If it does, the line should have an execution count (and we don't even need to bother with emitting a skipped region covering the line). It sounds like we need a different hook in the preprocessor that reports SourceRanges for lines with no non-comment, non-whitespace tokens. Practically, I think this can be wired up just like a CommentHandler (maybe it could be called WhitespaceHandler?). One side-benefit of introducing a new hook like this is that coverage will handle whitespace-only lines the same way as comment-only lines.

There's probably more than one way to approach this. One thing to be aware of: llvm's LineCoverageStats class is the one responsible for determining whether a line is "mapped" (i.e. whether it has an execution count). It assumes that when a line begins with a skipped region, the subsequent portion of the line must also be skipped. If we only emit skipped regions for fully whitespace-like lines, that's fine. If not, depending on how C-style comments are handled, that may need updating.

Thanks for the reply. I will work on that.

zequanwu updated this revision to Diff 279329.Jul 20 2020, 1:04 PM

Rebase and address comments.

  • Keep track of token locations before and after comment regions by onToken in Preprocessor so when emitting SkippedRegion we can determine if the region should be emitted or not.
  • Add an end-to-end llvm-cov test case.
Herald added a project: Restricted Project. · View Herald TranscriptJul 20 2020, 1:04 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
zequanwu updated this revision to Diff 279331.Jul 20 2020, 1:09 PM

arc diff --update automatically formatted test case. Reverted.

vsk added a comment.Jul 20 2020, 2:00 PM

@zequanwu at a high level, I think this is the right approach and it looks nice overall. I do have some feedback (inline). Thanks!

clang/lib/CodeGen/CoverageMappingGen.cpp
305

Could you restructure this to avoid mutating a SpellingRegion? This can aid debugging (the unmodified struct remains available for inspection). One way to do this might be to call the function "adjustSkippedRange" and have it return an Optional<SpellingRegion>.

306

What exactly is 'Range' in updateSkippedRange? It seems like it's the {prev,next}TokLoc pair from SkippedRegion, but it's not obvious based on a cursory read. It would help to rename 'Range' to something more specific ('InnermostNonCommentRange'? 'OverlappingExecutedTokenRange'?).

clang/lib/CodeGen/CoverageMappingGen.h
50

It'd be more self-descriptive to replace the pair with some struct SkippedRange { SourceRange SkippedRange; SourceLocation PrevTokLoc, NextTokLoc; }.

51

Optional<unsigned> LastCommentIndex would be slightly more idiomatic.

54–60

Please leave short inline comments explaining the respective roles of PrevTokLoc and BeforeCommentLoc.

clang/test/lit.cfg.py
96

Bit of a nitpick, but I don't think it'll necessarily end up being more convenient to have "%T/%basename_t" hardcoded in this substitution (strawman, but: you can imagine a test that needs to strip comments from two different files). It'd be nice to be more explicit and write the run lines like:

RUN: %strip_comments %s > %t.stripped.c
RUN: clang ... %t.stripped.c

compiler-rt/test/profile/Linux/coverage_comments.cpp
1 ↗(On Diff #279331)

I don't think there's anything Linux or gold-specific about this test. Could you move the test up one directory, so we can run it on Darwin/Windows/etc.?

zequanwu updated this revision to Diff 279394.Jul 20 2020, 8:17 PM
zequanwu marked 7 inline comments as done.

Address comments.

zequanwu updated this revision to Diff 279395.Jul 20 2020, 8:22 PM

Missed something. Updated

vsk added a comment.Jul 21 2020, 11:28 AM

The testing looks really good. Just a few more requests for documentation (inline).

clang/lib/CodeGen/CoverageMappingGen.cpp
304

This could use some description of what adjustment is done, e.g. "This shrinks the skipped range if it spans a line that contains a non-comment token. If shrinking the skipped range would make it empty, this returns None."

clang/lib/CodeGen/CoverageMappingGen.h
36

It'd be helpful to leave inline documentation for these fields as well. It's clear from context that 'Range' is the skipped source range, but it's not as clear what {Prev,Next}TokLoc are.

49

Please end sentences with a '.'

54–60

How does "Location of the token parsed before HandleComment is called. This is updated every time Preprocessor::Lex lexes a new token." sound?

zequanwu updated this revision to Diff 279595.Jul 21 2020, 11:40 AM
zequanwu marked 2 inline comments as done.
zequanwu retitled this revision from [Parser] Add comment to skipped regions to [Coverage] Add comment to skipped regions.
zequanwu edited the summary of this revision. (Show Details)
  • Address comments
  • Update summary.

New tests should use [[#@LINE]] instead of [[@LINE]]

https://llvm.org/docs/CommandGuide/FileCheck.html

To support legacy uses of @LINE as a special string variable, FileCheck also accepts the following uses of @LINE with string substitution block syntax: [[@LINE]], [[@LINE+<offset>]] and [[@LINE-<offset>]] without any spaces inside the brackets and where offset is an integer.

zequanwu updated this revision to Diff 279605.Jul 21 2020, 12:09 PM

Change @LINE to # @LINE.

zequanwu updated this revision to Diff 279661.Jul 21 2020, 4:28 PM

remove unnecessary header.

vsk accepted this revision.Jul 21 2020, 5:22 PM

Thanks, lgtm.

This revision is now accepted and ready to land.Jul 21 2020, 5:22 PM
MaskRay added inline comments.Jul 21 2020, 5:43 PM
compiler-rt/test/profile/coverage_comments.cpp
10

I think you can just use CHECK-NEXT: for each line and remove duplicated code on the right side. See some recent gcov-* tests I added.

zequanwu marked an inline comment as done.Jul 21 2020, 6:02 PM
zequanwu added inline comments.
compiler-rt/test/profile/coverage_comments.cpp
10

Thanks, I removed the duplicated code when fixed a coverage test failure.

thakis added a subscriber: thakis.Jul 21 2020, 6:41 PM

Looks like this breaks tests on Windows: http://45.33.8.238/win/20315/step_7.txt

Please take a look, and if it takes a while to investigate please revert while you look.

Looks like this breaks tests on Windows: http://45.33.8.238/win/20315/step_7.txt

Please take a look, and if it takes a while to investigate please revert while you look.

Fixed

hans added a comment.Jul 22 2020, 8:11 AM

This caused asserts in Chromium's coverage builds: https://bugs.chromium.org/p/chromium/issues/detail?id=1108352
I've reverted it in the meantime (238bbd48c5a5f84deca36e5df980241578f7c1df).

It also looks like the commit landed with the wrong Phabricator review number.

(This was committed within 10 minutes after it was approved. As a large change with updates to some sensitive places like clang/Lex/Preprocessor.h, this probably should have been waited a bit longer.)

vsk added a comment.Jul 22 2020, 2:22 PM

@zequanwu I'm not sure whether this is something you've already tried, but for frontend changes, it can be helpful to verify that a stage2 clang self-host with assertions enabled completes successfully. For coverage specifically, it's possible to do that by setting '-DLLVM_BUILD_INSTRUMENTED_COVERAGE=On' during the stage2 cmake step.

In D83592#2167926, @vsk wrote:

@zequanwu I'm not sure whether this is something you've already tried, but for frontend changes, it can be helpful to verify that a stage2 clang self-host with assertions enabled completes successfully. For coverage specifically, it's possible to do that by setting '-DLLVM_BUILD_INSTRUMENTED_COVERAGE=On' during the stage2 cmake step.

I didn't know that before. I will try.

When including some standard libraries, the assertion assert(SM.isWrittenInSameFile(LocStart, LocEnd) && "region spans multiple files"); fails, like <stdlib.h> or <cmath> etc, still investigating.

I found the problem is that I was trying to merge two skipped regions when they are adjacent.
So, the following code will fail at assertion, because the skipped regions are in different files, but are merged.

$ cat a.h
// comment
$ cat a.c
#include "a.h"
// comment
int main() {}
zequanwu updated this revision to Diff 280005.EditedJul 22 2020, 7:40 PM

Fix the bug about merging skipped regions. Simple don't merge, because adjustSkippedRange will handle those cases if NextTokLoc is invalid, which will just skip the second if condition.

zequanwu reopened this revision.Jul 22 2020, 7:40 PM
This revision is now accepted and ready to land.Jul 22 2020, 7:40 PM

Just tried the cmake option -DLLVM_BUILD_INSTRUMENTED_COVERAGE=On, the result still shows count for comments... but it pass the coverage_comments.cpp test case, need more investigation.

zequanwu updated this revision to Diff 280283.EditedJul 23 2020, 4:02 PM
zequanwu edited the summary of this revision. (Show Details)

In Preprocessor.cpp, don't increment TokenCount if LexLevel == 0.

I tested this with chromium crypto_unittests. It doesn't track comments regions.

I don't why when using the cmake option -DLLVM_BUILD_INSTRUMENTED_COVERAGE=On to test coverage for clang itself, it does tracking comments.
Also, with that option on, llvm-cov crashes at assertion:llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp:578: virtual Expected<const char *> (anonymous namespace)::VersionedCovMapFuncRecordReader<llvm::coverage::Version4, unsigned long, llvm::support::little>::readCoverageHeader(const char *, const char *, BinaryCoverageReader::DecompressedData &) [Version = llvm::coverage::Version4, IntPtrT = unsigned long, Endian = llvm::support::little]: Assertion (CovMapVersion)CovHeader->getVersion<Endian>() == Version' failed.`

zequanwu marked an inline comment as done.Jul 23 2020, 4:10 PM
zequanwu added inline comments.
clang/lib/Lex/Preprocessor.cpp
973

@hans , can you take a look on this part? I saw TokenCount was introduced for a warning -Wmax-tokens.

hans added inline comments.Jul 24 2020, 2:04 AM
clang/lib/Lex/Preprocessor.cpp
973

Can you explain why things are changing here?

The idea of TokenCount is to simply count the preprocessor tokens. At this point I think you have more knowledge here than I did when I added it :)

zequanwu marked an inline comment as done.Jul 24 2020, 9:45 AM
zequanwu added inline comments.
clang/lib/Lex/Preprocessor.cpp
973

In CodeGen::CoverageMappingModuleGen::setUpCoverageCallbacks, I set OnToken to a lambda to update PrevTokLoc after lexing a new token. But, the original condition if (LexLevel == 0 && !Result.getFlag(Token::IsReinjected)) will not call OnToken if it's lexing a directive, like #if, #define etc, because the LexLevel is greater than 0. In order to update PrevTokLoc even when lexing directive, I add PreprocessToken which will be set to true in CodeGen::CoverageMappingModuleGen::setUpCoverageCallbacks just to invoke OnToken without increment TokenCount when lexing directives.

zequanwu updated this revision to Diff 280504.Jul 24 2020, 9:49 AM

Change newSR.ColumnEnd = 1 to newSR.ColumnEnd = SR.ColumnStart + 1 to ensure ColumnEnd >= ColumnStart.

hans added inline comments.Jul 24 2020, 10:18 AM
clang/lib/Lex/Preprocessor.cpp
973

Ah, I see. That sounds okay to me.

vsk added a comment.Jul 26 2020, 2:15 PM

I don't why when using the cmake option -DLLVM_BUILD_INSTRUMENTED_COVERAGE=On to test coverage for clang itself, it does tracking comments.
Also, with that option on, llvm-cov crashes at assertion:llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp:578: virtual Expected<const char *> (anonymous namespace)::VersionedCovMapFuncRecordReader<llvm::coverage::Version4, unsigned long, llvm::support::little>::readCoverageHeader(const char *, const char *, BinaryCoverageReader::DecompressedData &) [Version = llvm::coverage::Version4, IntPtrT = unsigned long, Endian = llvm::support::little]: Assertion (CovMapVersion)CovHeader->getVersion<Endian>() == Version' failed.`

Just to double-check: are you using llvm-{cov,profdata} binaries from the stage1 llvm build to produce reports for stage2 (coverage-instrumented) binaries?

In D83592#2174600, @vsk wrote:

I don't why when using the cmake option -DLLVM_BUILD_INSTRUMENTED_COVERAGE=On to test coverage for clang itself, it does tracking comments.
Also, with that option on, llvm-cov crashes at assertion:llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp:578: virtual Expected<const char *> (anonymous namespace)::VersionedCovMapFuncRecordReader<llvm::coverage::Version4, unsigned long, llvm::support::little>::readCoverageHeader(const char *, const char *, BinaryCoverageReader::DecompressedData &) [Version = llvm::coverage::Version4, IntPtrT = unsigned long, Endian = llvm::support::little]: Assertion (CovMapVersion)CovHeader->getVersion<Endian>() == Version' failed.`

Just to double-check: are you using llvm-{cov,profdata} binaries from the stage1 llvm build to produce reports for stage2 (coverage-instrumented) binaries?

I didn't config correctly before. I just tried. It works correctly now, not tracking comments.

@vsk Do you think it's ok to reland this commit?

vsk added a comment.Jul 28 2020, 12:37 PM

@zequanwu the change that dropped merging of adjacent comments looks good. If no further issues cropped up during stage2 testing, please feel free to re-land. Thanks!

This revision was automatically updated to reflect the committed changes.
Via Conduit