Page MenuHomePhabricator

[Coverage] Add empty line regions to SkippedRegions
ClosedPublic

Authored by zequanwu on Jul 30 2020, 3:53 PM.

Details

Summary

Add a hook to track empty line regions when lexing.

Diff Detail

Event Timeline

zequanwu created this revision.Jul 30 2020, 3:53 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJul 30 2020, 3:53 PM
zequanwu requested review of this revision.Jul 30 2020, 3:53 PM
zequanwu updated this revision to Diff 285811.EditedAug 14 2020, 8:55 PM
  • Rebase.
  • Merging adjcent skipped regions, if the regions are in the same file, could reduce binary size.
  • The new empty lines regions break existing coverage tests, and don't know a good way to fix those. (probably add a null statement ; at each empty line?)
  • For stage2 build of clang, I didn't notice a performance slowdown this time.
Herald added a project: Restricted Project. · View Herald TranscriptAug 14 2020, 8:55 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
zequanwu edited the summary of this revision. (Show Details)Aug 14 2020, 8:58 PM
zequanwu added a reviewer: arphaman.
vsk added a comment.Aug 17 2020, 10:03 AM

Hi @zequanwu, are you looking for review for this patch? I wasn't sure because of the WIP label.

zequanwu retitled this revision from WIP [Coverage] Add empty line regions to SkippedRegions to [Coverage] Add empty line regions to SkippedRegions.Aug 17 2020, 10:09 AM
In D84988#2221805, @vsk wrote:

Hi @zequanwu, are you looking for review for this patch? I wasn't sure because of the WIP label.

Yes, I removed the label.

vsk added inline comments.Aug 17 2020, 10:30 AM
clang/include/clang/Lex/Lexer.h
131

Could you leave a comment describing what this is?

clang/include/clang/Lex/Preprocessor.h
957

const

clang/lib/CodeGen/CoverageMappingGen.cpp
336

It looks like this assumes there is some guarantee that the skipped range (as given by SR) is in the same file as {Prev,Next}TokLoc. If that's correct, can we go ahead and assert that?

clang/lib/Lex/Lexer.cpp
3290

Is NewLinePtr supposed to point to the start of the most recent newline sequence? For "\r\n", is it supposed to be "\r<NewLinePtr>\n" or "<NewLinePtr>\r\n"?

llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
483–498

Why is this deletion necessary? Do we need to introduce 0-length regions which alter the count? An example would help.

595–597

I don't think this relaxation is correct, since it permits duplicate segments. This is confusing for reporting tools because it's not possible to work out which segment applies at a given source location.

zequanwu updated this revision to Diff 286123.Aug 17 2020, 1:09 PM
zequanwu marked 3 inline comments as done.
zequanwu edited the summary of this revision. (Show Details)

Address comments.

clang/lib/CodeGen/CoverageMappingGen.cpp
336

Oh, it's a bug in https://reviews.llvm.org/D83592. There is no guarantee that they are in the same file.

clang/lib/Lex/Lexer.cpp
3290

I didn't consider this. Updated.
NewLinePtr is supposed to point to the '\n' character. For "\r\n", it will point to '\n'.

llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
483–498

Because a single empty line will be a 0 length region. I don't know why is this condition necessary before. Does zero-length region exists before this change?

example:

int main() {

  return 0;
}

Before, llvm-cov gives the following.

Counter in file 0 1:12 -> 4:2, #0
Counter in file 0 2:1 -> 2:1, 0
Emitting segments for file: /tmp/a.c
Combined regions:
  1:12 -> 4:2 (count=1)
  2:1 -> 2:1 (count=0)
Segment at 1:12 (count = 1), RegionEntry
Segment at 2:1 (count = 0), RegionEntry, Skipped
Segment at 4:2 (count = 0), Skipped
    1|      1|int main() {
    2|       |
    3|       |    return 0;
    4|       |}

After:

Counter in file 0 1:12 -> 4:2, #0
Counter in file 0 2:1 -> 2:1, 0
Emitting segments for file: /tmp/a.c
Combined regions:
  1:12 -> 4:2 (count=1)
  2:1 -> 2:1 (count=0)
Segment at 1:12 (count = 1), RegionEntry
Segment at 2:1 (count = 0), RegionEntry, Skipped
Segment at 2:1 (count = 1)
Segment at 4:2 (count = 0), Skipped
    1|      1|int main() {
    2|       |
    3|      1|    return 0;
    4|      1|}
595–597

I don't remember why I made this change. Reverting it seems nothing changed.

zequanwu added inline comments.Aug 17 2020, 1:12 PM
llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
595–597

Oh, since single empty line will be a 0 length regions. L.Col could equal to R.Col.

vsk added inline comments.Aug 18 2020, 12:34 PM
llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
483–498

It looks like we do occasionally see 0-length regions, possibly due to bugs in the frontend (http://lab.llvm.org:8080/coverage/coverage-reports/coverage/Users/buildslave/jenkins/workspace/coverage/llvm-project/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp.html#L485).

I don't expect the reporting tools to be able to handle duplicate segments in a robust way. Having duplicate segments was the source of "messed up" coverage bugs in the past, due to the contradiction inherent in having two different segments begin at the same source location.

Do you see some other way to represent empty lines? Perhaps, if we emit a skipped region for an empty line, we can emit a follow-up segment that restores the previously-active region starting on the next line? So in this case:

Segment at 1:12 (count = 1), RegionEntry
Segment at 2:1 (count = 0), RegionEntry, Skipped
Segment at 3:1 (count = 1)
Segment at 4:2 (count = 0), Skipped

zequanwu updated this revision to Diff 286622.Aug 19 2020, 11:38 AM

Add a wrapped segment at location of zero-length segment with last pushed region count.

zequanwu added inline comments.Aug 19 2020, 11:41 AM
llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
483–498

I think we should have the following, because the wrapped segment's count will be used in next line (e.g. line 3).

Segment at 1:12 (count = 1), RegionEntry
Segment at 2:1 (count = 0), RegionEntry, Skipped
Segment at 2:1 (count = 1)
Segment at 4:2 (count = 0), Skipped
zequanwu updated this revision to Diff 286688.Aug 19 2020, 5:03 PM

minor fix.

Friendly ping.

vsk added inline comments.Sep 1 2020, 8:14 AM
llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
483–498

I think one issue with that output is that it contains two segments that start at the same location (2:1). Historically, this sort of duplication was a source of bugs/inconsistencies (e.g two entry regions beginning at the same location with different counts), and I’m concerned that re-allowing multiple segments with the same start location could lead to regressions down the road.

OTOH, your change is consistent with how non-zero length segments are handled, and it could be fragile to look for an alternative start location (like 3:1) that restores the count from before the skipped region.

I’d be curious to know your thoughts on how to prevent regressions related to segments which share the same start location. Maybe they could only be allowed in this limited case?

zequanwu updated this revision to Diff 289582.Sep 2 2020, 3:05 PM

Emit second segment as wrapped segment only when first segment is RegionEntry

zequanwu added inline comments.Sep 2 2020, 3:11 PM
llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
483–498

Yes, they are two segments with the same location, but only the first is a RegionEntry (might not cause the same bug you saw before). So the second one is only used as wrapped segment, and will not conflict with first segment RegionEntry.
We could only emit the second segment when the first segment is skipped segment.

vsk added a comment.Sep 2 2020, 4:59 PM

I'm not as familiar with the preprocessor bits, but this looks like it's in good shape.

llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
483–498

That carveout sounds reasonable to me.

595–597

Could you keep the original check and simply 'continue' if L.Col == R.Col && L.IsSkipped?

zequanwu updated this revision to Diff 289617.EditedSep 2 2020, 6:20 PM

address comment.
emit second segment when first segment is skipped and activeregions is not empty.
I don't know who can be added as reviewers for the preprocessor part

zequanwu added inline comments.Sep 2 2020, 6:33 PM
llvm/lib/ProfileData/Coverage/CoverageMapping.cpp
483–498

Since this broke a unit test, which has 5 zero-length regions points to same location, I think it's reasonable only emitting the second segment when active regions is not empty.

595–597

if (L.Line == R.Line && L.Col == R.Col && !L.HasCount) is more specific.

vsk accepted this revision.Sep 9 2020, 12:35 PM

I took a closer look at the lexer changes, and think that these look fine. Thanks again for working on this. LGTM with a minor change, described inline.

clang/lib/Lex/Lexer.cpp
2230

It'd be more consistent to write this the way lastNewLine is initialized earlier in SkipWhitespace, e.g. as:

if (*CurPtr == '\n') {
  lastNewLine = CurPtr;
  if (!NewLinePtr)
    NewLinePtr = CurPtr;
}

Maybe this could even be factored into a helper lambda, like setLastNewLine(char *Ptr) -- I'm not sure whether that'd be cleaner.

This revision is now accepted and ready to land.Sep 9 2020, 12:35 PM

Thanks for reviewing. One last thing I need to resolve is to handle the coverage test case as skipped regions are emitted for empty lines, and haven't thought a good way other than adding checks for those new skipped regions.

vsk added a comment.Sep 9 2020, 2:04 PM

Thanks for reviewing. One last thing I need to resolve is to handle the coverage test case as skipped regions are emitted for empty lines, and haven't thought a good way other than adding checks for those new skipped regions.

How does it sound to add a testing-only cl::opt that disables whitespace skipped regions? Tests that aren't focused on validating skipped regions can set this cl::opt.

zequanwu updated this revision to Diff 291118.Sep 10 2020, 8:07 PM
  • Address comment.
  • Add cl::opt to disable emitting skipped regions for empty lines and comments (used on test case only), and update test cases under CoverageMapping with -mllvm -emptyline-comment-coverage=false.
  • Remove the logic which only emits skipped regions for comments and empty lines parsing function body, because can't tell if it's inside class member function.
  • Always set SR.ColumnStart to 1 in adjustSkippedRange.
vsk added inline comments.Sep 14 2020, 11:29 AM
clang/lib/CodeGen/CoverageMappingGen.cpp
38

We might consider marking this as cl::Hidden.

zequanwu updated this revision to Diff 291630.Sep 14 2020, 11:40 AM

Rename option EmptyLineCoverage to EmptyLineCommentCoverage and mark it cl::Hidden.

zequanwu marked an inline comment as done.Sep 14 2020, 11:40 AM

Friendly ping.

vsk added a comment.Sep 18 2020, 1:50 PM

@zequanwu FTR, I don't have any outstanding concerns (I understand you might be asking for a second reviewer to chime in though).

In D84988#2282849, @vsk wrote:

@zequanwu FTR, I don't have any outstanding concerns (I understand you might be asking for a second reviewer to chime in though).

Thanks for reviewing. Then, I think it might be ready to land.

This revision was automatically updated to reflect the committed changes.