This is an archive of the discontinued LLVM Phabricator instance.

llvm-cov: __cxx_global_var_init functions disrupt coverage
AbandonedPublic

Authored by MaskRay on Aug 4 2023, 8:29 AM.

Details

Summary

My llvm-cov reports show phantom executable lines in some cases.
It's 100 reproducible.
It appears on random places (see attached images), even for unexistent lines.
My investigations show that it appears on __cxx_global_var_init* processing.

For example here is an output

    -:  397:/*------------------------------------------------------------------------------------------------*/
    -:  398:template<typename T>
#####:  399:void putGlmVec(boost::property_tree::ptree* const  parameter_node,
    -:  400:               boost::any                   const& value,
#####:  401:               std::string                  const& type)
    -:  402:/*------------------------------------------------------------------------------------------------*/
    -:  403:{

Line 401 is definitely false-positive. And here is debug data for this file for line 401

> gdb --args llvm-project/build/bin/llvm-cov gcov ./src/Settings/PersistenceAdapterSample/src/CMakeFiles/C_MXR_Settings_PersistenceAdapterSample_Base_Impl.dir/CPersistenceAdapterImpl.cpp.gcno
(gdb) b GCOV.cpp:689
(gdb) condition 1 lineNum==401
(gdb) r
(gdb) p si.filename.Data
$4 = 0x19f3020 "src/Settings/PersistenceAdapterSample/src/CPersistenceAdapterImpl.cpp"
(gdb) p f.Name
$7 = {static npos = 18446744073709551615, Data = 0x7ffff784faa0 "__cxx_global_var_init.44", Length = 24}
(gdb) p lineNum 
$8 = 401
(gdb) p f.blocks[4]->lines[0]
$29 = (unsigned int &) @0x1aa0300: 401
(gdb) p f.startLine
$9 = 0
(gdb) p f.startColumn 
$10 = 0
(gdb) p f.endLine 
$11 = 0

So, at line 401 llvm-cov found executable source code of function __cxx_global_var_init.44, while there is actually a definition of putGlmVec().
It's definitely bug. For project with ~100K lines I have many similar cases.
Probably, my patch is not the best way to fix it and there are better ways to filter such functions.
If so, please direct me.

Diff Detail

Event Timeline

tenta4 created this revision.Aug 4 2023, 8:29 AM
tenta4 requested review of this revision.Aug 4 2023, 8:29 AM
tenta4 edited the summary of this revision. (Show Details)Aug 4 2023, 11:04 AM
tenta4 edited the summary of this revision. (Show Details)Aug 6 2023, 2:40 AM

@bogner please review 🙏

If the right answer is to skip functions that don't have line info we should probably do that in Context::print where we loop over file.functions or maybe just at the beginning of collectFunction - doing this on a per block basis inside of the function doesn't really make sense.

That said, I think there needs to be a bit more investigation here, because the fact that there's a block implies we have debug information that should be meaningful. A couple of ideas for where to look to see if we can get a better idea of what's going on:

  • Do f.file and f.getFilename() make sense given what you see in si?
  • Is something going wrong in GCOVFile::readGCNO when we're handling GCOV_TAG_BLOCKS, such that we're associating a block with the wrong function or something?
  • Is the data in the .gcno file itself corrupt?
  • Is the .gcno a newer version of the format that adds new information in a way that we don't handle it?
tenta4 added a comment.EditedAug 15 2023, 9:06 AM

Thank you.
Yes, I agree that this is a wrong place for filtering.
And, furthermore it's wrong approach to filter out something weird just because we don't understand the reasons of its appearance.

I'll be able to proceed investigations on next week
Now I'll just try to answer your questions:
1 - There are the same pointers:

p f.getFilename().Data == si.filename.Data
$6 = true

2 - Is something going wrong in GCOVFile::readGCNO..? - Didn't investigate so deeper yet
3 - Is the data in the .gcno file itself corrupt? - Not sure how to check it
4 - Is the .gcno a newer version of the format that adds new information in a way that we don't handle it? - I almost sure they are compatible. I use clang-18 to generate gcno and latest llvm-project to analyze it (currently its version also 18)

MaskRay added a subscriber: MaskRay.EditedAug 15 2023, 9:45 AM

I just noticed this patch. This problem was known, see // TODO Unhandled in GCOVFile::readGCNO from 01899bb4e41178af6f4cf7b32833e661fe1e3322 (2020).

We do not implement the feature attributing a block from another file. I mitigated the problem in 406e81b79d26dae6838cc69d10a3e22635da09ef (re-landed yesterday).
We do not need mitigation on the llvm-cov gcov side.

Wow, I just checked coverage with fresh clang and it seems that problem was fixed.
Thanks @MaskRay, you saved so much of my time

bogner resigned from this revision.Aug 15 2023, 4:47 PM

Wow, I just checked coverage with fresh clang and it seems that problem was fixed.
Thanks @MaskRay, you saved so much of my time

Thanks. Then I think this patch can be abandoned. You may create a feature request to cherry pick 406e81b79d26dae6838cc69d10a3e22635da09ef to the release/17.x branch, if you think it is useful. (https://discourse.llvm.org/t/llvm-17-x-release-branch-created/72280)

Wow, I just checked coverage with fresh clang and it seems that problem was fixed.
Thanks @MaskRay, you saved so much of my time

Thanks. Then I think this patch can be abandoned. You may create a feature request to cherry pick 406e81b79d26dae6838cc69d10a3e22635da09ef to the release/17.x branch, if you think it is useful. (https://discourse.llvm.org/t/llvm-17-x-release-branch-created/72280)

Created release/17.x backport request: https://github.com/llvm/llvm-project/issues/64831

MaskRay commandeered this revision.Aug 19 2023, 3:07 PM
MaskRay abandoned this revision.
MaskRay added a reviewer: tenta4.