This is an archive of the discontinued LLVM Phabricator instance.

[Coverage] Revise format to reduce binary size
ClosedPublic

Authored by vsk on Oct 26 2019, 7:02 PM.

Details

Summary

Revise the coverage mapping format to reduce binary size by:

  1. Naming function records and marking them linkonce_odr, and
  2. Compressing filenames.

This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).

Rationale for changes to the format:

  • With the current format, most coverage function records are discarded. E.g., more than 97% of the records in llc are *duplicate* placeholders for functions visible-but-not-used in TUs. Placeholders *are* used to show under-covered functions, but duplicate placeholders waste space.
  • We reached general consensus about giving (1) a try at the 2017 code coverage BoF [1]. The thinking was that using linkonce_odr to merge duplicates is simpler than alternatives like teaching build systems about a coverage-aware database/module/etc on the side.
  • Revising the format is expensive due to the backwards compatibility requirement, so we might as well compress filenames while we're at it. This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).

See CoverageMappingFormat.rst for the details on what exactly has
changed.

Fixes PR34533 [2], hopefully.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533

Diff Detail

Event Timeline

vsk created this revision.Oct 26 2019, 7:02 PM
vsk updated this revision to Diff 226582.Oct 27 2019, 12:20 PM
  • Improve wording in CoverageMappingFormat.rst.
dblaikie added inline comments.
clang/lib/CodeGen/CoverageMappingGen.h
56–65

I'd probably skip all these & let it be a plain struct (& use {} init to construct it, as good as the ctor provided I think).

llvm/lib/Transforms/Instrumentation/InstrProfiling.cpp
74

Declaring things from other files without a common header's not ideal. (makes it easy to violate ODR - accidentally change one declaration without the other, etc)

vsk updated this revision to Diff 226828.Oct 28 2019, 6:30 PM
vsk marked 2 inline comments as done.

Declare DoInstrProfNameCompression in a shared header, address remaining feedback and rebase.

rnk added a comment.Oct 29 2019, 3:39 PM

I remember the direction from the BoF back in 2017. I'm glad you were able to find the time to work on this and that it ultimately worked out. Thanks so much!

I thought maybe you would need to do more to find .lcovfun on Windows, but I saw this comment:
https://github.com/llvm/llvm-project/blob/e73ae9a/compiler-rt/lib/profile/InstrProfilingPlatformWindows.c#L19

/* Do *NOT* merge .lprfn and .lcovmap into .rdata. llvm-cov must be able to find
 * after the fact.
 */

So, I guess it will work as is.

vsk added a comment.Nov 7 2019, 2:44 PM

Would it help review to split the NFC portion of the change in the coverage::accessors namespace into its own patch?

vsk added subscribers: hans, efriedma.Dec 2 2019, 8:22 AM

Gentle ping. I'm not in a rush to land this, but it'd help to have some feedback about whether/how this should patch should be split up.

rnk added a comment.Dec 2 2019, 4:22 PM

I'm not listed as a reviewer, but it looks good to me.

clang/lib/CodeGen/CoverageMappingGen.cpp
1331

I think this needs to be namespaced somehow, something like __covrec_0afeec...u. The meaning is more obvious, and there is a (admittedly small) risk of colliding with other random 64-bit hex symbols without a double-underscore prefix.

1372

I guess we really do want llvm.used, we really want these records to be retained in the binary by the linker even though nothing references them.

clang/test/Profile/def-assignop.cpp
21

I guess 6 was just stale/wrong.

llvm/docs/CoverageMappingFormat.rst
228

This section of the document isn't word wrapped to 80 cols, but the rest is.

246

Do you need to highlight "dummy" records? I thought all function records are linkonce_odr, so that any duplicate records can be deduplicated, not just the dummy ones for unused inline functions.

hans added a comment.Dec 3 2019, 5:14 AM

Looks good as far as I can tell. I think it can be simplified by asserting that zlib::compress doesn't fail, though.

llvm/docs/CoverageMappingFormat.rst
295

missing 'the' before "function name's"?

llvm/include/llvm/ProfileData/Coverage/CoverageMapping.h
727

s/consider/considered/

llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
39

It seems the only way this can fail is if zlib::compress fails, which should not be possible except if it runs out of memory, which we generally don't diagnose.

I think we could assert that zlib::compress succeeds, simplify the interface of this function, and remove the need for the new err_code_coverage_mapping_failed diag, as well as the compression_failed enumerator.

vsk updated this revision to Diff 232005.Dec 3 2019, 3:56 PM
vsk marked 10 inline comments as done.
vsk added reviewers: rnk, hans.
vsk removed subscribers: hans, rnk.

Address review feedback.

clang/lib/CodeGen/CoverageMappingGen.cpp
1372

Yes, this ensures that the records can be read back when generating a report. The record names can be stripped out post-link to reduce bloat in the linkedit section.

llvm/docs/CoverageMappingFormat.rst
246

I think it's helpful to highlight "dummy" records, as these motivate the linkonce_odr idea. I'll point out that all function records are linkonce_odr.

llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
39

Thanks, sgtm.

efriedma added inline comments.Dec 3 2019, 4:27 PM
llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
56

llvm::report_bad_alloc_error?

rnk accepted this revision.Dec 3 2019, 4:42 PM

I think it looks good, but I'd wait for @hans too.

BTW, after this we should remove -limited-coverage-experimental:
https://github.com/llvm/llvm-project/blob/bbc328c/clang/lib/CodeGen/CodeGenModule.cpp#L67

llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
572–574

While working on CodeView, I think we came up with a good code pattern that could be used here to make this easier to read and less error prone. Check out llvm::codeview::consume:
https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/DebugInfo/CodeView/RecordSerialization.h#L64

Ignore the use of BinaryStreamReader, that exists to abstract over discontiguous inputs, and isn't relevant to this code here.

You could make a helper that looks like this and use it in various places to avoid all the reinterpret_casts:

template <typename T>
bool consumeObject(StringRef &Buf, const T *&Obj) {
  if (Buf.size() < sizeof(T))
    return false;
  Obj = reinterpret_cast<const T*>(Buf.data());
  Buf = Buf.drop_front(sizeof(T));
  return true;
}

And then instead of dealing in pointers and pointer arithmetic, you slice up StringRef buffers into sub buffers. It becomes much harder to create array OOB bugs.

This is just a suggestion for future work, it seems out of scope for this patch.

llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
53–56

I think the shorter pattern for this is cantFail(zlib::compress(...)).

This revision is now accepted and ready to land.Dec 3 2019, 4:42 PM
hans accepted this revision.Dec 4 2019, 4:01 AM

lgtm

vsk marked 3 inline comments as done.Dec 4 2019, 9:49 AM
vsk added inline comments.
llvm/lib/ProfileData/Coverage/CoverageMappingReader.cpp
572–574

Thanks for the suggestion, I'll keep this in mind for future refactors.

llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
53–56

Both sound reasonable, I'll pick the shorter spelling.

vsk updated this revision to Diff 232168.Dec 4 2019, 9:53 AM
vsk marked an inline comment as done.
This revision was automatically updated to reflect the committed changes.
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 4 2019, 10:19 AM
Herald added subscribers: Restricted Project, cfe-commits. · View Herald Transcript
efriedma added inline comments.Dec 4 2019, 2:18 PM
llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
53–56

cantFail and report_bad_alloc_error have substantially different semantics. In particular, for non-Asserts builds, cantFail is a no-op, which doesn't seem appropriate for an API that fails on out-of-memory.

vsk reopened this revision.Dec 4 2019, 3:05 PM
vsk marked an inline comment as done.

I reverted this due to failures on Windows that I did not encounter in local testing. I suspect that there's an error in the coverage parsing logic, as the same binary coverage data parses successfully locally, but not on some armv7 bots.

llvm/lib/ProfileData/Coverage/CoverageMappingWriter.cpp
53–56

I see, I'll revise this to use report_bad_alloc_error in the next update.

This revision is now accepted and ready to land.Dec 4 2019, 3:05 PM
vsk added a comment.Dec 11 2019, 3:11 PM

Sorry for the delay here. An update: I tested this patch with a sanitized build on a Linux install, but could not reproduce the "malformed coverage data" errors seen on the Windows bots. At this point, it seems likely that there is a bug in CoverageMappingReader (this would explain why the .covmapping testing bundles fail to parse), and it may be that there is a separate issue on the producer side (this could explain why the 'instrprof-merging.cpp' test from check-profile failed). I think I need a Windows set up to debug further, and will try to find one. Meanwhile, if anyone has the bandwidth to attempt to reproduce on Windows, and to share a backtrace from where the CoverageMapError constructor is called, I would greatly appreciate it.

rnk added a comment.Dec 11 2019, 6:16 PM

I have a Windows build directory and am motivated to debug this. I'll try to do it tomorrow, but I have a couple of other deadlines so I can't make a very firm promise.

rnk added a comment.Feb 14 2020, 3:16 PM

I haven't forgotten about this, even though it's been two months. Hopefully I get to it soon.

vsk updated this revision to Diff 244790.Feb 14 2020, 4:09 PM

Rebase.

Sorry this hasn't seen any update: at this point in our release cycle I've had to put this on the back burner.

rnk added a comment.Feb 19 2020, 4:30 PM

Compiling with -fdump-record-layouts revealed the problem:

*** Dumping AST Record Layout
         0 | struct llvm::coverage::CovMapFunctionRecordV3
         0 |   struct llvm::coverage::accessors::FuncHashAndDataSize<struct llvm::coverage::CovMapFunctionRecordV3> (base) (empty)
         1 |   struct llvm::coverage::accessors::HashedNameRef<struct llvm::coverage::CovMapFunctionRecordV3> (base) (empty)
         1 |   const int64_t NameRef
         9 |   const uint32_t DataSize
        13 |   const uint64_t FuncHash
        21 |   const uint64_t FilenamesRef
        29 |   const char CoverageMapping
           | [sizeof=30, align=1,
           |  nvsize=30, nvalign=1]

Everything is off-by-one because the empty bases are not zero sized. The MSVC record layout algorithm is just different in this area. =/

So, I think this patch would be fine if you refactor it to avoid the accessor classes. I took a stab at it, but it's not straightforward.

In D69471#1883912, @rnk wrote:

Everything is off-by-one because the empty bases are not zero sized. The MSVC record layout algorithm is just different in this area. =/

Do all the MSVCs we support building with support __declspec(empty_bases)?
https://devblogs.microsoft.com/cppblog/optimizing-the-layout-of-empty-base-classes-in-vs2015-update-2-3/

So, I think this patch would be fine if you refactor it to avoid the accessor classes. I took a stab at it, but it's not straightforward.

Another option is to chain the accessor classes:

template <class T, class Base = void> struct MyAccessor1 : Base { /* ... */ };
template <class T> struct MyAccessor1<void> { /* ... */ };
template <class T, class Base = void> struct MyAccessor2 : Base { /* ... */ };
template <class T> struct MyAccessor2<void> { /* ... */ };

Then you can:

struct MyFormat : MyAccessor1<MyFormat, MyAccessor2<MyFormat>> { /* ... */ };
rnk added a comment.Feb 20 2020, 2:17 PM
In D69471#1883912, @rnk wrote:

Everything is off-by-one because the empty bases are not zero sized. The MSVC record layout algorithm is just different in this area. =/

Do all the MSVCs we support building with support __declspec(empty_bases)?
https://devblogs.microsoft.com/cppblog/optimizing-the-layout-of-empty-base-classes-in-vs2015-update-2-3/

Yes, we can use it. Clang supports it too.


However, you might want this class to follow the rules for a standard layout type:
http://www.cplusplus.com/reference/type_traits/is_standard_layout/
" ... has .... at most one base class with non-static data members, or has no base classes with non-static data members."
So the use of multiple inheritance makes a type non-standard layout. Nevermind that MSVC accepts the following program:

#include <type_traits>
struct EmptyA {};
struct EmptyB {};
struct Foo : EmptyA, EmptyB { int x, y; };
static_assert(std::is_standard_layout<Foo>::value, "asdf");

Why not use free function templates like these?

template <typename T> uint64_t getFoo(T *p) { return p->Foo; }
template <typename T> uint64_t getBar(T *p) { return p->Bar; }

These accessors don't seem like they have to be methods. It actually makes them a bit more awkward to use:

uint32_t DataSize = CFR->template getDataSize<Endian>();
vsk planned changes to this revision.Feb 20 2020, 2:31 PM

@rnk Thanks for chasing this down. I'll update the function record structs to use free functions instead of multiple inheritance.

I don't plan on getting rid of the awkward method calls at this point. The coverage reader is still templated by CovMapFunctionRecordX via CovMapTraits, so we'd need to untangle that first.

vsk updated this revision to Diff 245756.EditedFeb 20 2020, 3:16 PM

Get rid of multiple inheritance in the coverage::accessors namespace. This is ninja check-{llvm, clang, profile} clean on macOS. I'll wait on this until next week and re-land it unless any objections / new issues pop up.

This revision is now accepted and ready to land.Feb 20 2020, 3:16 PM
rnk accepted this revision.Feb 20 2020, 3:34 PM

Sounds like a plan, thanks!

This revision was automatically updated to reflect the committed changes.