This is an archive of the discontinued LLVM Phabricator instance.

[MC] Support writing ELFCOMPRESS_ZSTD compressed debug info sections
ClosedPublic

Authored by MaskRay on Jul 28 2022, 11:48 AM.

Diff Detail

Event Timeline

MaskRay created this revision.Jul 28 2022, 11:48 AM
MaskRay requested review of this revision.Jul 28 2022, 11:48 AM
Herald added a project: Restricted Project. · View Herald Transcript

@ckissane This also serves as an example splitting a big change like D130667 and adding accompanying tests:)

(Note: I've gradually cleaned up code and removed zlib-gnu stuff over the previous months to make the code more ready for zstd.)

dblaikie added inline comments.Jul 28 2022, 12:18 PM
llvm/lib/MC/ELFObjectWriter.cpp
861–878

Proliferation of these sort of switch statements is why I'd like to resolve the design discussions before further patches are committed. They're not hugely expensive to fix/change later, but neither are we in a rush to add this functionality, nor is the design discussion especially hard to resolve.
Code something like:

if (CompressionAlgorithm *C = getCompressionAlgorithm(MAI->compressDebugSections()) {
  ChType = C->getELFType(); // alternatively have an array to lookup ELF type from the enum type
  C->compress(Uncompressed, Compressed);
}

Seems like it'd be quite achievable without great complexity & simplify consumers.

Possibly even having 'compressDebugSections' carry the algorithm pointer directly? (since the checking's already done up-front, and the pointer would be to an immutable/thread safe/singleton/eternal object anyway)

llvm/tools/llvm-mc/llvm-mc.cpp
413–433

This could potentially be replaced by something like:

CompressionAlgorithm *C = getCompressionAlgorithm(CompressDebugSections);
if (C && !C->isAvailable()) {
  WithColor::error(errs(), ProgName) << "build LLVM with " << C->getBuildFlag() << " to enable " << CompressDebugSections.is_there_some_way_to_query_the_opts_full_spelling();
  C = nullptr;
}
MAI->setCompressDebugSections(C);

I guess the desire to get the build flag name might be enough to justify getCompressionAlgorithm returning non-null even for non-available algorithms, so it can provide that info? (unlike my previous comment/example above that implies getCompressionAlgorithm returns null for non-available algorithms)

MaskRay added inline comments.Jul 28 2022, 12:52 PM
llvm/lib/MC/ELFObjectWriter.cpp
861–878

I wish that 15.0.0 would have the llvm-objcopy change. This helps distributions which install llvm binary utilities in /usr/bin which may use newer userland llvm utilities.
MC/clang -gz support certainly can go into 16.0.0.

After this MC change, the only remaining place using DebugCompressionType is clang -gz. We don't have many ELF object producers, and the object-oriented CompressionAlgorithm design kinda makes me feel some over-engineering.

Note: in many (among the few DebugCompressionType uses) use cases isAvailable and compress are called apart.

dblaikie added inline comments.Jul 28 2022, 1:13 PM
llvm/lib/MC/ELFObjectWriter.cpp
861–878

I don't think there's any need to rush to get this into any particular LLVM release (compression isn't enabled by default in any case & I don't think this is a pressing issue, is it?).

I don't think the complexity needed to implement the sort of framing above should be too great (the current patches add a bunch of other things that I think are overly complicating it & making the comparison weight differently - which is why I'm trying to start with the syntax and work backwards to the implementation necessary to provide that interface and not more than that).

It'd be nice if they didn't have to be so separate (between isAvailable and compress) - so the invariants were expressed at the API level rather than by assertions that something was checked somewhere else.

MaskRay added inline comments.Jul 28 2022, 1:33 PM
llvm/lib/MC/ELFObjectWriter.cpp
861–878

Say, someone maintains XXXBSD and has /usr/bin llvm binary utilities (base image) from an older release. If I get a binary built from a newer machine with newer clang, I'd hope that I have a way to decompress the debug sections with /usr/bin/llvm-objcopy . Having zstd decompression support at 15.0.0 will help such a use case. Whether LLVM_ENABLE_ZSTD is enabled or not by default in the upstream doesn't really matter. A distribution typically enables many LLVM/CLANG/etc CMake variables to build a toolchain customized to their needs.

As mentioned on another thread, the proposed object-oriented design seems to have too much boilerplate. Free functions seem to work quite well. (This is just the classical expression problem and I think different people may have different opinions.)

dblaikie added inline comments.Jul 28 2022, 6:18 PM
llvm/lib/MC/ELFObjectWriter.cpp
861–878

Oh, sorry, by not enabled by default, I was thinking about clang -gz not being the default. So I guess there's 3 (maybe 4) levels of "this is not the default" - LLVM_ENABLE_ZSTD isn't enabled by default, -gz isn't clang's default, and -gz=zstd isn't the default even when compression is requested (& the 4th - -g isn't the default either, but probably has enough users that we don't count that case).

If someone gets a new LLVM-based tool that isn't a full LLVM release, it won't compress by default and it won't use zstd by default - I think it's OK if their system-installed tools don't handle that novel case (similarly we implemented -gdwarf-5 long before system debuggers could handle it, we even changed to `-gdwarf-5 by default before system gdb's could handle it, I think).

I don't think there's any need to rush this.

MaskRay added inline comments.Jul 28 2022, 9:01 PM
llvm/lib/MC/ELFObjectWriter.cpp
861–878

-gz: There are Linux distributions defaulting to -gz (probably by patching gcc driver).

-g: Many Linux distributions build packages with -g and ship the non-debug part and debug symbols in separate packages. -g is likely never a default but this argument does not affect distribution usage.

MaskRay updated this revision to Diff 458277.Sep 6 2022, 2:06 PM

rebase on D130458 (llvm-objcopy)

MaskRay marked 3 inline comments as done.Sep 6 2022, 2:08 PM
MaskRay added inline comments.
llvm/lib/MC/ELFObjectWriter.cpp
861–878

Resolving all comments since this does not catch 15.0.0 and the latest API design discussion happened in other places (D130506 and alternative patches).

I think the switch is useful as it makes clear the code intentionally supports the related compression formats.

dblaikie accepted this revision.Sep 7 2022, 4:29 PM

Sounds good

This revision is now accepted and ready to land.Sep 7 2022, 4:29 PM
This revision was landed with ongoing or failed builds.Sep 8 2022, 12:03 AM
This revision was automatically updated to reflect the committed changes.
dyung added a subscriber: dyung.Sep 8 2022, 6:47 AM

FYI one of the tests is failing on at least this bot:
https://lab.llvm.org/buildbot/#/builders/42/builds/7040

The PS5 Windows bot is also failing for the same reason:
https://lab.llvm.org/buildbot/#/builders/216/builds/9486

The error message on Windows seems to be different from what is expected.

thakis added a subscriber: thakis.Sep 8 2022, 7:47 AM

This breaks tests on Windows when building without zlib: http://45.33.8.238/win/65893/step_11.txt

Please take a look and revert for now if it takes a while to fix.