Page MenuHomePhabricator

Parse section ranges when verifying DWARF so we can exclude addresses that should have been stripped from DWARF.
Needs ReviewPublic

Authored by clayborg on Mon, Jun 29, 9:58 PM.

Details

Summary

This change gets all of the section address ranges with executable permissions and registers these valid .text ranges with the verifier. This allows the verifier to ignore any address ranges that should have been stripped so they don't cause false overlapping errors. This is meant as a starting point to a solution for https://bugs.llvm.org/show_bug.cgi?id=46453. Hopefully we can discuss the pros and cons of this approach and see if this approach is worth persuing.

Diff Detail

Unit TestsFailed

TimeTest
70 mslinux > LLVM.DebugInfo/X86::template.ll
Script: -- : 'RUN: at line 2'; /mnt/disks/ssd0/agent/llvm-project/build/bin/llc -mtriple=x86_64-linux -O0 -filetype=obj < /mnt/disks/ssd0/agent/llvm-project/llvm/test/DebugInfo/X86/template.ll | /mnt/disks/ssd0/agent/llvm-project/build/bin/llvm-dwarfdump -v -debug-info - | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/DebugInfo/X86/template.ll
30 mslinux > LLVM.tools/llvm-dwarfdump/X86::verify_die_ranges.s
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/bin/llvm-mc /mnt/disks/ssd0/agent/llvm-project/llvm/test/tools/llvm-dwarfdump/X86/verify_die_ranges.s -filetype obj -triple x86_64-apple-darwin -o - | /mnt/disks/ssd0/agent/llvm-project/build/bin/not /mnt/disks/ssd0/agent/llvm-project/build/bin/llvm-dwarfdump -verify - | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/tools/llvm-dwarfdump/X86/verify_die_ranges.s
9,320 mslinux > libomp.env::kmp_set_dispatch_buf.c
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -I /mnt/disks/ssd0/agent/llvm-project/openmp/runtime/test -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -I /mnt/disks/ssd0/agent/llvm-project/openmp/runtime/test/ompt /mnt/disks/ssd0/agent/llvm-project/openmp/runtime/test/env/kmp_set_dispatch_buf.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/test/env/Output/kmp_set_dispatch_buf.c.tmp -lm -latomic && env KMP_DISP_NUM_BUFFERS=0 /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/test/env/Output/kmp_set_dispatch_buf.c.tmp
1,610 mslinux > libomp.worksharing/for::kmp_set_dispatch_buf.c
Script: -- : 'RUN: at line 1'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -I /mnt/disks/ssd0/agent/llvm-project/openmp/runtime/test -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -I /mnt/disks/ssd0/agent/llvm-project/openmp/runtime/test/ompt /mnt/disks/ssd0/agent/llvm-project/openmp/runtime/test/worksharing/for/kmp_set_dispatch_buf.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/test/worksharing/for/Output/kmp_set_dispatch_buf.c.tmp -lm -latomic && /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/test/worksharing/for/Output/kmp_set_dispatch_buf.c.tmp 7

Event Timeline

clayborg created this revision.Mon, Jun 29, 9:58 PM

This approach is very verbose at the moment so we can see each time an address range is ignored. The idea would be to omit the messages unless "--verbose" is supplied, but for now a warning message is emitted each time a DIE has one or more invalid ranges just so we can see everything that happens. If this approach works, I will make a full set of tests.

I'm probably not best-placed to review this, as I haven't really used the verifier in practice, partly because of the exact problem this is trying to solve. In principle, I quite like not needing to use specific tombstone values here, but I wonder whether a "dead" range that isn't a tombstone value should itself be a verification failure, regardless of where it ends up?

Another point: there can be dead data too, and this approach shouldn't be text specific. After all, you can have -fdata-sections leading to discarded data via --gc-sections just as easily as dead text sections from -ffunction-sections.

I'm probably not best-placed to review this, as I haven't really used the verifier in practice, partly because of the exact problem this is trying to solve. In principle, I quite like not needing to use specific tombstone values here, but I wonder whether a "dead" range that isn't a tombstone value should itself be a verification failure, regardless of where it ends up?

Can you give an example of what you are saying above? We can easily ignore tombstone values (-1, -2) in addition to what we are doing here. Zero is more problematic and more common from what I have seen, and that is what this patch really helps weed out. At address zero in a ELF file is right at the ELF header, so not many shared libraries actually have functions with a virtual address of zero from what I have seen.

Another point: there can be dead data too, and this approach shouldn't be text specific. After all, you can have -fdata-sections leading to discarded data via --gc-sections just as easily as dead text sections from -ffunction-sections.

This is definitely possible. I don't think we do any verification on data right now since there isn't much we can do other than emit a warning that the DIE should have been stripped.

I added the extra "ObjectFile *Obj" argument to the DIContext::verify(...) function currently, though I wonder if the DwarfContext should get this information when it is constructed. It might be handy to have the text ranges available so that standard DWARF queries on the DWARFContext could filter out stripped information when the user looks up information in the DWARF. This could be extended to data sections as well. Thoughts?

I'm probably not best-placed to review this, as I haven't really used the verifier in practice, partly because of the exact problem this is trying to solve. In principle, I quite like not needing to use specific tombstone values here, but I wonder whether a "dead" range that isn't a tombstone value should itself be a verification failure, regardless of where it ends up?

Can you give an example of what you are saying above? We can easily ignore tombstone values (-1, -2) in addition to what we are doing here. Zero is more problematic and more common from what I have seen, and that is what this patch really helps weed out. At address zero in a ELF file is right at the ELF header, so not many shared libraries actually have functions with a virtual address of zero from what I have seen.

I mean a range that isn't referenced by any function/data in the final output. For example, if I had the range [0x20, 0x30) in my .debug_ranges, but that range is outside the assigned addresses for the program (which might be, say [0x4000, 0x5000) and [0x10000, 0x20000), could it be a failure? There's no overlapping involved, but the range isn't useful and indicates something might have gone wrong.

aprantl added inline comments.Wed, Jul 1, 9:36 AM
llvm/include/llvm/DebugInfo/DIContext.h
233

Perhaps add a Doxygen comment that explains what the extra argument is for?

How does everyone feel about using this approach? If this looks good to people, I will add doxygen comments and then add tests. But I wanted to get some feedback prior to proceeding.

I think maybe this is sort of orthogonal to 46453... maybe not, but kind of.

Seems like we should filter out known-tombstoned ranges (the only ones we can know for sure are the new -1/-2 tombstones - all the others have ambiguities). Then we should maybe flag maybe-tombstones with a little "eh, maybe?". Then we should warn for anything left that's even partially outside the .text range (this patch), then we should warn for overlaps/etc on the remaining ones?

But as @jhenderson said, maybe those first ones come later & we use the .text range to determine which things to look at for overlap first, then add new verifier checks for "things outside .text that aren't clearly tombstoned" knowing that some of those are expected limitations of (at least gold's) previous tombstoning strategies.

(I'd sort of like to avoid actually looking at the object's executable sections - but I can't really fault the strategy & even if we added all the other verifier checks/warnings/etc, it'd still be super reasonable to warn about ranges that are otherwise totally valid, but extend beyond/are entirely outside the actual executable .text)

I think maybe this is sort of orthogonal to 46453... maybe not, but kind of.

Seems like we should filter out known-tombstoned ranges (the only ones we can know for sure are the new -1/-2 tombstones - all the others have ambiguities). Then we should maybe flag maybe-tombstones with a little "eh, maybe?". Then we should warn for anything left that's even partially outside the .text range (this patch), then we should warn for overlaps/etc on the remaining ones?

So for this patch, anything that isn't in text becomes a warning and not an error? Or do we want to add an option to "llvm-dwarfdump --verify" to enforce the text ranges as feature that is disabled by default? --ignore-invalid-text-ranges?

But as @jhenderson said, maybe those first ones come later & we use the .text range to determine which things to look at for overlap first, then add new verifier checks for "things outside .text that aren't clearly tombstoned" knowing that some of those are expected limitations of (at least gold's) previous tombstoning strategies.

(I'd sort of like to avoid actually looking at the object's executable sections - but I can't really fault the strategy & even if we added all the other verifier checks/warnings/etc, it'd still be super reasonable to warn about ranges that are otherwise totally valid, but extend beyond/are entirely outside the actual executable .text)

Since zero is so prevalent, it is nice to get that noise out of the error checking since it creates so many false errors at the moment. It makes the --verify option less useful and way too noisy if we don't do something. We can also just not do the .text ranges for object files since they typically have relocations on each address. We already avoid looking at ranges in many cases for .o files.

I think maybe this is sort of orthogonal to 46453... maybe not, but kind of.

Seems like we should filter out known-tombstoned ranges (the only ones we can know for sure are the new -1/-2 tombstones - all the others have ambiguities). Then we should maybe flag maybe-tombstones with a little "eh, maybe?". Then we should warn for anything left that's even partially outside the .text range (this patch), then we should warn for overlaps/etc on the remaining ones?

So for this patch, anything that isn't in text becomes a warning and not an error? Or do we want to add an option to "llvm-dwarfdump --verify" to enforce the text ranges as feature that is disabled by default? --ignore-invalid-text-ranges?

I think my goal was to suggest implementing filter known-tombstones first (now we have a good/known tombstone) so that "is not in .text" doesn't unduly warn on correctly tombstoned ranges/addresses (honestly bfd's tombstoning should be fairly good - since it creates empty ranges at least in debug_ranges that don't use base address selection entries - . Then we could maybe warn or error to varying degrees on the things in the middle (not certainly tombstoned, not entirely in .text... )

Sorry, didn't mean to muddy the waters with "warning V error" discussion or need to add more flags, etc - folks who implemented/have more ownership over "verify" should chime in on this, but for myself - yeah, I think I'm coming around to "let's just ignore anything that's even partially outside .text for now" & eventually maybe someone implements the specific tombstone support - and then we warn/error/something on "it's not tombstone, but it's outside .text" which would be a separate issue & a problem, even if it's non-overlapping. Then only the "is in .text" bits would be tested for overlapping.

But as @jhenderson said, maybe those first ones come later & we use the .text range to determine which things to look at for overlap first, then add new verifier checks for "things outside .text that aren't clearly tombstoned" knowing that some of those are expected limitations of (at least gold's) previous tombstoning strategies.

(I'd sort of like to avoid actually looking at the object's executable sections - but I can't really fault the strategy & even if we added all the other verifier checks/warnings/etc, it'd still be super reasonable to warn about ranges that are otherwise totally valid, but extend beyond/are entirely outside the actual executable .text)

Since zero is so prevalent, it is nice to get that noise out of the error checking since it creates so many false errors at the moment. It makes the --verify option less useful and way too noisy if we don't do something. We can also just not do the .text ranges for object files since they typically have relocations on each address. We already avoid looking at ranges in many cases for .o files.

Yup.

In theory all this stuff should be supported for object files too.

llvm/include/llvm/DebugInfo/DWARF/DWARFVerifier.h
365

(non-member static shouldn't be used in headers - I fixed the op< to not do this)

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
743

Not sure why this uses UndefSection - be nice to support this in object files and track distinct .text sections. Might be that pre-building a table doesn't suit that strategy - not sure.

(& maybe even non-.text sections, in case someone decides to use attribute((section(".my_section"))) for their functions, for instance)