This is an archive of the discontinued LLVM Phabricator instance.

[DWP][DWARF] Detect and error on debug info offset overflow
ClosedPublic

Authored by ayermolo on Jul 22 2022, 1:11 PM.

Details

Summary

Right now we silently overflow uint32_t for debug_indfo sections. Added a check
and error out.

Diff Detail

Event Timeline

ayermolo created this revision.Jul 22 2022, 1:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2022, 1:11 PM
ayermolo requested review of this revision.Jul 22 2022, 1:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 22 2022, 1:11 PM

Could /maybe/ test this with an assembly test, but it'd still have to generate something like a 4GB file... which probably isn't a great idea. Maybe have a test that's checked in and can be run manually, but I guess no one would ever know it was there/know to run it.

But could you copy/paste some terminal execution that shows this failing/erroring correctly?

(& I guess you plan to add an error for the string offset overflow too?)

Could /maybe/ test this with an assembly test, but it'd still have to generate something like a 4GB file... which probably isn't a great idea. Maybe have a test that's checked in and can be run manually, but I guess no one would ever know it was there/know to run it.

But could you copy/paste some terminal execution that shows this failing/erroring correctly?

Yeah I wasn't sure how to test it. Checking in 4GB test seemed excessive.

Base case:

[~/local/bzip2_DF] ~/local/llvm-build-upstream-release/bin/llvm-dwp -e bzip2 -o bzip2.dwp
[~/local/bzip2_DF] ~/local/llvm-build-upstream-release/bin/llvm-readelf --sections bzip2.dwp
There are 10 section headers, starting at offset 0x14ad8:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .strtab           STRTAB          0000000000000000 014a40 000091 00      0   0  1
  [ 2] .debug_loclists.dwo PROGBITS      0000000000000000 000040 009a92 00   E  0   0  1
  [ 3] .debug_abbrev.dwo PROGBITS        0000000000000000 009ad2 0010cb 00   E  0   0  1
  [ 4] .debug_rnglists.dwo PROGBITS      0000000000000000 00ab9d 0004e4 00   E  0   0  1
  [ 5] .debug_str.dwo    PROGBITS        0000000000000000 00b081 001833 01 MSE  0   0  1
  [ 6] .debug_str_offsets.dwo PROGBITS   0000000000000000 00c8b4 0010fc 00   E  0   0  1
  [ 7] .debug_info.dwo   PROGBITS        0000000000000000 00d9b0 006e50 00   E  0   0  1
  [ 8] .debug_cu_index   PROGBITS        0000000000000000 014800 000224 00      0   0  1
  [ 9] .symtab           SYMTAB          0000000000000000 014a28 000018 18      1   1  8
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  R (retain), l (large), p (processor specific)

Error case:

[~/local/ErrorCase] ~/local/llvm-build-upstream-release/bin/llvm-dwp -e error_binary -o error_binary.dwp
error: debug information section offset is greater than 4GB

Regarding our discussion https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902. Basically nothing can be done at this point, beyond waiting for DWARF6 spec?

I can add error for string (different diff), but I don't have a test case that will trigger it.

Could /maybe/ test this with an assembly test, but it'd still have to generate something like a 4GB file... which probably isn't a great idea. Maybe have a test that's checked in and can be run manually, but I guess no one would ever know it was there/know to run it.

But could you copy/paste some terminal execution that shows this failing/erroring correctly?

Yeah I wasn't sure how to test it. Checking in 4GB test seemed excessive.

Base case:

[~/local/bzip2_DF] ~/local/llvm-build-upstream-release/bin/llvm-dwp -e bzip2 -o bzip2.dwp
[~/local/bzip2_DF] ~/local/llvm-build-upstream-release/bin/llvm-readelf --sections bzip2.dwp
There are 10 section headers, starting at offset 0x14ad8:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .strtab           STRTAB          0000000000000000 014a40 000091 00      0   0  1
  [ 2] .debug_loclists.dwo PROGBITS      0000000000000000 000040 009a92 00   E  0   0  1
  [ 3] .debug_abbrev.dwo PROGBITS        0000000000000000 009ad2 0010cb 00   E  0   0  1
  [ 4] .debug_rnglists.dwo PROGBITS      0000000000000000 00ab9d 0004e4 00   E  0   0  1
  [ 5] .debug_str.dwo    PROGBITS        0000000000000000 00b081 001833 01 MSE  0   0  1
  [ 6] .debug_str_offsets.dwo PROGBITS   0000000000000000 00c8b4 0010fc 00   E  0   0  1
  [ 7] .debug_info.dwo   PROGBITS        0000000000000000 00d9b0 006e50 00   E  0   0  1
  [ 8] .debug_cu_index   PROGBITS        0000000000000000 014800 000224 00      0   0  1
  [ 9] .symtab           SYMTAB          0000000000000000 014a28 000018 18      1   1  8
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  R (retain), l (large), p (processor specific)

Error case:

[~/local/ErrorCase] ~/local/llvm-build-upstream-release/bin/llvm-dwp -e error_binary -o error_binary.dwp
error: debug information section offset is greater than 4GB

Cool, thanks.

Regarding our discussion https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902. Basically nothing can be done at this point, beyond waiting for DWARF6 spec?

I can add error for string (different diff), but I don't have a test case that will trigger it.

Yeah, followed up on that thread. There's no real user-extension point in the DWARF spec around these fields/records/sections, so I don't immediately see a way to extend things (certainly can't be backwards compatible - it'd mean creating indexes that couldn't be read by other consumers, but I guess that's not the worst thing since the current alternative is not being able to produce something that can be read by any consumer). I'd guess we could invent a "custom" DWARF version (like pick version 200) for the index with the new layout that allows specifying DWARF64, etc. Or we could use a different section name. .debug_{cu,tu}_llvm_index and have llvm-dwp produce that when it'd otherwise overflow and teach lldb to be able to read that.

Could /maybe/ test this with an assembly test, but it'd still have to generate something like a 4GB file... which probably isn't a great idea. Maybe have a test that's checked in and can be run manually, but I guess no one would ever know it was there/know to run it.

But could you copy/paste some terminal execution that shows this failing/erroring correctly?

Yeah I wasn't sure how to test it. Checking in 4GB test seemed excessive.

Base case:

[~/local/bzip2_DF] ~/local/llvm-build-upstream-release/bin/llvm-dwp -e bzip2 -o bzip2.dwp
[~/local/bzip2_DF] ~/local/llvm-build-upstream-release/bin/llvm-readelf --sections bzip2.dwp
There are 10 section headers, starting at offset 0x14ad8:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .strtab           STRTAB          0000000000000000 014a40 000091 00      0   0  1
  [ 2] .debug_loclists.dwo PROGBITS      0000000000000000 000040 009a92 00   E  0   0  1
  [ 3] .debug_abbrev.dwo PROGBITS        0000000000000000 009ad2 0010cb 00   E  0   0  1
  [ 4] .debug_rnglists.dwo PROGBITS      0000000000000000 00ab9d 0004e4 00   E  0   0  1
  [ 5] .debug_str.dwo    PROGBITS        0000000000000000 00b081 001833 01 MSE  0   0  1
  [ 6] .debug_str_offsets.dwo PROGBITS   0000000000000000 00c8b4 0010fc 00   E  0   0  1
  [ 7] .debug_info.dwo   PROGBITS        0000000000000000 00d9b0 006e50 00   E  0   0  1
  [ 8] .debug_cu_index   PROGBITS        0000000000000000 014800 000224 00      0   0  1
  [ 9] .symtab           SYMTAB          0000000000000000 014a28 000018 18      1   1  8
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  R (retain), l (large), p (processor specific)

Error case:

[~/local/ErrorCase] ~/local/llvm-build-upstream-release/bin/llvm-dwp -e error_binary -o error_binary.dwp
error: debug information section offset is greater than 4GB

Cool, thanks.

Regarding our discussion https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902. Basically nothing can be done at this point, beyond waiting for DWARF6 spec?

I can add error for string (different diff), but I don't have a test case that will trigger it.

Yeah, followed up on that thread. There's no real user-extension point in the DWARF spec around these fields/records/sections, so I don't immediately see a way to extend things (certainly can't be backwards compatible - it'd mean creating indexes that couldn't be read by other consumers, but I guess that's not the worst thing since the current alternative is not being able to produce something that can be read by any consumer). I'd guess we could invent a "custom" DWARF version (like pick version 200) for the index with the new layout that allows specifying DWARF64, etc. Or we could use a different section name. .debug_{cu,tu}_llvm_index and have llvm-dwp produce that when it'd otherwise overflow and teach lldb to be able to read that.

! In D130395#3677034, @dblaikie wrote:

! In D130395#3676909, @ayermolo wrote:

! In D130395#3675026, @dblaikie wrote:

Could /maybe/ test this with an assembly test, but it'd still have to generate something like a 4GB file... which probably isn't a great idea. Maybe have a test that's checked in and can be run manually, but I guess no one would ever know it was there/know to run it.

But could you copy/paste some terminal execution that shows this failing/erroring correctly?

Yeah I wasn't sure how to test it. Checking in 4GB test seemed excessive.

Base case:

[~/local/bzip2_DF] ~/local/llvm-build-upstream-release/bin/llvm-dwp -e bzip2 -o bzip2.dwp
[~/local/bzip2_DF] ~/local/llvm-build-upstream-release/bin/llvm-readelf --sections bzip2.dwp
There are 10 section headers, starting at offset 0x14ad8:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .strtab           STRTAB          0000000000000000 014a40 000091 00      0   0  1
  [ 2] .debug_loclists.dwo PROGBITS      0000000000000000 000040 009a92 00   E  0   0  1
  [ 3] .debug_abbrev.dwo PROGBITS        0000000000000000 009ad2 0010cb 00   E  0   0  1
  [ 4] .debug_rnglists.dwo PROGBITS      0000000000000000 00ab9d 0004e4 00   E  0   0  1
  [ 5] .debug_str.dwo    PROGBITS        0000000000000000 00b081 001833 01 MSE  0   0  1
  [ 6] .debug_str_offsets.dwo PROGBITS   0000000000000000 00c8b4 0010fc 00   E  0   0  1
  [ 7] .debug_info.dwo   PROGBITS        0000000000000000 00d9b0 006e50 00   E  0   0  1
  [ 8] .debug_cu_index   PROGBITS        0000000000000000 014800 000224 00      0   0  1
  [ 9] .symtab           SYMTAB          0000000000000000 014a28 000018 18      1   1  8
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  R (retain), l (large), p (processor specific)

Error case:

[~/local/ErrorCase] ~/local/llvm-build-upstream-release/bin/llvm-dwp -e error_binary -o error_binary.dwp
error: debug information section offset is greater than 4GB

Cool, thanks.

Regarding our discussion https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902. Basically nothing can be done at this point, beyond waiting for DWARF6 spec?

I can add error for string (different diff), but I don't have a test case that will trigger it.

Yeah, followed up on that thread. There's no real user-extension point in the DWARF spec around these fields/records/sections, so I don't immediately see a way to extend things (certainly can't be backwards compatible - it'd mean creating indexes that couldn't be read by other consumers, but I guess that's not the worst thing since the current alternative is not being able to produce something that can be read by any consumer). I'd guess we could invent a "custom" DWARF version (like pick version 200) for the index with the new layout that allows specifying DWARF64, etc. Or we could use a different section name. .debug_{cu,tu}_llvm_index and have llvm-dwp produce that when it'd otherwise overflow and teach lldb to be able to read that.

That will be a big change. It's not only lldb, but also all other llvm tools: dwarfdump, profgen, gsymutil, etc. Let's see if we can get unblocked internally by some other way, but good to know it's not a complete dead end. :)
For this diff, can this land so at least it doesn't succeed silently?

dblaikie accepted this revision.Jul 25 2022, 3:24 PM

That will be a big change. It's not only lldb, but also all other llvm tools: dwarfdump, profgen, gsymutil, etc.

Yep

Let's see if we can get unblocked internally by some other way, but good to know it's not a complete dead end. :) For this diff, can this land so at least it doesn't succeed silently?

Yeah

This revision is now accepted and ready to land.Jul 25 2022, 3:24 PM
ayermolo updated this revision to Diff 447497.Jul 25 2022, 3:35 PM

rebase + clangformat

This revision was landed with ongoing or failed builds.Jul 26 2022, 8:20 AM
This revision was automatically updated to reflect the committed changes.