This is an archive of the discontinued LLVM Phabricator instance.

[DebugInfo] Allow parsing line tables aligned to 4 or 8-byte boundaries
ClosedPublic

Authored by benmxwl-arm on Feb 7 2023, 9:24 AM.

Details

Summary

This allows the DWARFDebugLine::SectionParser to try parsing line tables
at 4 or 8-byte boundaries if the unaligned offset appears invalid. If
aligning the offset does not reduce errors the offset is used unchanged.

This is needed for llvm-dwarfdump to be able to extract the line tables
(with --debug-lines) from binaries produced by certain compilers that
like to align each line table in the .debug_line section. Note that this
alignment does not seem to be invalid since the units do point to the
correct line table offsets via the DW_AT_stmt_list attribute.

Diff Detail

Event Timeline

benmxwl-arm created this revision.Feb 7 2023, 9:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2023, 9:24 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
benmxwl-arm requested review of this revision.Feb 7 2023, 9:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 7 2023, 9:24 AM

Currently this patch is using a heuristic due to the following comment, and because the unit tests currently don't create the associated units for the line tables.

// We want to supply the Unit associated with a .debug_line[.dwo] table when
// we dump it, if possible, but still dump the table even if there isn't a Unit.
// Therefore, collect up handles on all the Units that point into the
// line-table section.

Corrected error handling

Any thoughts on this or suggestions for reviewers?

probinson added a subscriber: probinson.

+debug-info tag.

It's true that DWARF doesn't require the per-unit .debug_line chunks to be adjacent, which makes life difficult for dumpers who want to read just the .debug_line section (without parsing .debug_info headers). Debuggers with an edit-and-continue feature can dynamically replace bits of the line table, though, and in those cases padding is handy. That would usually be a bigger lump than just alignment, though. Just curious, what compiler(s) are aligning line-tables?

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
1546

Just curious, what compiler(s) are aligning line-tables?

This was something noticed with binaries produced by the ARM C/C++ compiler. It aligns each line table entry and pads out the section size, which currently confuses llvm-dwarfdump.

Sorry for letting this sit so long! Code looks okay, still needs a test. For something like this you probably need an assembler source with a hand-made .debug_line section.

Added test case for aligned line tables (took a little while to get round to!)

benmxwl-arm marked an inline comment as done.Mar 3 2023, 9:05 AM
probinson accepted this revision.Mar 3 2023, 9:54 AM

Tidy up the test and LGTM.

llvm/test/tools/llvm-dwarfdump/ARM/aligned_line_tables.s
14

This CHECK-NOT is unnecessary. If you've matched the 28 offset, that shows you've found the correct place.

17

Again this CHECK-NOT is unnecessary. You could modify the previous CHECK line to end with 3a{{$}} to emphasize that nothing follows it.

20

Same thing here; make the previous check 2{{$}} (so it won't match something like 256) and remove the CHECK-NOT.

This revision is now accepted and ready to land.Mar 3 2023, 9:54 AM
jhenderson added inline comments.Mar 6 2023, 12:32 AM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
1508

We know we're in the DWARFDebugLine class, so we can simplify this function name somewhat. Perhaps to headerHasValidVersion or even simply hasValidVersion.

1514–1517

I'm not convinced you should be throwing away the error here. At the very least, there should be a comment explaining why it is reasonable to do so.

1539–1540
1549–1550

Can you use alignTo to do this alignment?

1555

Does this return imply that the PossibleAlignments must be in ascending order? That's fine for now, but I wonder if it's a bit brittle (I guess you could argue that if there's not enough space for alignment, it's not going to be big enough for a line table header, so it's moot, but if you want to rely on that, it should at least be explained in the comment, in case something changes in the future).

llvm/test/tools/llvm-dwarfdump/ARM/aligned_line_tables.s
5–7

It seems a little odd to be mixing comment marker styles (# and //). In newer tests, I tend to go with # for lit directives (like you have done) and ## for real comments.

23

This CHECK-NOT will be quite fragile - if somebody changes the warning wording for whatever reason, it will fail to catch the warning, should it be emitted. Instead, I'd just do CHECK-NOT: warning:.

Address various review comments.

(I'm still using // in the test since llvm-mc with the arm-none-eabi triple does not accept
\# or \#\# as a comment).

benmxwl-arm marked 9 inline comments as done.Mar 8 2023, 9:43 AM

(I'm still using // in the test since llvm-mc with the arm-none-eabi triple does not accept
\# or \#\# as a comment).

I'm confused. The RUN lines and CHECK lines are using # as a comment introducer; if llvm-mc won't accept that, why does it work?

It seems like they are removed by the test runner before passing the file to llvm-mc (but it only removes lines that start with a directive it recognises), though I can't point to precisely where this happens.
I'll just change the test to use // for the checks too for consistency.

MaskRay added a comment.EditedMar 8 2023, 1:20 PM

# is a universal comment marker at the line beginning (ignoring whitespace). If there are non-space tokens before the comment, use // (for arm-* @ works as well but I don't think it is common).
// is a universal comment marker in GNU assembler.

% clang --target=arm-linux-gnueabi -c a.s  # aarch64 is similar
a.s:1:10: error: unexpected token
.long 5  # Unit
MaskRay added inline comments.Mar 8 2023, 1:29 PM
llvm/test/tools/llvm-dwarfdump/ARM/aligned_line_tables.s
25

Replace the magic number with a label difference.

You can define two temporary labels .L and use a subtraction. See how Clang generates the section.

27

Use a label difference.

MaskRay added inline comments.Mar 8 2023, 2:16 PM
llvm/test/tools/llvm-dwarfdump/ARM/aligned_line_tables.s
55

.long 58 is at offset 40, which is a multiple of 8 (also a multiple of 4).
I suggest that you create at least 3 RUN lines to cover the multiple-of-4, multiple-of-8, not-multiple-of-4 cases.

.ifdef XXX
.asciz "test.c"
.else
.asciz "testxxxx.c"  // appropriate number of padding bytes
.endif

If you do this, my earlier comment about using a label difference will show its advantage...

MaskRay added inline comments.Mar 8 2023, 2:19 PM
llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
1548

Certain compilers => ARM C/C++ Compiler.

Be more specific when available information is useful. It helps decide whether the code is a workaround and when it can be removed.

You can adjust the previous comment or this comment to say that we are testing 4 and 8 as possible alignments.

It seems like they are removed by the test runner before passing the file to llvm-mc (but it only removes lines that start with a directive it recognises), though I can't point to precisely where this happens.

I am 99% certain that this is incorrect. The %s that is used for the llvm-mc test input is simply a lit substitution which expands to the current file's path. In other words, llvm-mc sees the entire test file verbatim, with no modification.

I'll just change the test to use // for the checks too for consistency.

This is fine from my point of view.

llvm/lib/DebugInfo/DWARF/DWARFDebugLine.cpp
1539–1540

You only did half my suggestion. The "Otherwise" should be the start of a new sentence.

  • Tidied up FileChecks
  • Use labels for unit/header lengths
  • Test 8 byte multiple offset, 4 (not 8) byte multiple offset, and invalid padding
  • Applied various suggestions
benmxwl-arm marked 5 inline comments as done.Mar 9 2023, 11:38 AM
MaskRay added inline comments.Mar 10 2023, 12:40 PM
llvm/test/tools/llvm-dwarfdump/ARM/aligned_line_tables.s
86

The number of padding will be fragile if the above directives are changed.

You may use llvm-mc ... --save-temp-labels to retain .L* symbols, then add // RUN: llvm-nm %t.o | FileCheck ... to test the st_value of .Ltable0_end. This will clearly show the needed padding bytes.

  • Verify values of .Ltable_end0 using llvm-nm
benmxwl-arm marked 2 inline comments as done.Mar 13 2023, 11:39 AM
jhenderson added inline comments.Mar 17 2023, 1:04 AM
llvm/test/tools/llvm-dwarfdump/ARM/aligned_line_tables.s
14–16

FWIW, I'd use //// here, much like I'd use ## to distinguish it from test code. Same goes with other test comments later on.

20

Here and throughout, can you use CHECK-NEXT (well, MULT4-NEXT etc)?

30

Why are you specifically testing that there is no warning at this specific point in the prologue printing?

50

I recommend you match the full warning message here. Otherwise, it might match another warning that isn't relevant. Also, it helps readers to know what warning you are actually testing for.

Updated test to use CHECK-NEXT and clearer comment styles.
(There was no reason the CHECK-NOT: warning: was where it was other than a misconception on my part)

P.s. Thanks for all the feedback on this test here; I've not written many manual file checks before,
so it's been very helpful in learning the system.

benmxwl-arm marked 4 inline comments as done.Mar 20 2023, 5:34 AM
jhenderson added inline comments.Mar 21 2023, 2:12 AM
llvm/test/tools/llvm-dwarfdump/ARM/aligned_line_tables.s
19

So, in case it wasn't clear, this will only check that there is no warning between the two CHECK lines that are used either side of it. If you want to check that no warning appears in the output at all, you're better off adding --implicit-check-not='warning:' to the FileCheck command-line.

20

Not required, but as FileCheck by default doesn't treat whitespace as significant, when I have a line like this followed by -NEXT lines, I add a few extra spaces between the end of the CHECK and the actual to-be-matched data, so that everything lines up. In this case, I'd do:

// MULT4:      Address            Line   Column File   ISA Discriminator Flags

Make use of --implicit-check-not='warning:' and some minor formatting changes.

benmxwl-arm marked 2 inline comments as done.Mar 21 2023, 6:30 AM
jhenderson accepted this revision.Mar 21 2023, 6:49 AM

Looks good.

MaskRay accepted this revision.Mar 21 2023, 10:45 PM
MaskRay added inline comments.
llvm/test/tools/llvm-dwarfdump/ARM/aligned_line_tables.s
14–16

/// to distinguish non-CHECK-non-RUN comments should be fine :)

MatzeB added a subscriber: MatzeB.Mar 22 2023, 11:54 AM
MatzeB added inline comments.
llvm/test/tools/llvm-dwarfdump/ARM/aligned_line_tables.s
1

This test needs something like // REQUIRES: arm-registered-target

MaskRay added inline comments.Mar 22 2023, 12:56 PM
llvm/test/tools/llvm-dwarfdump/ARM/aligned_line_tables.s
1

Thanks. I added lit.local.cfg to exclude the test for non-ARM builds instead.