This is an archive of the discontinued LLVM Phabricator instance.

[DWARF5] Added support for emission of debug_macro section.
ClosedPublic

Authored by SouraVX on Jan 16 2020, 2:55 AM.

Details

Summary

This patch adds support for emission of following DWARFv5 macro forms in debug_macro section.

  1. DW_MACRO_start_file
  2. DW_MACRO_end_file
  3. DW_MACRO_define_strp
  4. DW_MACRO_undef_strp.

Diff Detail

Event Timeline

SouraVX created this revision.Jan 16 2020, 2:55 AM
SouraVX edited the summary of this revision. (Show Details)Jan 16 2020, 2:56 AM

Unit tests: fail. 61911 tests passed, 1 failed and 783 were skipped.

failed: LLVM.MC/WebAssembly/debug-info.ll

clang-tidy: unknown.

clang-format: pass.

Build artifacts: diff.json, clang-format.patch, CMakeCache.txt, console-log.txt, test-results.xml

aprantl added inline comments.Jan 16 2020, 8:48 AM
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1187

this check is oddly redundant?

2772

///

SouraVX updated this revision to Diff 238532.Jan 16 2020, 9:44 AM

Thank you @aprantl , for reviewing this.
Addressed minor review comment by @aprantl

SouraVX marked 3 inline comments as done.Jan 16 2020, 9:46 AM
SouraVX added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
1187

Oh Sorry, that must've crept up/left while segregating the dwarf-dump and this patch.

Please implement the dwarfdump support first (this patch can be reviewed during that time - but shouldn't be committed without tests & ideally those tests should be written using dwarfdump (sometimes there are exceptions for especially complicated dumping situations, where raw assembly testing is used instead, but I don't think it's necessary to make that tradeoff here))

& I'm surprised this patch causes updates to the tests - what changes in the patch caused changes in these tests? (stringpool and dbg-stack-value-range.mir) - ah, because debug_str now needs to be emitted after debug_macro because the macro section might introduce new strings? OK. Could you separate that ordering change/test update out into a standalone patch. It'll make the rest of this more targeted when reviewing.

llvm/lib/BinaryFormat/Dwarf.cpp
464–475 ↗(On Diff #238448)

Build this with Dwarf.def rather than writing it out explicitly. Something like this:

return StringSwitch<unsigned>(MacroString)
#define HANDLE_DW_MACRO(ID, NAME)
  .Case("DW_MACRO_" #NAME, ID)
#include "llvm/BinaryFormat/Dwarf.def"
  .Default(DW_MACINFO_invalid);
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2779

This should probably be 4, not 3, since it's a bit field column. (if you tried to | in OPCODE_OPERANDS_TABLE of 3 into the table, you'd overwrite bits 1 and 2 (changing the value of offset_size and debug_line_offset, rather than specifying the value of opcode_operands))

2785–2786

This looks to be backwards - since OFFSET_SIZE_32 is zero, these two lines don't have any effect (& 32 bit is the default - if the bit is zero).

So it should /probably/ be:

if (offset byte size == 8)
  Flags |= OFFSET_SIZE_32;
2787–2788

Seems this line offset is optional and the DWARF spec doesn't mention what it might be useful for. Do you have any specific use in mind? Otherwise I'd leave it out. (@aprantl @probinson )

2789–2790

I don't think this is a FIXME situation - LLVM has no extensions here, so there's no need for the opcode_operands_table.

2819–2827

Perhaps refactor this to only have "AddComment / EMitULEB128" written once, but parameterized on the enum, eg:

unsigned Type = M.getMacinfoType() == dwarf::DW_MACINFO_define ? dwarf::DW_MACRO_define_strp : dwarf::DW_MACRO_undef_strp;
Asm->OutStreamer->AddComment(dwarf::MacroString(Type));
Asm->EmitULEB128(Type);
2868

Parsing constant strings back into constants is a bit suboptimal (both in terms of compile time performance, chance of mistakes (the string is checked at runtime whereas a reference to an enum would be checked at compile time)) & is curiously inverted compared to the line before - which uses an enum, then converts that to a string for the comment. There are a few other cases of this - in general, don't write string constants like this due to the lack of error/type checking, prefer using the enums & calling stringifying functions if needed.

It'd be nice to simplify this code a bit in general - the fact that this adds comments to the new debug_macro paths that are then missing from the debug_macinfo paths, etc, isn't ideal.

I'm wondering if it'd make sense to instead phrase this whole function parameterized on the type of enums, so it'd be implemented a bit like this:

if (UseMacro) {
  emitMacroFileImpl(F, U, dwarf::DW_MACRO_start_file, dwarf::DW_MACRO_end_file, dwarf::MacroString);
} else {
  emitMacroFileImpl(F, U, dwarf::DW_MACINFO_start_file, dwarf::DW_MACINFO_end_file, dwarf::MacinfoString);
}

So the implementation doesn't have conditionals in several places, keeps it consistent, etc.

2895–2899

This might be more legible using the conditional (?:) operator to pass the specific debug section - so it's obvious to the reader that that's the only difference (that they're not calling subtly differently named functions, etc - just calling the same function with a different argument), eg:

auto &ObjLower = Asm->getObjFileLowering();
emitDebugMacinfoImpl(getDwarfVersion() >= 5 ? ObjLower.getDwarfMacroSection() : ObjLower.getDwarfMacInfoSection());
SouraVX marked 5 inline comments as done.Jan 16 2020, 11:32 AM

Please implement the dwarfdump support first (this patch can be reviewed during that time - but shouldn't be committed without tests & ideally those tests should be written using dwarfdump (sometimes there are exceptions for especially complicated dumping situations, where raw assembly testing is used instead, but I don't think it's necessary to make that tradeoff here))

I'm trying to get that patch ready. As for this, this patch doesn't have dependency on dwarfdump except that testing the section contents part. I've verified integrity of section content using objdump and lldb/gdb. My rationale behind this was this patch can get in, till we discuss and agree on dwarfdump extension to accommodate macro support.

& I'm surprised this patch causes updates to the tests - what changes in the patch caused changes in these tests? (stringpool and dbg-stack-value-range.mir) - ah, because debug_str now needs to be emitted after debug_macro because the macro section might introduce new strings? OK. Could you separate that ordering change/test update out into a standalone patch. It'll make the rest of this more targeted when reviewing.

Yes, that' s intentional. I've shifted the string emission to post macro emission to put macro content in string section. BTW are you suggesting here to do that change/patch commit separately ??

Thanks David for reviewing/sharing your thoughts on this.
Refactoring suggestion, I will address those in next revision.

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2785–2786

I may be wrong here. Are you trying to suggest this ??

if (offset byte size == 8)
 Flags |= OFFSET_SIZE_64;

I think we only need that in DWARF64 case, which this patch does not support.
So for consistency/readibility, I intentionally set OFFSET_byte_size_32 that even that has no effect.
Should we leave it as it is OR remove it and add comment that in default case DWARF32 Offset size will be 32.

2787–2788

Yes, One crucial thing I noticed while verifying section content is that, consumers objdump/lldb/gdb agrees on this representation and expect offset of debug_line section within macro section. So if we omit that, I'm afraid we might loose debuggability.

probinson added inline comments.Jan 16 2020, 12:50 PM
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2785–2786

If you're not supporting 64-bit offsets (which is entirely reasonable) you should have an assert() that the offset size is 4.

2787–2788

@dblaikie The reference to .debug_line is required for DW_MACRO_start_file, because that opcode has an operand that is a file number from the line-number program's file table. It's optional if that opcode is not used.

@SouraVX it is inconsistent to conditionalize setting the flag bit, and then unconditionally emit the debug_line_offset field. I have no problem with always emitting the field, but in that case you should unconditionally set the flag bit as well.

2875

DWARF does not define symbols for the flag bits, but I'd still expect to see these in Dwarf.def where they could be shared with the dumper code.
Also, OPCODE_OPERANDS_TABLE should be 4 not 3; these are masks, not bit numbers.

llvm/test/DebugInfo/X86/stringpool.ll
43 ↗(On Diff #238532)

This captures the string offset into YYYY, but never uses the captured value. The old version of the test did verify the offset was used, presumably somewhere in the __debug_info section. So, this change has lost coverage.

SouraVX updated this revision to Diff 238723.Jan 17 2020, 2:43 AM
SouraVX marked 2 inline comments as done.

Thanks @probinson for sharing thoughts.
Refactored code based on @dblaikie review
Addressed and implemented @probinson concerns

SouraVX marked 8 inline comments as done.Jan 17 2020, 2:48 AM
SouraVX added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2875

Thanks for sharing this, I've added forms in Dwarf.def and exapanded the enum here.
Will try to reuse this in corresponding dumping part.

llvm/test/DebugInfo/X86/stringpool.ll
43 ↗(On Diff #238532)

Thanks for pointing this out.
Added corresponding debug info variable. Now we have back reference to info section from str section.

SouraVX marked 2 inline comments as done.Jan 17 2020, 2:56 AM
SouraVX added inline comments.
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2920

Since my emitMacroFileImpl causes the comments to be emitted in asm for both macinfo and macro section which is good. But this is sort of incomplete, since we're not emitting comments for macinfo type.
I'm planning to do a small factor on macinfo part also. To emit macro form comments for DW_MACINFO_define and others.
Making asm more readeable.

My comments have been addressed, but leaving final approval to @dblaikie.

Corresponding llvm-dwarfdump revision D73086

Gentle, friendly Ping! to all reviewers!
Child Revision - D73086

SouraVX updated this revision to Diff 240430.Jan 26 2020, 4:34 AM
SouraVX added a reviewer: jini.susan.george.

Comments emission in ASM for macinfo section. completed! + Rebase.

Gentle/Friendly Ping! @dblaikie @aprantl and all other reviewers!

lgtm if David is happy.

lgtm if David is happy.

Thank You, @aprantl, for your acceptance. I agree, that part of macro implementation needs refactoring/In my opinion/. That was the reason, I was delaying and created separate patch for llvm-dwarfdump D73086. So that, even if we disagree/want changes, for a better implementation of dumping part.
We can still have emission part i.e this patch in LLVM {/tested/beneficial from consumer perspective GDB or LLDB/}

Ping!
Hi @dblaikie, Could you please share your thoughts/acceptance on this. All other reviewers had shared their acceptance.

This doesn't seem to be tested - the couple of test updates are to keep them passing after changing the ordering of sections - ideally that change (changing the ordering of sections, so strings can come after macros) should be committed separately (just change the order and update the few tests that need it) & explicit/full test coverage for this feature should be added and should use llvm-dwarfdump (so this patch should come after the one that adds functionality to llvm-dwarfdump for dumping this section).

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2916

Could this be Name + ' ' + Value? (perhaps it could be (Name + ' ' + Value).str() if it needs to be a string in the end - that would be a bit more efficient than the current code that stringifies the Name and Value separately)

2929

Probably skip the "= + Value" part here (the same isn't done for the "Macro String"/name above - because the string itself is probably legible in the assembly? So the description is useful, but the value itself will be evident in the assembly without needing to be repeated in the comment)

2945

Rename "Func" to a more informative name - "MacToString", perhaps?

SouraVX updated this revision to Diff 243084.Feb 6 2020, 11:02 PM

Incorporated @dblaikie comments!
Committed string emission part separately.
Incorporated, other minor comments.

SouraVX marked an inline comment as done.Feb 6 2020, 11:03 PM

Ping this when the llvm-dwarfdump support for debug_macro is in and this patch has been updated with test coverage using that support.

Hi @dblaikie , it's already in pipeline. I mentioned in this revision. Corresponding dwarfdump emission D73086.

ikudrin added inline comments.
llvm/include/llvm/BinaryFormat/Dwarf.def
830 ↗(On Diff #243084)

Remove OFFSET_SIZE_32, as this value does not define any real flag.

831 ↗(On Diff #243084)

This should be just OFFSET_SIZE to correspond to the other flags.

833 ↗(On Diff #243084)

The value for OPCODE_OPERANDS_TABLE should be 0x04. It looks like @dblaikie already said that for one of previous versions of the patch.

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2879

This crashes on a Debug build, probably because DwarfParams is not initialized.

SouraVX updated this revision to Diff 243764.Feb 11 2020, 1:33 AM
SouraVX marked 4 inline comments as done.

Incorporated @ikudrin comments,
Removed a redundant check and a FIXME:Not emitting opcode_operands_table

SouraVX added inline comments.Feb 11 2020, 3:04 AM
llvm/include/llvm/BinaryFormat/Dwarf.def
830 ↗(On Diff #243084)

Yes, theirs no particular flag in Spec. Spec says about "offset_size_flag" and how it's value is interpreted
0 - DWARF32 1 - DWARF64. I created these flags for readable/informative purpose.

833 ↗(On Diff #243084)

Sorry for the trouble, I forgot to replace this with "0x03". Since, their was a discussion to keep this or not in first place because we are not emitting opcode_operands_table, So this flag is not used, at least right now.

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2879

Oh.., Yeah I also noticed this, even after initalizing dwarf::FormParams DwarfParams({5, 8, DWARF32}).
Still
assert(DwarfParams.getDwarfOffsetByteSize() == 4 && "DWARF64 not supported!"); not on release build.
this is failing, some how getDwarfOffsetByteSize() is not returning correctly.
Anyway I found this check to be a bit redundant in it's own way. So removed it.

ikudrin requested changes to this revision.Feb 18 2020, 10:11 PM
ikudrin added inline comments.
llvm/include/llvm/BinaryFormat/Dwarf.def
830 ↗(On Diff #243084)

This is false readability. You still have to remember that the value is not real and treat it accordingly. There is no distinct value for DWARF32 in the specs. The value of 0 means that the contribution has DWARF32 format AND does not have a debug_line_offset field AND does not have an opcode_operands_table. Moreover, you do not define pairs of values for the other flags, which makes all the definition inconsistent. So, please, remove that fictitious value and make the set of flags follow the documentation.

This revision now requires changes to proceed.Feb 18 2020, 10:11 PM
SouraVX updated this revision to Diff 248413.Mar 5 2020, 2:12 AM

Revised and Addressed comments by @ikudrin.
Changes summary -
-No dependency on D73086, just the Macro flags defined in Dwarf.def are reused here while writing macro header flags field.
-Contents verification has been done using llvm-dwarfdump generated from D73086.

SouraVX marked 2 inline comments as done.Mar 5 2020, 2:15 AM

While this can be applied to master cleanly, it cannot be built, because it requires D73086; it applies after D73086 with some conflicts. Please, prepare a diff which will be applied after D73086 cleanly and add D73086 as a parent revision to show the dependency.

llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2882

Flags = MACRO_FLAG_DEBUG_LINE_OFFSET;

While this can be applied to master cleanly, it cannot be built, because it requires D73086; it applies after D73086 with some conflicts. Please, prepare a diff which will be applied after D73086 cleanly and add D73086 as a parent revision to show the dependency.

Hi @ikudrin and other reviewers, How about committing Dwarf.def and Dwarf.h changes/Mostly Flag encodings/ and stringify encoding functionalties as separate commit. That will resolve most of false dependencies. In my opinion that part adding those flags and encoding is independent of these patches(for instance, most of other encodings/flags described by dwarfv5 spec are their, even when LLVM/consumer might not be using them at present) and should be committed separately.
Any thoughts on this?

SouraVX updated this revision to Diff 248688.Mar 6 2020, 3:30 AM

Addressed nits by @ikudrin

While this can be applied to master cleanly, it cannot be built, because it requires D73086; it applies after D73086 with some conflicts. Please, prepare a diff which will be applied after D73086 cleanly and add D73086 as a parent revision to show the dependency.

Hi @ikudrin and other reviewers, How about committing Dwarf.def and Dwarf.h changes/Mostly Flag encodings/ and stringify encoding functionalties as separate commit. That will resolve most of false dependencies. In my opinion that part adding those flags and encoding is independent of these patches(for instance, most of other encodings/flags described by dwarfv5 spec are their, even when LLVM/consumer might not be using them at present) and should be committed separately.
Any thoughts on this?

That would be perfectly fine.

In my understanding, this should still depend on D73086 because this revision requires the dumper for the test(s).

SouraVX updated this revision to Diff 248951.Mar 7 2020, 11:56 AM

Rebase!
@ikudrin this still depends on D73086, I've not removed parent-child relationship of revisions.

SouraVX updated this revision to Diff 249557.Mar 10 2020, 11:10 PM

nit: Modified macro header comment emission. Previously we were emitting lineptr.
Based on @ikudrin comment on D73086 this is modified to debug_line_offset for consistency and correlation with v5 macro definition in spec.
Thanks for this!

SouraVX added a comment.EditedMar 25 2020, 7:20 AM

Since most of this patch is reviewed already and this was blocking on D73086. Now D73086 is mostly ready to land so could you guys @dblaikie and @ikudrin can share some final comments on this one also.

dblaikie added inline comments.Apr 2 2020, 6:28 PM
llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
2856–2861

Rather than duplicating the code here from DwarfCompileUnit::initStmtList, please refactor so that this value is computed once and shared from there. (perhaps DwarfCompileUnit can have a getLineTableStartSym function that can be used from here.

SouraVX updated this revision to Diff 255190.Apr 5 2020, 1:11 PM
SouraVX edited the summary of this revision. (Show Details)

Addressed review comments by @dblaikie. Thanks for this!
-Refactored and rebased on trunk.
-Added a brief descriptive comment stating intention of the test case.
-Edited summary to remove previously stated objdump based testing statement.

SouraVX marked an inline comment as done.Apr 5 2020, 1:12 PM
dblaikie accepted this revision.Apr 5 2020, 3:21 PM

Looks good to me - please wait for @ikudrin to circle back on this and approve it as well.

ikudrin added inline comments.Apr 6 2020, 2:56 AM
llvm/test/DebugInfo/X86/debug-macro-v5.ll
6

The test fails because this line lacks the RUN: prefix.

28–42

The function @main() and its dependencies are not required for the test. Please, shrink the test so that it includes only that is necessary.

SouraVX updated this revision to Diff 255286.Apr 6 2020, 4:30 AM

Addressed comments by @ikudrin. Thanks for this!
-Reduced the test case to minimal.

SouraVX marked 2 inline comments as done.Apr 6 2020, 4:34 AM
SouraVX added inline comments.
llvm/test/DebugInfo/X86/debug-macro-v5.ll
6

Sorry, somehow the presence of line continuation(Previously inserted since the RUN line was long) was causing this test to be UnResolved.

Test has unterminated run lines (with '\')
Unresolved Tests (1):
    LLVM :: DebugInfo/X86/debug-macro-v5.ll

Anyways removed the line continuation, Now this is passing.

SouraVX marked 2 inline comments as done.Apr 6 2020, 4:35 AM
ikudrin accepted this revision.Apr 6 2020, 5:06 AM

LGTM.

This revision is now accepted and ready to land.Apr 6 2020, 5:06 AM
This revision was automatically updated to reflect the committed changes.