This is an archive of the discontinued LLVM Phabricator instance.

[DWARF5]: Added support for dumping strx forms in llvm-dwarfdump
ClosedPublic

Authored by SouraVX on Apr 23 2020, 11:19 AM.

Details

Summary

This patch adds support for dumping DW_MACRO_define_strx, DW_MACRO_undef_strx in llvm-dwarfdump. These forms are currently supported only in debug_macro section.

  • Added test containing multiple CUs(2 CU) contributing to macro section via DW_MACRO_define_strx form. llvm-dwarfdump is able to parse/dump contents correctly.

This patch is separated from D78500 based on feedback from @dblaikie , @ikudrin, Thanks for this!

Support for these forms in a debug_macro.dwo section will be done in a separate patch.

Diff Detail

Event Timeline

SouraVX created this revision.Apr 23 2020, 11:19 AM
Herald added a project: Restricted Project. · View Herald Transcript
SouraVX marked an inline comment as done.Apr 23 2020, 11:36 AM
SouraVX added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
117

This check some how crept in while rebasing and separating. I'll remove this.

SouraVX updated this revision to Diff 259676.Apr 23 2020, 12:26 PM

Removed un-necessary if condition left behind.

probinson added inline comments.Apr 23 2020, 12:29 PM
llvm/include/llvm/DebugInfo/DWARF/DWARFDebugMacro.h
12

Typo //, also please keep includes in correct alphabetical order.

llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
92

It looks like StringExtractor should not be Optional?

SouraVX marked 2 inline comments as done.Apr 23 2020, 12:40 PM
SouraVX added inline comments.
llvm/include/llvm/DebugInfo/DWARF/DWARFDebugMacro.h
12

Ah, Sorry! I'll keep this in mind now onwards.
Thank You for this!

llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
92

I marked it optional since, for cases of debug_macinfo[.dwo] . String Extractor is not needed since the macro info are represented in macinfo[.dwo] section itself.

The test should be reduced.

llvm/lib/DebugInfo/DWARF/DWARFContext.cpp
273

You do not need type units to parse the macro section. It is better to pass compile_units().

llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
103

Not all units may reference the macro section. You should skip such units.

168–173

DWARFUnit provides a method for that, getStringOffsetSectionItem(). Please, use it.

By the way, the string extractor can also be obtained from DWARFUnit using the getStringExtractor() method.

SouraVX updated this revision to Diff 259936.EditedApr 24 2020, 11:11 AM

Addressed @ikudrin comments thanks for this!

  • replaced normal_units() with compile_units()
  • Skipped the units which doesn't contribute to macro section.
  • Removed getStringOffsetsExtractor from arguments and used the corresponding getStringOffsetSectionItem() function to retrieve contents.
SouraVX marked 3 inline comments as done.Apr 24 2020, 11:22 AM
SouraVX added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
168–173

By the way, the string extractor can also be obtained from DWARFUnit using the getStringExtractor() method.

There's small caveat here, consider a case of _strp forms. For this case we don't need any CU information to parse/dump the associated macro content.
So we if try this way --

E.MacroStr = GetContributionUnit()->getStringExtractor().getCStr(&StrOffset);

Then cases which doesn't have CU in first place will end up crashing the dwarfdump.
I tried this approach and ended with crashes for these test cases, as you may notice these test cases only contains the minimum needed information to dump _strp forms which is debug_str and debug_macro section --

Failing Tests (2):
  LLVM :: DebugInfo/X86/debug-macro-macinfo.s
  LLVM :: DebugInfo/X86/debug-macro-v5.s

Shall we continue with earlier approach(pass the String Extractor explicitly) just for the cases of _strp form ?

SouraVX marked an inline comment as done.Apr 24 2020, 11:32 AM

The test should be reduced.

Sorry I saw this a bit late. Just to be on same page WRT this.
I'm planning to reduce it more(previously removed language producer ... attributes) by removing following attributes - DW_AT_stmt_list, DW_AT_comp_dir and remove corresponding temporary labels.
I think rest of the stuff is needed, Please add to this if I'm still missing out.
Thanks a lot!

SouraVX updated this revision to Diff 260127.Apr 25 2020, 11:19 AM

Reduced the test case.

SouraVX marked an inline comment as done.Apr 25 2020, 11:19 AM

Please, do not split definitions for sections in the test. It adds no value, just makes it harder to read.

llvm/include/llvm/DebugInfo/DWARF/DWARFDebugMacro.h
104–113

It starts looking that it is better to have two distinct methods in the public interface, parseMacinfo() and parseMacro(), than adding lots of the optional parameters. They can still share the implementation, though.

llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
104

I would probably use try_emplace() instead of insert() here. It is the same but does not require curly brackets.

117

The lambda is simple and is used only in one place. Just move the code there.

119
  • What if there is no record for that offset in the map?
  • Why not just store a pointer to DWARFUnit in the map?
163–190

StringExtractor->getCStr(). But, please, check that StringExtractor has a value.

llvm/test/DebugInfo/X86/debug-macro-multi-cu-strx.s
68–70

Add comments for these values.

84

This second part of the .debug_abbrev section is not used.

SouraVX updated this revision to Diff 260342.Apr 27 2020, 8:46 AM

Addressed @ikudrin comments, Thanks for this!

  • Addressed the case when there are no records in map. By asserting it before we parsing the _strx form.
  • Addressed test case related comments. Thanks for this, now test is more reduced and more readeable.
  • lambda removal, using DWARFUnit, using try_emplace.
SouraVX marked 4 inline comments as done.Apr 27 2020, 8:57 AM

It starts looking that it is better to have two distinct methods in the public interface, parseMacinfo() and parseMacro(), than adding lots of the optional parameters. They can still share the implementation, though.

Yes, I'm also noticing it. But this can be tricky(since we want code sharing too) these macro and macinfo is using 4 common forms. I'll be separately investigating this to cleanly refactor it so that the public interface should be more clean/intuitive. I think a private member function parseImpl to be called by public parseMacinfo/Macro would be the best fit?

  • This is just an abstract idea, have to dig-in more for implementation. May be 1/2 other patches ?
SouraVX marked 2 inline comments as done.Apr 27 2020, 9:08 AM
SouraVX marked an inline comment as done.Apr 27 2020, 9:19 AM
SouraVX added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
170

Ah, Small typo! this should be MacroContributionOffset

SouraVX updated this revision to Diff 260374.Apr 27 2020, 10:43 AM

Removed a typo! + Rebase!

It starts looking that it is better to have two distinct methods in the public interface, parseMacinfo() and parseMacro(), than adding lots of the optional parameters. They can still share the implementation, though.

Yes, I'm also noticing it. But this can be tricky(since we want code sharing too) these macro and macinfo is using 4 common forms. I'll be separately investigating this to cleanly refactor it so that the public interface should be more clean/intuitive. I think a private member function parseImpl to be called by public parseMacinfo/Macro would be the best fit?

  • This is just an abstract idea, have to dig-in more for implementation. May be 1/2 other patches ?

We can go with a private parseImpl() for now. It may be improved later.

ikudrin added inline comments.Apr 28 2020, 4:53 AM
llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
163–190

assert should not be used for the cases when malicious user input may trigger it.

171

The same. Please review other lines in this block to avoid runtime errors/crashes on unexpected input.

llvm/test/DebugInfo/X86/debug-macro-multi-cu-strx.s
83

The definitions for both CUs should lay under one .section directive. In that case, a reader can easily see that there are two CUs in the test. If there are several .section directives for one section, it makes reading the test more difficult.

The same for other sections.

It starts looking that it is better to have two distinct methods in the public interface, parseMacinfo() and parseMacro(), than adding lots of the optional parameters. They can still share the implementation, though.

Yes, I'm also noticing it. But this can be tricky(since we want code sharing too) these macro and macinfo is using 4 common forms. I'll be separately investigating this to cleanly refactor it so that the public interface should be more clean/intuitive. I think a private member function parseImpl to be called by public parseMacinfo/Macro would be the best fit?

  • This is just an abstract idea, have to dig-in more for implementation. May be 1/2 other patches ?

We can go with a private parseImpl() for now. It may be improved later.

Thanks for sharing your thoughts @ikudrin!
Should we do this refactoring after these 4 patches are in and macro infrastructure is all set ? Or should we delay these patches and focus on refactoring first ? Advantage of the former approach would be that we'll have all ingredients(macro macinfo split & non-split) already and then doing the refactoring.
Could you please share your thoughts on this too! Thanks!

SouraVX added inline comments.Apr 28 2020, 5:12 AM
llvm/test/DebugInfo/X86/debug-macro-multi-cu-strx.s
83

Thanks for clearing this!, I will make sure to follow this practice in all test cases and revision.

SouraVX updated this revision to Diff 260650.Apr 28 2020, 8:47 AM

Addressed review comments by @ikudrin . Thanks for this!
Following changes have been made to this revision-

  • Refactoring front based on comment by @ikudrin -
    • In an effort to simplify the public interfacing of macro/macinfo,( which is currently used by DWARFContext). public member function parse and private member function parseImpl(to do actual parsing) are introduced in class DWARFDebugMacro. This approach simplifies the interfacing to DWARFContext to some extent(atleast parameter passing and readeability).
    • Also included a FIXME, as a hint that we should introduce a better abstraction.

On testing front -

  • Previously we're asserting for a malformed user provided input, that has been switched to error handling. As a result 2 test cases were also added to verify these error messages.
  • Furthermore test case is further reduced based on @ikudrin suggestion. Thanks for this, it helped a lot, now test case is more readable asm then previous.
  • Lastly, I'll be revising other patches specifically D78500, After we reach consensus on the changes in this revision.
SouraVX marked 5 inline comments as done.Apr 28 2020, 8:51 AM
SouraVX added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
163–190

This String Extractor is innocuous in a sense, it is passed as parameters by functions up in the stack. So I left this `assert as it is. ? Are you okay with this.
Other asserts, I've replaced with error handling.

jdoerfert resigned from this revision.Apr 30 2020, 6:09 AM
SouraVX marked an inline comment as done.May 5 2020, 11:32 AM

Ping! @all reviewers!

ikudrin added inline comments.May 6 2020, 8:46 AM
llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
163–190

What if a malformed .debug_macinfo section contains a DW_MACRO_define_strp entry? Does the code have any protection against that?

SouraVX added inline comments.May 6 2020, 9:40 AM
llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
163–190

Ah, thanks for noting that. I think we should consider assert for these cases DW_MACRO_define_strp and DW_MACRO_define_strx since these forms belongs and should only be present when section is debug_macro. + Test cases to test these cases.
Does this sounds fair to you ?

SouraVX updated this revision to Diff 262447.May 6 2020, 12:20 PM

Addressed @ikudrin comments, Thanks for this!

  • Added assert while parsing _strp/_strx forms. Since DW_MACRO_define_strp and DW_MACRO_define_strx forms are only allowed in macro and macro[.dwo] sections respectively.
SouraVX marked 2 inline comments as done.May 6 2020, 12:20 PM
SouraVX marked an inline comment as done.May 6 2020, 12:26 PM
SouraVX added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
150

This different indentation for both assert statements is after running git-clang-format )

ikudrin added inline comments.May 6 2020, 6:36 PM
llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
163–190

As I have already said, asserts should not be used to handle invalid user input.

SouraVX updated this revision to Diff 262549.May 6 2020, 10:14 PM

Addressed @ikudrin comments, Thanks for this!

  • Replaced asserts with error handling and added 2 test cases to test these error creation and dumping.
SouraVX marked an inline comment as done.May 6 2020, 10:15 PM
ikudrin added inline comments.May 7 2020, 8:35 AM
llvm/include/llvm/DebugInfo/DWARF/DWARFDebugMacro.h
108

The method looks weird. It would be better to add two separate methods to parse macro and macinfo sections respectively.

llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
176

This probably should be a recoverable error. It would be good to report the offset of the macro table.

178

You've already found the record, no need to search again.

183

This also should be a recoverable error. It would be good to report the problematic offset.

SouraVX added inline comments.May 7 2020, 9:55 AM
llvm/include/llvm/DebugInfo/DWARF/DWARFDebugMacro.h
108

I thought we already had a discussion/agreement on this:
https://reviews.llvm.org/D78736#2007450

llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
176

I guess it should be non-recoverable error since:

  • Once an Macro Offset is not found, rest of the macro unit or a particular contribution of the CU(containing strx forms) becomes non-parseable(Since we cannot establish the mapping b/w macro contribution to String contribution).

This is based on premises that the contribution is homogeneous, in a sense, contribution contains single type(in this case strx) form to represent the macro information. This is expected since most producers don't use multiple forms(strp strx or pre-v5 forms such as define ) when producing macro info for a CU.
This also holds true for the 2nd case/comment where a String Contribution is not found.
This seems recoverable error in a mixed form(strp or DW_MACRO_define) contribution scenario: but this is hand crafted case.

debug_macro: 
Macro Header
DW_MACRO_define_strx
....
DW_MACRO_define
ikudrin added inline comments.May 7 2020, 10:40 PM
llvm/include/llvm/DebugInfo/DWARF/DWARFDebugMacro.h
108

It looks like I was not clear enough there. I meant to move the implementation into a private method parseImpl() and add two public methods parseMacinfo() and parseMacro(). This will make the public interface of the class clear while the implementation may be improved later.

llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
151

For consistency, the actions should be the same as in the default case. From the perspective of MACINFO sections, all these encodings are unknown.

176

In general, if an invalid record can be safely skipped, we prefer to do that, generate a recoverable error, and continue parsing. Otherwise, recoverable errors would not exist at all.

Anyway, that is not very critical for this patch. Handling error cases may be improved later when/if someone needs that.

llvm/test/DebugInfo/X86/debug-macro-multi-cu-strx.s
103

That is a bit weird that units with 3 and 2 string offsets both have the same unit length.

SouraVX updated this revision to Diff 262825.May 8 2020, 12:37 AM

Addressed @ikudrin comments, Thanks for this!

  • Simplified the public interface further, as per suggestion. Now parseMacro and parseMacinfo is the public interfacing of parsing macro and macinfo respectively.
  • For the case of strp or strx forms found in macinfo[.dwo] section, halt parsing(default behavior since these forms are invalid WRT debug_macinfo[.dwo] sections.)
  • Corrected the unit length in test case.
SouraVX marked 5 inline comments as done.May 8 2020, 12:38 AM

LGTM with a nit. Please wait a couple of days in case other reviewers have anything to say,

llvm/test/DebugInfo/X86/debug-macro-empty-str-offset.s
2

Remove trailing whitespace in this line.

LGTM with a nit. Please wait a couple of days in case other reviewers have anything to say,

Thank You @ikudrin for investing time in reviewing this. I'll wait for a while(till Monday) to have other reviewers(including @dblaikie , @aprantl and others) opinion on this.
I'm planning to revise rest of the revisions after this one lands.

dblaikie accepted this revision.May 11 2020, 3:54 PM
dblaikie added inline comments.
llvm/lib/DebugInfo/DWARF/DWARFDebugMacro.cpp
176

+1 here - while I don't feel super invested in the error handling here, I'd like us to stick to consistent habits here - please at least provide a follow-up patch in short-order after this one lands that changes this to a non-fatal error & tests that.

177–180

This variable name and error message seem a bit off. This isn't about the fact that the "macro contribution" for "the unit" couldn't be found - but somewhat the opposite "No compile unit found for this macro contribution (needed for discovering the str_offsets contribution when decoding a strx form in the macro contribution)" or something like that, perhaps.

184

Prefer just the straight boolean test:

if (!StrOffset)
190

No need for the extra () here, just:

&*StrOffset
This revision is now accepted and ready to land.May 11 2020, 3:54 PM
This revision was automatically updated to reflect the committed changes.
SouraVX marked 2 inline comments as done.May 11 2020, 10:08 PM

(the usual convention is to address outstanding comments that are made when a patch is approved (such as the naming and parentheses I mentioned) before committing (this is done to save an extra feedback loop when the changes seem sufficiently obvious to not need another check to ensure they were done as requested) - so if you could do those in a follow-up commit (no need to send it for review, but please follow-up here and mention the commit hashes of the commits that address the feedback))

SouraVX added a comment.EditedMay 11 2020, 10:49 PM

(the usual convention is to address outstanding comments that are made when a patch is approved (such as the naming and parentheses I mentioned) before committing (this is done to save an extra feedback loop when the changes seem sufficiently obvious to not need another check to ensure they were done as requested) - so if you could do those in a follow-up commit (no need to send it for review, but please follow-up here and mention the commit hashes of the commits that address the feedback))

Ah Sorry! But I've addressed/incorporated if (!StrOffset) and (&(*StrOffset) related comments Correctly right?.
Only the Variable renaming and Error message is remaining which I plan to address separately(Since making the error recoverable won't be trivial because, in case of error(MacroContribution missing or StrOffsets missing) we have to correctly skip the entire CU macro contribution and start parsing the next CU if it exist.).
Right now I'm rebasing rest of the 3 revision on top of this.

(the usual convention is to address outstanding comments that are made when a patch is approved (such as the naming and parentheses I mentioned) before committing (this is done to save an extra feedback loop when the changes seem sufficiently obvious to not need another check to ensure they were done as requested) - so if you could do those in a follow-up commit (no need to send it for review, but please follow-up here and mention the commit hashes of the commits that address the feedback))

Ah Sorry! But I've addressed/incorporated if (!StrOffset) and (&(*StrOffset) related comments Correctly right?.

Oh, right, so you did! My mistake! (I thought I'd checked the Phab close email & thought it didn't mention any further changes - but it did! totally my mistake)

Only the Variable renaming and Error message is remaining which I plan to address separately(Since making the error recoverable won't be trivial because, in case of error(MacroContribution missing or StrOffsets missing) we have to correctly skip the entire CU macro contribution and start parsing the next CU if it exist.).
Right now I'm rebasing rest of the 3 revision on top of this.

*nod* Sounds good.