This is an archive of the discontinued LLVM Phabricator instance.

Support DW_FORM_strx* in llvm-dwp.
ClosedPublic

Authored by kimanh on Mar 2 2020, 3:14 PM.

Details

Summary

Currently llvm-dwp only handled DW_FORM_string and DW_FORM_GNU_str_index; with this patch it also starts to handle DW_FORM_strx[1-4]?

Diff Detail

Event Timeline

tamur created this revision.Mar 2 2020, 3:14 PM
dblaikie added inline comments.Mar 3 2020, 1:46 PM
llvm/test/tools/llvm-dwp/X86/handle_strx.test
7–21

This looks a bit more complicated than it needs to be - you can dwp any two .o files (so you don't need headers, function calls/cross references, etc), or even a single .o file (& a singel one might be sufficient for the purposes of this test case).

26–30

What sort of incompletenes problems does this test case trip over? What /should/ be tested but isn't because it's broken in other ways?

It might be that this patch should be held off on until otehr fixes are made so we're not adding code that can't be well tested, etc.

tamur updated this revision to Diff 248079.Mar 3 2020, 5:55 PM

Simplified the test. llvm-dwp and llvm-dwarfdump works fine for this simple test, but in any case I don't want to overspecify the output.

tamur marked 2 inline comments as done.Mar 3 2020, 5:59 PM
tamur added inline comments.
llvm/test/tools/llvm-dwp/X86/handle_strx.test
26–30

IIRC, dwarfdump had a non-fatal complaint, but they both work fine for this simple test. Still, I don't want to overspecify the output (and I don't want to make the test depend too much on llvm-dwarfdump), so I think just checking that llvm-dwp works correctly and the output has a DW_FORM_strx1 should suffice?

dblaikie added inline comments.Mar 4 2020, 9:34 AM
llvm/test/tools/llvm-dwp/X86/handle_strx.test
26–30

IIRC, dwarfdump had a non-fatal complaint, but they both work fine for this simple test.

Still, I don't want to overspecify the output (and I don't want to make the test depend too much on llvm-dwarfdump),

Not overspecifying is a worthy goal (& we do lots of things to ensure that - symbolic matches, etc) - though llvm-dwarfdump itself exists predominantly for writing test cases, so that dependence in general is intentional/not something to avoid.

so I think just checking that llvm-dwp works correctly and the output has a DW_FORM_strx1 should suffice?

Specifically the reason llvm-dwp tries to read strings is to produce error messages using the DW_AT_name and DW_AT_dwo_name - so it's probably appropriate to specifically test that such an error message has the right string value (that the parsing was correct, not just that that codepath didn't produce an error).

See llvm/test/tools/llvm-dwp/X86/duplicate.test for existing test coverage.

Also, though I don't feel too strongly about this, I think the current leaning is to write such tests in assembly (though still include source/compile commands, etc - but just not checking in binary object files) and have the test use llvm-mc to assemble them within the test.

tamur updated this revision to Diff 248216.EditedMar 4 2020, 9:34 AM

I deleted "#include " line in handle_strx.test and ran clang-format on llvm-dwp.cpp

tamur updated this revision to Diff 248361.Mar 4 2020, 5:33 PM
tamur marked an inline comment as not done.
  • Introduce a CompileUnitHeader structure, it is needed to know the dwarf version, because, for example, debug_str_offsets sectoins have a header only when dwarf version is 5.
  • Introduce a function to parse CompileUnitHeader.
  • Introduce a function to compute the size of debug_str_offsets section headers.
  • Fix writeStringsAndOffsets to copy debug_str_offsets section headers correctly.
  • Get getIndexedString funciton to handle DW_FORM_strx forms correctly.
tamur marked 3 inline comments as done.Mar 4 2020, 5:38 PM
tamur added inline comments.
llvm/test/tools/llvm-dwp/X86/handle_strx.test
26–30

OK, it turns out debug_str_offsets section had problems with respect to header parsing, and dwp did not produce correct strings. With my latest changes, I do see correct strings now.

Almost all tests in llvm/test/tools/llvm-dwp/X86 start from a .dwo file, and I'd like to do it the same way, if that's ok with you.

tamur marked 2 inline comments as done.Mar 4 2020, 5:39 PM
dblaikie added inline comments.Mar 5 2020, 4:28 PM
llvm/tools/llvm-dwp/llvm-dwp.cpp
194–197

This should probably be:

Out.emitBytes(Data.getBytes(&Offset, HeaderSize));

I think?

292–293

unnecessary reformatting, maybe?

Ah, looks like this overlaps a bit with https://reviews.llvm.org/D74312 - might be worth checking there if there's any necessary issues (rather than orthogonal pieces) that patch addresses that this one doesn't, etc.

tamur updated this revision to Diff 248631.Mar 5 2020, 5:04 PM

Done reviewer suggestions and fixed clang -format complaints.

tamur marked 2 inline comments as done.Mar 5 2020, 5:04 PM
tamur added a comment.Mar 5 2020, 5:32 PM

Ah, looks like this overlaps a bit with https://reviews.llvm.org/D74312 - might be worth checking there if there's any necessary issues (rather than orthogonal pieces) that patch addresses that this one doesn't, etc.

Thank you, I was not aware https://reviews.llvm.org/D74312 ; I looked at it, yes it seems this patch and that mostly solves the same problems; it seems parts of that patch have already been checked in (maybe as another patch?) but other parts have not seen action for some time. I added the author @SouraVX as a reviewer for comment.

Ah, looks like this overlaps a bit with https://reviews.llvm.org/D74312 - might be worth checking there if there's any necessary issues (rather than orthogonal pieces) that patch addresses that this one doesn't, etc.

Thank you, I was not aware https://reviews.llvm.org/D74312 ; I looked at it, yes it seems this patch and that mostly solves the same problems; it seems parts of that patch have already been checked in (maybe as another patch?) but other parts have not seen action for some time. I added the author @SouraVX as a reviewer for comment.

Hi thanks for doing this, I was occupied by my other reviews so the debug_str_offsets changes didn't went in. D74425 only addresses info section(Mostly the changes in Header) parsing and writing to dwp.
BTW, have you tested the resultant dwp for debuggability with GDB ?? I also tried to apply locally D74312(info + str_offsets support), As I recall loading of dwp was fine. But their were some complaints reported when you try running/set break etc. If possible can you confirm this ?

Your changes LGTM! but you can wait for @dblaikie comments/approval.

tamur added a comment.Mar 6 2020, 11:54 AM

BTW, have you tested the resultant dwp for debuggability with GDB ?? I also tried to apply locally D74312(info + str_offsets support), As I recall loading of dwp was fine. But their were some complaints reported when you try running/set break etc. If possible can you confirm this ?

Hi, with this patch I am able to run gdb successsfully on very basic examples. But I am using a custom gdb, that includes some pending patches upstream. llvm-dwp does not have support for loclists and range lists, and I'd not be surprised if there are other problems I am yet unaware of. I am planning to tackle loclists and range lists next, so I don't think we should expect a good gdb experience with .dwp yet.

dblaikie added inline comments.Mar 6 2020, 6:15 PM
llvm/test/tools/llvm-dwp/X86/handle_strx.test
9–12

I'd probably simplify this to something like:

void f1() {
}

That'll be shorter DWARF and machine code (no need for a function call, a return type, etc).

14

Remove this line that's no longer needed/relevant.

26–30

Could you add a DWARFv5 variation of the duplicate.test (either in that test, or in another/similar one alongside) - this will test that the strings are being parsed correctly by llvm-dwp when it needs to parse them (to extract the dwo_name and source file name).

llvm/tools/llvm-dwp/llvm-dwp.cpp
192

Any idea if this assertion could be hit by malformed input? If that's the case it shouldn't be an assert & should be full error handling - but you could leave it as a FIXME for now.

ikudrin added inline comments.
llvm/tools/llvm-dwp/llvm-dwp.cpp
75

There is a named constant for that: DW_LENGTH_DWARF64.

112
  • The function should be static.
  • The function should check if Info has enough data in it for the full header.
116

Please, Consider using DWARFDataExtractor::getInitialLength() here.

tamur updated this revision to Diff 249541.Mar 10 2020, 7:18 PM
tamur marked 7 inline comments as done.
tamur added inline comments.
llvm/test/tools/llvm-dwp/X86/handle_strx.test
26–30

I added string parsing to this test.

tamur marked an inline comment as done.Mar 12 2020, 7:17 AM

Friendly ping?

dblaikie added inline comments.Mar 23 2020, 3:52 PM
llvm/tools/llvm-dwp/llvm-dwp.cpp
137

That's probably a fairly opaque error message compared to returning Err here (possibly extracting the string from Err and adding a prefix, one example of that is here: https://github.com/llvm-mirror/clang/blob/master/lib/Tooling/Execution.cpp#L76 - or the Error could be passed up unmodified, potentially, though that might give a less legible message to the user). Same goes for other error failures in this function.

Also, and I know it's annoying to make them, but each of these failure paths should have tests.

SouraVX added inline comments.Mar 30 2020, 8:49 AM
llvm/test/tools/llvm-dwp/X86/handle_strx.test
2

How about instead of putting raw objects files dw5.dwo, put it as asm file and use llvm-mc to generate object and llvm-dwarfdump and so forth.

dblaikie added inline comments.Mar 30 2020, 11:28 AM
llvm/test/tools/llvm-dwp/X86/handle_strx.test
2

This was already discussed here: https://reviews.llvm.org/D75485#inline-688717

kimanh added a subscriber: kimanh.Mar 24 2021, 12:13 AM

We recently ran into this issue too. It seems as if the thread was forgotten about, and if some comments were not addressed yet. Ping @tamur @dblaikie : Are there any plans for this?

pfaffe added a subscriber: pfaffe.Mar 24 2021, 3:24 AM
kimanh updated this revision to Diff 338007.Apr 16 2021, 12:22 AM

Picking up patch and updating outstanding issues:

  • Adding test cases for each error case in parseCompileUnitHeader
  • Changing the error message to reuse the message from Error in parseCompileUnitHeader
  • Adapted tests (minor change: test error message, and added required header info to unsupported_cu_index_version)
kimanh commandeered this revision.Apr 16 2021, 12:26 AM
kimanh added a reviewer: tamur.

Commandeering revision.

kimanh marked an inline comment as done.Apr 16 2021, 12:46 AM

Answering leftover TODOs except of the FIXME one.

This change also was rebased to origin/main, so some changes to llvm-dwp are highlighted that are not mine (lightgreen).

llvm/test/tools/llvm-dwp/X86/handle_strx.test
2

I kept tamur's test case as discussed in https://reviews.llvm.org/D75485#inline-688717 but used .s files for the new test cases. Let me know if you want me to change anything.

ikudrin added inline comments.Apr 16 2021, 5:13 AM
llvm/test/tools/llvm-dwp/X86/invalid_string_form.test
2

Probably, the line was added accidentally.

llvm/test/tools/llvm-dwp/X86/unsupported_cu_index_version.s
4–5

The comment is outdated with the changes in .debug_info.dwo

dblaikie added inline comments.Apr 18 2021, 11:14 AM
llvm/test/tools/llvm-dwp/X86/invalid_cu_header_unittype.s
5 ↗(On Diff #338007)

The other messages are specific about what they can't parse - the unit version or the unit length. Hmm, I see because this uses the generic "check the error after parsing a bunch of fields".

Maybe it'd be worth checking that the length is potentially correct? That the length is long enough to include the header, and that the section is long enough to fit the length that's been parsed?

llvm/tools/llvm-dwp/llvm-dwp.cpp
125

I think the norm is probably not to include a newline in the error string - but wherever it's printed can choose where the newline should go.
Also, I think the norm is to not capitalize error messages (think about compiler error messages "error: unexpected thing") - so probably best to change "Cannot" to "cannot" in all these.

kimanh updated this revision to Diff 338512.Apr 19 2021, 7:08 AM
kimanh marked 3 inline comments as done.

Addressing reviewer comments:

  • adding test on the header length, which changed test set
  • removing accidentally added line in test
  • removing obsolete comment
  • update error messages to start with lower letter, and no new line
kimanh updated this revision to Diff 338514.Apr 19 2021, 7:16 AM
kimanh marked an inline comment as done.

Re-uploading as merged commit.

dblaikie accepted this revision.Apr 22 2021, 3:39 PM

Seems OK, I think. Few minor wording/coding changes to deal with.

llvm/test/tools/llvm-dwp/X86/handle_strx.test
2

Yeah, I wouldn't mind if tamur's test was changed to .s, but I don't insist on it.

llvm/tools/llvm-dwp/llvm-dwp.cpp
125

This suggests the header itself is too large for the debug_info section, but what it's testing is that the length specified in the header is too large - so maybe rephrasing this to something like: "compile unit exceeds .debug_info section range"?

160

This is an assertion because the length has been checked ahead of time, yeah? So there's no failure expected?

I guess the assertion was added to suppress an llvm::Error about the Error being unchecked? The problem is that in a non-assertions build that checking will fail because the Error won't be checked because the assertion isn't executed.

Perhaps it'd be OK to call the extraction functions without an Error parameter? (getU32(&Offset) etc) since the bounds have already been checked

273

(are you using/interested in DWARF type units (-fdebug-types-section)? In DWARFv5 those were moved into the .debug_info section, so this would need to be updated to handle them at some ponit)

This revision is now accepted and ready to land.Apr 22 2021, 3:39 PM
kimanh updated this revision to Diff 339945.Apr 23 2021, 1:58 AM
kimanh marked an inline comment as done.
  • removed Err parameter after already checking the header size
  • updated error message for compile unit length
  • adapted abbrev offset size (and thus also header size) for dwarf32/dwarf64

Thanks a lot for the reviews!

After reading up more on the DWARF specs, I noticed that debug abbrev offsets were not updated for dwarf 32-bit, and 64-bit. I've changed that now in the latest patch.

llvm/test/tools/llvm-dwp/X86/handle_strx.test
2

Alright, in that case I prefer to keep it. I'll be updating the code again for type units, so in case your opinion changes I'll take it on in a follow up patch!

llvm/tools/llvm-dwp/llvm-dwp.cpp
160

Yes right, I added the assertion because no failure is expected, and you're right that LLVM would throw an error if that's not checked. That makes sense to remove the Error parameter in that case, thanks!

273

Thanks for pointing this out! I'm currently working on a patch to add support for type units, and also for writing v5 cu/tu index sections.

Thanks a lot for the reviews!

After reading up more on the DWARF specs, I noticed that debug abbrev offsets were not updated for dwarf 32-bit, and 64-bit. I've changed that now in the latest patch.

Pretty sure the dwp code doesn't handle DWARF64 at all - might be best to leave that code to a separate patch with dedicated testing. (if you really want to include it in this patch - please include tests for it)

kimanh updated this revision to Diff 340441.Apr 26 2021, 12:44 AM
  • Removing 64-bit DWARF parts
kimanh marked 2 inline comments as done.EditedApr 26 2021, 12:48 AM

Ok, sounds good, removed the 64-bit DWARF parts now. If everything looks fine, could you help me to land this patch?

This revision was automatically updated to reflect the committed changes.

Thanks for the review and the help with landing the patch :)!