Currently llvm-dwp only handled DW_FORM_string and DW_FORM_GNU_str_index; with this patch it also starts to handle DW_FORM_strx[1-4]?
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/tools/llvm-dwp/X86/handle_strx.test | ||
---|---|---|
7–21 | This looks a bit more complicated than it needs to be - you can dwp any two .o files (so you don't need headers, function calls/cross references, etc), or even a single .o file (& a singel one might be sufficient for the purposes of this test case). | |
26–30 | What sort of incompletenes problems does this test case trip over? What /should/ be tested but isn't because it's broken in other ways? It might be that this patch should be held off on until otehr fixes are made so we're not adding code that can't be well tested, etc. |
Simplified the test. llvm-dwp and llvm-dwarfdump works fine for this simple test, but in any case I don't want to overspecify the output.
llvm/test/tools/llvm-dwp/X86/handle_strx.test | ||
---|---|---|
26–30 | IIRC, dwarfdump had a non-fatal complaint, but they both work fine for this simple test. Still, I don't want to overspecify the output (and I don't want to make the test depend too much on llvm-dwarfdump), so I think just checking that llvm-dwp works correctly and the output has a DW_FORM_strx1 should suffice? |
llvm/test/tools/llvm-dwp/X86/handle_strx.test | ||
---|---|---|
26–30 |
Not overspecifying is a worthy goal (& we do lots of things to ensure that - symbolic matches, etc) - though llvm-dwarfdump itself exists predominantly for writing test cases, so that dependence in general is intentional/not something to avoid.
Specifically the reason llvm-dwp tries to read strings is to produce error messages using the DW_AT_name and DW_AT_dwo_name - so it's probably appropriate to specifically test that such an error message has the right string value (that the parsing was correct, not just that that codepath didn't produce an error). See llvm/test/tools/llvm-dwp/X86/duplicate.test for existing test coverage. Also, though I don't feel too strongly about this, I think the current leaning is to write such tests in assembly (though still include source/compile commands, etc - but just not checking in binary object files) and have the test use llvm-mc to assemble them within the test. |
- Introduce a CompileUnitHeader structure, it is needed to know the dwarf version, because, for example, debug_str_offsets sectoins have a header only when dwarf version is 5.
- Introduce a function to parse CompileUnitHeader.
- Introduce a function to compute the size of debug_str_offsets section headers.
- Fix writeStringsAndOffsets to copy debug_str_offsets section headers correctly.
- Get getIndexedString funciton to handle DW_FORM_strx forms correctly.
llvm/test/tools/llvm-dwp/X86/handle_strx.test | ||
---|---|---|
26–30 | OK, it turns out debug_str_offsets section had problems with respect to header parsing, and dwp did not produce correct strings. With my latest changes, I do see correct strings now. Almost all tests in llvm/test/tools/llvm-dwp/X86 start from a .dwo file, and I'd like to do it the same way, if that's ok with you. |
Ah, looks like this overlaps a bit with https://reviews.llvm.org/D74312 - might be worth checking there if there's any necessary issues (rather than orthogonal pieces) that patch addresses that this one doesn't, etc.
Thank you, I was not aware https://reviews.llvm.org/D74312 ; I looked at it, yes it seems this patch and that mostly solves the same problems; it seems parts of that patch have already been checked in (maybe as another patch?) but other parts have not seen action for some time. I added the author @SouraVX as a reviewer for comment.
Hi thanks for doing this, I was occupied by my other reviews so the debug_str_offsets changes didn't went in. D74425 only addresses info section(Mostly the changes in Header) parsing and writing to dwp.
BTW, have you tested the resultant dwp for debuggability with GDB ?? I also tried to apply locally D74312(info + str_offsets support), As I recall loading of dwp was fine. But their were some complaints reported when you try running/set break etc. If possible can you confirm this ?
Your changes LGTM! but you can wait for @dblaikie comments/approval.
BTW, have you tested the resultant dwp for debuggability with GDB ?? I also tried to apply locally D74312(info + str_offsets support), As I recall loading of dwp was fine. But their were some complaints reported when you try running/set break etc. If possible can you confirm this ?
Hi, with this patch I am able to run gdb successsfully on very basic examples. But I am using a custom gdb, that includes some pending patches upstream. llvm-dwp does not have support for loclists and range lists, and I'd not be surprised if there are other problems I am yet unaware of. I am planning to tackle loclists and range lists next, so I don't think we should expect a good gdb experience with .dwp yet.
llvm/test/tools/llvm-dwp/X86/handle_strx.test | ||
---|---|---|
9–12 | I'd probably simplify this to something like: void f1() { } That'll be shorter DWARF and machine code (no need for a function call, a return type, etc). | |
14 | Remove this line that's no longer needed/relevant. | |
26–30 | Could you add a DWARFv5 variation of the duplicate.test (either in that test, or in another/similar one alongside) - this will test that the strings are being parsed correctly by llvm-dwp when it needs to parse them (to extract the dwo_name and source file name). | |
llvm/tools/llvm-dwp/llvm-dwp.cpp | ||
192 | Any idea if this assertion could be hit by malformed input? If that's the case it shouldn't be an assert & should be full error handling - but you could leave it as a FIXME for now. |
llvm/test/tools/llvm-dwp/X86/handle_strx.test | ||
---|---|---|
26–30 | I added string parsing to this test. |
llvm/tools/llvm-dwp/llvm-dwp.cpp | ||
---|---|---|
137 | That's probably a fairly opaque error message compared to returning Err here (possibly extracting the string from Err and adding a prefix, one example of that is here: https://github.com/llvm-mirror/clang/blob/master/lib/Tooling/Execution.cpp#L76 - or the Error could be passed up unmodified, potentially, though that might give a less legible message to the user). Same goes for other error failures in this function. Also, and I know it's annoying to make them, but each of these failure paths should have tests. |
llvm/test/tools/llvm-dwp/X86/handle_strx.test | ||
---|---|---|
2 | How about instead of putting raw objects files dw5.dwo, put it as asm file and use llvm-mc to generate object and llvm-dwarfdump and so forth. |
llvm/test/tools/llvm-dwp/X86/handle_strx.test | ||
---|---|---|
2 | This was already discussed here: https://reviews.llvm.org/D75485#inline-688717 |
Picking up patch and updating outstanding issues:
- Adding test cases for each error case in parseCompileUnitHeader
- Changing the error message to reuse the message from Error in parseCompileUnitHeader
- Adapted tests (minor change: test error message, and added required header info to unsupported_cu_index_version)
Answering leftover TODOs except of the FIXME one.
This change also was rebased to origin/main, so some changes to llvm-dwp are highlighted that are not mine (lightgreen).
llvm/test/tools/llvm-dwp/X86/handle_strx.test | ||
---|---|---|
2 | I kept tamur's test case as discussed in https://reviews.llvm.org/D75485#inline-688717 but used .s files for the new test cases. Let me know if you want me to change anything. |
llvm/test/tools/llvm-dwp/X86/invalid_cu_header_unittype.s | ||
---|---|---|
5 ↗ | (On Diff #338007) | The other messages are specific about what they can't parse - the unit version or the unit length. Hmm, I see because this uses the generic "check the error after parsing a bunch of fields". Maybe it'd be worth checking that the length is potentially correct? That the length is long enough to include the header, and that the section is long enough to fit the length that's been parsed? |
llvm/tools/llvm-dwp/llvm-dwp.cpp | ||
125 | I think the norm is probably not to include a newline in the error string - but wherever it's printed can choose where the newline should go. |
Addressing reviewer comments:
- adding test on the header length, which changed test set
- removing accidentally added line in test
- removing obsolete comment
- update error messages to start with lower letter, and no new line
Seems OK, I think. Few minor wording/coding changes to deal with.
llvm/test/tools/llvm-dwp/X86/handle_strx.test | ||
---|---|---|
2 | Yeah, I wouldn't mind if tamur's test was changed to .s, but I don't insist on it. | |
llvm/tools/llvm-dwp/llvm-dwp.cpp | ||
125 | This suggests the header itself is too large for the debug_info section, but what it's testing is that the length specified in the header is too large - so maybe rephrasing this to something like: "compile unit exceeds .debug_info section range"? | |
160 | This is an assertion because the length has been checked ahead of time, yeah? So there's no failure expected? I guess the assertion was added to suppress an llvm::Error about the Error being unchecked? The problem is that in a non-assertions build that checking will fail because the Error won't be checked because the assertion isn't executed. Perhaps it'd be OK to call the extraction functions without an Error parameter? (getU32(&Offset) etc) since the bounds have already been checked | |
273 | (are you using/interested in DWARF type units (-fdebug-types-section)? In DWARFv5 those were moved into the .debug_info section, so this would need to be updated to handle them at some ponit) |
- removed Err parameter after already checking the header size
- updated error message for compile unit length
- adapted abbrev offset size (and thus also header size) for dwarf32/dwarf64
Thanks a lot for the reviews!
After reading up more on the DWARF specs, I noticed that debug abbrev offsets were not updated for dwarf 32-bit, and 64-bit. I've changed that now in the latest patch.
llvm/test/tools/llvm-dwp/X86/handle_strx.test | ||
---|---|---|
2 | Alright, in that case I prefer to keep it. I'll be updating the code again for type units, so in case your opinion changes I'll take it on in a follow up patch! | |
llvm/tools/llvm-dwp/llvm-dwp.cpp | ||
160 | Yes right, I added the assertion because no failure is expected, and you're right that LLVM would throw an error if that's not checked. That makes sense to remove the Error parameter in that case, thanks! | |
273 | Thanks for pointing this out! I'm currently working on a patch to add support for type units, and also for writing v5 cu/tu index sections. |
Pretty sure the dwp code doesn't handle DWARF64 at all - might be best to leave that code to a separate patch with dedicated testing. (if you really want to include it in this patch - please include tests for it)
Ok, sounds good, removed the 64-bit DWARF parts now. If everything looks fine, could you help me to land this patch?
How about instead of putting raw objects files dw5.dwo, put it as asm file and use llvm-mc to generate object and llvm-dwarfdump and so forth.