DWARFListTableHeader::length() handles the zero value of HeaderData.Length in a special way, which makes the result different from the calculated value of FullLength, which leads to triggering the assertion. The patch moves the assertion a bit later when FullLength is already checked for minimal allowed value.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/DebugInfo/DWARF/DWARFListTable.cpp | ||
---|---|---|
38 | seems reasonable. Could you add an && "some explanation" to the assert that explains what went wrong? |
llvm/test/DebugInfo/X86/dwarfdump-rnglists-format-mix.s | ||
---|---|---|
1 ↗ | (On Diff #274506) | I think you can get rid of the "with enabled assertions" comment here. The test also shows that llvm-dwarfdump without assertions doesn't crash! As this is testing the library code, rather than something to do with llvm-dwarfdump specifically, perhaps you can just drop dwarfdump- from the test name. |
5–7 ↗ | (On Diff #274506) | Two thoughts, and not asking you to change anything if you prefer the current approach:
# RUN: llvm-mc -triple x86_64 %s -filetype=obj -o - | \ ## Indentation shows this is a continuation; llvm-dwarfdump shows it's a new executable being called # RUN: not llvm-dwarfdump -debug-info - 2>&1 | \ # RUN: FileCheck %s |
6 ↗ | (On Diff #274506) | Do you actually need to dump the debug info, or could you just dump the .debug_rnglists section explicitly? That would allow you to simplify your test input accordingly. |
- Added an explanatory comment to the assert.
- Simplified the test.
llvm/test/DebugInfo/X86/dwarfdump-rnglists-format-mix.s | ||
---|---|---|
1 ↗ | (On Diff #274506) |
I'll update the wording, thanks!
We have a group of tests here with assembly inputs that check various aspects of the library. I wanted the new test to be in the same group, next to other tests which also start with "dwarfdump-rnglists". |
5–7 ↗ | (On Diff #274506) | Will do. Thanks. |
6 ↗ | (On Diff #274506) | The test was based on the sample I run into when worked on producing DWARF64 debugging info. Probably you are right, the test may be simplified to check only the specific issue. |
Is this assertion really valuable? "length()" does the same addition that is done to create the value that length() is being compared to - it's pretty trivial? Perhaps "FullLength" should be initialized with a call to length instead of duplicating that addition?
In that case, we have a small inconsistency in the error message. In most cases, we report the full length of the data, which is the value of the unit length field plus the size of the field itself, but for the unit length of 0, we would not add the size of the field and would report only the raw value. This inconsistency is the only reason we calculate FullLength directly.
Is that difference necessary? I tried removing the length == 0 special case from "length()" and no tests fail. Perhaps we could go that route instead?
For example, dumpRnglistsSection() in DWARFContext.cpp terminates the loop when length() returns 0. With a specially constructed input, your variant would result in several additional unsuccessful reads with additional error messages:
.section .debug_rnglists,"",@progbits .long 0xffffffff .long 0xffffffff .byte 0xff ... error: parsing .debug_rnglists table at offset 0x0: unexpected end of data at offset 0xb while reading [0x4, 0xc) error: parsing .debug_rnglists table at offset 0x4: unexpected end of data at offset 0xb while reading [0x8, 0x10) error: parsing .debug_rnglists table at offset 0x8: unexpected end of data at offset 0xb while reading [0x8, 0xc)
Ah, thanks! Would be handy to have a test case for that & perhaps some other way to communicate "end of list" that's a bit more explicit?
Hmm, I'm not sure why this produce the repetition - if length() accurately returned the length that was read rather than zero, then it'd go to the end and stop, right?
For my understanding, that is not yet broken, so does not need to be fixed.
Hmm, I'm not sure why this produce the repetition - if length() accurately returned the length that was read rather than zero, then it'd go to the end and stop, right?
0xffffffff is a DWARF64 mark, so than it is read, the library expects to read the next 8 bytes.
I'm not suggesting it needs to be fixed - but that that codepath (the one that returns zero) is untested - so when it was committed, it was committed without test coverage. It'd be good to add test coverage where it is missing like this.
Hmm, I'm not sure why this produce the repetition - if length() accurately returned the length that was read rather than zero, then it'd go to the end and stop, right?
0xffffffff is a DWARF64 mark, so than it is read, the library expects to read the next 8 bytes.
Right - but what I mean is if there's only 10 bytes, as in your example - it reads the 4 bytes of DWARF64 mark, then 6 bytes out of the desired 8 - if the length was then reported as 10 (with an error saying the length was garbled/the contents terminated earlier than expected), would that be adequate to no longer need the zero length special case?
Isn't adding that test coverage orthogonal to this particular patch?
Right - but what I mean is if there's only 10 bytes, as in your example - it reads the 4 bytes of DWARF64 mark, then 6 bytes out of the desired 8 - if the length was then reported as 10 (with an error saying the length was garbled/the contents terminated earlier than expected), would that be adequate to no longer need the zero length special case?
I am OK with the current convention that if it is not possible to read the length field the code returns zero as the total length. It might be better to make the result Optional and return None in that case, but I really doubt it is worth investing time in that. Reporting something that was not read from the section (10 in your example) smells not good for me, but you may provide the patch if you feel like the code will be improved.
Oh yeah, for sure! Didn't mean to suggest it should go here!
Right - but what I mean is if there's only 10 bytes, as in your example - it reads the 4 bytes of DWARF64 mark, then 6 bytes out of the desired 8 - if the length was then reported as 10 (with an error saying the length was garbled/the contents terminated earlier than expected), would that be adequate to no longer need the zero length special case?
I am OK with the current convention that if it is not possible to read the length field the code returns zero as the total length. It might be better to make the result Optional and return None in that case, but I really doubt it is worth investing time in that. Reporting something that was not read from the section (10 in your example) smells not good for me,
Not sure I follow - 10 bytes of the length (as many bytes as were available to read) were read from the section, right? Ah, you mean the length bytes did not describe a length of 10, but 10 bytes were read. *nod* I can see how that's a bit ambiguous, indeed.
Well, I've created D83673 to add some tests. Please, take a look.
Right - but what I mean is if there's only 10 bytes, as in your example - it reads the 4 bytes of DWARF64 mark, then 6 bytes out of the desired 8 - if the length was then reported as 10 (with an error saying the length was garbled/the contents terminated earlier than expected), would that be adequate to no longer need the zero length special case?
I am OK with the current convention that if it is not possible to read the length field the code returns zero as the total length. It might be better to make the result Optional and return None in that case, but I really doubt it is worth investing time in that. Reporting something that was not read from the section (10 in your example) smells not good for me,
Not sure I follow - 10 bytes of the length (as many bytes as were available to read) were read from the section, right? Ah, you mean the length bytes did not describe a length of 10, but 10 bytes were read. *nod* I can see how that's a bit ambiguous, indeed.
In fact, these 10 bytes even were never read. Only the first 4 bytes were actually read, after which the check for insufficient space triggered an error. Also note, that getInitialLength() preserves the value of Offset in case of an error, not sets it to the position of the error.
Looks good! (though perhaps a full functionality test would be good - but I guess if we end up changing this API contract at some point, the test'll fail & hopefully someone'll then look at the end-to-end situation)
Right - but what I mean is if there's only 10 bytes, as in your example - it reads the 4 bytes of DWARF64 mark, then 6 bytes out of the desired 8 - if the length was then reported as 10 (with an error saying the length was garbled/the contents terminated earlier than expected), would that be adequate to no longer need the zero length special case?
I am OK with the current convention that if it is not possible to read the length field the code returns zero as the total length. It might be better to make the result Optional and return None in that case, but I really doubt it is worth investing time in that. Reporting something that was not read from the section (10 in your example) smells not good for me,
Not sure I follow - 10 bytes of the length (as many bytes as were available to read) were read from the section, right? Ah, you mean the length bytes did not describe a length of 10, but 10 bytes were read. *nod* I can see how that's a bit ambiguous, indeed.
In fact, these 10 bytes even were never read. Only the first 4 bytes were actually read, after which the check for insufficient space triggered an error. Also note, that getInitialLength() preserves the value of Offset in case of an error, not sets it to the position of the error.
*nod* sure enough - got some mildly mixed feelings still, but seems OK. Thanks for walking me through it!
seems reasonable. Could you add an && "some explanation" to the assert that explains what went wrong?