Page MenuHomePhabricator

Fix dumping of characters with non-standard sizes

Authored by petpav01 on Sep 29 2017, 2:50 AM.


  • Prevent dumping of characters in DumpDataExtractor() with item_byte_size bigger than 8 bytes. This case is not supported by the code and results in a crash because the code calls DataExtractor::GetMaxU64Bitfield() -> GetMaxU64() that asserts for byte size > 8 bytes.
  • Teach DataExtractor::GetMaxU64(), GetMaxU32() and GetMaxS64() how to handle byte sizes that are not a multiple of 2. This allows DumpDataExtractor() to dump characters and booleans with item_byte_size in the interval of [1, 8] bytes. Values that are not a multiple of 2 would previously result in a crash because they were not handled by GetMaxU64().

Examples of two commands that previously resulted in a crash when debugging an AArch64 target, and their new behaviour:

(lldb) register read --format character v0
      v0 = error: unsupported byte size (16) for char format
(lldb) memory read --format boolean --size 7 $sp
0x7ffffffd70: false
0x7ffffffd77: false

Diff Detail

Event Timeline

petpav01 created this revision.Sep 29 2017, 2:50 AM
jingham edited edge metadata.Sep 29 2017, 4:20 PM

This looks fine to me. I'd give Greg a little time to weigh in, this is much more his code than mine. But I don't see any problem with this, and thanks for the tests!


Should this consume the weird input we couldn't print? I actually don't have a good feel for which would be better.


This is trivial, and you didn't change what was there, but this message makes it sound like this is just something we haven't gotten to yet. It's really "You passed in an illegal byte size"... Might be clearer if the message said that.

jingham accepted this revision.Sep 29 2017, 4:20 PM
This revision is now accepted and ready to land.Sep 29 2017, 4:20 PM

Thank you for the initial review.


The behaviour implemented in DumpDataExtractor() for other formats, such as eFormatBoolean or eFormatComplexInteger, is to report an error and do not advance the offset. The approach that the patch takes is to make eFormatChar (and its variants) consistent with this behaviour.


I was not sure what is the expected behaviour when the input byte_size exceeds the size of the return type of each of these GetMax...() methods. The current behaviour is to assert this situation but comments describing the methods (in both DataExtractor.cpp and DataExtractor.h) say that nothing should get extracted in these cases and zero is returned.

Maybe the patch should go a bit further and clean this up as follows:

  • Remove duplicated comments in DataExtractor.cpp for DataExtractor::GetMaxU32() and GetMaxU64() and keep only their Doxygen versions in DataExtractor.h.
  • Update comments in DataExtractor.h for DataExtractor::GetMaxU32(), GetMaxU64(), GetMaxS64(), GetMaxU64Bitfield() and GetMaxS64Bitfield() to match the current implementation.
  • Change assertion text in DataExtractor::GetMaxU32() and GetMaxU64() from "unhandled case" to "invalid byte size".

Does this sound reasonable?

See inlined comments.


The doc's say an error in the size "will result in nothing being extracted". That seems like a roundabout way of saying that offset won't be advanced, but I probably should have figured it out from there. Thanks for checking on the error behavior of the other dumpers! You are doing the right thing.


The released versions of lldb - at least the ones Apple releases - have asserts disabled. This isn't unique to lldb, clang does the same thing.

I do my day-to-day debugging using a TOT build with asserts enabled, and we run the testsuite that way so the asserts catch errors at this stage. But for the general public, the function will behave as described. It would be great to remove the duplicated docs - that's just begging for one or the other to get out of date. But the descriptions are functionally correct. And then changing the text to "invalid byte size" also seems good to me.

zturner added a subscriber: zturner.Oct 3 2017, 9:16 AM
zturner added inline comments.

Being pedantic, this *is* a functionality change. Previously, we would assert on a size of 3 or 0, with this change we will allow those cases through.

petpav01 added inline comments.Oct 4 2017, 2:49 AM

To explain myself better, what I was thinking about is that e.g. GetMaxU64() should have part:

"\a byte_size should have a value greater than or equal to one and less than or equal to eight since the return value is 64 bits wide. Any \a byte_size values less than 1 or greater than 8 will result in nothing being extracted, and zero being returned."

changed to:

"\a byte_size must have a value greater than or equal to one and less than or equal to eight since the return value is 64 bits wide. The behaviour is undefined for any \a byte_size values less than 1 or greater than 8."

This way the comment provides information that does not depend on whether assertions are enabled or not. The behaviour for byte_size > 8 is said to be undefined in the updated description because it either results in an assertion failure or some undefined behaviour if asserts are disabled.

If the behaviour for byte_size > 4/8 with assertions disabled should actually be that these methods still return 0 and do not advance the offset then the patch has two bugs:

  • The general case added in GetMaxU64() is not correct. It returns an unexpected value for byte_size > 8 and advances the offset.
  • GetMaxU32() needs to have if (byte_size > 4) return 0; added before it calls GetMaxU64() to avoid the same problem for any byte_size > 4.

An additional thing is that the patch causes that byte_size == 0 is now fully valid and does not assert. This might not be the best idea given that the current descriptions say that byte_size values should be in interval [1, 4/8]. I will add the assertion for byte_size == 0 back in the updated patch so the changes affect/enable only byte_size in range [1, 4/8] (which are clear to be valid) and the zero corner case has its behaviour unchanged.

clayborg edited edge metadata.Oct 9 2017, 8:20 AM

Looks good. Would be nice to add support for byte sizes of 3, 5 and 7 to the unchecked version as noted in inline comments, or remove the function if no one is using this function. Just a few quick fixes and this will be good to go.


use lldbassert if the function will function correctly with the assert removed. I know the previous code was always asserting, but we should change it to use lldbassert to make sure we don't crash the debugger in release builds.


Shouldn't we handle the 3, 5 and 7 sizes here too?


add a test for the unchecked version here?

petpav01 updated this revision to Diff 118375.Oct 10 2017, 6:33 AM

Updated patch contains the following changes:

  • Remove comments for DataExtractor::GetMaxU32() and GetMaxU64() from DataExtractor.cpp and keep only the Doxygen ones in the header file.
  • Restore assertion for byte_size == 0 in GetMaxU32() and GetMaxU64(), change the assert text from "unhandled case" to "invalid byte_size" and replace assert() by lldbassert().
  • Update Doxygen documentation for GetMaxU32(), GetMaxU64(), GetMaxS64(), GetMaxU64Bitfield() and GetMaxS64Bitfield() to say that byte_size must be in interval [1,4/8] and remove that the methods return 0 if byte_size is bigger than 4/8 because that no longer holds. The patch retains the behaviour that LLDB does not crash in such cases but the returned value can be arbitrary. Note: This is something that I am not certain if I addressed properly. It seems to me this should be ok because the code now uses lldbassert() and so there will be always some error that something went wrong. An alternative is to add extra code that checks for byte_size > 4/8 and returns 0 in such cases. Please let me know if that would be preferred.
  • Enable GetMaxU64_unchecked() to also handle any byte_size in range [1,8] and add testing for this method in DataExtractorTest.cpp.
clayborg accepted this revision.Oct 10 2017, 8:04 AM

Looks good

Thanks, will commit shortly.

petpav01 closed this revision.Oct 11 2017, 1:56 AM

Landed as r315444. Closing manually because I forgot to include "Differential Revision: ..." in the commit message.