This is an archive of the discontinued LLVM Phabricator instance.

[Support] Make DataExtractor error messages more clear
ClosedPublic

Authored by labath on Apr 21 2020, 5:46 AM.

Details

Summary

This is a result of the discussion at D78113. Previously we would be
only giving the current offset at which the error was detected. However,
this was phrased somewhat ambiguously (as it could also mean that end of
data was at that offset). The new error message includes the current
offset as well as the extent of the data being read.

I've changed a couple of file-level static functions into private member
functions in order to avoid passing a bunch of new arguments everywhere.

Diff Detail

Event Timeline

labath created this revision.Apr 21 2020, 5:46 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2020, 5:46 AM
jhenderson added inline comments.Apr 21 2020, 6:09 AM
llvm/lib/Support/DataExtractor.cpp
27

My instinct says that this needs to be Data.size() + Offset. I'm reading some of the changed messages in the tests and getting confused by how they line up with the range Example: "offset 0x48 while reading [0xbadbeef, 0xbadbef7)" - what is the offset 0x48 referring to? I think "offset 0xbadbef6 while reading..." (or whatever the value actually should be) might be clearer.

llvm/test/DebugInfo/X86/dwarfdump-debug-loc-error-cases2.s
16

Does this error message need updating?

llvm/test/DebugInfo/X86/dwarfdump-debug-loclists-error-cases2.s
16

Same as above.

jhenderson added inline comments.Apr 21 2020, 6:18 AM
llvm/include/llvm/Support/DataExtractor.h
696

Any particular reason this is protected, not private?

labath marked an inline comment as done.Apr 21 2020, 6:39 AM
labath added inline comments.
llvm/lib/Support/DataExtractor.cpp
27

Yeah, that one is funny. What happens there is that the very first read starts at a completely bogus offset. In this case that happens because the value of DW_AT_location is (deliberately) corrupted, but it could happen when following other kinds of indirect references (e.g. DW_FORM_strp), depending on how exactly is that implemented.

What would you say to checking for Data.size() < Offset and emitting a different error message in that case ("reading offset 0xdeadbeef beyond the end of data at 0x48") ?

labath marked an inline comment as done.Apr 21 2020, 6:41 AM
labath added inline comments.
llvm/include/llvm/Support/DataExtractor.h
696

There is, though it's fairly arbitrary -- it seemed like it could be useful to subclasses like DWARFDataExtractor.

jhenderson added inline comments.Apr 21 2020, 7:06 AM
llvm/include/llvm/Support/DataExtractor.h
696

Fair enough. I'll leave it up to you to decide what would be best.

llvm/lib/Support/DataExtractor.cpp
27

That seems like a reasonable improvement. It might mean I'm less likely to get completely thrown by the error message in the future!

labath marked 7 inline comments as done.May 14 2020, 8:53 AM
labath added inline comments.
llvm/include/llvm/Support/DataExtractor.h
696

Actually, I'll just make it private. It's easy to make it protected if needed.

labath updated this revision to Diff 264010.May 14 2020, 8:53 AM
labath marked an inline comment as done.
labath removed a subscriber: cmtice.
  • impove error message, update test expectations

Sorry about the delay.

jhenderson added inline comments.May 15 2020, 1:02 AM
llvm/lib/Support/DataExtractor.cpp
30

FWIW, I'm not convinced illegal_byte_sequence is really the right thing to use. I had a discussion with someone else recently about this, but in every example usage I found online outside LLVM, it is for unicode encoding issues. The standard is silent on the exact purpose of it so the argument is that it's perfectly reasonable to use it for other bad sequences. I personally would use invalid_argument here, since you're trying to read using an invalid Offset.

35–38

If I'm not mistaken, this case needs a unit test?

labath marked 3 inline comments as done.Jun 1 2020, 7:05 AM
labath added inline comments.
llvm/lib/Support/DataExtractor.cpp
30

Yes, invalid_argument looks better here.

35–38

Test added.

labath updated this revision to Diff 267609.Jun 1 2020, 7:05 AM
labath marked an inline comment as done.
  • change errc value
  • add unit test
jhenderson accepted this revision.Jun 2 2020, 1:25 AM

LGTM, with one suggestion.

llvm/lib/Support/DataExtractor.cpp
30

Looks like you only changed the second case here? Potentially, Size could be invalid causing the error. What do you think?

This revision is now accepted and ready to land.Jun 2 2020, 1:25 AM
labath marked an inline comment as done.Jun 2 2020, 3:42 AM
labath added inline comments.
llvm/lib/Support/DataExtractor.cpp
30

Yes, that was intentional. I don't think we should blame the Size variable here. Since that is a long lived member variable, I think we should be assuming that it is correct. The first thing to blame should be the Offset argument, but in this case that also looks plausible, so we blame the data contents. For that illegal_byte_sequence seems appropriate (with the unicode caveat).

I think this is similar to how e.g. passing a bogus pointer to a system call results in EFAULT, but passing a pointer to incorrect data results in a more specific error code.

jhenderson added inline comments.Jun 2 2020, 3:56 AM
llvm/lib/Support/DataExtractor.cpp
30

Okay, happy either way.

This revision was automatically updated to reflect the committed changes.