This is an archive of the discontinued LLVM Phabricator instance.

[LLDB] Fix buffer overflow problem in DWARFExpression::Evaluate.
ClosedPublic

Authored by cmtice on Jun 27 2023, 12:36 AM.

Details

Summary

In two calls to ReadMemory in DWARFExpression.cpp, the buffer size passed to ReadMemory is not actually the size of the buffer (I suspect a copy/paste error where the variable name was not properly updated). This caused a buffer overflow bug, which we found through Address Sanitizer. This patch fixes the problem by passing the correct buffer size to the calls to ReadMemory (and to the DataExtractor).

Diff Detail

Event Timeline

cmtice created this revision.Jun 27 2023, 12:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 27 2023, 12:36 AM
cmtice requested review of this revision.Jun 27 2023, 12:36 AM
kastiglione accepted this revision.Jun 28 2023, 10:39 AM
kastiglione added inline comments.
lldb/source/Expression/DWARFExpression.cpp
1141

could use sizeof to get greater assurance the size matches the buffer.

This revision is now accepted and ready to land.Jun 28 2023, 10:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 28 2023, 11:48 AM

I updated the version that I committed to use 'sizeof' as recommended.

I'm not sure if this is the right fix - these reads are for implementing DW_OP_deref_size, by the looks of it - so I think it does make sense that the size read is not the size of the address, but the size specified in the DW_OP_deref_size. There is a requirement that DW_OP_deref_size's size may not be larger than the system address - so maybe the input that hit this is incorrect, and lldb should've failed earlier (validating the size retrieved at line 1082 is within the bounds)?

I have to agree with David, I don't see how the old code could overflow if DW_OP_deref_size's maximum size is the pointer size in the target, and we are reading it in to an 8-byte buffer, unless the target had addresses larger than 8 bytes, or the dwarf was malformed. The DWARF5 doc says

"In the DW_OP_deref_size operation, however, the size in bytes of the data retrieved from the dereferenced address is specified by
the single operand. This operand is a 1-byte unsigned integral constant whose value may not be larger than the size of the generic type. The data retrieved is zero extended to the size of an address on the target machine before being pushed onto the expression stack"

If there was an invalid dwarf input file that had a DW_OP_deref_size with a size of 16, we will only read the first 8 bytes of it into our buffer, and then try to extract 16 bytes with a call to DerefSizeExtractDataHelper, which will only read an Address sized object out of the buffer so it won't exceed the length.

I think this change may be hiding a real bug in the DWARF @cmtice ? Do you have a record of what the input file that caused the address sanitizer to trip was?

cmtice marked an inline comment as done.Jul 5 2023, 9:39 AM

Hi Jason,

I had been talking more with David, and yes, I had come to the conclusion that you are both right and that this was not the right fix. I am planning on reverting this, but I am trying to figure out the right fix to replace it with. I can't share the source that was causing the bug to manifest, because it's in proprietary code, but David is looking at it and I believe he has come to the conclusion that there is a bug in the DWARF code generation -- we were getting a size of 16, which is absolutely not right. The question is, in the case of bad DWARF being generated, what (if anything) should the LLDB code here be doing? Should we check the size as soon as we read it in, and assert that it must be <= 8? Or something else? Or just leave the LLDB code entirely alone?

What do you (and other reviewers) think is the right thing to do here?

Hi Jason,

I had been talking more with David, and yes, I had come to the conclusion that you are both right and that this was not the right fix. I am planning on reverting this, but I am trying to figure out the right fix to replace it with. I can't share the source that was causing the bug to manifest, because it's in proprietary code, but David is looking at it and I believe he has come to the conclusion that there is a bug in the DWARF code generation -- we were getting a size of 16, which is absolutely not right. The question is, in the case of bad DWARF being generated, what (if anything) should the LLDB code here be doing? Should we check the size as soon as we read it in, and assert that it must be <= 8? Or something else? Or just leave the LLDB code entirely alone?

What do you (and other reviewers) think is the right thing to do here?

While it's likely generally under-tested/under-covered, debuggers shouldn't crash on invalid/problematic DWARF. So this code should probably abort parsing an invalid expression like this - there's probably some codepaths through here that do some kind of error handling like the "Failed to dereference pointer from" a few lines later. I'd expect a similar error handling should be introduced if size is invalid (not sure if "invalid" should be > sizeof(lldb::addr_t) or maybe something more specific (like it should check the address size in the DWARF, perhaps - I don't know/recall the /exact/ DWARF spec wording about the size limitations)).

lldb/source/Expression/DWARFExpression.cpp
1088

FWIW, I think this code is differently invalid, but figured I'd mention it while I'm here - this is probably illegal load widening. If we're processing DW_OP_deref_size(1) maybe that 1 byte is at the end of a page - reading sizeof(void*) would read more data than would be valid, and be a problem?