This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lldb/
-
include/lldb/Core/
-
lldb/
-
Core/
-
ValueObject.h
-
source/Core/
-
Core/
-
ValueObject.cpp
-
ValueObjectVariable.cpp
-
test/Shell/SymbolFile/DWARF/
-
Shell/
-
SymbolFile/
-
DWARF/
-
DW_OP_piece-struct.s

Differential D69273

ValueObject: Fix a crash related to children address type computation
ClosedPublic

Authored by labath on Oct 21 2019, 10:55 AM.

Download Raw Diff

Details

Reviewers

jingham
clayborg

Commits

rG96601ec28b7e: ValueObject: Fix a crash related to children address type computation

Summary

This patch fixes a crash encountered when debugging optimized code. If some
variable has been completely optimized out, but it's value is nonetheless known,
the compiler can replace it with a DWARF expression computing its value. The
evaluating these expressions results in a eValueTypeHostAddress Value object, as
it's contents are computed into an lldb buffer. However, any value that is
obtained by dereferencing pointers in this object should no longer have the
"host" address type.

Lldb had code to account for this, but it was only present in the
ValueObjectVariable class. This wasn't enough when the object being described
was a struct, as then the object holding the actual pointer was a
ValueObjectChild. This caused lldb to dereference the contained pointer in the
context of the host process and crash.

Though I am not an expert on ValueObjects, it seems to me that this children
address type logic should apply to all types of objects (and indeed, applying
applying the same logic to ValueObjectChild fixes the crash). Therefore, I move
this code to the base class, and arrange it to be run everytime the value is
updated.

The test case is a reduced and simplified version of the original debug info
triggering the crash. Originally we were dealing with a local variable, but as
these require a running process to display, I changed it to use a global one
instead.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

labath created this revision.Oct 21 2019, 10:55 AM

Herald added a subscriber: aprantl. · View Herald TranscriptOct 21 2019, 10:55 AM

Harbormaster completed remote builds in B39872: Diff 225921.Oct 21 2019, 10:55 AM

Looks good!

This revision is now accepted and ready to land.Oct 21 2019, 11:05 AM

Except for the comment comment this looks fine.

I think the Host -> Load address code isn't quite right, though it looks like that's not your doing.

The main reason why you would copy a ValueObject into Host memory it is to freeze-dry it as a ConstResult. Since that is supposed to represent the object at that point in time, it should only be valid to you ask the ValueObject to produce its children if the process is at the same StopID as when the ValueObject was made. Then the ValueObject system should fetch that more data and include it in the freeze-dried object. But if the StopID has moved on, you should just give an error: "Can't travel back in time to fetch that extra data".

At the ValueObject level, however, we only know how the data is stored (in Host or Process) and not why. So it's harder to get this behavior right.

As I said, however, I don't think this planned design was ever carried out successfully, so I don't think this will have broken anything.

source/Core/ValueObject.cpp
162–163 ↗	(On Diff #225921)	"remain a file address if it was a file address."?

davide added a subscriber: davide.Oct 24 2019, 12:30 PM

davide removed a subscriber: davide.

davide added a subscriber: davide.

Closed by commit rG96601ec28b7e: ValueObject: Fix a crash related to children address type computation (authored by labath). · Explain WhyOct 25 2019, 10:53 AM

This revision was automatically updated to reflect the committed changes.

labath marked an inline comment as done.

Herald added a project: Restricted Project. · View Herald TranscriptOct 25 2019, 10:53 AM

In Swift you have to ask the runtime for most of the layout details of objects so getting a const result object, stepping, then asking it to reevaluate itself would lead to passing the runtime incorrect data and potentially undoing a correct type decision that you had made when you fetched the result on the first stop. That's what it sounded like from Davide's description, but I haven't looked at it more deeply yet.

All the ValueObjects have a ModID, so you for const objects you should always check that before updating. I think the problem is this just never happened before because the code was not in the ValueObjectConst inheritance path. By hoisting that code into ValueObject, now it was, but it was not correct to do that without consulting the modID. Note that for cases like this we check the last user stop, not the actual last stop. We really can't avoid running expressions to get dynamic types and some other jobs, so we really want to consider had run expressions as not changing the essential state of the program. This hasn't caused problems in practice, and if it were to become a problem we could classify expressions as state changing or not state changing, and move the user stop ID forward for the former as well...

In D69273#1732582, @jingham wrote:

In Swift you have to ask the runtime for most of the layout details of objects so getting a const result object, stepping, then asking it to reevaluate itself would lead to passing the runtime incorrect data and potentially undoing a correct type decision that you had made when you fetched the result on the first stop. That's what it sounded like from Davide's description, but I haven't looked at it more deeply yet.

All the ValueObjects have a ModID, so you for const objects you should always check that before updating. I think the problem is this just never happened before because the code was not in the ValueObjectConst inheritance path. By hoisting that code into ValueObject, now it was, but it was not correct to do that without consulting the modID. Note that for cases like this we check the last user stop, not the actual last stop. We really can't avoid running expressions to get dynamic types and some other jobs, so we really want to consider had run expressions as not changing the essential state of the program. This hasn't caused problems in practice, and if it were to become a problem we could classify expressions as state changing or not state changing, and move the user stop ID forward for the former as well...

Hmm.. that's interesting. But what is the thing that's causing the layout to be "reevaluated"? Is it the GetCompilerType() call on line 141? Should we maybe just have ValueObjectConstResult override GetCompilerType so that it "freeze-dries" the type too ? From your description, it sounds like that's exactly the semantics we want here...

This seems to have fizzled out without any sort of a conclusion.. Did you guys do anything about the crashes you were seeing on the swift side?

We've been off all the past week. I'll circle back with Jim about this once I get to the office.

In D69273#1765235, @davide wrote:

We've been off all the past week. I'll circle back with Jim about this once I get to the office.

Sorry, I've been busy with other things.

In answer to Pavel's direct question "What did we do about the swift crashes" the current answer is "we backed out the patch from the swift-lldb sources". But that's clearly not the good answer...

The thing that's surprising about the crash in the swift testsuite is that it happens because, in the process of building the dynamic ValueObject of the synthetic child for the expression result ValueObject for a simple swift expression, with this code in place one of the Values we are using that is in fact a load address type gets mislabeled as a host address, and then we crash accessing it in our address space. IIUC, that's of the opposite of what this patch was trying to do...

BTW, I also think this patch is formally wrong for const result objects, because once the stop ID has moved on, const result object values should never be converted back to load address types. They are supposed to represent the state at the time of the capture, so checking the state of the process after that time can't be right. But that wasn't the failure we were seeing in the Swift testsuite.

ConstResult objects are still implicated in this crash, because "expr var" crashes but "frame var" for this swift variable works. The difference between the two cases is that in the succeeding case the root ValueObject is a ValueObjectVariable, and in the crashing case a ValueObjectConstResult. But it doesn't have to do with the const result getting into an inconsistent state because it updates itself when it shouldn't. It seems like there's just something in the logic of UpdateChildrenAddressType that is wrong for ValueObjectConstResult. But it seems to take a fairly complex chain of values - ConstResult->ConstResultChild->SyntheticValue->DynamicValue to trigger the crash, and the crash is actually in getting the backing data for a ValueObjectDynamicValue...

It will take me some more head scratching to figure out why this is going wrong but I'm currently in the process of hastening my eventual balding. Hopefully, once I've figured that out I can get a fix and if I'm lucky a C/C++ based test case that shows the same error.

Thanks for the update Jim. I'm putting some of my thoughts inline.

In D69273#1840029, @jingham wrote:

Sorry, I've been busy with other things.

In answer to Pavel's direct question "What did we do about the swift crashes" the current answer is "we backed out the patch from the swift-lldb sources". But that's clearly not the good answer...

The thing that's surprising about the crash in the swift testsuite is that it happens because, in the process of building the dynamic ValueObject of the synthetic child for the expression result ValueObject for a simple swift expression, with this code in place one of the Values we are using that is in fact a load address type gets mislabeled as a host address, and then we crash accessing it in our address space. IIUC, that's of the opposite of what this patch was trying to do...

Yes, that is the opposite of what this is trying to do, though I can see how this could trigger something like that when synthetic children come into the game. The logic children_type = am_i_pointer ? load : host is only correct for real children, and if this somehow fires on synthetic children of a non-pointer ValueObject, but the children are actually obtained via dereferencing some pointers, then they might get mislabelled as host pointers.

However, I don't think things could be this simple, as otherwise this would also reproduce on the synthetic children of std::map for instance. So there must be something more here...

BTW, I also think this patch is formally wrong for const result objects, because once the stop ID has moved on, const result object values should never be converted back to load address types. They are supposed to represent the state at the time of the capture, so checking the state of the process after that time can't be right. But that wasn't the failure we were seeing in the Swift testsuite.

I get the "state at the time of the capture" argument, but it's not clear to me what would be the behavior without this patch(?) IIUC, we currently don't have any logic which would "freeze-dry" the pointed-to values (and it's tricky to do so, because those values can contain other pointers, etc.). So, this seems correct at least in the sense that those pointer values will get the correct address class (assuming the synthetic children issue above is sorted out) and we won't crash while trying to dereference them.

I also think the "state at the time of capture" thing needs to be more nuanced. For instance, if I had a variable like int *ptr = &some_int;, I wouldn't expect that p ptr followed by a p *$1 would show me the old/stale value of some_int. That would certainly be the case if the pointer is part of an "implementation detail" of something (like the const char * inside std::string), but it seems like there are at least some cases where it would be reasonable to follow the pointer values into live memory.

ConstResult objects are still implicated in this crash, because "expr var" crashes but "frame var" for this swift variable works. The difference between the two cases is that in the succeeding case the root ValueObject is a ValueObjectVariable, and in the crashing case a ValueObjectConstResult. But it doesn't have to do with the const result getting into an inconsistent state because it updates itself when it shouldn't. It seems like there's just something in the logic of UpdateChildrenAddressType that is wrong for ValueObjectConstResult. But it seems to take a fairly complex chain of values - ConstResult->ConstResultChild->SyntheticValue->DynamicValue to trigger the crash, and the crash is actually in getting the backing data for a ValueObjectDynamicValue...

It will take me some more head scratching to figure out why this is going wrong but I'm currently in the process of hastening my eventual balding.

Lets hope it doesn't come to that. :) Let me know if there's anything I can do to help.

labath mentioned this in D83450: Delegate UpdateChildrenPointerType to the Root ValueObject.Jul 9 2020, 9:24 AM

Revision Contents

Path

Size

lldb/

include/

lldb/

Core/

ValueObject.h

1 line

source/

Core/

ValueObject.cpp

53 lines

ValueObjectVariable.cpp

45 lines

test/

Shell/

SymbolFile/

DWARF/

DW_OP_piece-struct.s

113 lines

Diff 226458

lldb/include/lldb/Core/ValueObject.h

Show First 20 Lines • Show All 979 Lines • ▼ Show 20 Lines	const char *GetLocationAsCStringImpl(const Value &value,
const DataExtractor &data);		const DataExtractor &data);

bool IsChecksumEmpty();		bool IsChecksumEmpty();

void SetPreferredDisplayLanguageIfNeeded(lldb::LanguageType);		void SetPreferredDisplayLanguageIfNeeded(lldb::LanguageType);

private:		private:
virtual CompilerType MaybeCalculateCompleteType();		virtual CompilerType MaybeCalculateCompleteType();
		void UpdateChildrenAddressType();

lldb::ValueObjectSP GetValueForExpressionPath_Impl(		lldb::ValueObjectSP GetValueForExpressionPath_Impl(
llvm::StringRef expression_cstr,		llvm::StringRef expression_cstr,
ExpressionPathScanEndReason *reason_to_stop,		ExpressionPathScanEndReason *reason_to_stop,
ExpressionPathEndResultType *final_value_type,		ExpressionPathEndResultType *final_value_type,
const GetValueForExpressionPathOptions &options,		const GetValueForExpressionPathOptions &options,
ExpressionPathAftermath *final_task_on_target);		ExpressionPathAftermath *final_task_on_target);

▲ Show 20 Lines • Show All 47 Lines • Show Last 20 Lines

lldb/source/Core/ValueObject.cpp

Show First 20 Lines • Show All 127 Lines • ▼ Show 20 Lines	: UserID(++g_value_obj_uid), // Unique identifier for every value object
m_is_synthetic_children_generated(false) {		m_is_synthetic_children_generated(false) {
m_manager = new ValueObjectManager();		m_manager = new ValueObjectManager();
m_manager->ManageObject(this);		m_manager->ManageObject(this);
}		}

// Destructor		// Destructor
ValueObject::~ValueObject() {}		ValueObject::~ValueObject() {}

		void ValueObject::UpdateChildrenAddressType() {
		Value::ValueType value_type = m_value.GetValueType();
		ExecutionContext exe_ctx(GetExecutionContextRef());
		Process *process = exe_ctx.GetProcessPtr();
		const bool process_is_alive = process && process->IsAlive();
		const uint32_t type_info = GetCompilerType().GetTypeInfo();
		const bool is_pointer_or_ref =
		(type_info & (lldb::eTypeIsPointer \| lldb::eTypeIsReference)) != 0;

		switch (value_type) {
		case Value::eValueTypeFileAddress:
		// If this type is a pointer, then its children will be considered load
		// addresses if the pointer or reference is dereferenced, but only if
		// the process is alive.
		//
		// There could be global variables like in the following code:
		// struct LinkedListNode { Foo* foo; LinkedListNode* next; };
		// Foo g_foo1;
		// Foo g_foo2;
		// LinkedListNode g_second_node = { &g_foo2, NULL };
		// LinkedListNode g_first_node = { &g_foo1, &g_second_node };
		//
		// When we aren't running, we should be able to look at these variables
		// using the "target variable" command. Children of the "g_first_node"
		// always will be of the same address type as the parent. But children
		// of the "next" member of LinkedListNode will become load addresses if
		// we have a live process, or remain a file address if it was a file
		// address.
		if (process_is_alive && is_pointer_or_ref)
		SetAddressTypeOfChildren(eAddressTypeLoad);
		else
		SetAddressTypeOfChildren(eAddressTypeFile);
		break;
		case Value::eValueTypeHostAddress:
		// Same as above for load addresses, except children of pointer or refs
		// are always load addresses. Host addresses are used to store freeze
		// dried variables. If this type is a struct, the entire struct
		// contents will be copied into the heap of the
		// LLDB process, but we do not currently follow any pointers.
		if (is_pointer_or_ref)
		SetAddressTypeOfChildren(eAddressTypeLoad);
		else
		SetAddressTypeOfChildren(eAddressTypeHost);
		break;
		case Value::eValueTypeLoadAddress:
		case Value::eValueTypeScalar:
		case Value::eValueTypeVector:
		SetAddressTypeOfChildren(eAddressTypeLoad);
		break;
		}
		}

bool ValueObject::UpdateValueIfNeeded(bool update_format) {		bool ValueObject::UpdateValueIfNeeded(bool update_format) {

bool did_change_formats = false;		bool did_change_formats = false;

if (update_format)		if (update_format)
did_change_formats = UpdateFormatsIfNeeded();		did_change_formats = UpdateFormatsIfNeeded();

// If this is a constant value, then our success is predicated on whether we		// If this is a constant value, then our success is predicated on whether we
▲ Show 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	if (IsInScope()) {
old_checksum.begin());		old_checksum.begin());
}		}

bool success = UpdateValue();		bool success = UpdateValue();

SetValueIsValid(success);		SetValueIsValid(success);

if (success) {		if (success) {
		UpdateChildrenAddressType();
const uint64_t max_checksum_size = 128;		const uint64_t max_checksum_size = 128;
m_data.Checksum(m_value_checksum, max_checksum_size);		m_data.Checksum(m_value_checksum, max_checksum_size);
} else {		} else {
need_compare_checksums = false;		need_compare_checksums = false;
m_value_checksum.clear();		m_value_checksum.clear();
}		}

assert(!need_compare_checksums \|\|		assert(!need_compare_checksums \|\|
▲ Show 20 Lines • Show All 3,165 Lines • Show Last 20 Lines

lldb/source/Core/ValueObjectVariable.cpp

Show First 20 Lines • Show All 162 Lines • ▼ Show 20 Lines	if (expr.Evaluate(&exe_ctx, nullptr, loclist_base_load_addr, nullptr,
CompilerType compiler_type = GetCompilerType();		CompilerType compiler_type = GetCompilerType();
if (compiler_type.IsValid())		if (compiler_type.IsValid())
m_value.SetCompilerType(compiler_type);		m_value.SetCompilerType(compiler_type);

Value::ValueType value_type = m_value.GetValueType();		Value::ValueType value_type = m_value.GetValueType();

Process *process = exe_ctx.GetProcessPtr();		Process *process = exe_ctx.GetProcessPtr();
const bool process_is_alive = process && process->IsAlive();		const bool process_is_alive = process && process->IsAlive();
const uint32_t type_info = compiler_type.GetTypeInfo();
const bool is_pointer_or_ref =
(type_info & (lldb::eTypeIsPointer \| lldb::eTypeIsReference)) != 0;

switch (value_type) {
case Value::eValueTypeFileAddress:
// If this type is a pointer, then its children will be considered load
// addresses if the pointer or reference is dereferenced, but only if
// the process is alive.
//
// There could be global variables like in the following code:
// struct LinkedListNode { Foo* foo; LinkedListNode* next; };
// Foo g_foo1;
// Foo g_foo2;
// LinkedListNode g_second_node = { &g_foo2, NULL };
// LinkedListNode g_first_node = { &g_foo1, &g_second_node };
//
// When we aren't running, we should be able to look at these variables
// using the "target variable" command. Children of the "g_first_node"
// always will be of the same address type as the parent. But children
// of the "next" member of LinkedListNode will become load addresses if
// we have a live process, or remain what a file address if it what a
// file address.
if (process_is_alive && is_pointer_or_ref)
SetAddressTypeOfChildren(eAddressTypeLoad);
else
SetAddressTypeOfChildren(eAddressTypeFile);
break;
case Value::eValueTypeHostAddress:
// Same as above for load addresses, except children of pointer or refs
// are always load addresses. Host addresses are used to store freeze
// dried variables. If this type is a struct, the entire struct
// contents will be copied into the heap of the
// LLDB process, but we do not currently follow any pointers.
if (is_pointer_or_ref)
SetAddressTypeOfChildren(eAddressTypeLoad);
else
SetAddressTypeOfChildren(eAddressTypeHost);
break;
case Value::eValueTypeLoadAddress:
case Value::eValueTypeScalar:
case Value::eValueTypeVector:
SetAddressTypeOfChildren(eAddressTypeLoad);
break;
}

switch (value_type) {		switch (value_type) {
case Value::eValueTypeVector:		case Value::eValueTypeVector:
// fall through		// fall through
case Value::eValueTypeScalar:		case Value::eValueTypeScalar:
// The variable value is in the Scalar value inside the m_value. We can		// The variable value is in the Scalar value inside the m_value. We can
// point our m_data right to it.		// point our m_data right to it.
m_error =		m_error =
▲ Show 20 Lines • Show All 153 Lines • Show Last 20 Lines

lldb/test/Shell/SymbolFile/DWARF/DW_OP_piece-struct.s

This file was added.

				# RUN: llvm-mc -filetype=obj -o %t -triple x86_64-pc-linux %s
				# RUN: %lldb %t -o "target variable reset" -b \| FileCheck %s

				# CHECK: (lldb) target variable reset
				# CHECK: (auto_reset) reset = {
				# CHECK: ptr = 0xdeadbeefbaadf00d
				# CHECK: prev = false
				# CHECK: }

				.section .debug_abbrev,"",@progbits
				.byte 1 # Abbreviation Code
				.byte 17 # DW_TAG_compile_unit
				.byte 1 # DW_CHILDREN_yes
				.byte 0 # EOM(1)
				.byte 0 # EOM(2)
				.byte 2 # Abbreviation Code
				.byte 52 # DW_TAG_variable
				.byte 0 # DW_CHILDREN_no
				.byte 3 # DW_AT_name
				.byte 8 # DW_FORM_string
				.byte 73 # DW_AT_type
				.byte 19 # DW_FORM_ref4
				.byte 2 # DW_AT_location
				.byte 24 # DW_FORM_exprloc
				.byte 0 # EOM(1)
				.byte 0 # EOM(2)
				.byte 3 # Abbreviation Code
				.byte 36 # DW_TAG_base_type
				.byte 0 # DW_CHILDREN_no
				.byte 3 # DW_AT_name
				.byte 8 # DW_FORM_string
				.byte 62 # DW_AT_encoding
				.byte 11 # DW_FORM_data1
				.byte 11 # DW_AT_byte_size
				.byte 11 # DW_FORM_data1
				.byte 0 # EOM(1)
				.byte 0 # EOM(2)
				.byte 4 # Abbreviation Code
				.byte 19 # DW_TAG_structure_type
				.byte 1 # DW_CHILDREN_yes
				.byte 3 # DW_AT_name
				.byte 8 # DW_FORM_string
				.byte 11 # DW_AT_byte_size
				.byte 11 # DW_FORM_data1
				.byte 0 # EOM(1)
				.byte 0 # EOM(2)
				.byte 5 # Abbreviation Code
				.byte 13 # DW_TAG_member
				.byte 0 # DW_CHILDREN_no
				.byte 3 # DW_AT_name
				.byte 8 # DW_FORM_string
				.byte 73 # DW_AT_type
				.byte 19 # DW_FORM_ref4
				.byte 56 # DW_AT_data_member_location
				.byte 11 # DW_FORM_data1
				.byte 0 # EOM(1)
				.byte 0 # EOM(2)
				.byte 6 # Abbreviation Code
				.byte 15 # DW_TAG_pointer_type
				.byte 0 # DW_CHILDREN_no
				.byte 73 # DW_AT_type
				.byte 19 # DW_FORM_ref4
				.byte 0 # EOM(1)
				.byte 0 # EOM(2)
				.byte 0 # EOM(3)

				.section .debug_info,"",@progbits
				.Lcu_begin0:
				.long .Lcu_end-.Lcu_start # Length of Unit
				.Lcu_start:
				.short 4 # DWARF version number
				.long .debug_abbrev # Offset Into Abbrev. Section
				.byte 8 # Address Size (in bytes)
				.byte 1 # Abbrev [1] 0xb:0x6c DW_TAG_compile_unit
				.Lbool:
				.byte 3 # Abbrev [3] 0x33:0x7 DW_TAG_base_type
				.asciz "bool" # DW_AT_name
				.byte 2 # DW_AT_encoding
				.byte 1 # DW_AT_byte_size
				.byte 2 # Abbrev [2] 0x3a:0x15 DW_TAG_variable
				.asciz "reset" # DW_AT_name
				.long .Lstruct # DW_AT_type
				.byte 2f-1f # DW_AT_location
				1:
				.byte 0xe # DW_OP_constu
				.quad 0xdeadbeefbaadf00d
				.byte 0x9f # DW_OP_stack_value
				.byte 0x93 # DW_OP_piece
				.uleb128 8
				.byte 0xe # DW_OP_constu
				.quad 0
				.byte 0x9f # DW_OP_stack_value
				.byte 0x93 # DW_OP_piece
				.uleb128 8
				2:
				.Lstruct:
				.byte 4 # Abbrev [4] 0x4f:0x22 DW_TAG_structure_type
				.asciz "auto_reset" # DW_AT_name
				.byte 16 # DW_AT_byte_size
				.byte 5 # Abbrev [5] 0x58:0xc DW_TAG_member
				.asciz "ptr" # DW_AT_name
				.long .Lbool_ptr # DW_AT_type
				.byte 0 # DW_AT_data_member_location
				.byte 5 # Abbrev [5] 0x64:0xc DW_TAG_member
				.asciz "prev" # DW_AT_name
				.long .Lbool # DW_AT_type
				.byte 8 # DW_AT_data_member_location
				.byte 0 # End Of Children Mark
				.Lbool_ptr:
				.byte 6 # Abbrev [6] 0x71:0x5 DW_TAG_pointer_type
				.long .Lbool # DW_AT_type
				.byte 0 # End Of Children Mark
				.Lcu_end: