This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Fix Type::GetByteSize for pointer types
ClosedPublic

Authored by labath on Aug 24 2020, 2:36 AM.

Details

Summary

The function was returning an incorrect (empty) value on the first
invocation. Given that this only affected the first invocation, this
bug/typo went mostly unaffected. DW_AT_const_value were particularly
badly affected by this as the GetByteSize call is
SymbolFileDWARF::ParseVariableDIE is likely to be the first call of this
function, and its effects cannot be undone by retrying.

Depends on D86348.

Diff Detail

Event Timeline

labath created this revision.Aug 24 2020, 2:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 24 2020, 2:36 AM
labath requested review of this revision.Aug 24 2020, 2:36 AM
davide accepted this revision.Aug 24 2020, 9:26 AM

I came to the same conclusion when analyzing https://bugs.llvm.org/show_bug.cgi?id=47257 (but you beat me to the punch).

This revision is now accepted and ready to land.Aug 24 2020, 9:26 AM
shafik added inline comments.Aug 24 2020, 10:06 AM
lldb/source/Symbol/Type.cpp
378

Wouldn't it be better to turn m_byte_size into an Optional? As this fix shows this having this additional state we carry around is error prone.

labath added inline comments.Aug 24 2020, 10:15 AM
lldb/source/Symbol/Type.cpp
378

IIRC, this was done this way to reduce sizeof(Type). Note that the external interface is Optional<uint64_t> and the only place that knows (should know?) about this encoding is this function.

davide added inline comments.Aug 24 2020, 10:15 AM
lldb/source/Symbol/Type.cpp
378

This is a much larger project, probably worth considering.
With that in mind, it wouldn't have prevented this bug from happening, as this falls through and returns an Optional<T> anyway.
Yay fuzz testing (that found this), I would say.

shafik added inline comments.Aug 24 2020, 10:18 AM
lldb/source/Symbol/Type.cpp
378

It would have prevented this bug as you just return m_byte_size no matter what.

davide added a subscriber: teemperor.EditedAug 24 2020, 10:23 AM

@labath something I noticed when finding this (and related bugs) is that frame var carries a decent diagnostic

(int *) l_125 = <empty constant data>

and the expression parser returns something not particularly useful:

(lldb) p l_125 
error: <lldb wrapper prefix>:43:31: no member named 'l_125' in namespace '$__lldb_local_vars'
    using $__lldb_local_vars::l_125;
          ~~~~~~~~~~~~~~~~~~~~^
error: <user expression 0>:1:1: use of undeclared identifier 'l_125'
l_125

From my testing infrastructure/fuzzing perspective the two are indistinguishable, as the script I've written chokes on both, but it would be better from an ergonomics point of view if p would return something meaningful, if possible (even if there's a bug in lldb). Do you think it's worth filing a PR? (also, cc: @teemperor for ideas as he spent a fair amount of time working on the expression parser)

davide added inline comments.Aug 24 2020, 10:26 AM
lldb/source/Symbol/Type.cpp
378

If you think it's an improvement, please consider submitting a review, I would be eager to look at it!

@labath something I noticed when finding this (and related bugs) is that frame var carries a decent diagnostic

(int *) l_125 = <empty constant data>

and the expression parser returns something not particularly useful:

(lldb) p l_125 
error: <lldb wrapper prefix>:43:31: no member named 'l_125' in namespace '$__lldb_local_vars'
    using $__lldb_local_vars::l_125;
          ~~~~~~~~~~~~~~~~~~~~^
error: <user expression 0>:1:1: use of undeclared identifier 'l_125'
l_125

From my testing infrastructure/fuzzing perspective the two are indistinguishable, as the script I've written chokes on both, but it would be better from an ergonomics point of view if p would return something meaningful, if possible (even if there's a bug in lldb). Do you think it's worth filing a PR? (also, cc: @teemperor for ideas as he spent a fair amount of time working on the expression parser)

For this particular case, I think the best diagnostic would be one of those module-level warnings we print to stderr, as I think that a DW_AT_const_value attribute with no data means the debug info is broken.

Speaking more generally, it may be interesting to try to present variables with no value (e.g. optimized out) to clang as undefined (extern) variables. That way we could get past the "compile" stage, and handle this stuff in the "link" stage. Here, I believe we have more control so it may be easier to print something like "Cannot evaluate expression because variable 'foo' is optimized out. But I am not really an expert here so I am just guessing...
I'm not really an expert here, and we are sort of limited by what error messages we can get clang to produce. The current error message is not completely unreasonable because it basically says "this variable does not exist"

lldb/source/Symbol/Type.cpp
378

FWIW, there is a way to make this more robust without the memory size increase due to llvm::Optional:

if (!m_byte_size_has_value) {
  if (Optional<uint64_t> byte_size = ComputeByteSize()) {
    m_byte_size_has_value = true;
    m_byte_size = *byte_size;
  }
}
if (m_byte_size_has_value)
  return m_byte_size;
return None;

That way the optional-to-flat conversion is fully contained in 9 lines of code, which can be easily audited for correctness.

This revision was landed with ongoing or failed builds.Aug 27 2020, 6:41 AM
This revision was automatically updated to reflect the committed changes.