Page MenuHomePhabricator

Fix (and improve) the support for C99 variable length array types
ClosedPublic

Authored by aprantl on Oct 22 2018, 2:39 PM.

Details

Summary

Clang recently improved its DWARF support for C VLA types. The DWARF now looks like this:

0x00000051:         DW_TAG_variable [4]  
                     DW_AT_location( fbreg -32 )
                     DW_AT_name( "__vla_expr" )
                     DW_AT_type( {0x000000d3} ( long unsigned int ) )
                     DW_AT_artificial( true )
...
0x000000da:     DW_TAG_array_type [10] *
                 DW_AT_type( {0x000000cc} ( int ) )

0x000000df:         DW_TAG_subrange_type [11]  
                     DW_AT_type( {0x000000e9} ( __ARRAY_SIZE_TYPE__ ) )
                     DW_AT_count( {0x00000051} )

Without this patch LLDB will naively interpret the DIE offset 0x51 as the static size of the array, which is clearly wrong.
This patch uses LLDB's dynamic type mechanism to re-parse VLA types with an optional execution context, to dynamically resolve the size of the array correctly. These dynamic types are not being cached, since they are only valid in a single execution context.

See the testcase for an example:

   4   int foo(int a) {
   5   	  int vla[a];
   6   	  for (int i = 0; i < a; ++i)
   7   	    vla[i] = i;
   8   	
-> 9   	  pause(); // break here
   10  	  return vla[a-1];
   11  	}
   12  	

(lldb) fr v vla
(int [4]) vla = ([0] = 0, [1] = 1, [2] = 2, [3] = 3)
(lldb) quit

Diff Detail

Repository
rL LLVM

Event Timeline

aprantl created this revision.Oct 22 2018, 2:39 PM
aprantl edited the summary of this revision. (Show Details)Oct 22 2018, 4:04 PM

This seems good to me, but Greg is more expert than I so I'd wait on his okay.

source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
1901–1907 ↗(On Diff #170496)

You have to do this twice, maybe there should be a method to add the results to the DIE to Type map?

aprantl updated this revision to Diff 170526.Oct 22 2018, 4:57 PM

Factor out caching of types as suggested. Thanks!

clayborg requested changes to this revision.EditedOct 23 2018, 9:21 AM

Parsing a type shouldn't need an execution context and we shouldn't be re-parsing a type over and over for each frame. We should be encoding the array expression somewhere that we can access it when we need to get the number of children using the current execution context.

The way I would prefer to see this:

  • Revert all SymbolFile changes that added an execution context to type parsing
  • Change the type parsing logic in SymbolFileDWARF to store the array count expression in some format in the TypeSystem (ClangASTContext.cpp) that is associated with the clang opaque type via clang type metadata maybe?
  • Change "virtual uint32_t TypeSystem::GetNumChildren(...);" and "virtual CompilerType TypeSystem::GetChildCompilerTypeAtIndex(...);" to take an execution context that defaults to nothing like you did with the type parsing in the first patch. This execution context can be used to evaluate the count expression as needed. We can attempt to grab the count expression from the clang opaque type that we stored in the above step, and if one exists, use the current frame to evaluate it correctly.
source/Plugins/LanguageRuntime/CPlusPlus/ItaniumABI/ItaniumABILanguageRuntime.cpp
213–229 ↗(On Diff #170526)

We must abstract this via the TypeSystem stuff and make this happen via the CompilerType interface. What happens when "in_value" is a swift type or other type from the type system? Crash

This revision now requires changes to proceed.Oct 23 2018, 9:21 AM
davide added a subscriber: davide.Oct 25 2018, 10:57 AM
davide added inline comments.
source/Plugins/LanguageRuntime/CPlusPlus/ItaniumABI/ItaniumABILanguageRuntime.cpp
213–229 ↗(On Diff #170526)

My understanding is that it is not possible today to get here with a swift type. In fact, GetDynamicTypeAndAddress in swift will be always called on a SwiftLanguageRuntime.

That said, I'm not necessarily advocating against the abstraction, just pointing out that this is just a theoretical concern, and given this is an internal API in the debugger, it could be possibly enforced by an assertion to make sure people don't change the status quo accidentally.

I now spent (way too much :-) time experimenting with alternative implementations of this patch. In https://reviews.llvm.org/D53961 there is a version that changes GetNumChildren() to take an execution context. I don't think that doing it this way is a good solution either and here's why:

If we move the logic to GetNumChildren; we can print the array elements, but the type remains unchanged, which makes for a worse user experience.

With VLAs implemented as dynamic types we get

(lldb) fr v vla
(int [4]) vla = ([0] = 1, [1] = 2)

if we only change GetNumChildren() we get

(lldb) fr v vla
(int []) vla = ([0] = 1, [1] = 2)

which is less nice, since we no longer see the size of the array.

What i'm proposing is to keep the implementation as a dynamic type in ItaniumABILanguageRuntime.cpp but reduce the surface area of the execution context pass-through in DWARFASTParserClang.cpp. Similar to what I did in https://reviews.llvm.org/D53961 we can add a ResolveDynamicArrayUID(uid, exe_ctx) method that only creates an array type. Since we need to re-create a fresh clang type at every execution context anyway, going through the single-function array-only DWARF parser seems like the right trade-off to me. Let me know what you think!

The only other thing you would need to change to get the usability back in check when doing things in GetNumChildren() would be to have the function that gets the typename take on optional execution context for dynamic types. The ValueObject can easily pass its execution context when getting the typename. Anyone who doesn't would continue to get the "int []" as the typename which isn't really lying either way. Thoughts?

To me a VLA fulfills all properties of a dynamic type so not modeling it as a dynamic type feels backwards to me. But not having to deal with temporary clang types might be worth the trade-off.

The only other thing you would need to change to get the usability back in check when doing things in GetNumChildren() would be to have the function that gets the typename take on optional execution context for dynamic types. The ValueObject can easily pass its execution context when getting the typename. Anyone who doesn't would continue to get the "int []" as the typename which isn't really lying either way. Thoughts?

I didn't realize that the string int [] is produced by ValueObject itself; I think this makes this option more palatable to me. I'll give it a try.

I didn't realize that the string int [] is produced by ValueObject itself; I think this makes this option more palatable to me. I'll give it a try.

So, it turns out it isn't. The only way to get the length into the typename is to hack ClangASTContext::GetTypeName to replace the training "[]" of what clang returns as the typename with a different string. The main problem with this is that this will only work for outermost types.

Something that has been requested in the past is to support C structs with trailing array members, such as

struct variable_size {
  unsigned length;
  int __attribute__((size=.length)) elements[]; // I just made up this attribute, but you get the basic idea.
};

in a similar fashion. When printing such a struct, there's no way of safely injecting the size into array type string any more.
If we dynamically created the correctly-sized array type instead, this would work just fine.

I haven't yet understood the motivation behind why overriding GetNumChildren/GetTypeName/GetChildAtIndex is preferable over creating a dynamic type in the language runtime. Is there something that I need to know?

I didn't realize that the string int [] is produced by ValueObject itself; I think this makes this option more palatable to me. I'll give it a try.

So, it turns out it isn't. The only way to get the length into the typename is to hack ClangASTContext::GetTypeName to replace the training "[]" of what clang returns as the typename with a different string. The main problem with this is that this will only work for outermost types.

Something that has been requested in the past is to support C structs with trailing array members, such as

struct variable_size {
  unsigned length;
  int __attribute__((size=.length)) elements[]; // I just made up this attribute, but you get the basic idea.
};

in a similar fashion. When printing such a struct, there's no way of safely injecting the size into array type string any more.
If we dynamically created the correctly-sized array type instead, this would work just fine.

I haven't yet understood the motivation behind why overriding GetNumChildren/GetTypeName/GetChildAtIndex is preferable over creating a dynamic type in the language runtime. Is there something that I need to know?

Creating a new dynamic type every time you stop seems like a lot of work and a possible waste of memory. You will need to see if you already have such a type ("int[4]" already exists) and use it if it does each time you stop. Any type that can change dynamically should easily be able to be detected by the type system in GetNumChildren(...) with the right execution context and and a dynamic type name can also easily be calculated in the type system. One option would be to just display "int[]" and have a summary for any dynamic array types that says "size = 4". Then we don't have to touch the typename.

aprantl updated this revision to Diff 172388.Nov 2 2018, 10:04 AM

Version that only overrides GetNumChildren to avoid creating dynamic clang types.

Fair point. So here's a version that only overrides GetNumChildren(). I'll leave the type summary for a follow-up patch.

aprantl updated this revision to Diff 172390.Nov 2 2018, 10:08 AM

Fix a bug in the testcase.

clayborg accepted this revision.Nov 5 2018, 7:37 AM

Thanks for making the changes!

This revision is now accepted and ready to land.Nov 5 2018, 7:37 AM
This revision was automatically updated to reflect the committed changes.

Thanks for your patience!

TestVla fails on Windows: http://lab.llvm.org:8014/builders/lldb-x64-windows-ninja/builds/1129/steps/test/logs/stdio

AssertionError: False is not True : 'frame var vla' returns expected result, got '(int []) vla = {}'

I will have time to look in more detail later this week, but if you have any ideas in the mean time, that would be great.

TestVla fails on Windows: http://lab.llvm.org:8014/builders/lldb-x64-windows-ninja/builds/1129/steps/test/logs/stdio

AssertionError: False is not True : 'frame var vla' returns expected result, got '(int []) vla = {}'

I will have time to look in more detail later this week, but if you have any ideas in the mean time, that would be great.

Is your bot using DWARF or CodeView as a debug info format? Because this test definitely requires DWARF.

TestVla fails on Windows: http://lab.llvm.org:8014/builders/lldb-x64-windows-ninja/builds/1129/steps/test/logs/stdio

AssertionError: False is not True : 'frame var vla' returns expected result, got '(int []) vla = {}'

I will have time to look in more detail later this week, but if you have any ideas in the mean time, that would be great.

Is your bot using DWARF or CodeView as a debug info format? Because this test definitely requires DWARF.

I was under the impression that each test can control how to build the necessary payloads. Is that not the case?

For dotest style tests, the debug format to test is chosen by the test driver. All the tests should run with any format, but sometimes there are bugs in one reader or another (or in one version of DWARF or another) so you can skip or xfail a test based on format. Sounds like this test should be skipped when the debug format is PDB, since it seems like either the PDB format doesn't express a way to find the actual size of the vla (or the current PDB reader doesn't process those records).

Jim

There is a regression in trunk clang+lldb on Fedora 29 x86_64. Using either 7.0 clang or 7.0 lldb makes the testcases PASS but both trunk clang and trunk lldb makes the testcase FAIL.
This commit (rL346165) is the regressing one for LLDB, commit rL349207 by @dblaikie is the regressing one for clang.

TestChar1632T.py

FAIL:

Failure-TestChar1632T-Char1632TestCase-test_dwo-x86_64-redhat-llvm-monorepo-clang-bin-clang-8.log
runCmd: frame variable as16 as32
output: (char16_t []) as16 = {}
(char32_t []) as32 = {}

PASS:

Success-TestChar1632T-Char1632TestCase-test_dwo-x86_64-redhat-llvm-monorepo-clang-bin-clang-8.log
runCmd: frame variable as16 as32
output: (char16_t [64]) as16 = u"ﺸﺵۻ"
(char32_t [64]) as32 = U"ЕЙРГЖО"
clang-1lldb-1PASS
clang-1lldbPASS
clanglldb-1PASS
clanglldbFAIL

monorepo:
clang-1=d162fb426deb61f897038e07ebd2fcaae8e51c01^=b36eb520c704507990b039011cc02c3da181c461
clang =d162fb426deb61f897038e07ebd2fcaae8e51c01
lldb -1=0154738f9a1f73b5fe108671f61554b94eabf3fb^=91e3d2bffe38dbde2976cbb76cecce97a3e94af6
lldb =0154738f9a1f73b5fe108671f61554b94eabf3fb

git:
clang = b4b260e17979210464c395ee00e6c985f305a16e = https://reviews.llvm.org/rL349207
lldb = 8fea0d8957787360d7e5abefc7b6a94991c07f85 = https://reviews.llvm.org/rL346165 = https://reviews.llvm.org/D53530

GDB seems to handle the clang output OK:

gdb -batch -ex 'b 42' -ex r -ex 'p as16' -ex 'p as32' ./lldb-test-build.noindex/lang/cpp/char1632_t/TestChar1632T.test_dwo/a.out
...
$1 = u"色ハ匂ヘト散リヌルヲ\000\000\000\000\000\000y\000\000\000\xde08@\000\000᯿\001\000\000ᰰ\001\000\000ミ\xffff\xffff\xffffѰ\000\000\000ᇁ\000Ҡ\000ᕭ@\000\000ῠ翿\000\000\000\000\000ᔠ@\000\000ၐ@\000"
$2 = U"෴\000РГЖО\000\000\000\000\x6562b026\000\xf7ffea80翿\xf7b89061翿\xf7c55c60翿ɐ\000\xf7c564f8翿\xf7b1d5ad翿\002\000\xf7b198d7翿\000\000\x405000\000ɰ\000\xfffff000\xffffffff\000\000\000\000\000\000\000\000\000\000\xf7c55cc0翿#\000\000\000\xffffff90\xffffffff\xf7c55c60翿ɐ\000\xf7b1a909翿ɀ\000\000"

Using for the test:

PYTHONPATH=$PWD/lib64/python2.7/site-packages:$PWD/lib64/python2.7/site-packages/lldb" LD_LIBRARY_PATH=$PWD/lib:$PWD/lib64/python2.7/site-packages/lldb" ../llvm-monorepo/lldb/test/dotest.py --executable /PATH/TO/bin/lldb -C $PWD/bin/clang --log-success -t ../llvm-monorepo/lldb/packages/Python/lldbsuite/test/  -p TestChar1632T.py
jankratochvil added inline comments.
lldb/trunk/source/Plugins/SymbolFile/DWARF/DWARFASTParserClang.cpp
3476

Here it could accidentally follow a number as DIE reference, addressed by: D56068