Page MenuHomePhabricator

Add possibility to get module from SBType
ClosedPublic

Authored by fallkrum on Sep 29 2020, 6:07 AM.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
fallkrum requested review of this revision.Sep 29 2020, 6:07 AM
mib added a subscriber: mib.Sep 29 2020, 7:21 AM

Spotted some little things here and there ... This patch lacks a test and would need some reformatting (git clang-format HEAD~) but other than that, looks ok to me.

lldb/include/lldb/Symbol/Type.h
118 ↗(On Diff #294949)

Typo

lldb/source/API/SBType.cpp
504 ↗(On Diff #294949)

It would be nice to record the return value for the reproducers.

507 ↗(On Diff #294949)

Same here

In D88483#2300716, @mib wrote:

Spotted some little things here and there ... This patch lacks a test and would need some reformatting (git clang-format HEAD~) but other than that, looks ok to me.

Can I format and add tests in the very end? The issue of getting module from type turned out a bit tricky and I'm not sure I'v done it the right way. Would like to see opinions first so as not to format and rewrite tests in vain.

JDevlieghere requested changes to this revision.Oct 14 2020, 8:35 AM
JDevlieghere added inline comments.
lldb/source/API/SBType.cpp
504 ↗(On Diff #294949)

Not just nice, it's mandatory.

This revision now requires changes to proceed.Oct 14 2020, 8:35 AM
fallkrum updated this revision to Diff 298184.Oct 14 2020, 10:44 AM
fallkrum marked 4 inline comments as done.
fallkrum updated this revision to Diff 298445.Oct 15 2020, 1:02 PM

API test added.

mib requested changes to this revision.EditedOct 15 2020, 6:26 PM

Could you please follow PEP8 style guide regarding variable names :

Function names should be lowercase, with words separated by underscores as necessary to improve readability.

Variable names follow the same convention as function names.

I also left some inline suggestions regarding the test.
After changing that, it should be good for me.

Thanks in advance.

lldb/test/API/functionalities/type_get_module/TestTypeGetModule.py
36
39
41–46
This revision now requires changes to proceed.Oct 15 2020, 6:26 PM
JDevlieghere added inline comments.Oct 15 2020, 8:35 PM
lldb/include/lldb/Symbol/Type.h
122 ↗(On Diff #298184)

These should be /// doxygen-style comments.

fallkrum updated this revision to Diff 298596.Oct 16 2020, 4:40 AM
fallkrum set the repository for this revision to rG LLVM Github Monorepo.
fallkrum marked 3 inline comments as done.Oct 16 2020, 4:42 AM
mib accepted this revision.Oct 16 2020, 6:46 AM

LGTM provided that you address my last comments :)

Also, please make sure that your patch is well formatted (git clang-format HEAD~) before submitting it.

Thanks.

lldb/include/lldb/Symbol/Type.h
111–114 ↗(On Diff #298596)

Can you change this to be /// (triple slash) Doxygen-style comments please ?

117–120 ↗(On Diff #298596)

Same here

fallkrum updated this revision to Diff 299004.Oct 19 2020, 4:34 AM
fallkrum marked 2 inline comments as done.Oct 19 2020, 4:40 AM

Did not noticed it must be 3 slashes instead of 2 :)

Also, please make sure that your patch is well formatted (git clang-format HEAD~) before submitting it.

git-clang-format HEAD~ says the patch is well formatted.

Can you please give me a couple of words on what m_static_type and m_dynamic_type is in TypeImpl class, in what cases can it be initialized with both of those types?

jingham added a comment.EditedOct 19 2020, 10:19 AM

Can you please give me a couple of words on what m_static_type and m_dynamic_type is in TypeImpl class, in what cases can it be initialized with both of those types?

ValueObjectDynamic value uses a TypeImpl to store its type, and it stores both the static type (the one returned by the parsed expression or was in the DWARF variable records for some local value) and the dynamic type it figured out by looking at the value in memory in this TypeImpl. See for instance ValueObjectDynamicValue::UpdateValue. ValueObjectDynamicValue could also get its static type from its parent. I'm not sure why it doesn't do that and stores it in its TypeImpl instead; presumably this was more convenient somewhere.

Can you please give me a couple of words on what m_static_type and m_dynamic_type is in TypeImpl class, in what cases can it be initialized with both of those types?

ValueObjectDynamic value uses a TypeImpl to store its type, and it stores both the static type (the one returned by the parsed expression or was in the DWARF variable records for some local value) and the dynamic type it figured out by looking at the value in memory in this TypeImpl. See for instance ValueObjectDynamicValue::UpdateValue. ValueObjectDynamicValue could also get its static type from its parent. I'm not sure why it doesn't do that and stores it in its TypeImpl instead; presumably this was more convenient somewhere.

Thanks for explanation.
Could please tell if I understood you correctly: from the start we load into some module’s type system some type for example from DWARF, but during runtime it may be altered and as a consequence ‘original’ static type becomes dynamic in-memory type? It still seems to me I do not completely understand relation between static and it’s dynamic part, if it is correctly to say so. Would like to understand the origin of this.

Can you please give me a couple of words on what m_static_type and m_dynamic_type is in TypeImpl class, in what cases can it be initialized with both of those types?

ValueObjectDynamic value uses a TypeImpl to store its type, and it stores both the static type (the one returned by the parsed expression or was in the DWARF variable records for some local value) and the dynamic type it figured out by looking at the value in memory in this TypeImpl. See for instance ValueObjectDynamicValue::UpdateValue. ValueObjectDynamicValue could also get its static type from its parent. I'm not sure why it doesn't do that and stores it in its TypeImpl instead; presumably this was more convenient somewhere.

Thanks for explanation.
Could please tell if I understood you correctly: from the start we load into some module’s type system some type for example from DWARF, but during runtime it may be altered and as a consequence ‘original’ static type becomes dynamic in-memory type? It still seems to me I do not completely understand relation between static and it’s dynamic part, if it is correctly to say so. Would like to understand the origin of this.

This is confusing, in part because TypeImpl is mixing two concepts, Types and Values.

A Type can really only have one type, itself... Types may have base-class and derived class Types, but it is itself a definite type.

But a Value though originally known by its declared type, but can also have a more specific type based on how it was created.

The most common example of this is w.r.t. base & derived classes. For instance, if you stopped are in a method of a base class in some class hierarchy, the Value representing the "this" pointer has a static type which is that of the base class. That's pretty obvious. But if a derived class object is the one actually calling the base class method at the point where you are stopped in that method, then the full (which in lldb we call the "dynamic") type of the "this" Value is the derived class. In C++, the dynamic type is determined by looking at the vtable pointer in the object, which always points at the most specific class of the object. In ObjC we look at the "isa" pointer, which again is always the most specific class. As these examples show, to have a dynamic type you have to have a Value whose vtable or isa pointer you can look up.

And dynamic values can change over time:

BaseClass *a = new BaseClass();
...
free(a);
a = new DerivedClass();

In this case, the static type of a is always BaseClass *, and at first it's dynamic type is also BaseClass *. But after we free & re-initialize it, the dynamic type becomes DerivedClass *

I'm assuming TypeImpl picked up a "dynamic" type field because that allowed us to represent the Type of a ValueObject with static and dynamic types using one entity. Then, for instance, if we failed to find the dynamic value for some reason, this would automatically fall back to the static type. But dynamic types are really a concept that pertains to values not to types.

Thanks a lot Jim for explanations, now it makes sense to me.
Have one more question on class Type. Writing API test for the patch defined 2 functions like this:

void *func1(int) {
        return NULL;
}

void *func2(int) {
        return NULL;
}

Tried to find function type using target.FindFirstType('func1') and got nothing as a result. Call to module.GetTypes().GetTypeAtIndex(0).GetName() contained:
"void *(int)". Looking through clang::Decls found out that clang::FunctionDecl is child of clang::ValueDecl and it finally became clear to me that "func1", "func2" are not types at all but simply symbols (variables?) of the same type "void *(int)". Therefore module where above 2 functions were declared contained only 1 type "void *(int)". Investigating how DWARFASTParser works noticed that it do generates 2 Type instances for 2 defined functions with m_name variable set to "func1", "func2" respectively and m_compiler_type referencing the same "void *(int)" (but not in it's string representation of course).
So the question is: Does class Type represents not only a type but also a variable (symbol or identifier, don't know how to name it correctly in this situation)?

Thanks a lot Jim for explanations, now it makes sense to me.
Have one more question on class Type. Writing API test for the patch defined 2 functions like this:

void *func1(int) {
        return NULL;
}

void *func2(int) {
        return NULL;
}

Tried to find function type using target.FindFirstType('func1') and got nothing as a result. Call to module.GetTypes().GetTypeAtIndex(0).GetName() contained:
"void *(int)". Looking through clang::Decls found out that clang::FunctionDecl is child of clang::ValueDecl and it finally became clear to me that "func1", "func2" are not types at all but simply symbols (variables?) of the same type "void *(int)". Therefore module where above 2 functions were declared contained only 1 type "void *(int)". Investigating how DWARFASTParser works noticed that it do generates 2 Type instances for 2 defined functions with m_name variable set to "func1", "func2" respectively and m_compiler_type referencing the same "void *(int)" (but not in it's string representation of course).
So the question is: Does class Type represents not only a type but also a variable (symbol or identifier, don't know how to name it correctly in this situation)?

My understanding is that functions have a type but aren't types themselves. I'm not sure where you are seeing us make Type's for functions, can you say more about that?

Thanks a lot Jim for explanations, now it makes sense to me.
Have one more question on class Type. Writing API test for the patch defined 2 functions like this:

void *func1(int) {
        return NULL;
}

void *func2(int) {
        return NULL;
}

Tried to find function type using target.FindFirstType('func1') and got nothing as a result. Call to module.GetTypes().GetTypeAtIndex(0).GetName() contained:
"void *(int)". Looking through clang::Decls found out that clang::FunctionDecl is child of clang::ValueDecl and it finally became clear to me that "func1", "func2" are not types at all but simply symbols (variables?) of the same type "void *(int)". Therefore module where above 2 functions were declared contained only 1 type "void *(int)". Investigating how DWARFASTParser works noticed that it do generates 2 Type instances for 2 defined functions with m_name variable set to "func1", "func2" respectively and m_compiler_type referencing the same "void *(int)" (but not in it's string representation of course).
So the question is: Does class Type represents not only a type but also a variable (symbol or identifier, don't know how to name it correctly in this situation)?

My understanding is that functions have a type but aren't types themselves. I'm not sure where you are seeing us make Type's for functions, can you say more about that?

I can describe how to see it really happens. Attached a compiled executable with debug info in a separate file. Defined only 1 function in function_type.c:

int main (int argc, const char * argv[]) {
    return 0;
}

Steps:

  1. launch lldb with executable from attachment (i do it from command line: lldb /path/to/function_type)
  2. attach to lldb with another debugger.
  3. in another debugger set breakpoint on Type constructor:

lldb_private::Type::Type(unsigned long long, lldb_private::SymbolFile*, lldb_private::ConstString, llvm::Optional<unsigned long long>, lldb_private::SymbolContextScope*, unsigned long long, lldb_private::Type::EncodingDataType, lldb_private::Declaration const&, lldb_private::CompilerType const&, lldb_private::Type::ResolveState, unsigned int)
(i set it slightly below, on "if (byte_size) {", on my version of sources line 154)

  1. continue another debugger
  2. in lldb with function_type target execute command:

lldb.target.GetModuleAtIndex(0).GetTypes().GetSize()

  1. After several continuations it should stop inside Type constructor with input name = "main" and compiler_type for function type.
  2. e compiler_type.GetTypeName() will give result:

"int (int, const char **)"

Hope this will help.

labath added a subscriber: labath.Oct 21 2020, 12:18 AM

<drive-by>
It sounds like that could be an artefact from how we parse function types from dwarf. I haven't checked if that's what happens in this particular case, but I know that we parse function types by just taking the function DIE (which describes the whole function, not just its type) and let the generic ParseTypeFromDWARF function loose on it. It could be that this function picks up the DW_AT_name attribute from there (which is the function name in this case, but for other kinds of types it would actually be the type name). We could probably change that function to ignore the DW_AT_name for functions, though it's not clear to me whether the current behavior has any adverse effects (the final type name seems to come out alright...)
</drive-by>

<drive-by>
It sounds like that could be an artefact from how we parse function types from dwarf. I haven't checked if that's what happens in this particular case, but I know that we parse function types by just taking the function DIE (which describes the whole function, not just its type) and let the generic ParseTypeFromDWARF function loose on it. It could be that this function picks up the DW_AT_name attribute from there (which is the function name in this case, but for other kinds of types it would actually be the type name). We could probably change that function to ignore the DW_AT_name for functions, though it's not clear to me whether the current behavior has any adverse effects (the final type name seems to come out alright...)
</drive-by>

Thanks for reply, it seems it works the way you tell. Added to my above example one more function definition, so the source file function_type.c looks like:

int main1 (int argc, const char * argv[]) {
    return 0;
}

int main (int argc, const char * argv[]) {
    return 0;
}

At the end of dwarf parsing content of m_die_to_type is:

(lldb_private::ConstString) $27 = (m_string = "main1")
(lldb_private::ConstString) $28 = (m_string = "int (int, const char **)")

(lldb_private::ConstString) $29 = (m_string = "main")
(lldb_private::ConstString) $30 = (m_string = "int (int, const char **)")

(lldb_private::ConstString) $31 = (m_string = "int")
(lldb_private::ConstString) $32 = (m_string = "int")

(lldb_private::ConstString) $33 = (m_string = 0x0000000000000000)
(lldb_private::ConstString) $34 = (m_string = "const char **")

(lldb_private::ConstString) $35 = (m_string = 0x0000000000000000)
(lldb_private::ConstString) $36 = (m_string = "const char *")

(lldb_private::ConstString) $37 = (m_string = "char")
(lldb_private::ConstString) $38 = (m_string = "char")

Where first and second m_string in each block are m_name and m_compiler_type->GetTypeName() of Type correspondingly. Later on in SymbolFileDWARF::GetTypes type duplicates are filtered out:

std::set<CompilerType> compiler_type_set;
  for (Type *type : type_set) {
    CompilerType compiler_type = type->GetForwardCompilerType();
    if (compiler_type_set.find(compiler_type) == compiler_type_set.end()) {
      compiler_type_set.insert(compiler_type);
      type_list.Insert(type->shared_from_this());
    }
  }

Ending up with resulting type list as follows:

(lldb_private::ConstString) $27 = (m_string = "main1")
(lldb_private::ConstString) $28 = (m_string = "int (int, const char **)")

(lldb_private::ConstString) $31 = (m_string = "int")
(lldb_private::ConstString) $32 = (m_string = "int")

(lldb_private::ConstString) $33 = (m_string = 0x0000000000000000)
(lldb_private::ConstString) $34 = (m_string = "const char **")

(lldb_private::ConstString) $35 = (m_string = 0x0000000000000000)
(lldb_private::ConstString) $36 = (m_string = "const char *")

(lldb_private::ConstString) $37 = (m_string = "char")
(lldb_private::ConstString) $38 = (m_string = "char")

Maybe of course that's not a problem but for me it was at least confusing, Type's m_name actually not a type's name but identifier in a source file. :)

JDevlieghere accepted this revision.Oct 27 2020, 10:40 AM

LGTM with the few nits addressed.

lldb/include/lldb/Symbol/Type.h
269 ↗(On Diff #299004)
lldb/test/API/functionalities/type_get_module/Makefile
1 ↗(On Diff #299004)

spurious newline?

lldb/test/API/functionalities/type_get_module/compile_unit1.c
1 ↗(On Diff #299004)

spurious newline?

lldb/test/API/functionalities/type_get_module/compile_unit2.c
1 ↗(On Diff #299004)

spurious newline?

lldb/test/API/functionalities/type_get_module/main.c
1 ↗(On Diff #299004)

spurious newline?

This revision is now accepted and ready to land.Oct 27 2020, 10:40 AM
fallkrum updated this revision to Diff 301271.Oct 28 2020, 7:12 AM
fallkrum marked 5 inline comments as done.

Please commit it for me.
Ilya Bukonkin fallkrum@yahoo.com

@JDevlieghere @jingham api test failed after my commit:
http://lab.llvm.org:8011/#/builders/68/builds/1040
Had no such an error on my computer, please help to figure out what went wrong.

jingham added a comment.EditedOct 29 2020, 2:32 PM

@JDevlieghere @jingham api test failed after my commit:
http://lab.llvm.org:8011/#/builders/68/builds/1040
Had no such an error on my computer, please help to figure out what went wrong.

The test is failing because

type_name = comp_unit.GetTypes().GetTypeAtIndex(0).GetName()

is returning an empty string. I can't tell from the error whether that's because comp_unit is not valid in this case, or because GetTypeAtIndex(0) is not valid, or because for some reason this compile unit has an anonymous type injected as the first type in the CU.

You are also assuming that the compile units will come in order with main as the 0-th. That seems reasonable, but there's no law about what order the compiler has to insert compile units into the debug information, so the CU at index 2 could also be the wrong CU.

You are also assuming that these simple files have only one type. That seems reasonable but there's no guarantee the compiler won't emit debug info for maybe some builtin type or whatever you don't know into the CU. Since that's not what you are testing, you should look through all the types to see if the one you expect is there, rather than asserting that it has to be the first type in the CU...

Particularly when writing tests, I try never to chain calls together like this and instead to check each entity I care about as I get it. Otherwise you get these "happens on a remote machine I can't access" failures and you can't tell from the failure site what was actually failing.

Can you rewrite the test to remove all these unnecessary assumptions, and to check the entities you get back as you get them? If you make up that patch we can resubmit this change and the next time it fails you should get more information about why.

Could you explain how to commit that Differential revision is closed by commit?
Commited like described in here https://llvm.org/docs/Phabricator.html#committing-someone-s-change-from-phabricator

git pull --rebase https://github.com/llvm/llvm-project.git master
git show # Ensure the patch looks correct.
ninja check-$whatever # Rerun the appropriate tests if needed.
git push https://github.com/llvm/llvm-project.git HEAD:master

And nothing happened, as far as I see will have to close revision manually.
Thanks in advance.

fallkrum updated this revision to Diff 301768.Oct 29 2020, 3:09 PM

Api test improvements.

@JDevlieghere @jingham api test failed after my commit:
http://lab.llvm.org:8011/#/builders/68/builds/1040
Had no such an error on my computer, please help to figure out what went wrong.

Checked test on my machine again, it simply skips DWO:

libstdcxx tests will not be run because: Don't know how to build with libstdcxx on macosx
Skipping following debug info categories: ['dwo']

Session logs for test failures/errors/unexpected successes will go into directory '/Users/ilya/Documents/Projects/llvm-project/build/Ninja/lldb-test-traces'
PASS: LLDB (/Users/ilya/Documents/Projects/llvm-project/build/Ninja/bin/clang-x86_64) :: test_dsym (TestTypeGetModule.TestTypeGetModule)
PASS: LLDB (/Users/ilya/Documents/Projects/llvm-project/build/Ninja/bin/clang-x86_64) :: test_dwarf (TestTypeGetModule.TestTypeGetModule)
UNSUPPORTED: LLDB (/Users/ilya/Documents/Projects/llvm-project/build/Ninja/bin/clang-x86_64) :: test_dwo (TestTypeGetModule.TestTypeGetModule) (test case does not fall in any category of interest for this run) 
PASS: LLDB (/Users/ilya/Documents/Projects/llvm-project/build/Ninja/bin/clang-x86_64) :: test_gmodules (TestTypeGetModule.TestTypeGetModule)
}

@jingham Are there special tricks to enable DWO testing?

This test is still assuming (1) that main.c is CompUnit(0), etc. You don't know that's true. And that there is only one type in each CU (and type 0 is the one you want). Neither of these is guaranteed. Can you check all these things, I'd rather not have to go a bunch of rounds on this.

fallkrum updated this revision to Diff 301783.Oct 29 2020, 4:14 PM

Api test fix.

This test is still assuming (1) that main.c is CompUnit(0), etc. You don't know that's true. And that there is only one type in each CU (and type 0 is the one you want). Neither of these is guaranteed. Can you check all these things, I'd rather not have to go a bunch of rounds on this.

Sorry, did not pay due attention on this. Think this issue is solved in my latest update.

Sorry to be picky, but you can compress this by putting the IsValid checks into your find_* routines. And the:

self.assertEqual(cu_type.GetName(), type2_name)

calls aren't necessary, since you already found the type by matching the name. So I think you can simplify this by moving all the checks into the helper routines.

Note, you are also assuming that SBTarget::FindFirstType goes through the modules starting with the executable module for the tests at the end of the file. That is true but is not part of the contract for FindFirstType. It doesn't seem a good idea to rely on that either.

I pushed this version so we can hopefully either clear up the bot failures or better see why they are happening. But can you clean up the patch as suggested above for the final version? Thanks.

The test failed in the new form at line 77. So we got the CU for compile_unit1.c and it was valid. But it didn't contain an SBType for compile_unit1_type. Probably the same thing is true in compile_unit2.c you just didn't get there.

Maybe your executable is too artificial? You define a type in easy .c file, and you make a CU level variable with that type. But neither of those variables are used by anything. So some compiler that's being a little bit too smart for our good could discern that it can elide all those types. Maybe make those struct elements extern and refer to them in main.c so that the compiler will have to keep them around?

fallkrum updated this revision to Diff 301796.Oct 29 2020, 5:49 PM

make types extern.
@jingham please commit this patch to check if it will fix DWO issue, the rest of your recommendations will add later.

This revision was landed with ongoing or failed builds.Oct 29 2020, 6:29 PM
This revision was automatically updated to reflect the committed changes.

make types extern.
@jingham please commit this patch to check if it will fix DWO issue, the rest of your recommendations will add later.

Jim asked me to follow up on this so I've committed this for you.

It was still failing so I've temporarily XFAIL'ed it in 30e7df0d58542ad35d517eace70a4cea40e6fa7a

It looks like the test has found an actual bug in handling of the dwo format. 8485ee781f fixes that.

Refactored test according to recommendations given during review 1267bb2e416e

labath added a comment.Nov 2 2020, 2:38 AM

The num_compile_units==3 assumption is also fragile, as files like crtbegin/end, etc. can come with debug info too (ran into that downstream). Changed the assertion to >=3 in fd06f6441.

labath added a comment.Nov 2 2020, 6:44 AM

In e3645fdff, that is (the first has is a local commit I forgot to push).