This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lldb/source/Plugins/SymbolFile/DWARF/
-
source/
-
Plugins/
-
SymbolFile/
-
DWARF/
-
DIERef.h
-
SymbolFileDWARF.h
-
SymbolFileDWARF.cpp

Differential D74637

Separate DIERef vs. user_id_t: m_function_scope_qualified_name_map
ClosedPublic

Authored by jankratochvil on Feb 14 2020, 12:48 PM.

Download Raw Diff

Details

Reviewers

labath
clayborg

Commits

rG217808887918: Separate DIERef vs. user_id_t: m_function_scope_qualified_name_map

Summary

As discussed in D73206 there is both DIERef and user_id_t and sometimes (for DWZ) we need to encode Main CU into them and sometimes we cannot as it is unavailable at that point and at the same time not even needed.
I have also noticed DIERef and user_id_t in fact contain the same information which can be seen in SymbolFileDWARF::GetUID.
(Offtopic: Moreover UserID is also another form of user_id_t.)
SB* API/ABI is already using user_id_t and it needs to encode Main CU for DWZ. Therefore what about making DIERef the identifier not containing Main CU and user_id_t the identifier containing Main CU?
@labath also said:

I think it would be good to have only one kind of "user id". What are the cases where you need a main-cu-less user id?

This patch does that in a small scale, is it the proper way forward?
(Personally I think the non-MainCU and plus-MainCU forms should be the same, either both user_id_t or both DIERef - and to drop the other form completely - but that can be always easily refactored any time.)

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jankratochvil created this revision.Feb 14 2020, 12:48 PM

Herald added a subscriber: aprantl. · View Herald TranscriptFeb 14 2020, 12:48 PM

jankratochvil retitled this revision from Unify DIERef vs. user_id_t to Separate DIERef vs. user_id_t: m_function_scope_qualified_name_map.Feb 16 2020, 6:43 AM

jankratochvil mentioned this in D74690: Separate DIERef vs. user_id_t: GetForwardDeclClangTypeToDie().Feb 16 2020, 6:50 AM

+ @JDevlieghere for a reason why it would be nice to have cross platform "debug map" test(s).

> I have also noticed DIERef and user_id_t in fact contain the same information which can be seen in SymbolFileDWARF::GetUID.

I am afraid the situation is not that simple. Your statement is true for all "elf" forms of dwarf (regular, dwo, dwp, type units, etc.), and also for mac dSYM files, but it is not true in case of mac "debug map" scenario. In this case, a user_id_t also encodes the symbol file / module ID. You can best see this in the function doing the opposite transformation (SymbolFileDWARF::DecodeUID), which returns a SymbolFileDWARF in addition to a DIERef.

The macos debug map works by creating multiple modules for each .o file, and them mangling them so that they appear to come from a single module. From five miles up, this is pretty similar to what "split dwarf" does, but it has one important distinction -- in split dwarf, the dwo files can only be interpreted together with the main executable file, which contains the linked portions of the debug info (addresses, mainly), while in case of a debug map, the .o files contain fully standalone dwarf, and the "relinking" that we do is outside of the scope of the dwarf spec and works by the linker leaving breadcrumbs about what it has done in a custom format.

For this reason the debug map, and the split dwarf features were implemented on different levels inside SymbolFileDWARF, and one of them is included in DIERef, while the other isn't. The main confusing part about this is that split dwarf creates multiple SymbolFile objects (SymbolFileDWARFDwo, SymbolFileDWARFDwp, SymbolFileDWARFDwoDwp :/), even though all of these things are really a part of a single SymbolFile object that happens to be spread across multiple files -- this is something I am working on fixing (first by removing the Dwp flavours), and why want to use your MainCU concept for split dwarf too (hopefully that would rid us of SymbolFileDWARFDwo).

The reason I am saying all of this is to illustrate why I think you can't make user_id_t and DIERef the same thing -- the former needs to be globally unique, whereas the latter is local to a single "symbol file dwarf" (with big quotes).

However, I am not sure what all of this says about this patch. In principle, I don't see a big problem with changing the type of this field. In fact, this field used to hold a DIERef until I changed that in D63322. However, there wasn't a strong reason for that -- I did it because it was a) convenient; b) more memory-efficient. We can change it back if it helps you in achieving your goal (and btw, thank you for doing this in small steps). It's just that currently it's not at all clear to me what that goal is. Maybe you could give a rough outline of where are you going with this. For example, how will the user_id_t decoding process look like in the end?

In D74637#1878655, @labath wrote:

In fact, this field used to hold a DIERef until I changed that in D63322.

I did not expect that. OK, this D74637 + my D74690 try to just revert D63322.

I did it because it was
b) more memory-efficient.

Why? Both DIERef and user_id_t sizeof is 8.

Maybe you could give a rough outline of where are you going with this.

I am trying to reduce user_id_t usage as much as possible. And then to add MainCU to user_id_t (but no longer to DIERef). As construction of user_id_t with MainCU needs additional information no longer contained in DIERef it will need some additional parameter in the caller chain like I did in D73206. As the new parameter is used only for DWZ its addition is not much popular so it should be limited to as few cases as possible.

For example, how will the user_id_t decoding process look like in the end?

In my dwz branch the MainCU encoding is now combined for both DIERef and user_id_t but the plan is to encode it only into user_id_t. It still fits into 64-bits reusing the DWO field (which is never used together with DWZ). Current implementation:

[[ https://github.com/jankratochvil/llvm-project/blob/fa83503513696b53c9cc2d20964ee27c375c2ef7/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp#L1226 | DIERef->user_id_t ]]
[[ https://github.com/jankratochvil/llvm-project/blob/fa83503513696b53c9cc2d20964ee27c375c2ef7/lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp#L1262 | user_id_->DecodedUID ]]
[[ https://github.com/jankratochvil/llvm-project/blob/fa83503513696b53c9cc2d20964ee27c375c2ef7/lldb/source/Plugins/SymbolFile/DWARF/DWARFBaseDIE.cpp#L26 | DWARFDIE->DIERef ]]
[[ https://github.com/jankratochvil/llvm-project/blob/fa83503513696b53c9cc2d20964ee27c375c2ef7/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp#L157 | DIERef->DWARFUnit * ]] wrt DWZ common file
[[ https://github.com/jankratochvil/llvm-project/blob/fa83503513696b53c9cc2d20964ee27c375c2ef7/lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfo.cpp#L168 | DIERef->DWARFUnit * ]] for MainCU

In D74637#1878704, @jankratochvil wrote:

I did it because it was
b) more memory-efficient.

Why? Both DIERef and user_id_t sizeof is 8.

Ah, sorry, I misremembered that (and confused DIERef with DWARFDIE). I think what happened is that at the time I was writing that patch, I was planning to increase the size of DIERef. But in the end, that did not materialize (we chose to drop the somewhat redundant cu_offset field instead).

Maybe you could give a rough outline of where are you going with this.

I am trying to reduce user_id_t usage as much as possible. And then to add MainCU to user_id_t (but no longer to DIERef). As construction of user_id_t with MainCU needs additional information no longer contained in DIERef it will need some additional parameter in the caller chain like I did in D73206.

Ok, this part makes sense. It's hard for me to evaluate the rest, as the code your linking to still assumes that the MainCU is stored in the DIERef, which you now say you want to change. Suppose these patches are accepted (let's call them tentatively accepted). What would be the next steps?

In D74637#1878720, @labath wrote:

the MainCU is stored in the DIERef, which you now say you want to change.

Just some wording: If MainCU was in DIERef then MainCU needs to be also in DWARFDIE which means DWARFDIE must grow 16->24 bytes which you do not want. So transitively yes, I cannot store MainCU into DIERef.

Suppose these patches are accepted (let's call them tentatively accepted). What would be the next steps?

I find this patch as a NFC cleanup to the codebase - to satisfy a new premise user_id_t is used as little as possible and thus only for external interfaces which must not deal with MainCU in any way.
Its larger goal is to satisfy this item of the big DWZ plan you made:

I think it would be good to have only one kind of "user id". What are the cases where you need a main-cu-less user id?

Sure thanks for all the reviews.

In D74637#1878748, @jankratochvil wrote:

In D74637#1878720, @labath wrote:

Suppose these patches are accepted (let's call them tentatively accepted). What would be the next steps?

I find this patch as a NFC cleanup to the codebase - to satisfy a new premise user_id_t is used as little as possible and thus only for external interfaces which must not deal with MainCU in any way.

Ok, I think I can go with that. I wanted to get a feel for where your going (something mid-level plan between this patch, and the "big DWZ plan") to see where this is going, but I think we can accept this just on the basis that it is a revert of D63322, which turned out to be a dud. And the changes are not that big so it's not a big deal if we need to modify this.

This revision is now accepted and ready to land.Feb 17 2020, 5:21 AM

Closed by commit rG217808887918: Separate DIERef vs. user_id_t: m_function_scope_qualified_name_map (authored by jankratochvil). · Explain WhyFeb 17 2020, 7:37 AM

This revision was automatically updated to reflect the committed changes.

jankratochvil mentioned this in rGaa3e99dc859f: [lldb] [nfc] Separate DIERef vs. user_id_t: GetForwardDeclClangTypeToDie().Feb 18 2020, 9:12 AM

jankratochvil mentioned this in D73206: Pass `CompileUnit *` along `DWARFDIE` for DWZ.Mar 30 2020, 4:26 AM

Revision Contents

Path

Size

lldb/

source/

Plugins/

SymbolFile/

DWARF/

DIERef.h

10 lines

SymbolFileDWARF.h

2 lines

SymbolFileDWARF.cpp

6 lines

Diff 244973

lldb/source/Plugins/SymbolFile/DWARF/DIERef.h

Show All 38 Lines	if (m_dwo_num_valid)
return m_dwo_num;		return m_dwo_num;
return llvm::None;		return llvm::None;
}		}

Section section() const { return static_cast<Section>(m_section); }		Section section() const { return static_cast<Section>(m_section); }

dw_offset_t die_offset() const { return m_die_offset; }		dw_offset_t die_offset() const { return m_die_offset; }

		bool operator<(DIERef other) const {
		if (m_dwo_num_valid != other.m_dwo_num_valid)
		return m_dwo_num_valid < other.m_dwo_num_valid;
		if (m_dwo_num_valid && (m_dwo_num != other.m_dwo_num))
		return m_dwo_num < other.m_dwo_num;
		if (m_section != other.m_section)
		return m_section < other.m_section;
		return m_die_offset < other.m_die_offset;
		}

private:		private:
uint32_t m_dwo_num : 30;		uint32_t m_dwo_num : 30;
uint32_t m_dwo_num_valid : 1;		uint32_t m_dwo_num_valid : 1;
uint32_t m_section : 1;		uint32_t m_section : 1;
dw_offset_t m_die_offset;		dw_offset_t m_die_offset;
};		};
static_assert(sizeof(DIERef) == 8, "");		static_assert(sizeof(DIERef) == 8, "");

Show All 9 Lines

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h

Show First 20 Lines • Show All 494 Lines • ▼ Show 20 Lines	typedef std::unordered_map<lldb::offset_t, lldb_private::DebugMacrosSP>
DebugMacrosMap;		DebugMacrosMap;
DebugMacrosMap m_debug_macros_map;		DebugMacrosMap m_debug_macros_map;

ExternalTypeModuleMap m_external_type_modules;		ExternalTypeModuleMap m_external_type_modules;
std::unique_ptr<lldb_private::DWARFIndex> m_index;		std::unique_ptr<lldb_private::DWARFIndex> m_index;
bool m_fetched_external_modules : 1;		bool m_fetched_external_modules : 1;
lldb_private::LazyBool m_supports_DW_AT_APPLE_objc_complete_type;		lldb_private::LazyBool m_supports_DW_AT_APPLE_objc_complete_type;

typedef std::set<lldb::user_id_t> DIERefSet;		typedef std::set<DIERef> DIERefSet;
typedef llvm::StringMap<DIERefSet> NameToOffsetMap;		typedef llvm::StringMap<DIERefSet> NameToOffsetMap;
NameToOffsetMap m_function_scope_qualified_name_map;		NameToOffsetMap m_function_scope_qualified_name_map;
std::unique_ptr<DWARFDebugRanges> m_ranges;		std::unique_ptr<DWARFDebugRanges> m_ranges;
UniqueDWARFASTTypeMap m_unique_ast_type_map;		UniqueDWARFASTTypeMap m_unique_ast_type_map;
DIEToTypePtr m_die_to_type;		DIEToTypePtr m_die_to_type;
DIEToVariableSP m_die_to_variable_sp;		DIEToVariableSP m_die_to_variable_sp;
DIEToClangType m_forward_decl_die_to_clang_type;		DIEToClangType m_forward_decl_die_to_clang_type;
ClangTypeToDIE m_forward_decl_clang_type_to_die;		ClangTypeToDIE m_forward_decl_clang_type_to_die;
llvm::DenseMap<dw_offset_t, lldb_private::FileSpecList>		llvm::DenseMap<dw_offset_t, lldb_private::FileSpecList>
m_type_unit_support_files;		m_type_unit_support_files;
std::vector<uint32_t> m_lldb_cu_to_dwarf_unit;		std::vector<uint32_t> m_lldb_cu_to_dwarf_unit;
};		};

#endif // SymbolFileDWARF_SymbolFileDWARF_h_		#endif // SymbolFileDWARF_SymbolFileDWARF_h_

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp

Show First 20 Lines • Show All 2,375 Lines • ▼ Show 20 Lines	for (uint32_t i = 0; i < num_comp_units; i++) {
if (cu == nullptr)		if (cu == nullptr)
continue;		continue;

SymbolFileDWARFDwo *dwo = cu->GetDwoSymbolFile();		SymbolFileDWARFDwo *dwo = cu->GetDwoSymbolFile();
if (dwo)		if (dwo)
dwo->GetMangledNamesForFunction(scope_qualified_name, mangled_names);		dwo->GetMangledNamesForFunction(scope_qualified_name, mangled_names);
}		}

for (lldb::user_id_t uid :		for (DIERef die_ref :
m_function_scope_qualified_name_map.lookup(scope_qualified_name)) {		m_function_scope_qualified_name_map.lookup(scope_qualified_name)) {
DWARFDIE die = GetDIE(uid);		DWARFDIE die = GetDIE(die_ref);
mangled_names.push_back(ConstString(die.GetMangledName()));		mangled_names.push_back(ConstString(die.GetMangledName()));
}		}
}		}

void SymbolFileDWARF::FindTypes(		void SymbolFileDWARF::FindTypes(
ConstString name, const CompilerDeclContext *parent_decl_ctx,		ConstString name, const CompilerDeclContext *parent_decl_ctx,
uint32_t max_matches,		uint32_t max_matches,
llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,		llvm::DenseSet<lldb_private::SymbolFile *> &searched_symbol_files,
▲ Show 20 Lines • Show All 631 Lines • ▼ Show 20 Lines	if (type_sp) {
GetTypeList().Insert(type_sp);		GetTypeList().Insert(type_sp);

if (die.Tag() == DW_TAG_subprogram) {		if (die.Tag() == DW_TAG_subprogram) {
std::string scope_qualified_name(GetDeclContextForUID(die.GetID())		std::string scope_qualified_name(GetDeclContextForUID(die.GetID())
.GetScopeQualifiedName()		.GetScopeQualifiedName()
.AsCString(""));		.AsCString(""));
if (scope_qualified_name.size()) {		if (scope_qualified_name.size()) {
m_function_scope_qualified_name_map[scope_qualified_name].insert(		m_function_scope_qualified_name_map[scope_qualified_name].insert(
die.GetID());		*die.GetDIERef());
}		}
}		}
}		}

return type_sp;		return type_sp;
}		}

size_t SymbolFileDWARF::ParseTypes(const SymbolContext &sc,		size_t SymbolFileDWARF::ParseTypes(const SymbolContext &sc,
▲ Show 20 Lines • Show All 974 Lines • Show Last 20 Lines