This is an archive of the discontinued LLVM Phabricator instance.

ObjectFilePECOFF: Create a "container" section spanning the entire module image
ClosedPublic

Authored by labath on Jan 10 2019, 4:02 AM.

Download Raw Diff

Details

Reviewers

zturner
amccarth
stella.stamenova
clayborg
lemo

Commits

rG7db8b5c4bdea: ObjectFilePECOFF: Create a "container" section spanning the entire module image
rLLDB353916: ObjectFilePECOFF: Create a "container" section spanning the entire module image
rL353916: ObjectFilePECOFF: Create a "container" section spanning the entire module image

Summary

This is coming from the discussion in D55356 (the most interesting part
happened on the mailing list, so it isn't reflected on the review page).

In short the issue is that lldb assumes that all bytes of a module image
in memory will be backed by a "section". This isn't the case for PECOFF
files because the initial bytes of the module image will contain the
file header, which does not correspond to any normal section in the
file. In particular, this means it is not possible to implement
GetBaseAddress function for PECOFF files, because that's supposed point
to the first byte of that header.

If my (limited) understanding of how PECOFF files work is correct, then
the OS is expecded to load the entire module into one continuous chunk
of memory. The address of that chunk (+/- ASLR) is given by the "image
base" field in the COFF header, and it's size by "image size". All of
the COFF sections are then loaded into this range.

If that's true, then we can model this behavior in lldb by creating a
"container" section to represent the entire module image, and then place
other sections inside that. This would make be consistent with how MachO
and ELF files are modelled (except that those can have multiple
top-level containers as they can be loaded into multiple discontinuous
chunks of memory).

This change required a small number of fixups in the PDB plugins, which
assumed a certain order of sections within the object file (which
obivously changes now). I fix this by changing the lookup code to use
section IDs (which are unchanged) instead of indexes. This has the nice
benefit of removing spurious -1s in the plugins as the section IDs in
the pdbs match the 1-based section IDs in the COFF plugin.

Besides making the implementation of GetBaseAddress possible, this also
improves the lookup of addresses in the gaps between the object file
sections, which will now be correctly resolved as belonging to the
object file.

Diff Detail

Repository: rL LLVM

Event Timeline

labath created this revision.Jan 10 2019, 4:02 AM

Herald added subscribers: abidh, JDevlieghere. · View Herald TranscriptJan 10 2019, 4:02 AM

labath added a child revision: D58050: PECOFF: Implement GetBaseAddress.Feb 11 2019, 5:24 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 11 2019, 5:24 AM

Could you please take a look at this? This is currently blocking me from making progress with breakpad symbol files.

My knowledge of PECOFF is even more limited, but it's my understanding that the ImageBase is the _preferred_ address for the module. That doesn't guarantee that's the actual address it would be loaded at. Not just because of ASLR but also because there may be conflicts with modules that have already been loaded. Is GetBaseAddress supposed to return the actual base address or the preferred one?

clayborg added inline comments.Feb 11 2019, 9:26 AM

source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
1370 ↗	(On Diff #181022)	How fast is this? Do we need a local cache so we aren't looking up a section for each symbol? Maybe a locally cached vector since sections are represented by indexes?

In D56537#1393202, @amccarth wrote:

My knowledge of PECOFF is even more limited, but it's my understanding that the ImageBase is the _preferred_ address for the module. That doesn't guarantee that's the actual address it would be loaded at. Not just because of ASLR but also because there may be conflicts with modules that have already been loaded. Is GetBaseAddress supposed to return the actual base address or the preferred one?

Well, it kind of returns both. The returned Address object stores the address in a section-relative form. So, if you ask it for the "file address", it will return the "address, as known to the object file", or the "preferred load address", or however you like to call it. OTOH, you can also ask it for the "load address" for a specific target, which will consult the target for the load address of the section, and return the actual load address in a specific target. This is the main reason why I needed to create the extra container section sitting on top of everything (though that has other benefits too).

source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
1370 ↗	(On Diff #181022)	It's not particularly fast (linear search), but the number of sections is generally small. FindSectionByID is also used in other places for Symtab construction (e.g. ObjectFileELF; ObjectFileMachO does some complicated thing, which I believe involves caching), so it doesn't seem to be too bad. If it turns out we need to speed up the lookup here, then I think it should be done a bit more generically, so that all users can benefit from this.

clayborg accepted this revision.Feb 12 2019, 7:04 AM

clayborg added inline comments.

source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp
1370 ↗	(On Diff #181022)	I think this is actually ok because PECOFF files generally don't have many (if any) symbols in them. Unless we did the symbol table out of another location in the file that isn't the standard PECOFF symbol table.

This revision is now accepted and ready to land.Feb 12 2019, 7:04 AM

lemo accepted this revision.Feb 12 2019, 10:30 AM

Closed by commit rL353916: ObjectFilePECOFF: Create a "container" section spanning the entire module image (authored by labath). · Explain WhyFeb 12 2019, 11:17 PM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 12 2019, 11:17 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

labath added a reverting change: D69100: COFF: Create a separate "section" for the file header.Oct 17 2019, 4:54 AM

labath added a reverting change: rG73a7a55c0ec9: lldb/COFF: Create a separate "section" for the file header.Oct 25 2019, 3:15 PM

Revision Contents

Path

Size

lldb/

trunk/

lit/

Modules/

PECOFF/

export-dllfunc.yaml

2 lines

subsections.yaml

70 lines

source/

Plugins/

ObjectFile/

PECOFF/

ObjectFilePECOFF.cpp

34 lines

SymbolFile/

NativePDB/

DWARFLocationExpression.cpp

8 lines

PDB/

PDBLocationToDWARFExpression.cpp

6 lines

SymbolFilePDB.cpp

6 lines

Diff 186587

lldb/trunk/lit/Modules/PECOFF/export-dllfunc.yaml

	# REQUIRES: system-windows			# REQUIRES: system-windows
	# RUN: yaml2obj < %s > %t.obj			# RUN: yaml2obj < %s > %t.obj
	#			#
	# RUN: lld-link /machine:x64 /out:%t.dll /noentry /nodefaultlib /dll %t.obj /export:DllFunc			# RUN: lld-link /machine:x64 /out:%t.dll /noentry /nodefaultlib /dll %t.obj /export:DllFunc
	#			#
	# RUN: lldb-test object-file %t.dll \| FileCheck -check-prefix=BASIC-CHECK %s			# RUN: lldb-test object-file %t.dll \| FileCheck -check-prefix=BASIC-CHECK %s
	# RUN: lldb-test object-file -dep-modules %t.dll \| FileCheck -check-prefix=DEPS %s			# RUN: lldb-test object-file -dep-modules %t.dll \| FileCheck -check-prefix=DEPS %s


	# BASIC-CHECK: Showing 3 sections			# BASIC-CHECK: Showing 3 subsections
	# BASIC-CHECK: Index: 0			# BASIC-CHECK: Index: 0
	# BASIC-CHECK: Name: .text			# BASIC-CHECK: Name: .text
	# BASIC-CHECK: Type: code			# BASIC-CHECK: Type: code
	# BASIC-CHECK: VM size: 22			# BASIC-CHECK: VM size: 22
	# BASIC-CHECK: File size: 512			# BASIC-CHECK: File size: 512
	#			#
	# BASIC-CHECK: Index: 1			# BASIC-CHECK: Index: 1
	# BASIC-CHECK: Name: .rdata			# BASIC-CHECK: Name: .rdata
	▲ Show 20 Lines • Show All 154 Lines • Show Last 20 Lines

lldb/trunk/lit/Modules/PECOFF/subsections.yaml

				# RUN: yaml2obj %s > %t
				# RUN: lldb-test object-file %t \| FileCheck %s


				# CHECK: Showing 1 sections
				# CHECK-NEXT: Index: 0
				# CHECK-NEXT: ID: 0xffffffffffffffff
				# CHECK-NEXT: Name:
				# CHECK-NEXT: Type: container
				# CHECK-NEXT: Permissions: ---
				# CHECK-NEXT: Thread specific: no
				# CHECK-NEXT: VM address: 0x40000000
				# CHECK-NEXT: VM size: 12288
				# CHECK-NEXT: File size: 0
				# CHECK-NEXT: Showing 2 subsections
				# CHECK-NEXT: Index: 0
				# CHECK-NEXT: ID: 0x1
				# CHECK-NEXT: Name: .text
				# CHECK-NEXT: Type: code
				# CHECK-NEXT: Permissions: ---
				# CHECK-NEXT: Thread specific: no
				# CHECK-NEXT: VM address: 0x40001000
				# CHECK-NEXT: VM size: 64
				# CHECK-NEXT: File size: 512
				# CHECK-EMPTY:
				# CHECK-NEXT: Index: 1
				# CHECK-NEXT: ID: 0x2
				# CHECK-NEXT: Name: .data
				# CHECK-NEXT: Type: data
				# CHECK-NEXT: Permissions: ---
				# CHECK-NEXT: Thread specific: no
				# CHECK-NEXT: VM address: 0x40002000
				# CHECK-NEXT: VM size: 64
				# CHECK-NEXT: File size: 512


				--- !COFF
				OptionalHeader:
				AddressOfEntryPoint: 4616
				ImageBase: 1073741824
				SectionAlignment: 4096
				FileAlignment: 512
				MajorOperatingSystemVersion: 6
				MinorOperatingSystemVersion: 0
				MajorImageVersion: 0
				MinorImageVersion: 0
				MajorSubsystemVersion: 6
				MinorSubsystemVersion: 0
				Subsystem: IMAGE_SUBSYSTEM_WINDOWS_CUI
				DLLCharacteristics: [ IMAGE_DLL_CHARACTERISTICS_HIGH_ENTROPY_VA, IMAGE_DLL_CHARACTERISTICS_DYNAMIC_BASE, IMAGE_DLL_CHARACTERISTICS_NX_COMPAT, IMAGE_DLL_CHARACTERISTICS_TERMINAL_SERVER_AWARE ]
				SizeOfStackReserve: 1048576
				SizeOfStackCommit: 4096
				SizeOfHeapReserve: 1048576
				SizeOfHeapCommit: 4096
				header:
				Machine: IMAGE_FILE_MACHINE_AMD64
				Characteristics: [ IMAGE_FILE_EXECUTABLE_IMAGE, IMAGE_FILE_LARGE_ADDRESS_AWARE ]
				sections:
				- Name: .text
				Characteristics: [ IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ ]
				VirtualAddress: 4096
				VirtualSize: 64
				SectionData: DEADBEEFBAADF00D
				- Name: .data
				Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ ]
				VirtualAddress: 8192
				VirtualSize: 64
				SectionData: DEADBEEFBAADF00D
				symbols: []
				...

lldb/trunk/source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp

Show First 20 Lines • Show All 700 Lines • ▼ Show 20 Lines
void ObjectFilePECOFF::CreateSections(SectionList &unified_section_list) {		void ObjectFilePECOFF::CreateSections(SectionList &unified_section_list) {
if (m_sections_up)		if (m_sections_up)
return;		return;
m_sections_up.reset(new SectionList());		m_sections_up.reset(new SectionList());

ModuleSP module_sp(GetModule());		ModuleSP module_sp(GetModule());
if (module_sp) {		if (module_sp) {
std::lock_guard<std::recursive_mutex> guard(module_sp->GetMutex());		std::lock_guard<std::recursive_mutex> guard(module_sp->GetMutex());

		SectionSP image_sp = std::make_shared<Section>(
		module_sp, this, ~user_id_t(0), ConstString(), eSectionTypeContainer,
		m_coff_header_opt.image_base, m_coff_header_opt.image_size,
		/file_offset/ 0, /file_size/ 0, m_coff_header_opt.sect_alignment,
		/flags/ 0);
		m_sections_up->AddSection(image_sp);
		unified_section_list.AddSection(image_sp);

const uint32_t nsects = m_sect_headers.size();		const uint32_t nsects = m_sect_headers.size();
ModuleSP module_sp(GetModule());		ModuleSP module_sp(GetModule());
for (uint32_t idx = 0; idx < nsects; ++idx) {		for (uint32_t idx = 0; idx < nsects; ++idx) {
ConstString const_sect_name(GetSectionName(m_sect_headers[idx]));		ConstString const_sect_name(GetSectionName(m_sect_headers[idx]));
static ConstString g_code_sect_name(".code");		static ConstString g_code_sect_name(".code");
static ConstString g_CODE_sect_name("CODE");		static ConstString g_CODE_sect_name("CODE");
static ConstString g_data_sect_name(".data");		static ConstString g_data_sect_name(".data");
static ConstString g_DATA_sect_name("DATA");		static ConstString g_DATA_sect_name("DATA");
▲ Show 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	for (uint32_t idx = 0; idx < nsects; ++idx) {
} else if (m_sect_headers[idx].flags &		} else if (m_sect_headers[idx].flags &
llvm::COFF::IMAGE_SCN_CNT_UNINITIALIZED_DATA) {		llvm::COFF::IMAGE_SCN_CNT_UNINITIALIZED_DATA) {
if (m_sect_headers[idx].size == 0)		if (m_sect_headers[idx].size == 0)
section_type = eSectionTypeZeroFill;		section_type = eSectionTypeZeroFill;
else		else
section_type = eSectionTypeData;		section_type = eSectionTypeData;
}		}

// Use a segment ID of the segment index shifted left by 8 so they
// never conflict with any of the sections.
SectionSP section_sp(new Section(		SectionSP section_sp(new Section(
		image_sp, // Parent section
module_sp, // Module to which this section belongs		module_sp, // Module to which this section belongs
this, // Object file to which this section belongs		this, // Object file to which this section belongs
idx + 1, // Section ID is the 1 based segment index shifted right by		idx + 1, // Section ID is the 1 based section index.
// 8 bits as not to collide with any of the 256 section IDs
// that are possible
const_sect_name, // Name of this section		const_sect_name, // Name of this section
section_type, // This section is a container of other sections.		section_type,
m_coff_header_opt.image_base +
m_sect_headers[idx].vmaddr, // File VM address == addresses as		m_sect_headers[idx].vmaddr, // File VM address == addresses as
// they are found in the object file		// they are found in the object file
m_sect_headers[idx].vmsize, // VM size in bytes of this section		m_sect_headers[idx].vmsize, // VM size in bytes of this section
m_sect_headers[idx]		m_sect_headers[idx]
.offset, // Offset to the data for this section in the file		.offset, // Offset to the data for this section in the file
m_sect_headers[idx]		m_sect_headers[idx]
.size, // Size in bytes of this section as found in the file		.size, // Size in bytes of this section as found in the file
m_coff_header_opt.sect_alignment, // Section alignment		m_coff_header_opt.sect_alignment, // Section alignment
m_sect_headers[idx].flags)); // Flags for this section		m_sect_headers[idx].flags)); // Flags for this section

// section_sp->SetIsEncrypted (segment_is_encrypted);		image_sp->GetChildren().AddSection(std::move(section_sp));

unified_section_list.AddSection(section_sp);
m_sections_up->AddSection(section_sp);
}		}
}		}
}		}

UUID ObjectFilePECOFF::GetUUID() { return UUID(); }		UUID ObjectFilePECOFF::GetUUID() { return UUID(); }

uint32_t ObjectFilePECOFF::ParseDependentModules() {		uint32_t ObjectFilePECOFF::ParseDependentModules() {
ModuleSP module_sp(GetModule());		ModuleSP module_sp(GetModule());
▲ Show 20 Lines • Show All 339 Lines • Show Last 20 Lines

lldb/trunk/source/Plugins/SymbolFile/NativePDB/DWARFLocationExpression.cpp

Show First 20 Lines • Show All 201 Lines • ▼ Show 20 Lines	DWARFExpression lldb_private::npdb::MakeGlobalLocationExpression(

return MakeLocationExpressionInternal(		return MakeLocationExpressionInternal(
module, [&](Stream &stream, RegisterKind &register_kind) -> bool {		module, [&](Stream &stream, RegisterKind &register_kind) -> bool {
stream.PutHex8(llvm::dwarf::DW_OP_addr);		stream.PutHex8(llvm::dwarf::DW_OP_addr);

SectionList *section_list = module->GetSectionList();		SectionList *section_list = module->GetSectionList();
assert(section_list);		assert(section_list);

// Section indices in PDB are 1-based, but in DWARF they are 0-based, so		auto section_ptr = section_list->FindSectionByID(section);
// we need to subtract 1.
uint32_t section_idx = section - 1;
if (section_idx >= section_list->GetSize())
return false;

auto section_ptr = section_list->GetSectionAtIndex(section_idx);
if (!section_ptr)		if (!section_ptr)
return false;		return false;

stream.PutMaxHex64(section_ptr->GetFileAddress() + offset,		stream.PutMaxHex64(section_ptr->GetFileAddress() + offset,
stream.GetAddressByteSize(), stream.GetByteOrder());		stream.GetAddressByteSize(), stream.GetByteOrder());

return true;		return true;
});		});
Show All 34 Lines

lldb/trunk/source/Plugins/SymbolFile/PDB/PDBLocationToDWARFExpression.cpp

Show First 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	DWARFExpression ConvertPDBLocationToDWARFExpression(
case PDB_LocType::Static:		case PDB_LocType::Static:
case PDB_LocType::TLS: {		case PDB_LocType::TLS: {
stream.PutHex8(DW_OP_addr);		stream.PutHex8(DW_OP_addr);

SectionList *section_list = module->GetSectionList();		SectionList *section_list = module->GetSectionList();
if (!section_list)		if (!section_list)
return DWARFExpression(nullptr);		return DWARFExpression(nullptr);

uint32_t section_idx = symbol.getAddressSection() - 1;		uint32_t section_id = symbol.getAddressSection();
if (section_idx >= section_list->GetSize())
return DWARFExpression(nullptr);

auto section = section_list->GetSectionAtIndex(section_idx);		auto section = section_list->FindSectionByID(section_id);
if (!section)		if (!section)
return DWARFExpression(nullptr);		return DWARFExpression(nullptr);

uint32_t offset = symbol.getAddressOffset();		uint32_t offset = symbol.getAddressOffset();
stream.PutMaxHex64(section->GetFileAddress() + offset, address_size,		stream.PutMaxHex64(section->GetFileAddress() + offset, address_size,
byte_order);		byte_order);

is_constant = false;		is_constant = false;
▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

lldb/trunk/source/Plugins/SymbolFile/PDB/SymbolFilePDB.cpp

Show First 20 Lines • Show All 1,343 Lines • ▼ Show 20 Lines	void SymbolFilePDB::AddSymbols(lldb_private::Symtab &symtab) {
if (!results)		if (!results)
return;		return;

auto section_list = m_obj_file->GetSectionList();		auto section_list = m_obj_file->GetSectionList();
if (!section_list)		if (!section_list)
return;		return;

while (auto pub_symbol = results->getNext()) {		while (auto pub_symbol = results->getNext()) {
auto section_idx = pub_symbol->getAddressSection() - 1;		auto section_id = pub_symbol->getAddressSection();
if (section_idx >= section_list->GetSize())
continue;

auto section = section_list->GetSectionAtIndex(section_idx);		auto section = section_list->FindSectionByID(section_id);
if (!section)		if (!section)
continue;		continue;

auto offset = pub_symbol->getAddressOffset();		auto offset = pub_symbol->getAddressOffset();

auto file_addr = section->GetFileAddress() + offset;		auto file_addr = section->GetFileAddress() + offset;
if (sym_addresses.find(file_addr) != sym_addresses.end())		if (sym_addresses.find(file_addr) != sym_addresses.end())
continue;		continue;
▲ Show 20 Lines • Show All 642 Lines • Show Last 20 Lines