This is an archive of the discontinued LLVM Phabricator instance.

Should we use the GUID from the COFF Debug Directory instead? It certainly seems more appropriate, if it's there. The UUID's purpose is to match symbol to the executable, so if you use a hash of the path it might solve this one problem, but won't solve the general case.

I would definitely encourage using something better than the file checksum as UUID, if at all possible.

source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
181	I think this should be `fromData`. The `Optional` is there to give special meaning to an all-zero checksum, but you don't need that here. Just because the md5 checksum comes out as all-zero, it doesn't mean it is not valid. PS: I am responsible for the existence of this function, so I am to blame for any confusion. If you have any idea, how to make this api more clear, I'd like to hear it.

Not quite sure but correct me if i am wrong.

(1) I think the Debug Directory is optional for COFF if it does have debug information and pdb to match with.

(2) The Debug Directory does not contain COFF timestamp.

Using md5 seems very tentative. Please elaborate how to leverage both COFF contents and the existing GUID mentioned?

In D56229#1346869, @Hui wrote:

Not quite sure but correct me if i am wrong.

(1) I think the Debug Directory is optional for COFF if it does have debug information and pdb to match with.

(2) The Debug Directory does not contain COFF timestamp.

Using md5 seems very tentative. Please elaborate how to leverage both COFF contents and the existing GUID mentioned?

Well, I guess I would ask what you want to do with the GUID? If you want to match it to a debug information file, then the Debug Directory is the correct way to do that, and using a hash of the file path will not even be helpful.

Another option would be to check for a debug directory of type IMAGE_DEBUG_TYPE_REPRO, and if that exists, then it means that the COFF timestamp is a hash of the binary, so it should be stable.

If neither of these is present, then I think we should simply return false from this function and not mislead the caller. The caller might wish to use special logic if the function returns false that says "if I couldn't get a UUID from the file, then try hashing the path and doing some kind of lookup based on that", but I don't think that should be part of this function.

The UUID that is used in ELF and Mach-o is designed to be something that is stable in a binary after it has been linked and should be the same before and after any kind of post production (stripping symbols, stripping section content to make a stand alone symbol file, etc). When someone types "target symbols add /path/to/symbol/file/a.out" we will grab its UUID and try to match it up with an existing object file.

In D56229#1346941, @zturner wrote:

Well, I guess I would ask what you want to do with the GUID? If you want to match it to a debug information file, then the Debug Directory is the correct way to do that, and using a hash of the file path will not even be helpful.

Another option would be to check for a debug directory of type IMAGE_DEBUG_TYPE_REPRO, and if that exists, then it means that the COFF timestamp is a hash of the binary, so it should be stable.

If neither of these is present, then I think we should simply return false from this function and not mislead the caller. The caller might wish to use special logic if the function returns false that says "if I couldn't get a UUID from the file, then try hashing the path and doing some kind of lookup based on that", but I don't think that should be part of this function.

I agree with the above statement. A very long time ago (for reasons which are not very important now), we decided to return a crc checksum as the UUID for ELF files without a build-id, and that's a decision that's been haunting us ever since. I think it would be good to not repeat the same mistake for COFF files. The problems with the crc elf checksum are:

it does not help (in fact, it gets in the way) of matching build-id-less elf files, because both files appear to have a valid-but-different UUID
checksumming whole files is slow, and we regularly get bug reports about lldb being slow from people whose default config does not include build-ids

So I'd propose to have this function return just the debug directory GUID, if it is there. Possibly the coff timestamp could be used as a fallback, but I don't know enough about windows/coff to say for sure. The case when we need to verify whether we have a local copy of a module for the remote debugging scenario can be handled at a different level. E.g., it already looks like the qModuleInfo packet we use does differentiate to some level the "uuid" and "md5" case https://github.com/llvm-mirror/lldb/blob/master/source/Plugins/Process/gdb-remote/GDBRemoteCommunicationClient.cpp#L3539. Right now they are both used to fill in the UUID field of the module spec, but we could change that and treat each field slightly differently. For matching exes, pdbs and minidumps we would just use the UUID field, while for checking object identity we would also fall back to the md5 checksum if the UUID field is not present.

Herald added a project: Restricted Project. · View Herald TranscriptApr 10 2019, 12:41 AM

asmith updated this revision to Diff 195282.Apr 15 2019, 6:01 PM

asmith edited the summary of this revision. (Show Details)

s/@zturner/@amccarth, as Zach probably won't have time to review this

lit/Modules/PECOFF/export-dllfunc.yaml
11–12	Since the U(G)UID needs to be stable and match the value computed from other sources, it would be good to have a test where we check that a file has some exact UUID. Is there any way to use yaml2obj to generate such a file? For instance, if we drop the `lld-link` step and yamlify the resulting `dll` file instead. Would that work?
source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
60	Can `getDebugPDBInfo` succeed and still return a null pdb_info? If not, can we delete the second part? Instead I believe you should check the CVSignature field of the returned struct to see that it indeed contains a PDB70 record.
890	ObjectFilePECOFF already has a `llvm::object::Binary` created for the underlying file. I think it's super-wasteful (and potentially racy, etc.) to create a second one just to read out it's GUID. If you make a second version of this function, taking a `Binary` (and have the FileSpec version delegate to that), then you can avoid this.

labath added inline comments.Apr 16 2019, 5:04 AM

source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
61	Is there a specific reason you used this particular encoding of the UUID, or did you do that just because it was the easiest? I am asking because I have a reason to have this use a somewhat different encoding. :) Let me elaborate: I think there are two things we can want from the UUID here: The first one is for it to match the UUID encoding we get from other sources (so that they can agree on whether they are talking about the same binary). The second one is for the uuid encoding to match the "native" UUID format of the given platform. Right now, this implementation achieves neither. :) It fails the second criterion because the UUID strings comes out in different endianness than what the windows native tools produce (I'm using `dumpbin` as reference here.). And it also fails the first one, because e.g. minidump reading code parses the UUID differently <D60501>. Now, for windows, these two criteria are slightly at odds with one another. In order to fully match the dumpbin format, we would need to have some kind of a special field for the "age" bit. But lldb has no such concept, and there doesn't seem to be much need to introduce it. However, including the "age" field in the "uuid" seems like the right thing to do, as two files with different "ages" should be considered different for debug info matching purposes (at least, that's what my limited understanding of pdbs tells me. if some of this is wrong, please let me know). So, in <D60501> I made a somewhat arbitrary decision to attach the age field to the UUID in big endian. That's the format that made most sense to me, though that can certainly be changed (the most important thing is for these things to stay in sync). So, if you have a reason to use a different encoding, please let me know, so we can agree on a consistent implementation. Otherwise, could you change this to use the same UUID format as the minidump parser?

Hui added inline comments.Apr 16 2019, 7:54 AM

source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
60	If the exe/dll is built without any debug info, the function succeeds and still returns null pdb_info.
61	You are right. The encoding of MS struct GUID and the PDB70DebugInfo::Signature are different. Can UUID format and the method to yield it from minidump parser be available in class COFFObjectFile?
890	In addition, it is possible to simplify ObjectFilePECOFF ::GetModuleSpecifications API with such Binary. In this sense, none of the arguments, like data_sp, data_offset will be used.

labath added inline comments.Apr 16 2019, 8:24 AM

source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
60	Ah, ok. Thanks for explaining.
61	I don't think you can put that in the COFFObjectFile, as it lives in llvm, and the UUID class is an lldb concept. It might be possible to put some utility function into llvm to help with that, but it's not clear to me how exactly that would look (and that would need to be a separate patch with a separate review). What would make kind of sense is to add another factory function to the `UUID` class in lldb (`UUID::fromCvRecord` ?), which both ObjectFilePECOFF and ProcessMinidump could call into. However, the problem with that is that the definition of the CvRecord is in llvm/Object, and it seems silly to have lldb/Utility depend on that just to pull the single struct. I think it would make sense to move this struct into llvm/BinaryFormat (which lldb/Utility already depends on) and then everything would be fine. If you want to try that out, then go ahead, but I don't think that's really necessary here. (swapping the bytes around should be just a couple of LOC).
890	Yeah, I noticed that too, but I didn't want to throw too many things into this patch. However, if you feel like trying it out, then please go ahead.

asmith updated this revision to Diff 195504.Apr 16 2019, 7:02 PM

asmith edited the summary of this revision. (Show Details)

asmith marked 10 inline comments as done.Apr 16 2019, 8:22 PM

asmith marked an inline comment as done.

Thanks. I have a couple of small comments, but I think this is basically done.

source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
43	llvm style is to only use the anonymous namespaces for class declarations (and use the `static` keyword for functions). What you've done here is particularly confusing, as you've combined this with `using namespace llvm`, which gives off the impression that the `using` declaration is somehow local to the anonymous namespace (which it isn't). In this case, I'd probably just get rid of the anonymous namespace and move everything (the struct definition and using declarations, if you really want it) into the now-static GetCoffUUID function.
45	`typedef struct` is very C-like. Just use plain `struct`.
117	You should keep the MagicBytesMatch call (if you want to llvm-ize it, you could replace that with a call to `llvm::identify_magic`). All of the GetModuleSpecifications will be called each time lldb does anything with an object file (any object file), and the idea is to avoid reading the whole file until we are reasonably certain that it is the file we are looking for. That's the reason this function gets the initial bytes of the file in the data_sp member. This way, all of the object file plugins can quickly inspect that data to see if they care about the file, and only one of them will attempt an actual parse.
890–891	I don't think this is necessary as `CreateInstance` will refuse to return the ObjectFile instance if the creation of the coff binary object failed. (You could theoretically assert that the binary is really there if you want extra security).

Hui added inline comments.Apr 17 2019, 7:49 AM

source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
890–891	There is no cached binary for memory instance (by CreateMemoryInstance). Is there any chance that any JIT-ed codes will call module or UUID related API?

labath added inline comments.Apr 17 2019, 7:55 AM

source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
890–891	Ah, interesting. Yes, if you manage to create a memory instance of ObjectFilePECOFF, then there's a very high chance that someone will call GetUUID on it. I strongly doubt that anyone is creating memory instances, or that they even work in the first place, but I suppose we can leave this check just in case.

asmith updated this revision to Diff 196874.Apr 26 2019, 9:53 AM

asmith marked 5 inline comments as done.Apr 26 2019, 9:55 AM

labath accepted this revision.Apr 28 2019, 11:56 PM

labath added inline comments.

source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp
81	this cast is unneeded as the function takes a `void*`
176–177	If you change the `dyn_cast` into a plain `cast` then you can drop the assert (as it will do the asserting for you).

This revision is now accepted and ready to land.Apr 28 2019, 11:56 PM

asmith updated this revision to Diff 197239.Apr 29 2019, 6:38 PM

asmith closed this revision.Apr 29 2019, 6:39 PM

Revision Contents

Path

Size

lit/

Modules/

PECOFF/

export-dllfunc.yaml

6 lines

uuid.yaml

90 lines

source/

Plugins/

ObjectFile/

PECOFF/

ObjectFilePECOFF.h

1 line

ObjectFilePECOFF.cpp

129 lines

Diff 197239

lit/Modules/PECOFF/export-dllfunc.yaml

	# REQUIRES: lld			# REQUIRES: lld
	# RUN: yaml2obj < %s > %t.obj			# RUN: yaml2obj < %s > %t.obj
	#			#
	# RUN: lld-link /machine:x64 /out:%t.dll /noentry /nodefaultlib /dll %t.obj /export:DllFunc			# RUN: lld-link /machine:x64 /out:%t.dll /noentry /nodefaultlib /debug /dll %t.obj /export:DllFunc
	#			#
	# RUN: lldb-test object-file %t.dll \| FileCheck -check-prefix=BASIC-CHECK %s			# RUN: lldb-test object-file %t.dll \| FileCheck -check-prefix=BASIC-CHECK %s
	# RUN: lldb-test object-file -dep-modules %t.dll \| FileCheck -check-prefix=DEPS %s			# RUN: lldb-test object-file -dep-modules %t.dll \| FileCheck -check-prefix=DEPS %s

				# BASIC-CHECK: Plugin name: pe-coff

				# UUID should not be empty if the module is built with debug info.
				# BASIC-CHECK-DAG: UUID: {{[0-9A-F]{7,}[0-9A-F]}}-{{.*}}
				labathUnsubmitted Done Reply Inline Actions Since the U(G)UID needs to be stable and match the value computed from other sources, it would be good to have a test where we check that a file has some exact UUID. Is there any way to use yaml2obj to generate such a file? For instance, if we drop the `lld-link` step and yamlify the resulting `dll` file instead. Would that work? labath: Since the U(G)UID needs to be stable and match the value computed from other sources, it would…

	# BASIC-CHECK: Showing 3 subsections			# BASIC-CHECK: Showing 3 subsections
	# BASIC-CHECK: Index: 0			# BASIC-CHECK: Index: 0
	# BASIC-CHECK: Name: .text			# BASIC-CHECK: Name: .text
	# BASIC-CHECK: Type: code			# BASIC-CHECK: Type: code
	# BASIC-CHECK: VM size: 22			# BASIC-CHECK: VM size: 22
	# BASIC-CHECK: File size: 512			# BASIC-CHECK: File size: 512
	#			#
	▲ Show 20 Lines • Show All 156 Lines • Show Last 20 Lines

lit/Modules/PECOFF/uuid.yaml

This file was added.

				# REQUIRES: lld
				# RUN: yaml2obj %s > %t.obj
				# RUN: lldb-test object-file %t.obj \| FileCheck %s

				# CHECK-DAG: UUID: 14B292E0-D81A-B4F1-4C4C-44205044422E-00000001

				--- !COFF
				OptionalHeader:
				AddressOfEntryPoint: 0
				ImageBase: 2147483648
				SectionAlignment: 4096
				FileAlignment: 512
				MajorOperatingSystemVersion: 6
				MinorOperatingSystemVersion: 0
				MajorImageVersion: 0
				MinorImageVersion: 0
				MajorSubsystemVersion: 6
				MinorSubsystemVersion: 0
				Subsystem: IMAGE_SUBSYSTEM_WINDOWS_GUI
				DLLCharacteristics: [ IMAGE_DLL_CHARACTERISTICS_HIGH_ENTROPY_VA, IMAGE_DLL_CHARACTERISTICS_DYNAMIC_BASE, IMAGE_DLL_CHARACTERISTICS_NX_COMPAT ]
				SizeOfStackReserve: 1048576
				SizeOfStackCommit: 4096
				SizeOfHeapReserve: 1048576
				SizeOfHeapCommit: 4096
				ExportTable:
				RelativeVirtualAddress: 8327
				Size: 90
				ImportTable:
				RelativeVirtualAddress: 0
				Size: 0
				ResourceTable:
				RelativeVirtualAddress: 0
				Size: 0
				ExceptionTable:
				RelativeVirtualAddress: 12288
				Size: 12
				CertificateTable:
				RelativeVirtualAddress: 0
				Size: 0
				BaseRelocationTable:
				RelativeVirtualAddress: 0
				Size: 0
				Debug:
				RelativeVirtualAddress: 8192
				Size: 28
				Architecture:
				RelativeVirtualAddress: 0
				Size: 0
				GlobalPtr:
				RelativeVirtualAddress: 0
				Size: 0
				TlsTable:
				RelativeVirtualAddress: 0
				Size: 0
				LoadConfigTable:
				RelativeVirtualAddress: 0
				Size: 0
				BoundImport:
				RelativeVirtualAddress: 0
				Size: 0
				IAT:
				RelativeVirtualAddress: 0
				Size: 0
				DelayImportDescriptor:
				RelativeVirtualAddress: 0
				Size: 0
				ClrRuntimeHeader:
				RelativeVirtualAddress: 0
				Size: 0
				header:
				Machine: IMAGE_FILE_MACHINE_AMD64
				Characteristics: [ IMAGE_FILE_EXECUTABLE_IMAGE, IMAGE_FILE_LARGE_ADDRESS_AWARE, IMAGE_FILE_DLL ]
				sections:
				- Name: .text
				Characteristics: [ IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ ]
				VirtualAddress: 4096
				VirtualSize: 22
				SectionData: 50894C24048B4C24040FAF4C2404890C248B042459C3
				- Name: .rdata
				Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ ]
				VirtualAddress: 8192
				VirtualSize: 236
				SectionData: 00000000A565B65C00000000020000006B0000001C2000001C06000052534453E092B2141AD8F1B44C4C44205044422E01000000443A5C757073747265616D5C6275696C645C746F6F6C735C6C6C64625C6C69745C4D6F64756C65735C5045434F46465C4F75747075745C6578706F72742D646C6C66756E632E79616D6C2E746D702E70646200000000000000000000000000AF200000000000000200000001000000CB200000D3200000D72000006578706F72742D646C6C66756E632E79616D6C2E746D702E646C6C000000000000100000D92000000100446C6C46756E63000000000101010001020000
				- Name: .pdata
				Characteristics: [ IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ ]
				VirtualAddress: 12288
				VirtualSize: 12
				SectionData: '0010000016100000E4200000'
				symbols: []
				...

source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.h

Show First 20 Lines • Show All 280 Lines • ▼ Show 20 Lines	private:
coff_header_t m_coff_header;		coff_header_t m_coff_header;
coff_opt_header_t m_coff_header_opt;		coff_opt_header_t m_coff_header_opt;
SectionHeaderColl m_sect_headers;		SectionHeaderColl m_sect_headers;
lldb::addr_t m_image_base;		lldb::addr_t m_image_base;
lldb_private::Address m_entry_point_address;		lldb_private::Address m_entry_point_address;
llvm::Optional<lldb_private::FileSpecList> m_deps_filespec;		llvm::Optional<lldb_private::FileSpecList> m_deps_filespec;
typedef llvm::object::OwningBinary<llvm::object::Binary> OWNBINType;		typedef llvm::object::OwningBinary<llvm::object::Binary> OWNBINType;
llvm::Optional<OWNBINType> m_owningbin;		llvm::Optional<OWNBINType> m_owningbin;
		lldb_private::UUID m_uuid;
};		};

#endif // liblldb_ObjectFilePECOFF_h_		#endif // liblldb_ObjectFilePECOFF_h_

source/Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.cpp

Show All 34 Lines
#define IMAGE_DOS_SIGNATURE 0x5A4D // MZ		#define IMAGE_DOS_SIGNATURE 0x5A4D // MZ
#define IMAGE_NT_SIGNATURE 0x00004550 // PE00		#define IMAGE_NT_SIGNATURE 0x00004550 // PE00
#define OPT_HEADER_MAGIC_PE32 0x010b		#define OPT_HEADER_MAGIC_PE32 0x010b
#define OPT_HEADER_MAGIC_PE32_PLUS 0x020b		#define OPT_HEADER_MAGIC_PE32_PLUS 0x020b

using namespace lldb;		using namespace lldb;
using namespace lldb_private;		using namespace lldb_private;

		struct CVInfoPdb70 {
		labathUnsubmitted Done Reply Inline Actions llvm style is to only use the anonymous namespaces for class declarations (and use the `static` keyword for functions). What you've done here is particularly confusing, as you've combined this with `using namespace llvm`, which gives off the impression that the `using` declaration is somehow local to the anonymous namespace (which it isn't). In this case, I'd probably just get rid of the anonymous namespace and move everything (the struct definition and using declarations, if you really want it) into the now-static GetCoffUUID function. labath: llvm style is to only use the anonymous namespaces for class declarations (and use the `static`…
		// 16-byte GUID
		struct _Guid {
		labathUnsubmitted Done Reply Inline Actions `typedef struct` is very C-like. Just use plain `struct`. labath: `typedef struct` is very C-like. Just use plain `struct`.
		llvm::support::ulittle32_t Data1;
		llvm::support::ulittle16_t Data2;
		llvm::support::ulittle16_t Data3;
		uint8_t Data4[8];
		} Guid;

		llvm::support::ulittle32_t Age;
		};

		static UUID GetCoffUUID(llvm::object::COFFObjectFile *coff_obj) {
		if (!coff_obj)
		return UUID();

		const llvm::codeview::DebugInfo *pdb_info = nullptr;
		llvm::StringRef pdb_file;
		labathUnsubmitted Done Reply Inline Actions Can `getDebugPDBInfo` succeed and still return a null pdb_info? If not, can we delete the second part? Instead I believe you should check the CVSignature field of the returned struct to see that it indeed contains a PDB70 record. labath: Can `getDebugPDBInfo` succeed and still return a null pdb_info? If not, can we delete the…
		HuiUnsubmitted Done Reply Inline Actions If the exe/dll is built without any debug info, the function succeeds and still returns null pdb_info. Hui: If the exe/dll is built without any debug info, the function succeeds and still returns null…
		labathUnsubmitted Done Reply Inline Actions Ah, ok. Thanks for explaining. labath: Ah, ok. Thanks for explaining.

		labathUnsubmitted Done Reply Inline Actions Is there a specific reason you used this particular encoding of the UUID, or did you do that just because it was the easiest? I am asking because I have a reason to have this use a somewhat different encoding. :) Let me elaborate: I think there are two things we can want from the UUID here: The first one is for it to match the UUID encoding we get from other sources (so that they can agree on whether they are talking about the same binary). The second one is for the uuid encoding to match the "native" UUID format of the given platform. Right now, this implementation achieves neither. :) It fails the second criterion because the UUID strings comes out in different endianness than what the windows native tools produce (I'm using `dumpbin` as reference here.). And it also fails the first one, because e.g. minidump reading code parses the UUID differently <D60501>. Now, for windows, these two criteria are slightly at odds with one another. In order to fully match the dumpbin format, we would need to have some kind of a special field for the "age" bit. But lldb has no such concept, and there doesn't seem to be much need to introduce it. However, including the "age" field in the "uuid" seems like the right thing to do, as two files with different "ages" should be considered different for debug info matching purposes (at least, that's what my limited understanding of pdbs tells me. if some of this is wrong, please let me know). So, in <D60501> I made a somewhat arbitrary decision to attach the age field to the UUID in big endian. That's the format that made most sense to me, though that can certainly be changed (the most important thing is for these things to stay in sync). So, if you have a reason to use a different encoding, please let me know, so we can agree on a consistent implementation. Otherwise, could you change this to use the same UUID format as the minidump parser? labath: Is there a specific reason you used this particular encoding of the UUID, or did you do that…
		HuiUnsubmitted Done Reply Inline Actions You are right. The encoding of MS struct GUID and the PDB70DebugInfo::Signature are different. Can UUID format and the method to yield it from minidump parser be available in class COFFObjectFile? Hui: You are right. The encoding of MS struct GUID and the PDB70DebugInfo::Signature are different.
		labathUnsubmitted Done Reply Inline Actions I don't think you can put that in the COFFObjectFile, as it lives in llvm, and the UUID class is an lldb concept. It might be possible to put some utility function into llvm to help with that, but it's not clear to me how exactly that would look (and that would need to be a separate patch with a separate review). What would make kind of sense is to add another factory function to the `UUID` class in lldb (`UUID::fromCvRecord` ?), which both ObjectFilePECOFF and ProcessMinidump could call into. However, the problem with that is that the definition of the CvRecord is in llvm/Object, and it seems silly to have lldb/Utility depend on that just to pull the single struct. I think it would make sense to move this struct into llvm/BinaryFormat (which lldb/Utility already depends on) and then everything would be fine. If you want to try that out, then go ahead, but I don't think that's really necessary here. (swapping the bytes around should be just a couple of LOC). labath: I don't think you can put that in the COFFObjectFile, as it lives in llvm, and the UUID class…
		// This part is similar with what has done in minidump parser.
		if (!coff_obj->getDebugPDBInfo(pdb_info, pdb_file) && pdb_info) {
		if (pdb_info->PDB70.CVSignature == llvm::OMF::Signature::PDB70) {
		using llvm::support::endian::read16be;
		using llvm::support::endian::read32be;

		const uint8_t *sig = pdb_info->PDB70.Signature;
		struct CVInfoPdb70 info;
		info.Guid.Data1 = read32be(sig);
		sig += 4;
		info.Guid.Data2 = read16be(sig);
		sig += 2;
		info.Guid.Data3 = read16be(sig);
		sig += 2;
		memcpy(info.Guid.Data4, sig, 8);

		// Return 20-byte UUID if the Age is not zero
		if (pdb_info->PDB70.Age) {
		info.Age = read32be(&pdb_info->PDB70.Age);
		return UUID::fromOptionalData(&info, sizeof(info));
		labathUnsubmitted Not Done Reply Inline Actions this cast is unneeded as the function takes a `void` labath:* this cast is unneeded as the function takes a `void*`
		}
		// Otherwise return 16-byte GUID
		return UUID::fromOptionalData(&info.Guid, sizeof(info.Guid));
		}
		}

		return UUID();
		}

void ObjectFilePECOFF::Initialize() {		void ObjectFilePECOFF::Initialize() {
PluginManager::RegisterPlugin(		PluginManager::RegisterPlugin(
GetPluginNameStatic(), GetPluginDescriptionStatic(), CreateInstance,		GetPluginNameStatic(), GetPluginDescriptionStatic(), CreateInstance,
CreateMemoryInstance, GetModuleSpecifications, SaveCore);		CreateMemoryInstance, GetModuleSpecifications, SaveCore);
}		}

void ObjectFilePECOFF::Terminate() {		void ObjectFilePECOFF::Terminate() {
PluginManager::UnregisterPlugin(CreateInstance);		PluginManager::UnregisterPlugin(CreateInstance);
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	ObjectFile *ObjectFilePECOFF::CreateMemoryInstance(
return nullptr;		return nullptr;
}		}

size_t ObjectFilePECOFF::GetModuleSpecifications(		size_t ObjectFilePECOFF::GetModuleSpecifications(
const lldb_private::FileSpec &file, lldb::DataBufferSP &data_sp,		const lldb_private::FileSpec &file, lldb::DataBufferSP &data_sp,
lldb::offset_t data_offset, lldb::offset_t file_offset,		lldb::offset_t data_offset, lldb::offset_t file_offset,
lldb::offset_t length, lldb_private::ModuleSpecList &specs) {		lldb::offset_t length, lldb_private::ModuleSpecList &specs) {
const size_t initial_count = specs.GetSize();		const size_t initial_count = specs.GetSize();
		if (!data_sp \|\| !ObjectFilePECOFF::MagicBytesMatch(data_sp))
		return initial_count;

if (ObjectFilePECOFF::MagicBytesMatch(data_sp)) {		auto binary = llvm::object::createBinary(file.GetPath());
labathUnsubmitted Not Done Reply Inline Actions You should keep the MagicBytesMatch call (if you want to llvm-ize it, you could replace that with a call to `llvm::identify_magic`). All of the GetModuleSpecifications will be called each time lldb does anything with an object file (any object file), and the idea is to avoid reading the whole file until we are reasonably certain that it is the file we are looking for. That's the reason this function gets the initial bytes of the file in the data_sp member. This way, all of the object file plugins can quickly inspect that data to see if they care about the file, and only one of them will attempt an actual parse. labath: You should keep the MagicBytesMatch call (if you want to llvm-ize it, you could replace that…
DataExtractor data;		if (!binary)
data.SetData(data_sp, data_offset, length);		return initial_count;
data.SetByteOrder(eByteOrderLittle);

dos_header_t dos_header;		if (!binary->getBinary()->isCOFF() &&
coff_header_t coff_header;		!binary->getBinary()->isCOFFImportFile())
		return initial_count;

if (ParseDOSHeader(data, dos_header)) {		auto COFFObj =
lldb::offset_t offset = dos_header.e_lfanew;		llvm::cast<llvm::object::COFFObjectFile>(binary->getBinary());
uint32_t pe_signature = data.GetU32(&offset);
		labathUnsubmitted Not Done Reply Inline Actions If you change the `dyn_cast` into a plain `cast` then you can drop the assert (as it will do the asserting for you). labath: If you change the `dyn_cast` into a plain `cast` then you can drop the assert (as it will do…
if (pe_signature != IMAGE_NT_SIGNATURE)		ModuleSpec module_spec(file);
return false;		ArchSpec &spec = module_spec.GetArchitecture();
if (ParseCOFFHeader(data, &offset, coff_header)) {		lldb_private::UUID &uuid = module_spec.GetUUID();
ArchSpec spec;		if (!uuid.IsValid())
		labathUnsubmitted Done Reply Inline Actions I think this should be `fromData`. The `Optional` is there to give special meaning to an all-zero checksum, but you don't need that here. Just because the md5 checksum comes out as all-zero, it doesn't mean it is not valid. PS: I am responsible for the existence of this function, so I am to blame for any confusion. If you have any idea, how to make this api more clear, I'd like to hear it. labath: I think this should be `fromData`. The `Optional` is there to give special meaning to an all…
if (coff_header.machine == MachineAmd64) {		uuid = GetCoffUUID(COFFObj);

		switch (COFFObj->getMachine()) {
		case MachineAmd64:
spec.SetTriple("x86_64-pc-windows");		spec.SetTriple("x86_64-pc-windows");
specs.Append(ModuleSpec(file, spec));		specs.Append(module_spec);
} else if (coff_header.machine == MachineX86) {		break;
		case MachineX86:
spec.SetTriple("i386-pc-windows");		spec.SetTriple("i386-pc-windows");
specs.Append(ModuleSpec(file, spec));		specs.Append(module_spec);
spec.SetTriple("i686-pc-windows");		spec.SetTriple("i686-pc-windows");
specs.Append(ModuleSpec(file, spec));		specs.Append(module_spec);
} else if (coff_header.machine == MachineArmNt) {		break;
		case MachineArmNt:
spec.SetTriple("arm-pc-windows");		spec.SetTriple("arm-pc-windows");
specs.Append(ModuleSpec(file, spec));		specs.Append(module_spec);
}		break;
}		default:
}		break;
}		}

return specs.GetSize() - initial_count;		return specs.GetSize() - initial_count;
}		}

bool ObjectFilePECOFF::SaveCore(const lldb::ProcessSP &process_sp,		bool ObjectFilePECOFF::SaveCore(const lldb::ProcessSP &process_sp,
const lldb_private::FileSpec &outfile,		const lldb_private::FileSpec &outfile,
lldb_private::Status &error) {		lldb_private::Status &error) {
▲ Show 20 Lines • Show All 669 Lines • ▼ Show 20 Lines	for (uint32_t idx = 0; idx < nsects; ++idx) {
m_coff_header_opt.sect_alignment, // Section alignment		m_coff_header_opt.sect_alignment, // Section alignment
m_sect_headers[idx].flags)); // Flags for this section		m_sect_headers[idx].flags)); // Flags for this section

image_sp->GetChildren().AddSection(std::move(section_sp));		image_sp->GetChildren().AddSection(std::move(section_sp));
}		}
}		}
}		}

UUID ObjectFilePECOFF::GetUUID() { return UUID(); }		UUID ObjectFilePECOFF::GetUUID() {
		if (m_uuid.IsValid())
		return m_uuid;

		if (!CreateBinary())
		labathUnsubmitted Done Reply Inline Actions ObjectFilePECOFF already has a `llvm::object::Binary` created for the underlying file. I think it's super-wasteful (and potentially racy, etc.) to create a second one just to read out it's GUID. If you make a second version of this function, taking a `Binary` (and have the FileSpec version delegate to that), then you can avoid this. labath: ObjectFilePECOFF already has a `llvm::object::Binary` created for the underlying file. I think…
		HuiUnsubmitted Done Reply Inline Actions In addition, it is possible to simplify ObjectFilePECOFF ::GetModuleSpecifications API with such Binary. In this sense, none of the arguments, like data_sp, data_offset will be used. Hui: In addition, it is possible to simplify ObjectFilePECOFF ::GetModuleSpecifications API with…
		labathUnsubmitted Done Reply Inline Actions Yeah, I noticed that too, but I didn't want to throw too many things into this patch. However, if you feel like trying it out, then please go ahead. labath: Yeah, I noticed that too, but I didn't want to throw too many things into this patch. However…
		return UUID();
		labathUnsubmitted Done Reply Inline Actions I don't think this is necessary as `CreateInstance` will refuse to return the ObjectFile instance if the creation of the coff binary object failed. (You could theoretically assert that the binary is really there if you want extra security). labath: I don't think this is necessary as `CreateInstance` will refuse to return the ObjectFile…
		HuiUnsubmitted Done Reply Inline Actions There is no cached binary for memory instance (by CreateMemoryInstance). Is there any chance that any JIT-ed codes will call module or UUID related API? Hui: There is no cached binary for memory instance (by CreateMemoryInstance). Is there any chance…
		labathUnsubmitted Done Reply Inline Actions Ah, interesting. Yes, if you manage to create a memory instance of ObjectFilePECOFF, then there's a very high chance that someone will call GetUUID on it. I strongly doubt that anyone is creating memory instances, or that they even work in the first place, but I suppose we can leave this check just in case. labath: Ah, interesting. Yes, if you manage to create a memory instance of ObjectFilePECOFF, then…

		auto COFFObj =
		llvm::cast<llvm::object::COFFObjectFile>(m_owningbin->getBinary());

		m_uuid = GetCoffUUID(COFFObj);
		return m_uuid;
		}

uint32_t ObjectFilePECOFF::ParseDependentModules() {		uint32_t ObjectFilePECOFF::ParseDependentModules() {
ModuleSP module_sp(GetModule());		ModuleSP module_sp(GetModule());
if (!module_sp)		if (!module_sp)
return 0;		return 0;

std::lock_guard<std::recursive_mutex> guard(module_sp->GetMutex());		std::lock_guard<std::recursive_mutex> guard(module_sp->GetMutex());
if (m_deps_filespec)		if (m_deps_filespec)
return m_deps_filespec->GetSize();		return m_deps_filespec->GetSize();

// Cache coff binary if it is not done yet.		// Cache coff binary if it is not done yet.
if (!CreateBinary())		if (!CreateBinary())
return 0;		return 0;

Log *log(GetLogIfAllCategoriesSet(LIBLLDB_LOG_OBJECT));		Log *log(GetLogIfAllCategoriesSet(LIBLLDB_LOG_OBJECT));
if (log)		if (log)
log->Printf("%p ObjectFilePECOFF::ParseDependentModules() module = %p "		log->Printf("%p ObjectFilePECOFF::ParseDependentModules() module = %p "
"(%s), binary = %p (Bin = %p)",		"(%s), binary = %p (Bin = %p)",
static_cast<void >(this), static_cast<void >(module_sp.get()),		static_cast<void >(this), static_cast<void >(module_sp.get()),
module_sp->GetSpecificationDescription().c_str(),		module_sp->GetSpecificationDescription().c_str(),
static_cast<void *>(m_owningbin.getPointer()),		static_cast<void *>(m_owningbin.getPointer()),
m_owningbin ? static_cast<void *>(m_owningbin->getBinary())		static_cast<void *>(m_owningbin->getBinary()));
: nullptr);

auto COFFObj =		auto COFFObj =
llvm::dyn_cast<llvm::object::COFFObjectFile>(m_owningbin->getBinary());		llvm::dyn_cast<llvm::object::COFFObjectFile>(m_owningbin->getBinary());
if (!COFFObj)		if (!COFFObj)
return 0;		return 0;

m_deps_filespec = FileSpecList();		m_deps_filespec = FileSpecList();

▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	if (!ParseHeader() \|\| !IsExecutable())
return m_entry_point_address;		return m_entry_point_address;

SectionList *section_list = GetSectionList();		SectionList *section_list = GetSectionList();
addr_t file_addr = m_coff_header_opt.entry + m_coff_header_opt.image_base;		addr_t file_addr = m_coff_header_opt.entry + m_coff_header_opt.image_base;

if (!section_list)		if (!section_list)
m_entry_point_address.SetOffset(file_addr);		m_entry_point_address.SetOffset(file_addr);
else		else
m_entry_point_address.ResolveAddressUsingFileSections(file_addr, section_list);		m_entry_point_address.ResolveAddressUsingFileSections(file_addr,
		section_list);
return m_entry_point_address;		return m_entry_point_address;
}		}

Address ObjectFilePECOFF::GetBaseAddress() {		Address ObjectFilePECOFF::GetBaseAddress() {
return Address(GetSectionList()->GetSectionAtIndex(0), 0);		return Address(GetSectionList()->GetSectionAtIndex(0), 0);
}		}

// Dump		// Dump
▲ Show 20 Lines • Show All 238 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[PECOFF] Implementation of ObjectFilePECOFF:: GetUUID()ClosedPublic

Details

Diff Detail