This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/lldb/Symbol/
-
lldb/
-
Symbol/
-
ObjectFile.h
-
lit/Modules/Breakpad/
-
Modules/
-
Breakpad/
-
Inputs/
-
identification-linux.syms
-
identification-macosx.syms
-
identification-windows.syms
-
breakpad-identification.test
1/2
lit.local.cfg
-
source/
-
Plugins/ObjectFile/
-
ObjectFile/
-
Breakpad/
-
CMakeLists.txt
7/7
ObjectFileBreakpad.h
11/13
ObjectFileBreakpad.cpp
-
CMakeLists.txt
-
Symbol/
-
ObjectFile.cpp
-
tools/lldb-test/
-
lldb-test/
2/2
SystemInitializerTest.cpp
1/1
lldb-test.cpp

Differential D55214

Introduce ObjectFileBreakpad
ClosedPublic

Authored by labath on Dec 3 2018, 5:27 AM.

Download Raw Diff

Details

Reviewers

clayborg
zturner
lemo
amccarth
markmentovai

Commits

rG1f6b247717c9: Re-commit "Introduce ObjectFileBreakpad"
rGd6e6e232ec90: Introduce ObjectFileBreakpad
rLLDB348773: Re-commit "Introduce ObjectFileBreakpad"
rL348773: Re-commit "Introduce ObjectFileBreakpad"
rLLDB348592: Introduce ObjectFileBreakpad
rL348592: Introduce ObjectFileBreakpad

Summary

This patch adds the scaffolding necessary for lldb to recognise symbol
files generated by breakpad. These (textual) files contain just enough
information to be able to produce a backtrace from a crash
dump. This information includes:

UUID, architecture and name of the module
line tables
list of symbols
unwind information

A minimal breakpad file could look like this:
MODULE Linux x86_64 0000000024B5D199F0F766FFFFFF5DC30 a.out
INFO CODE_ID 00000000B52499D1F0F766FFFFFF5DC3
FILE 0 /tmp/a.c
FUNC 1010 10 0 _start
1010 4 4 0
1014 5 5 0
1019 5 6 0
101e 2 7 0
PUBLIC 1010 0 _start
STACK CFI INIT 1010 10 .cfa: $rsp 8 + .ra: .cfa -8 + ^
STACK CFI 1011 $rbp: .cfa -16 + ^ .cfa: $rsp 16 +
STACK CFI 1014 .cfa: $rbp 16 +

Even though this data would normally be considered "symbol" information,
in the current lldb infrastructure it is assumed every SymbolFile object
is backed by an ObjectFile instance. So, in order to better interoperate
with the rest of the code (particularly symbol vendors).

In this patch I just parse the breakpad header, which is enough to
populate the UUID and architecture fields of the ObjectFile interface.
The rough plan for followup patches is to expose the individual parts of
the breakpad file as ObjectFile "sections", which can then be used by
other parts of the codebase (SymbolFileBreakpad ?) to vend the necessary
information.

Diff Detail

Build Status

Buildable 25667
Build 25666: arc lint + arc unit

Event Timeline

labath created this revision.Dec 3 2018, 5:27 AM

Herald added subscribers: fedor.sergeev, mgorny. · View Herald TranscriptDec 3 2018, 5:27 AM

Harbormaster completed remote builds in B25603: Diff 176376.Dec 3 2018, 5:27 AM

labath added inline comments.Dec 3 2018, 5:56 AM

source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.cpp
67–85	@lemo: Does this part make sense? It seems that on linux the breakpad files have the `INFO CODE_ID` section, which contains the UUID without the funny trailing zero. So I could try fetching the UUID from there instead, but only on linux, as that section is not present mac (and on windows it contains something completely different). Right now I compute the UUID on linux by chopping off the trailing zero (as I have to do that anyway for mac), but I could do something different is there's any advantage to that.

This looks like a good start.

This revision is now accepted and ready to land.Dec 3 2018, 7:29 AM

Very excited to see this work beginning!

source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.cpp
67–85	INFO CODE_ID, if present, is a better thing to use than what you find in MODULE, except on Windows, where it’s absolutely the wrong thing to use but MODULE is fine. So, suggested logic: if has_code_id and not is_win: id = code_id else: id = module_id Aside from special-casing Windows against using INFO CODE_ID, I don’t think you should hard-code any OS checks here. There’s no reason Mac dump_syms couldn’t emit INFO CODE_ID, even though it doesn’t currently. (In fact, you don’t even need to special-case for Windows. You could just detect the presence of a filename token after the ID in INFO CODE_ID. As your test data shows, Windows dump_syms always puts the module filename here, as in “INFO CODE_ID 5C01672A4000 a.exe”, but other dump_syms will only have the uncorrupted debug ID.

implement the module_id/code_id logic suggested by Mark Mentovai
fix module_id endianness handling to make sure the UUID matches the one we get from the minidump files

Harbormaster completed remote builds in B25667: Diff 176615.Dec 4 2018, 6:10 AM

labath added a reviewer: markmentovai.Dec 4 2018, 6:15 AM

labath marked 3 inline comments as done.

labath added inline comments.

source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.cpp
67–85	Thanks. I've implemented the logic you suggested and fixed byte-swapping issues when parsing the module id. Note I still have to special-case windows to strip the "age" field from the module_id in order for our UUID to match the ones we normally get on mac. (We do the same thing when opening minidump files: https://github.com/llvm-mirror/lldb/blob/master/source/Plugins/Process/minidump/MinidumpParser.cpp#L88).

JDevlieghere added a subscriber: JDevlieghere.Dec 4 2018, 9:43 AM

Looks like a great start

source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.cpp
53	these magic integer literals make it hard to follow the intent - what's special about 33, 40, 8, 16, ... ? (symbolic constants might help)
source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.h
98	Nit: I personally prefer not to mix data, type and function members in the same "access" section - is there an LLVM/LLDB guideline which requires everything in the same place? If not, can you please add a private section for the destructor, followed by a section for the private data members?

zturner added inline comments.Dec 4 2018, 3:21 PM

lit/Modules/Breakpad/lit.local.cfg
1	This shouldn't be necessary, the top-level `lit.cfg.py` already recognizes `.test` extension. You only need a lit.local.cfg if you're changing something.
source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.cpp
20–43	LLVM already has these functions in `Triple.cpp`, but they are hidden as private implementations in the CPP file. Perhaps we should expose them from headers in Triple.h.
56–59	Consider using `StringRef::consumeInteger()` here.
60–77	Similarly for these lines, by using `consume` functions everywhere we can get rid of a lot of the math and I think make the code easier to follow.
92–101	Instead of having the custom parsing functions above, how about just: std::tie(os, line) = getToken(line); std::tie(arch, line) = getToken(line); llvm::Triple triple(os, "unknown", arch); if (triple.getArch() == Unknown \|\| triple.getOS() == Unknown) return llvm::None; This way we don't even need to expose the parse functions I commented on earlier, and we can just delete them.
160–161	We have `GetData()` which returns an `ArrayRef`, and another function `toStringRef` which converts an `ArrayRef` to a `StringRef`. So this might be cleaner to write as `auto text = llvm::toStringRef(data_sp->GetData());`
source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.h
74	Is this always true for breakpad files?
98	Given that we don't actually store an instance of the header anywhere, we just use it as a constructor parameter, perhaps we could go one step further and move this entire type to an anonymous namespace in the cpp file, and update the constructor to take an `ArchSpec` and a `UUID`. I prefer to avoid nested classes wherever possible since it clutters up the interface, so hiding it to the cpp file is nice.
tools/lldb-test/SystemInitializerTest.cpp
120	Shouldn't we also initialize this in `SystemInitializerFull`?
tools/lldb-test/lldb-test.cpp
737	I would use an explicit type spelling here, but since the function is called `GetObjectFile`, I don't feel too strongly. It's pretty clear what the return type is.

Updated according to review comments.

Also added a couple of tests for invalid inputs.

Harbormaster completed remote builds in B25718: Diff 176781.Dec 5 2018, 3:05 AM

labath added inline comments.Dec 5 2018, 3:06 AM

lit/Modules/Breakpad/lit.local.cfg
1	Yes, but then `lit/Modules/lit.local.cfg` overrides it by specifying it's own list of suffixes. I could fix that by adding `.test.` to that file, or by making that file use `+=`, but it's not clear to me whether that is better than just being explicit here. If you have any preference, let me know.
source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.cpp
20–43	I've already checked out the available functions in `llvm::Triple`, and unfortunately each of them uses a slightly different form for some of the values. For example, `getArchTypeNameForLLVMName` uses `x86-64` instead of `x86_64`, `parseArch` uses `i386` instead of `x86`, `parseOS` uses `linux` instead of `Linux`, and so on... Since this particular encoding is specific to the breakpad format, it made sense to me to have the parsing functions live here (as opposed to adding new cases to the `Triple` functions for instance), and leave everything else working with the "canonical" forms.
53	I've rewritten this to gradually chop bytes off from the start of the string, instead of always indexing into the original one. That should reduce the number of magic numbers (and hopefully reduce confusion).
56–59	I don't think consumeInteger can help, as these "fields" are not delimited here in any way, so that function will happily try to parse the whole string. If you had a specific patter in mind let me know (but hopefully the new implementation won't be so bad either).
160–161	Cool, I didn't know about that. Thanks.
source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.h
74	Well.. the whole point of these files is to provide symbol information, so it would be weird if they were stripped. The breakpad `dump_syms` allows you to omit generating unwind information, but I don't think that's enough to call this "stripped". It is certainly possible to create a file by hand which contains just a `MODULE` directive and nothing else, but I would say that is a (non-stripped) file which describes an empty module, and not a stripped file. In reality, this doesn't really matter, as this function is called from just one place https://github.com/llvm-mirror/lldb/blob/master/source/Core/Module.cpp#L506, and I don't think that will be relevant for breakpad files.
98	Sounds good.
tools/lldb-test/SystemInitializerTest.cpp
120	good point

markmentovai added inline comments.Dec 5 2018, 7:18 AM

source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.h
74	Correct, "stripped" isn't really useful for Breakpad dump_syms output. What does LLDB do with the result of IsStripped()? Stripped dump_syms output would be what you get from running dump_syms on a stripped module. I can't imagine why anyone would do this intentionally, but you'd also be hard-pressed to tell that's what had happened given only the dumped symbol file.

labath marked 3 inline comments as done.Dec 5 2018, 7:26 AM

labath added inline comments.

source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.h
74	Not much. The only relevant use is linked to above. I don't fully understand that code, but my rough idea is the following: we create a "synthetic" symbol in the main object file when we know some symbol must be at the given address, but we don't know it's name. Then when we are looking up an address and it resolves to this synthetic symbol (and the object file is marked as stripped), we go to the symbol file (if we have one) to see if it can provide us with a name for it. So this isn't even relevant for breakpad files, as they will never be the "main" object file, but I had to put something here, and "false" seems the best option.

zturner accepted this revision.Dec 6 2018, 11:59 AM

Closed by commit rL348592: Introduce ObjectFileBreakpad (authored by labath). · Explain WhyDec 7 2018, 6:23 AM

This revision was automatically updated to reflect the committed changes.

labath marked an inline comment as done.

Herald added a subscriber: llvm-commits. · View Herald TranscriptDec 7 2018, 6:23 AM

Revision Contents

Path

Size

include/

lldb/

Symbol/

ObjectFile.h

12 lines

lit/

Modules/

Breakpad/

Inputs/

identification-linux.syms

6 lines

identification-macosx.syms

6 lines

identification-windows.syms

4 lines

breakpad-identification.test

27 lines

lit.local.cfg

1 line

source/

Plugins/

ObjectFile/

Breakpad/

CMakeLists.txt

11 lines

ObjectFileBreakpad.h

116 lines

ObjectFileBreakpad.cpp

236 lines

CMakeLists.txt

3 lines

Symbol/

ObjectFile.cpp

60 lines

tools/

lldb-test/

SystemInitializerTest.cpp

3 lines

lldb-test.cpp

7 lines

Diff 176615

include/lldb/Symbol/ObjectFile.h

Show First 20 Lines • Show All 811 Lines • ▼ Show 20 Lines	static lldb::DataBufferSP MapFileData(const FileSpec &file, uint64_t Size,
uint64_t Offset);		uint64_t Offset);

private:		private:
DISALLOW_COPY_AND_ASSIGN(ObjectFile);		DISALLOW_COPY_AND_ASSIGN(ObjectFile);
};		};

} // namespace lldb_private		} // namespace lldb_private

		namespace llvm {
		template <> struct format_provider<lldb_private::ObjectFile::Type> {
		static void format(const lldb_private::ObjectFile::Type &type,
		raw_ostream &OS, StringRef Style);
		};

		template <> struct format_provider<lldb_private::ObjectFile::Strata> {
		static void format(const lldb_private::ObjectFile::Strata &strata,
		raw_ostream &OS, StringRef Style);
		};
		} // namespace llvm

#endif // liblldb_ObjectFile_h_		#endif // liblldb_ObjectFile_h_

lit/Modules/Breakpad/Inputs/identification-linux.syms

This file was added.

				MODULE Linux x86_64 E5894855C35DCCCCCCCCCCCCCCCCCCCC0 linux.out
				INFO CODE_ID 554889E55DC3CCCCCCCCCCCCCCCCCCCC
				PUBLIC 1000 0 _start
				STACK CFI INIT 1000 6 .cfa: $rsp 8 + .ra: .cfa -8 + ^
				STACK CFI 1001 $rbp: .cfa -16 + ^ .cfa: $rsp 16 +
				STACK CFI 1004 .cfa: $rbp 16 +

lit/Modules/Breakpad/Inputs/identification-macosx.syms

This file was added.

				MODULE mac x86_64 D98C0E682089AA1BEACD6A8C1F16707B0 mac.out
				PUBLIC 0 0 _mh_execute_header
				PUBLIC f30 0 start
				STACK CFI INIT f30 6 .cfa: $rsp 8 + .ra: .cfa -8 + ^
				STACK CFI f31 $rbp: .cfa -16 + ^ .cfa: $rsp 16 +
				STACK CFI f34 .cfa: $rbp 16 +

lit/Modules/Breakpad/Inputs/identification-windows.syms

This file was added.

				MODULE windows x86 A0C9165780B5490981A1925EA62165C01 a.pdb
				INFO CODE_ID 5C01672A4000 a.exe
				FILE 1 c:\tmp\a.cpp
				PUBLIC 1000 0 main

lit/Modules/Breakpad/breakpad-identification.test

This file was added.

				RUN: lldb-test object-file %p/Inputs/identification-linux.syms \| FileCheck %s --check-prefix=LINUX
				RUN: lldb-test object-file %p/Inputs/identification-macosx.syms \| FileCheck %s --check-prefix=MAC
				RUN: lldb-test object-file %p/Inputs/identification-windows.syms \| FileCheck %s --check-prefix=WINDOWS

				LINUX: Plugin name: breakpad
				LINUX: Architecture: x86_64--linux
				LINUX: UUID: 554889E5-5DC3-CCCC-CCCC-CCCCCCCCCCCC
				LINUX: Executable: false
				LINUX: Stripped: false
				LINUX: Type: debug info
				LINUX: Strata: user

				MAC: Plugin name: breakpad
				MAC: Architecture: x86_64--macosx
				MAC: UUID: 680E8CD9-8920-1BAA-EACD-6A8C1F16707B
				MAC: Executable: false
				MAC: Stripped: false
				MAC: Type: debug info
				MAC: Strata: user

				WINDOWS: Plugin name: breakpad
				WINDOWS: Architecture: i386--windows
				WINDOWS: UUID: 5716C9A0-B580-0949-81A1-925EA62165C0-01000000
				WINDOWS: Executable: false
				WINDOWS: Stripped: false
				WINDOWS: Type: debug info
				WINDOWS: Strata: user

lit/Modules/Breakpad/lit.local.cfg

This file was added.

				config.suffixes = ['.test']
				zturnerUnsubmitted Not Done Reply Inline Actions This shouldn't be necessary, the top-level `lit.cfg.py` already recognizes `.test` extension. You only need a lit.local.cfg if you're changing something. zturner: This shouldn't be necessary, the top-level `lit.cfg.py` already recognizes `.test` extension.
				labathAuthorUnsubmitted Done Reply Inline Actions Yes, but then `lit/Modules/lit.local.cfg` overrides it by specifying it's own list of suffixes. I could fix that by adding `.test.` to that file, or by making that file use `+=`, but it's not clear to me whether that is better than just being explicit here. If you have any preference, let me know. labath: Yes, but then `lit/Modules/lit.local.cfg` overrides it by specifying it's own list of suffixes.

source/Plugins/ObjectFile/Breakpad/CMakeLists.txt

This file was added.

				add_lldb_library(lldbPluginObjectFileBreakpad PLUGIN
				ObjectFileBreakpad.cpp

				LINK_LIBS
				lldbCore
				lldbHost
				lldbSymbol
				lldbUtility
				LINK_COMPONENTS
				Support
				)

source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.h

This file was added.

				//===-- ObjectFileBreakpad.h ---------------------------------- -- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLDB_PLUGINS_OBJECTFILE_BREAKPAD_OBJECTFILEBREAKPAD_H
				#define LLDB_PLUGINS_OBJECTFILE_BREAKPAD_OBJECTFILEBREAKPAD_H

				#include "lldb/Symbol/ObjectFile.h"
				#include "lldb/Utility/ArchSpec.h"
				#include "llvm/ADT/Triple.h"

				namespace lldb_private {
				namespace breakpad {

				class ObjectFileBreakpad : public ObjectFile {
				public:
				//------------------------------------------------------------------
				// Static Functions
				//------------------------------------------------------------------
				static void Initialize();
				static void Terminate();

				static ConstString GetPluginNameStatic();
				static const char *GetPluginDescriptionStatic() {
				return "Breakpad object file reader.";
				}

				static ObjectFile *
				CreateInstance(const lldb::ModuleSP &module_sp, lldb::DataBufferSP &data_sp,
				lldb::offset_t data_offset, const FileSpec *file,
				lldb::offset_t file_offset, lldb::offset_t length);

				static size_t GetModuleSpecifications(const FileSpec &file,
				lldb::DataBufferSP &data_sp,
				lldb::offset_t data_offset,
				lldb::offset_t file_offset,
				lldb::offset_t length,
				ModuleSpecList &specs);

				//------------------------------------------------------------------
				// PluginInterface protocol
				//------------------------------------------------------------------
				ConstString GetPluginName() override { return GetPluginNameStatic(); }

				uint32_t GetPluginVersion() override { return 1; }

				//------------------------------------------------------------------
				// ObjectFile Protocol.
				//------------------------------------------------------------------

				bool ParseHeader() override;

				lldb::ByteOrder GetByteOrder() const override {
				return m_arch.GetByteOrder();
				}

				bool IsExecutable() const override { return false; }

				uint32_t GetAddressByteSize() const override {
				return m_arch.GetAddressByteSize();
				}

				AddressClass GetAddressClass(lldb::addr_t file_addr) override {
				return AddressClass::eInvalid;
				}

				Symtab *GetSymtab() override;

				bool IsStripped() override { return false; }
				zturnerUnsubmitted Done Reply Inline Actions Is this always true for breakpad files? zturner: Is this always true for breakpad files?
				labathAuthorUnsubmitted Done Reply Inline Actions Well.. the whole point of these files is to provide symbol information, so it would be weird if they were stripped. The breakpad `dump_syms` allows you to omit generating unwind information, but I don't think that's enough to call this "stripped". It is certainly possible to create a file by hand which contains just a `MODULE` directive and nothing else, but I would say that is a (non-stripped) file which describes an empty module, and not a stripped file. In reality, this doesn't really matter, as this function is called from just one place https://github.com/llvm-mirror/lldb/blob/master/source/Core/Module.cpp#L506, and I don't think that will be relevant for breakpad files. labath: Well.. the whole point of these files is to provide symbol information, so it would be weird if…
				markmentovaiUnsubmitted Done Reply Inline Actions Correct, "stripped" isn't really useful for Breakpad dump_syms output. What does LLDB do with the result of IsStripped()? Stripped dump_syms output would be what you get from running dump_syms on a stripped module. I can't imagine why anyone would do this intentionally, but you'd also be hard-pressed to tell that's what had happened given only the dumped symbol file. markmentovai: Correct, "stripped" isn't really useful for Breakpad dump_syms output. What does LLDB do with…
				labathAuthorUnsubmitted Done Reply Inline Actions Not much. The only relevant use is linked to above. I don't fully understand that code, but my rough idea is the following: we create a "synthetic" symbol in the main object file when we know some symbol must be at the given address, but we don't know it's name. Then when we are looking up an address and it resolves to this synthetic symbol (and the object file is marked as stripped), we go to the symbol file (if we have one) to see if it can provide us with a name for it. So this isn't even relevant for breakpad files, as they will never be the "main" object file, but I had to put something here, and "false" seems the best option. labath: Not much. The only relevant use is linked to above. I don't fully understand that code, but my…

				void CreateSections(SectionList &unified_section_list) override;

				void Dump(Stream *s) override {}

				bool GetArchitecture(ArchSpec &arch) override;

				bool GetUUID(UUID *uuid) override;

				FileSpecList GetDebugSymbolFilePaths() override { return FileSpecList(); }

				uint32_t GetDependentModules(FileSpecList &files) override { return 0; }

				Type CalculateType() override { return eTypeDebugInfo; }

				Strata CalculateStrata() override { return eStrataUser; }

				size_t ReadSectionData(Section *section, lldb::offset_t section_offset,
				void *dst, size_t dst_len) override;

				size_t ReadSectionData(Section *section,
				DataExtractor &section_data) override;

				private:
				lemoUnsubmitted Done Reply Inline Actions Nit: I personally prefer not to mix data, type and function members in the same "access" section - is there an LLVM/LLDB guideline which requires everything in the same place? If not, can you please add a private section for the destructor, followed by a section for the private data members? lemo: Nit: I personally prefer not to mix data, type and function members in the same "access"…
				zturnerUnsubmitted Done Reply Inline Actions Given that we don't actually store an instance of the header anywhere, we just use it as a constructor parameter, perhaps we could go one step further and move this entire type to an anonymous namespace in the cpp file, and update the constructor to take an `ArchSpec` and a `UUID`. I prefer to avoid nested classes wherever possible since it clutters up the interface, so hiding it to the cpp file is nice. zturner: Given that we don't actually store an instance of the header anywhere, we just use it as a…
				labathAuthorUnsubmitted Done Reply Inline Actions Sounds good. labath: Sounds good.
				struct Header {
				ArchSpec arch;
				UUID uuid;
				static llvm::Optional<Header> parse(llvm::StringRef text);
				};

				ArchSpec m_arch;
				UUID m_uuid;

				ObjectFileBreakpad(const lldb::ModuleSP &module_sp,
				lldb::DataBufferSP &data_sp, lldb::offset_t data_offset,
				const FileSpec *file, lldb::offset_t offset,
				lldb::offset_t length, const Header &header);
				};

				} // namespace breakpad
				} // namespace lldb_private
				#endif // LLDB_PLUGINS_OBJECTFILE_BREAKPAD_OBJECTFILEBREAKPAD_H

source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.cpp

This file was added.

				//===-- ObjectFileBreakpad.cpp -------------------------------- -- C++ --===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//

				#include "Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.h"
				#include "lldb/Core/ModuleSpec.h"
				#include "lldb/Core/PluginManager.h"
				#include "lldb/Utility/DataBuffer.h"
				#include "llvm/ADT/StringExtras.h"

				using namespace lldb;
				using namespace lldb_private;
				using namespace lldb_private::breakpad;

				static llvm::Triple::OSType toOS(llvm::StringRef str) {
				using llvm::Triple;
				return llvm::StringSwitch<Triple::OSType>(str)
				.Case("Linux", Triple::Linux)
				.Case("mac", Triple::MacOSX)
				.Case("windows", Triple::Win32)
				.Default(Triple::UnknownOS);
				}

				static llvm::Triple::ArchType toArch(llvm::StringRef str) {
				using llvm::Triple;
				return llvm::StringSwitch<Triple::ArchType>(str)
				.Case("arm", Triple::arm)
				.Case("arm64", Triple::aarch64)
				.Case("mips", Triple::mips)
				.Case("ppc", Triple::ppc)
				.Case("ppc64", Triple::ppc64)
				.Case("s390", Triple::systemz)
				.Case("sparc", Triple::sparc)
				.Case("sparcv9", Triple::sparcv9)
				.Case("x86", Triple::x86)
				.Case("x86_64", Triple::x86_64)
				.Default(Triple::UnknownArch);
				}
				zturnerUnsubmitted Not Done Reply Inline Actions LLVM already has these functions in `Triple.cpp`, but they are hidden as private implementations in the CPP file. Perhaps we should expose them from headers in Triple.h. zturner: LLVM already has these functions in `Triple.cpp`, but they are hidden as private…
				labathAuthorUnsubmitted Done Reply Inline Actions I've already checked out the available functions in `llvm::Triple`, and unfortunately each of them uses a slightly different form for some of the values. For example, `getArchTypeNameForLLVMName` uses `x86-64` instead of `x86_64`, `parseArch` uses `i386` instead of `x86`, `parseOS` uses `linux` instead of `Linux`, and so on... Since this particular encoding is specific to the breakpad format, it made sense to me to have the parsing functions live here (as opposed to adding new cases to the `Triple` functions for instance), and leave everything else working with the "canonical" forms. labath: I've already checked out the available functions in `llvm::Triple`, and unfortunately each of…

				static UUID parseModuleId(llvm::Triple::OSType os, llvm::StringRef str) {
				struct uuid_data {
				llvm::support::ulittle32_t uuid1;
				llvm::support::ulittle16_t uuid2[2];
				uint8_t uuid3[8];
				llvm::support::ulittle32_t age;
				} data;
				static_assert(sizeof(data) == 20, "");
				if (str.size() < 33 \|\| str.size() > 40)
				lemoUnsubmitted Done Reply Inline Actions these magic integer literals make it hard to follow the intent - what's special about 33, 40, 8, 16, ... ? (symbolic constants might help) lemo: these magic integer literals make it hard to follow the intent - what's special about 33, 40, 8…
				labathAuthorUnsubmitted Done Reply Inline Actions I've rewritten this to gradually chop bytes off from the start of the string, instead of always indexing into the original one. That should reduce the number of magic numbers (and hopefully reduce confusion). labath: I've rewritten this to gradually chop bytes off from the start of the string, instead of always…
				return UUID();
				uint32_t t;
				if (to_integer(str.substr(0, 8), t, 16))
				data.uuid1 = t;
				else
				return UUID();
				zturnerUnsubmitted Done Reply Inline Actions Consider using `StringRef::consumeInteger()` here. zturner: Consider using `StringRef::consumeInteger()` here.
				labathAuthorUnsubmitted Done Reply Inline Actions I don't think consumeInteger can help, as these "fields" are not delimited here in any way, so that function will happily try to parse the whole string. If you had a specific patter in mind let me know (but hopefully the new implementation won't be so bad either). labath: I don't think consumeInteger can help, as these "fields" are not delimited here in any way, so…
				for (int i = 0; i < 2; ++i) {
				if (to_integer(str.substr(8 + i * 4, 4), t, 16))
				data.uuid2[i] = t;
				else
				return UUID();
				}
				for (int i = 0; i < 8; ++i) {
				if (!to_integer(str.substr(16 + i * 2, 2), data.uuid3[i], 16))
				return UUID();
				}
				if (to_integer(str.substr(32), t, 16))
				data.age = t;
				else
				return UUID();

				// On non-windows, the age field should always be zero, so we don't include to
				// match the native uuid format of these platforms.
				return UUID::fromData(&data, os == llvm::Triple::Win32 ? 20 : 16);
				zturnerUnsubmitted Done Reply Inline Actions Similarly for these lines, by using `consume` functions everywhere we can get rid of a lot of the math and I think make the code easier to follow. zturner: Similarly for these lines, by using `consume` functions everywhere we can get rid of a lot of…
				}

				llvm::Optional<ObjectFileBreakpad::Header>
				ObjectFileBreakpad::Header::parse(llvm::StringRef text) {
				// A valid module should start with something like:
				// MODULE Linux x86_64 E5894855C35DCCCCCCCCCCCCCCCCCCCC0 a.out
				// optionally followed by
				// INFO CODE_ID 554889E55DC3CCCCCCCCCCCCCCCCCCCC [a.exe]
				labathAuthorUnsubmitted Done Reply Inline Actions @lemo: Does this part make sense? It seems that on linux the breakpad files have the `INFO CODE_ID` section, which contains the UUID without the funny trailing zero. So I could try fetching the UUID from there instead, but only on linux, as that section is not present mac (and on windows it contains something completely different). Right now I compute the UUID on linux by chopping off the trailing zero (as I have to do that anyway for mac), but I could do something different is there's any advantage to that. labath: @lemo: Does this part make sense? It seems that on linux the breakpad files have the `INFO…
				markmentovaiUnsubmitted Done Reply Inline Actions INFO CODE_ID, if present, is a better thing to use than what you find in MODULE, except on Windows, where it’s absolutely the wrong thing to use but MODULE is fine. So, suggested logic: if has_code_id and not is_win: id = code_id else: id = module_id Aside from special-casing Windows against using INFO CODE_ID, I don’t think you should hard-code any OS checks here. There’s no reason Mac dump_syms couldn’t emit INFO CODE_ID, even though it doesn’t currently. (In fact, you don’t even need to special-case for Windows. You could just detect the presence of a filename token after the ID in INFO CODE_ID. As your test data shows, Windows dump_syms always puts the module filename here, as in “INFO CODE_ID 5C01672A4000 a.exe”, but other dump_syms will only have the uncorrupted debug ID. markmentovai: INFO CODE_ID, if present, is a better thing to use than what you find in MODULE, except on…
				labathAuthorUnsubmitted Done Reply Inline Actions Thanks. I've implemented the logic you suggested and fixed byte-swapping issues when parsing the module id. Note I still have to special-case windows to strip the "age" field from the module_id in order for our UUID to match the ones we normally get on mac. (We do the same thing when opening minidump files: https://github.com/llvm-mirror/lldb/blob/master/source/Plugins/Process/minidump/MinidumpParser.cpp#L88). labath: Thanks. I've implemented the logic you suggested and fixed byte-swapping issues when parsing…
				llvm::StringRef token, line;
				std::tie(line, text) = text.split('\n');
				std::tie(token, line) = getToken(line);
				if (token != "MODULE")
				return llvm::None;

				std::tie(token, line) = getToken(line);
				llvm::Triple triple;
				triple.setOS(toOS(token));
				if (triple.getOS() == llvm::Triple::UnknownOS)
				return llvm::None;

				std::tie(token, line) = getToken(line);
				triple.setArch(toArch(token));
				if (triple.getArch() == llvm::Triple::UnknownArch)
				return llvm::None;
				zturnerUnsubmitted Not Done Reply Inline Actions Instead of having the custom parsing functions above, how about just: std::tie(os, line) = getToken(line); std::tie(arch, line) = getToken(line); llvm::Triple triple(os, "unknown", arch); if (triple.getArch() == Unknown \|\| triple.getOS() == Unknown) return llvm::None; This way we don't even need to expose the parse functions I commented on earlier, and we can just delete them. zturner: Instead of having the custom parsing functions above, how about just: ``` std::tie(os, line) =…

				llvm::StringRef module_id;
				std::tie(module_id, line) = getToken(line);

				std::tie(line, text) = text.split('\n');
				std::tie(token, line) = getToken(line);
				if (token == "INFO") {
				std::tie(token, line) = getToken(line);
				if (token != "CODE_ID")
				return llvm::None;

				std::tie(token, line) = getToken(line);
				// If we don't have any text following the code id (e.g. on linux), we
				// should use the module id as UUID. Otherwise, we revert back to the module
				// id.
				if (line.trim().empty()) {
				UUID uuid;
				if (uuid.SetFromStringRef(token, token.size() / 2) != token.size())
				return llvm::None;

				return Header{ArchSpec(triple), uuid};
				}
				}

				// We reach here if we don't have a INFO CODE_ID section, or we chose not to
				// use it. In either case, we need to properly decode the module id, whose
				// fields are encoded in big-endian.
				UUID uuid = parseModuleId(triple.getOS(), module_id);
				if (!uuid)
				return llvm::None;

				return Header{ArchSpec(triple), uuid};
				}

				void ObjectFileBreakpad::Initialize() {
				PluginManager::RegisterPlugin(GetPluginNameStatic(),
				GetPluginDescriptionStatic(), CreateInstance,
				nullptr, GetModuleSpecifications);
				}

				void ObjectFileBreakpad::Terminate() {
				PluginManager::UnregisterPlugin(CreateInstance);
				}

				ConstString ObjectFileBreakpad::GetPluginNameStatic() {
				static ConstString g_name("breakpad");
				return g_name;
				}

				ObjectFile *ObjectFileBreakpad::CreateInstance(
				const ModuleSP &module_sp, DataBufferSP &data_sp, offset_t data_offset,
				const FileSpec *file, offset_t file_offset, offset_t length) {
				if (!data_sp) {
				data_sp = MapFileData(*file, length, file_offset);
				if (!data_sp)
				return nullptr;
				data_offset = 0;
				}
				llvm::StringRef text(reinterpret_cast<const char *>(data_sp->GetBytes()),
				data_sp->GetByteSize());
				zturnerUnsubmitted Done Reply Inline Actions We have `GetData()` which returns an `ArrayRef`, and another function `toStringRef` which converts an `ArrayRef` to a `StringRef`. So this might be cleaner to write as `auto text = llvm::toStringRef(data_sp->GetData());` zturner: We have `GetData()` which returns an `ArrayRef`, and another function `toStringRef` which…
				labathAuthorUnsubmitted Done Reply Inline Actions Cool, I didn't know about that. Thanks. labath: Cool, I didn't know about that. Thanks.

				llvm::Optional<Header> header = Header::parse(text);
				if (!header)
				return nullptr;

				// Update the data to contain the entire file if it doesn't already
				if (data_sp->GetByteSize() < length) {
				data_sp = MapFileData(*file, length, file_offset);
				if (!data_sp)
				return nullptr;
				data_offset = 0;
				}

				return new ObjectFileBreakpad(module_sp, data_sp, data_offset, file,
				file_offset, length, *header);
				}

				size_t ObjectFileBreakpad::GetModuleSpecifications(
				const FileSpec &file, DataBufferSP &data_sp, offset_t data_offset,
				offset_t file_offset, offset_t length, ModuleSpecList &specs) {
				llvm::StringRef text(reinterpret_cast<const char *>(data_sp->GetBytes()),
				data_sp->GetByteSize());
				llvm::Optional<Header> header = Header::parse(text);
				if (!header)
				return 0;
				ModuleSpec spec(file, header->arch);
				spec.GetUUID() = header->uuid;
				specs.Append(spec);
				return 1;
				}

				ObjectFileBreakpad::ObjectFileBreakpad(const ModuleSP &module_sp,
				DataBufferSP &data_sp,
				offset_t data_offset,
				const FileSpec *file, offset_t offset,
				offset_t length, const Header &header)
				: ObjectFile(module_sp, file, offset, length, data_sp, data_offset),
				m_arch(header.arch), m_uuid(header.uuid) {}

				bool ObjectFileBreakpad::ParseHeader() {
				// We already parsed the header during initialization.
				return true;
				}

				Symtab *ObjectFileBreakpad::GetSymtab() {
				// TODO
				return nullptr;
				}

				bool ObjectFileBreakpad::GetArchitecture(ArchSpec &arch) {
				arch = m_arch;
				return true;
				}

				bool ObjectFileBreakpad::GetUUID(UUID *uuid) {
				*uuid = m_uuid;
				return true;
				}

				void ObjectFileBreakpad::CreateSections(SectionList &unified_section_list) {
				// TODO
				}

				size_t ObjectFileBreakpad::ReadSectionData(Section *section,
				lldb::offset_t section_offset,
				void *dst, size_t dst_len) {
				// TODO
				return 0;
				}

				size_t ObjectFileBreakpad::ReadSectionData(Section *section,
				DataExtractor &section_data) {
				// TODO
				return 0;
				}

source/Plugins/ObjectFile/CMakeLists.txt

				add_subdirectory(Breakpad)
	add_subdirectory(ELF)			add_subdirectory(ELF)
	add_subdirectory(Mach-O)			add_subdirectory(Mach-O)
	add_subdirectory(PECOFF)			add_subdirectory(PECOFF)
	add_subdirectory(JIT)			add_subdirectory(JIT)
	No newline at end of file

source/Symbol/ObjectFile.cpp

	Show First 20 Lines • Show All 682 Lines • ▼ Show 20 Lines
	void ObjectFile::RelocateSection(lldb_private::Section *section)			void ObjectFile::RelocateSection(lldb_private::Section *section)
	{			{
	}			}

	DataBufferSP ObjectFile::MapFileData(const FileSpec &file, uint64_t Size,			DataBufferSP ObjectFile::MapFileData(const FileSpec &file, uint64_t Size,
	uint64_t Offset) {			uint64_t Offset) {
	return FileSystem::Instance().CreateDataBuffer(file.GetPath(), Size, Offset);			return FileSystem::Instance().CreateDataBuffer(file.GetPath(), Size, Offset);
	}			}

				void llvm::format_provider<ObjectFile::Type>::format(
				const ObjectFile::Type &type, raw_ostream &OS, StringRef Style) {
				switch (type) {
				case ObjectFile::eTypeInvalid:
				OS << "invalid";
				break;
				case ObjectFile::eTypeCoreFile:
				OS << "core file";
				break;
				case ObjectFile::eTypeExecutable:
				OS << "executable";
				break;
				case ObjectFile::eTypeDebugInfo:
				OS << "debug info";
				break;
				case ObjectFile::eTypeDynamicLinker:
				OS << "dynamic linker";
				break;
				case ObjectFile::eTypeObjectFile:
				OS << "object file";
				break;
				case ObjectFile::eTypeSharedLibrary:
				OS << "shared library";
				break;
				case ObjectFile::eTypeStubLibrary:
				OS << "stub library";
				break;
				case ObjectFile::eTypeJIT:
				OS << "jit";
				break;
				case ObjectFile::eTypeUnknown:
				OS << "unknown";
				break;
				}
				}

				void llvm::format_provider<ObjectFile::Strata>::format(
				const ObjectFile::Strata &strata, raw_ostream &OS, StringRef Style) {
				switch (strata) {
				case ObjectFile::eStrataInvalid:
				OS << "invalid";
				break;
				case ObjectFile::eStrataUnknown:
				OS << "unknown";
				break;
				case ObjectFile::eStrataUser:
				OS << "user";
				break;
				case ObjectFile::eStrataKernel:
				OS << "kernel";
				break;
				case ObjectFile::eStrataRawImage:
				OS << "raw image";
				break;
				case ObjectFile::eStrataJIT:
				OS << "jit";
				break;
				}
				}

tools/lldb-test/SystemInitializerTest.cpp

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines
#include "Plugins/Language/CPlusPlus/CPlusPlusLanguage.h"		#include "Plugins/Language/CPlusPlus/CPlusPlusLanguage.h"
#include "Plugins/Language/ObjC/ObjCLanguage.h"		#include "Plugins/Language/ObjC/ObjCLanguage.h"
#include "Plugins/Language/ObjCPlusPlus/ObjCPlusPlusLanguage.h"		#include "Plugins/Language/ObjCPlusPlus/ObjCPlusPlusLanguage.h"
#include "Plugins/LanguageRuntime/CPlusPlus/ItaniumABI/ItaniumABILanguageRuntime.h"		#include "Plugins/LanguageRuntime/CPlusPlus/ItaniumABI/ItaniumABILanguageRuntime.h"
#include "Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCRuntimeV1.h"		#include "Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCRuntimeV1.h"
#include "Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCRuntimeV2.h"		#include "Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCRuntimeV2.h"
#include "Plugins/LanguageRuntime/RenderScript/RenderScriptRuntime/RenderScriptRuntime.h"		#include "Plugins/LanguageRuntime/RenderScript/RenderScriptRuntime/RenderScriptRuntime.h"
#include "Plugins/MemoryHistory/asan/MemoryHistoryASan.h"		#include "Plugins/MemoryHistory/asan/MemoryHistoryASan.h"
		#include "Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.h"
#include "Plugins/ObjectFile/ELF/ObjectFileELF.h"		#include "Plugins/ObjectFile/ELF/ObjectFileELF.h"
#include "Plugins/ObjectFile/Mach-O/ObjectFileMachO.h"		#include "Plugins/ObjectFile/Mach-O/ObjectFileMachO.h"
#include "Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.h"		#include "Plugins/ObjectFile/PECOFF/ObjectFilePECOFF.h"
#include "Plugins/Platform/Android/PlatformAndroid.h"		#include "Plugins/Platform/Android/PlatformAndroid.h"
#include "Plugins/Platform/FreeBSD/PlatformFreeBSD.h"		#include "Plugins/Platform/FreeBSD/PlatformFreeBSD.h"
#include "Plugins/Platform/Kalimba/PlatformKalimba.h"		#include "Plugins/Platform/Kalimba/PlatformKalimba.h"
#include "Plugins/Platform/Linux/PlatformLinux.h"		#include "Plugins/Platform/Linux/PlatformLinux.h"
#include "Plugins/Platform/MacOSX/PlatformMacOSX.h"		#include "Plugins/Platform/MacOSX/PlatformMacOSX.h"
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines

SystemInitializerTest::~SystemInitializerTest() {}		SystemInitializerTest::~SystemInitializerTest() {}

llvm::Error		llvm::Error
SystemInitializerTest::Initialize(const InitializerOptions &options) {		SystemInitializerTest::Initialize(const InitializerOptions &options) {
if (auto e = SystemInitializerCommon::Initialize(options))		if (auto e = SystemInitializerCommon::Initialize(options))
return e;		return e;

		breakpad::ObjectFileBreakpad::Initialize();
		zturnerUnsubmitted Done Reply Inline Actions Shouldn't we also initialize this in `SystemInitializerFull`? zturner: Shouldn't we also initialize this in `SystemInitializerFull`?
		labathAuthorUnsubmitted Done Reply Inline Actions good point labath: good point
ObjectFileELF::Initialize();		ObjectFileELF::Initialize();
ObjectFileMachO::Initialize();		ObjectFileMachO::Initialize();
ObjectFilePECOFF::Initialize();		ObjectFilePECOFF::Initialize();

ScriptInterpreterNone::Initialize();		ScriptInterpreterNone::Initialize();


platform_freebsd::PlatformFreeBSD::Initialize();		platform_freebsd::PlatformFreeBSD::Initialize();
▲ Show 20 Lines • Show All 199 Lines • ▼ Show 20 Lines	#endif
platform_android::PlatformAndroid::Terminate();		platform_android::PlatformAndroid::Terminate();
PlatformMacOSX::Terminate();		PlatformMacOSX::Terminate();
PlatformRemoteiOS::Terminate();		PlatformRemoteiOS::Terminate();
#if defined(__APPLE__)		#if defined(__APPLE__)
PlatformiOSSimulator::Terminate();		PlatformiOSSimulator::Terminate();
PlatformDarwinKernel::Terminate();		PlatformDarwinKernel::Terminate();
#endif		#endif

		breakpad::ObjectFileBreakpad::Terminate();
ObjectFileELF::Terminate();		ObjectFileELF::Terminate();
ObjectFileMachO::Terminate();		ObjectFileMachO::Terminate();
ObjectFilePECOFF::Terminate();		ObjectFilePECOFF::Terminate();

// Now shutdown the common parts, in reverse order.		// Now shutdown the common parts, in reverse order.
SystemInitializerCommon::Terminate();		SystemInitializerCommon::Terminate();
}		}

tools/lldb-test/lldb-test.cpp

Show First 20 Lines • Show All 728 Lines • ▼ Show 20 Lines	for (const auto &File : opts::object::InputFilenames) {
ModulePtr->GetSymbolVendor();		ModulePtr->GetSymbolVendor();
SectionList *Sections = ModulePtr->GetSectionList();		SectionList *Sections = ModulePtr->GetSectionList();
if (!Sections) {		if (!Sections) {
llvm::errs() << "Could not load sections for module " << File << "\n";		llvm::errs() << "Could not load sections for module " << File << "\n";
HadErrors = 1;		HadErrors = 1;
continue;		continue;
}		}

		auto *ObjectPtr = ModulePtr->GetObjectFile();
		zturnerUnsubmitted Done Reply Inline Actions I would use an explicit type spelling here, but since the function is called `GetObjectFile`, I don't feel too strongly. It's pretty clear what the return type is. zturner: I would use an explicit type spelling here, but since the function is called `GetObjectFile`, I…

		Printer.formatLine("Plugin name: {0}", ObjectPtr->GetPluginName());
Printer.formatLine("Architecture: {0}",		Printer.formatLine("Architecture: {0}",
ModulePtr->GetArchitecture().GetTriple().getTriple());		ModulePtr->GetArchitecture().GetTriple().getTriple());
Printer.formatLine("UUID: {0}", ModulePtr->GetUUID().GetAsString());		Printer.formatLine("UUID: {0}", ModulePtr->GetUUID().GetAsString());
		Printer.formatLine("Executable: {0}", ObjectPtr->IsExecutable());
		Printer.formatLine("Stripped: {0}", ObjectPtr->IsStripped());
		Printer.formatLine("Type: {0}", ObjectPtr->GetType());
		Printer.formatLine("Strata: {0}", ObjectPtr->GetStrata());

size_t Count = Sections->GetNumSections(0);		size_t Count = Sections->GetNumSections(0);
Printer.formatLine("Showing {0} sections", Count);		Printer.formatLine("Showing {0} sections", Count);
for (size_t I = 0; I < Count; ++I) {		for (size_t I = 0; I < Count; ++I) {
AutoIndent Indent(Printer, 2);		AutoIndent Indent(Printer, 2);
auto S = Sections->GetSectionAtIndex(I);		auto S = Sections->GetSectionAtIndex(I);
assert(S);		assert(S);
Printer.formatLine("Index: {0}", I);		Printer.formatLine("Index: {0}", I);
▲ Show 20 Lines • Show All 215 Lines • Show Last 20 Lines