This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lldb/trunk/
-
trunk/
-
include/lldb/Utility/
-
lldb/
-
Utility/
-
DataExtractor.h
-
lit/Modules/Breakpad/
-
Modules/
-
Breakpad/
-
Inputs/
-
discontiguous-sections.syms
-
sections-trailing-func.syms
-
sections.syms
-
discontiguous-sections.test
-
sections-trailing-func.test
-
sections.test
-
source/Plugins/ObjectFile/Breakpad/
-
Plugins/
-
ObjectFile/
-
Breakpad/
-
ObjectFileBreakpad.cpp

Differential D55434

ObjectFileBreakpad: Implement sections
ClosedPublic

Authored by labath on Dec 7 2018, 7:08 AM.

Download Raw Diff

Details

Reviewers

clayborg
zturner
lemo
markmentovai
amccarth

Commits

rGed42ea4707c3: ObjectFileBreakpad: Implement sections
rLLDB350511: ObjectFileBreakpad: Implement sections
rL350511: ObjectFileBreakpad: Implement sections

Summary

This patch allows ObjectFileBreakpad to parse the contents of Breakpad
files into sections. This sounds slightly odd at first, but in essence
its not too different from how other object files handle things. For
example in elf files, the symtab section consists of a number of
"records", where each record represents a single symbol. The same is
true for breakpad's PUBLIC section, except in this case, the records will be
textual instead of binary.

To keep sections contiguous, I create a new section every time record
type changes. Normally, the breakpad processor will group all records of
the same type in one block, but the format allows them to be intermixed,
so in general, the "object file" may contain multiple sections with the
same record type.

Diff Detail

Repository: rL LLVM

Event Timeline

labath created this revision.Dec 7 2018, 7:08 AM

Harbormaster completed remote builds in B25827: Diff 177209.Dec 7 2018, 7:08 AM

So sections should correspond to different kinds of sections, like .text, .data, etc. If we have the following breakpad file:

MODULE Linux x86_64 0000000024B5D199F0F766FFFFFF5DC30 linux.out
INFO CODE_ID 00000000B52499D1F0F766FFFFFF5DC3
FILE 0 /tmp/a.c
FUNC 1010 10 0 _start
1010 4 4 0
1014 5 5 0
1019 5 6 0
101e 2 7 0
FUNC 2010 10 0 main
2010 4 4 0
2014 5 5 0
2019 5 6 0
201e 2 7 0
PUBLIC 1010 0 _start
PUBLIC 2010 0 main
PUBLIC 3010 0 nodebuginfo1
PUBLIC 3020 0 nodebuginfo2

I would expect to have just 1 section named ".text" with read and execute permissions. This section would have its m_file_addr set to 0x1010 (from FUNC or PUBLIC with lowest address ("FUNC 1010 10 0 _start" in this case)). The size of the section would be set to the max code address + size minus the lowest code address (0x3020 - 0x1010 in this case). The file offset and file size should be set to zero since section contents are typically the bytes for the disassembly.

One other way to do this would be to create a section for each FUNC whose m_file_addr it set to the FUNC start address, and whose size is set to the last line entry - FUNC start address. Then name of the section can be set to the function name in this case. I am not a huge fan of this since it just creates extra sections for no reason and the debug info will have this info anyway so it will be duplicated.

I see no reason to create sections for MODULE, INFO, FILE, or STACK records. Was there a reason you wanted to create sections for all these? If you need the contents later, it seems like the initial parse pass of this file can easily store these items (as StringRef?) as a member variable of the ObjectFile.

Once you start parsing the debug info, the lldb::user_id_t for any items, like FUNC, can just be the line number or character offset for the FUNC source line within the file.

This revision now requires changes to proceed.Dec 7 2018, 8:51 AM

Another point of clarification is that sections exist in order to lookup addresses and resolve addresses to a section within a file. The section should be something that can easily be slid around when loaded by LLDB when we are debugging or symbolicating. So any sections we create should be able to be have the section load address set in the target with code like:

if (target.GetSectionLoadList().SetSectionLoadAddress(section_sp, section_sp->GetFileAddress() + slide))

All of the sections you added, except the FUNC section, wouldn't end up ever being loaded. All items besides FUNC might be better represented as symbols.

I guess I should elaborate more on the direction where I am going with this. I am trying to model these breakpad files as a debug-info-only object file, like something you would get by running say strip --only-keep-debug. This object file will contain a bunch of sections, but none of them will be real loadable sections. They will basically just be containers for data (DWARF, most likely). Like the .debug_*** sections, my sections also have vm_size set to 0, so there no notion of them being in memory or being slid around. The idea is that this file will never be used as the main object file for a module (*), but rather an object file that a symbol vendor uses to add symbol information to the module.

I have a follow-up patch to this (not yet ready for upload), where I create a SymbolFileBreakpad, which takes the "PUBLIC" and "FUNC" sections and uses them to add symbols into the symtab of the main object file using the interface we added for SymbolFilePDB (while doing that, I lookup these addresses in the main object file and resolve them to real sections (.text, etc.). Then, another part of that plugin would take the line information from the "FUNC" section, and convert that into a lldb_private::LineTable structure. So, basically the real action will happen in the SymbolFile plugin, and this ObjectFile is there just as a fancy container for the data.

(*) I am deliberately not handling the scenario where we have the main ObjectFile missing. We need to be able to handle the case when we cannot find the main object file regardless of whether we have the breakpad file around or not (and ProcessMinidump kind of does that right now, but I believe that should be generalized), so I am planning to have breakpad just piggy-back on that. Then if we have a real object file, we can get accurate section information from there. If not, then all of our symbols will resolve to some fake section encompassing the whole module.

Does this approach make sense?

So we do something like you describe with the DYSM files. The object file is "a.out" and it has a dSYM file "a.out.dSYM/Context/Resources/DWARF/a.out" and the dSYM file will share sections with the "a.out" object file. So if you plan on loading the breakpad file as a symbol file that just needs some sections that it can give to the debug info it will eventually create, these sections must be the same between the binary and the breakpad object file. So I would still recommend trying to make sections that make sense like a real object file. It is also nice to be able to live off of the breakpad file itself. In this case the ObjectFile for breakpad when it is stand alone should do as good of a job as possible.

If you plan on not making the breakpad file ever stand alone, then you will need to take any addresses and look them up in the module section list and use the other sections. I don't see why the breakpad file can't be stand alone though. It won't be as accurate, but it sure would be nice to be able to load a bunch of them into LLDB without needing to find the original executable and just symbolicate no?

In D55434#1323912, @clayborg wrote:

If you plan on not making the breakpad file ever stand alone, then you will need to take any addresses and look them up in the module section list and use the other sections. I don't see why the breakpad file can't be stand alone though. It won't be as accurate, but it sure would be nice to be able to load a bunch of them into LLDB without needing to find the original executable and just symbolicate no?

I could try to make it stand-alone, but that seems to me like a duplication of effort. And since the sections I could conjure up from the breakpad info would never match the original elf file, I would have to support both cases anyway, one with using the sections from the object file, and one with the own, made-up sections.

I do intend to support both cases, but in a slightly different way. The way, I see it we have actually four cases to consider:

we have an stripped elf file, and a breakpad symbol file (the case of an unstripped elf file is uninteresting, as it will have much better debug info than breakpad can possibly provide)
we don't have an elf file, but we have a breakpad file
we don't have an elf file nor a breakpad file

Because of case 3, we have to do some section conjuring independently of any breakpad file. We already do that to some extent, and @lemo is getting ready to extend that in D55142. Once we have that, and we find a breakpad file for this module (case 2), the breakpad file should be able to just latch onto the sections created for the placeholder object file. And I believe ProcessMinidump is in a better position to create the "placeholder" sections, as it has both access to the load addresses and sizes of the modules. From breakpad info, it would be hard to determine the size of the module if the highest address is occupied by a "PUBLIC" symbol, as they don't have any sizes associated with them.

Going back to case 1, if we have a stripped elf file, the breakpad file should latch onto the sections of that one without ever knowing the difference. So in the end, I hope this will produce a clearer code because the concerns will be separated. Breakpad code will always deal with externally-provided sections, regardless of whether they come from a "real" object file, or a made-up one. And the "making-up" code can work independently of there being a breakpad file.

In D55434#1325739, @labath wrote:

In D55434#1323912, @clayborg wrote:

If you plan on not making the breakpad file ever stand alone, then you will need to take any addresses and look them up in the module section list and use the other sections. I don't see why the breakpad file can't be stand alone though. It won't be as accurate, but it sure would be nice to be able to load a bunch of them into LLDB without needing to find the original executable and just symbolicate no?

I could try to make it stand-alone, but that seems to me like a duplication of effort. And since the sections I could conjure up from the breakpad info would never match the original elf file, I would have to support both cases anyway, one with using the sections from the object file, and one with the own, made-up sections.

I do intend to support both cases, but in a slightly different way. The way, I see it we have actually four cases to consider:

we have an stripped elf file, and a breakpad symbol file (the case of an unstripped elf file is uninteresting, as it will have much better debug info than breakpad can possibly provide)

we don't have an elf file, but we have a breakpad file

we don't have an elf file nor a breakpad file

Because of case 3, we have to do some section conjuring independently of any breakpad file. We already do that to some extent, and @lemo is getting ready to extend that in D55142. Once we have that, and we find a breakpad file for this module (case 2), the breakpad file should be able to just latch onto the sections created for the placeholder object file. And I believe ProcessMinidump is in a better position to create the "placeholder" sections, as it has both access to the load addresses and sizes of the modules. From breakpad info, it would be hard to determine the size of the module if the highest address is occupied by a "PUBLIC" symbol, as they don't have any sizes associated with them.

Going back to case 1, if we have a stripped elf file, the breakpad file should latch onto the sections of that one without ever knowing the difference. So in the end, I hope this will produce a clearer code because the concerns will be separated. Breakpad code will always deal with externally-provided sections, regardless of whether they come from a "real" object file, or a made-up one. And the "making-up" code can work independently of there being a breakpad file.

Ok. Check out my changes that parse region info:
https://reviews.llvm.org/D55522

It parses the memory region info from the linux maps info if it is available. In breakpad generated minidumps, this will give us enough info to correctly create sections for all object files in case #3!

In D55434#1325782, @clayborg wrote:

Ok. Check out my changes that parse region info:
https://reviews.llvm.org/D55522

It parses the memory region info from the linux maps info if it is available. In breakpad generated minidumps, this will give us enough info to correctly create sections for all object files in case #3!

Yes, that sounds like a really useful source of information.

labath added a child revision: D56173: Introduce SymbolFileBreakpad and use it to fill symtab.Dec 31 2018, 6:07 AM

I've uploaded D56173 to demonstrate how I intend to use the sections created here. The latter patch still requires some changes I only have locally (needed to make base address available to it), but the part about handling the sections is not affected by that.

Greg, given the intended usage of these sections as demonstrated in D56173, do you agree with representing the sections of the breakpad object file in this way?

So is this done as one section per function? Or one section for contiguous functions? What about if there are only symbols? I tried to read the code but wasn't able to decipher everything clearly in my head.

Just read the original description again and now code makes sense. Main questions for me: is there a benefit to creating multiple sections? Can we just create one section and name it ".breakpad"? Should we not try to find a section that contains the address from the FUNC, line entry or PUBLIC and then avoid creating a section? That way we only create breakpad sections of there are not backing sections from a real object or symbol file?

I was under the impression I've convinced you of this direction, but your questions make it sound like you're going back to the "standalone" breakpad file idea (which I am not fond of). I'll try to explain again what I'm doing here. This is going to be somewhat repetitive (for which I apologise), but I am trying to explain this from a slightly different angle this time.

The sections I'm creating here aren't the kind of sections that will be loaded in memory. They're non-loadable sections (like the sections without SHF_ALLOC flag in elf), whose only purpose is to carry data around. Similar to how .debug_info is a non-loadable section that carries DWARF data. In this code, I'm not trying to infer anything about the layout of the described executable file from the data in the breakpad file. I am only presenting a view of the data in the breakpad file, so that this connection can happen in SymbolFileBreakpad. So, ObjectFileBreakpad will create a section whose contents will be _literally_

PUBLIC 1010 0 function1
PUBLIC 1020 0 function2
...

Then, SymbolFileBreakpad will take this section, parse it (like SymbolFileDWARF parses .debug_info), cross-reference the information with the real object, and create appropriate symbols. This happens in D56173. I am currently working on other patches which take the line records from the breakpad file and create line tables. So here, ObjectFileBreakpad will provide a "FUNC" section (because in breakpad files line records are attached to the preceding FUNC record), similar to how ObjectFileELF provides .debug_line. Then SymbolFileBreakpad parses and presents it to LLDB (like SymbolFileDWARF parses .debug_line).

In this sense, a breakpad file should be similar to a symbol-only ELF file (the kind you produce with strip --only-keep-debug) -- this one also doesn't contain any loadable sections, and is merely a container for the symbol data.

When I speak about "discontinuity" in this patch, it means discontinuity in the descriptions themselves, not in the data being described. So a breakpad file like:

FUNC 1000 10 0 function1
FILE 0 /tmp/foo.c
FUNC 1010 10 0 function2

is discontinuous because the two FUNC records are not next to each other even though the functions themselves are positioned one after the other. (I don't know why would anyone produce files like these, but the breakpad format description https://chromium.googlesource.com/breakpad/breakpad/+/master/docs/symbol_files.md explicitly allows that).

Also note that neither of these ObjectFileBreakpad nor SymbolFileBreakpad creates any loadable sections. I don't think that is necessary, as that can be done elsewhere (and better). I just use whatever sections are present in the main object file of the module. In practice this will either be a real loadable object file (elf/macho/coff), or a placeholder object file that is created when opening a minidump file. I think this makes sense for several reasons:

determining the limits of the loadable section from the breakpad info is hard. There will always be some loaded data (various file headers, etc.) before the first symbol described by the breakpad file. And we won't also cannot be sure of the upper limit of the section if the last symbol is a PUBLIC symbol (as they don't have size). On the other hand, creating this from the minidump info is easy, as it knows the exact ranges (coming from /proc/pid/maps) and similar.
better composability: having ObjectFileBreakpad be standalone will not allow us to get rid of the placeholder object files, as those will be still needed in cases when we don't have even the breakpad info. So we will need two branches in ObjectFileBreakpad (for when we have an object file vs. when we don't), and then placeholder files on top of that. Making breakpad files not be standalone let's us get rid of one of the branches in ObjectFileBreakpad
Overall, I think object files should be as standalone as possible. They should not infer anything based on the information in other object files or elsewhere. They should just present the data that's present in the file itself and nothing more. Combining of data from should be done at a different level.

If this still hasn't convinced you :), and you think the standalone breakpad file is the better way to go, then I'd like to understand what are the advantages you see there, because right now I don't see any. In previous comments you were worried about being able to use a breakpad file without the matching exe file. If that isn't clear from the above, then I can reiterate that I do intend to support that flow. In fact, it is my primary use case. The only difference is I intend to achieve it via a combination of placeholder object files plus symbol information from breakpad files, rather than breakpad (object) files alone.

You have convinced me! Sorry I had paged out the original intent you conveyed from before the break. Thanks for the details.

This revision is now accepted and ready to land.Jan 4 2019, 11:08 AM

Thank you for the review.

Removing the self-accept (oops).

Closed by commit rL350511: ObjectFileBreakpad: Implement sections (authored by labath). · Explain WhyJan 7 2019, 3:17 AM

This revision was automatically updated to reflect the committed changes.

Herald added a subscriber: llvm-commits. · View Herald TranscriptJan 7 2019, 3:17 AM

@labath This broke lldb on Debian stable:

In file included from /build/llvm-toolchain-snapshot-8~svn350764/tools/lldb/source/Utility/DataExtractor.cpp:10:
/build/llvm-toolchain-snapshot-8~svn350764/tools/lldb/include/lldb/Utility/DataExtractor.h:1099:29: error: non-constant-expression cannot be narrowed from type 'uint64_t' (aka 'unsigned long long') to 'size_t' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]
    return {GetDataStart(), GetByteSize()};
                            ^~~~~~~~~~~~~
/build/llvm-toolchain-snapshot-8~svn350764/tools/lldb/include/lldb/Utility/DataExtractor.h:1099:29: note: insert an explicit cast to silence this issue
    return {GetDataStart(), GetByteSize()};
                            ^~~~~~~~~~~~~
                            static_cast<size_t>( )

On i386

Thanks for the heads-up. This should be fixed in r350834.

Revision Contents

Path

Size

lldb/

trunk/

include/

lldb/

Utility/

DataExtractor.h

5 lines

lit/

Modules/

Breakpad/

Inputs/

discontiguous-sections.syms

5 lines

sections-trailing-func.syms

8 lines

sections.syms

12 lines

discontiguous-sections.test

27 lines

sections-trailing-func.test

15 lines

sections.test

89 lines

source/

Plugins/

ObjectFile/

Breakpad/

ObjectFileBreakpad.cpp

79 lines

Diff 180450

lldb/trunk/include/lldb/Utility/DataExtractor.h

//===-- DataExtractor.h ------------------------------------------ C++ --===//		//===-- DataExtractor.h ------------------------------------------ C++ --===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLDB_UTILITY_DATAEXTRACTOR_H		#ifndef LLDB_UTILITY_DATAEXTRACTOR_H
#define LLDB_UTILITY_DATAEXTRACTOR_H		#define LLDB_UTILITY_DATAEXTRACTOR_H

#include "lldb/lldb-defines.h"		#include "lldb/lldb-defines.h"
#include "lldb/lldb-enumerations.h"		#include "lldb/lldb-enumerations.h"
#include "lldb/lldb-forward.h"		#include "lldb/lldb-forward.h"
#include "lldb/lldb-types.h"		#include "lldb/lldb-types.h"
		#include "llvm/ADT/ArrayRef.h"

#include <cassert>		#include <cassert>
#include <stdint.h>		#include <stdint.h>
#include <string.h>		#include <string.h>

namespace lldb_private {		namespace lldb_private {
class Log;		class Log;
}		}
▲ Show 20 Lines • Show All 1,064 Lines • ▼ Show 20 Lines	lldb::offset_t BytesLeft(lldb::offset_t offset) const {
const lldb::offset_t size = GetByteSize();		const lldb::offset_t size = GetByteSize();
if (size > offset)		if (size > offset)
return size - offset;		return size - offset;
return 0;		return 0;
}		}

void Checksum(llvm::SmallVectorImpl<uint8_t> &dest, uint64_t max_data = 0);		void Checksum(llvm::SmallVectorImpl<uint8_t> &dest, uint64_t max_data = 0);

		llvm::ArrayRef<uint8_t> GetData() const {
		return {GetDataStart(), GetByteSize()};
		}

protected:		protected:
//------------------------------------------------------------------		//------------------------------------------------------------------
// Member variables		// Member variables
//------------------------------------------------------------------		//------------------------------------------------------------------
const uint8_t *m_start; ///< A pointer to the first byte of data.		const uint8_t *m_start; ///< A pointer to the first byte of data.
const uint8_t		const uint8_t
*m_end; ///< A pointer to the byte that is past the end of the data.		*m_end; ///< A pointer to the byte that is past the end of the data.
lldb::ByteOrder		lldb::ByteOrder
Show All 11 Lines

lldb/trunk/lit/Modules/Breakpad/Inputs/discontiguous-sections.syms

				MODULE Linux x86_64 0000000024B5D199F0F766FFFFFF5DC30 linux.out
				INFO CODE_ID 00000000B52499D1F0F766FFFFFF5DC3
				FILE 0 /tmp/a.c
				PUBLIC 1010 0 _start
				FILE 1 /tmp/b.c

lldb/trunk/lit/Modules/Breakpad/Inputs/sections-trailing-func.syms

				MODULE Linux x86_64 0000000024B5D199F0F766FFFFFF5DC30 linux.out
				INFO CODE_ID 00000000B52499D1F0F766FFFFFF5DC3
				FILE 0 /tmp/a.c
				FUNC 1010 10 0 _start
				1010 4 4 0
				1014 5 5 0
				1019 5 6 0
				101e 2 7 0

lldb/trunk/lit/Modules/Breakpad/Inputs/sections.syms

				MODULE Linux x86_64 0000000024B5D199F0F766FFFFFF5DC30 linux.out
				INFO CODE_ID 00000000B52499D1F0F766FFFFFF5DC3
				FILE 0 /tmp/a.c
				FUNC 1010 10 0 _start
				1010 4 4 0
				1014 5 5 0
				1019 5 6 0
				101e 2 7 0
				PUBLIC 1010 0 _start
				STACK CFI INIT 1010 10 .cfa: $rsp 8 + .ra: .cfa -8 + ^
				STACK CFI 1011 $rbp: .cfa -16 + ^ .cfa: $rsp 16 +
				STACK CFI 1014 .cfa: $rbp 16 +

lldb/trunk/lit/Modules/Breakpad/discontiguous-sections.test

				# Test handling discontiguous sections.
				RUN: lldb-test object-file %p/Inputs/discontiguous-sections.syms -contents \| FileCheck %s

				CHECK: Showing 5 sections

				CHECK: ID: 0x1
				CHECK-NEXT: Name: MODULE

				CHECK: ID: 0x2
				CHECK-NEXT: Name: INFO

				CHECK: ID: 0x3
				CHECK-NEXT: Name: FILE
				CHECK: File size: 16
				CHECK-NEXT: Data: (
				CHECK-NEXT: 0000: 46494C45 2030202F 746D702F 612E630A \|FILE 0 /tmp/a.c.\|
				CHECK-NEXT: )

				CHECK: ID: 0x4
				CHECK-NEXT: Name: PUBLIC

				CHECK: ID: 0x5
				CHECK-NEXT: Name: FILE
				CHECK: File size: 16
				CHECK-NEXT: Data: (
				CHECK-NEXT: 0000: 46494C45 2031202F 746D702F 622E630A \|FILE 1 /tmp/b.c.\|
				CHECK-NEXT: )

lldb/trunk/lit/Modules/Breakpad/sections-trailing-func.test

				# Test handling of a (valid) breakpad file, which ends with a line without a
				# recognised keyword.

				RUN: lldb-test object-file %p/Inputs/sections-trailing-func.syms -contents \| FileCheck %s

				CHECK: Showing 4 sections

				CHECK: ID: 0x4
				CHECK-NEXT: Name: FUNC
				CHECK: File size: 66
				CHECK-NEXT: Data: (
				CHECK-NEXT: 0000: 46554E43 20313031 30203130 2030205F 73746172 740A3130 31302034 20342030 \|FUNC 1010 10 0 _start.1010 4 4 0\|
				CHECK-NEXT: 0020: 0A313031 34203520 3520300A 31303139 20352036 20300A31 30316520 32203720 \|.1014 5 5 0.1019 5 6 0.101e 2 7 \|
				CHECK-NEXT: 0040: 300A \|0.\|
				CHECK-NEXT: )

lldb/trunk/lit/Modules/Breakpad/sections.test

				RUN: lldb-test object-file %p/Inputs/sections.syms -contents \| FileCheck %s

				CHECK: Showing 6 sections

				CHECK: Index: 0
				CHECK-NEXT: ID: 0x1
				CHECK-NEXT: Name: MODULE
				CHECK-NEXT: Type: regular
				CHECK-NEXT: Permissions: ---
				CHECK-NEXT: Thread specific: no
				CHECK-NEXT: VM address: 0
				CHECK-NEXT: VM size: 0
				CHECK-NEXT: File size: 64
				CHECK-NEXT: Data: (
				CHECK-NEXT: 0000: 4D4F4455 4C45204C 696E7578 20783836 5F363420 30303030 30303030 32344235 \|MODULE Linux x86_64 0000000024B5\|
				CHECK-NEXT: 0020: 44313939 46304637 36364646 46464646 35444333 30206C69 6E75782E 6F75740A \|D199F0F766FFFFFF5DC30 linux.out.\|
				CHECK-NEXT: )

				CHECK: Index: 1
				CHECK-NEXT: ID: 0x2
				CHECK-NEXT: Name: INFO
				CHECK-NEXT: Type: regular
				CHECK-NEXT: Permissions: ---
				CHECK-NEXT: Thread specific: no
				CHECK-NEXT: VM address: 0
				CHECK-NEXT: VM size: 0
				CHECK-NEXT: File size: 46
				CHECK-NEXT: Data: (
				CHECK-NEXT: 0000: 494E464F 20434F44 455F4944 20303030 30303030 30423532 34393944 31463046 \|INFO CODE_ID 00000000B52499D1F0F\|
				CHECK-NEXT: 0020: 37363646 46464646 46354443 330A \|766FFFFFF5DC3.\|
				CHECK-NEXT: )

				CHECK: Index: 2
				CHECK-NEXT: ID: 0x3
				CHECK-NEXT: Name: FILE
				CHECK-NEXT: Type: regular
				CHECK-NEXT: Permissions: ---
				CHECK-NEXT: Thread specific: no
				CHECK-NEXT: VM address: 0
				CHECK-NEXT: VM size: 0
				CHECK-NEXT: File size: 16
				CHECK-NEXT: Data: (
				CHECK-NEXT: 0000: 46494C45 2030202F 746D702F 612E630A \|FILE 0 /tmp/a.c.\|
				CHECK-NEXT: )

				CHECK: Index: 3
				CHECK-NEXT: ID: 0x4
				CHECK-NEXT: Name: FUNC
				CHECK-NEXT: Type: regular
				CHECK-NEXT: Permissions: ---
				CHECK-NEXT: Thread specific: no
				CHECK-NEXT: VM address: 0
				CHECK-NEXT: VM size: 0
				CHECK-NEXT: File size: 66
				CHECK-NEXT: Data: (
				CHECK-NEXT: 0000: 46554E43 20313031 30203130 2030205F 73746172 740A3130 31302034 20342030 \|FUNC 1010 10 0 _start.1010 4 4 0\|
				CHECK-NEXT: 0020: 0A313031 34203520 3520300A 31303139 20352036 20300A31 30316520 32203720 \|.1014 5 5 0.1019 5 6 0.101e 2 7 \|
				CHECK-NEXT: 0040: 300A \|0.\|
				CHECK-NEXT: )

				CHECK: Index: 4
				CHECK-NEXT: ID: 0x5
				CHECK-NEXT: Name: PUBLIC
				CHECK-NEXT: Type: regular
				CHECK-NEXT: Permissions: ---
				CHECK-NEXT: Thread specific: no
				CHECK-NEXT: VM address: 0
				CHECK-NEXT: VM size: 0
				CHECK-NEXT: File size: 21
				CHECK-NEXT: Data: (
				CHECK-NEXT: 0000: 5055424C 49432031 30313020 30205F73 74617274 0A \|PUBLIC 1010 0 _start.\|
				CHECK-NEXT: )

				CHECK: Index: 5
				CHECK-NEXT: ID: 0x6
				CHECK-NEXT: Name: STACK
				CHECK-NEXT: Type: regular
				CHECK-NEXT: Permissions: ---
				CHECK-NEXT: Thread specific: no
				CHECK-NEXT: VM address: 0
				CHECK-NEXT: VM size: 0
				CHECK-NEXT: File size: 136
				CHECK-NEXT: Data: (
				CHECK-NEXT: 0000: 53544143 4B204346 4920494E 49542031 30313020 3130202E 6366613A 20247273 \|STACK CFI INIT 1010 10 .cfa: $rs\|
				CHECK-NEXT: 0020: 70203820 2B202E72 613A202E 63666120 2D38202B 205E0A53 5441434B 20434649 \|p 8 + .ra: .cfa -8 + ^.STACK CFI\|
				CHECK-NEXT: 0040: 20313031 31202472 62703A20 2E636661 202D3136 202B205E 202E6366 613A2024 \| 1011 $rbp: .cfa -16 + ^ .cfa: $\|
				CHECK-NEXT: 0060: 72737020 3136202B 0A535441 434B2043 46492031 30313420 2E636661 3A202472 \|rsp 16 +.STACK CFI 1014 .cfa: $r\|
				CHECK-NEXT: 0080: 62702031 36202B0A \|bp 16 +.\|
				CHECK-NEXT: )

lldb/trunk/source/Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.cpp

	//===-- ObjectFileBreakpad.cpp -------------------------------- -- C++ --===//			//===-- ObjectFileBreakpad.cpp -------------------------------- -- C++ --===//
	//			//
	// The LLVM Compiler Infrastructure			// The LLVM Compiler Infrastructure
	//			//
	// This file is distributed under the University of Illinois Open Source			// This file is distributed under the University of Illinois Open Source
	// License. See LICENSE.TXT for details.			// License. See LICENSE.TXT for details.
	//			//
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	#include "Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.h"			#include "Plugins/ObjectFile/Breakpad/ObjectFileBreakpad.h"
	#include "lldb/Core/ModuleSpec.h"			#include "lldb/Core/ModuleSpec.h"
	#include "lldb/Core/PluginManager.h"			#include "lldb/Core/PluginManager.h"
				#include "lldb/Core/Section.h"
	#include "lldb/Utility/DataBuffer.h"			#include "lldb/Utility/DataBuffer.h"
	#include "llvm/ADT/StringExtras.h"			#include "llvm/ADT/StringExtras.h"

	using namespace lldb;			using namespace lldb;
	using namespace lldb_private;			using namespace lldb_private;
	using namespace lldb_private::breakpad;			using namespace lldb_private::breakpad;

	namespace {			namespace {
	struct Header {			struct Header {
	ArchSpec arch;			ArchSpec arch;
	UUID uuid;			UUID uuid;
	static llvm::Optional<Header> parse(llvm::StringRef text);			static llvm::Optional<Header> parse(llvm::StringRef text);
	};			};

				enum class Token { Unknown, Module, Info, File, Func, Public, Stack };
	} // namespace			} // namespace

				static Token toToken(llvm::StringRef str) {
				return llvm::StringSwitch<Token>(str)
				.Case("MODULE", Token::Module)
				.Case("INFO", Token::Info)
				.Case("FILE", Token::File)
				.Case("FUNC", Token::Func)
				.Case("PUBLIC", Token::Public)
				.Case("STACK", Token::Stack)
				.Default(Token::Unknown);
				}

				static llvm::StringRef toString(Token t) {
				switch (t) {
				case Token::Unknown:
				return "";
				case Token::Module:
				return "MODULE";
				case Token::Info:
				return "INFO";
				case Token::File:
				return "FILE";
				case Token::Func:
				return "FUNC";
				case Token::Public:
				return "PUBLIC";
				case Token::Stack:
				return "STACK";
				}
				llvm_unreachable("Unknown token!");
				}

	static llvm::Triple::OSType toOS(llvm::StringRef str) {			static llvm::Triple::OSType toOS(llvm::StringRef str) {
	using llvm::Triple;			using llvm::Triple;
	return llvm::StringSwitch<Triple::OSType>(str)			return llvm::StringSwitch<Triple::OSType>(str)
	.Case("Linux", Triple::Linux)			.Case("Linux", Triple::Linux)
	.Case("mac", Triple::MacOSX)			.Case("mac", Triple::MacOSX)
	.Case("windows", Triple::Win32)			.Case("windows", Triple::Win32)
	.Default(Triple::UnknownOS);			.Default(Triple::UnknownOS);
	}			}
	▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines
	llvm::Optional<Header> Header::parse(llvm::StringRef text) {			llvm::Optional<Header> Header::parse(llvm::StringRef text) {
	// A valid module should start with something like:			// A valid module should start with something like:
	// MODULE Linux x86_64 E5894855C35DCCCCCCCCCCCCCCCCCCCC0 a.out			// MODULE Linux x86_64 E5894855C35DCCCCCCCCCCCCCCCCCCCC0 a.out
	// optionally followed by			// optionally followed by
	// INFO CODE_ID 554889E55DC3CCCCCCCCCCCCCCCCCCCC [a.exe]			// INFO CODE_ID 554889E55DC3CCCCCCCCCCCCCCCCCCCC [a.exe]
	llvm::StringRef token, line;			llvm::StringRef token, line;
	std::tie(line, text) = text.split('\n');			std::tie(line, text) = text.split('\n');
	std::tie(token, line) = getToken(line);			std::tie(token, line) = getToken(line);
	if (token != "MODULE")			if (toToken(token) != Token::Module)
	return llvm::None;			return llvm::None;

	std::tie(token, line) = getToken(line);			std::tie(token, line) = getToken(line);
	llvm::Triple triple;			llvm::Triple triple;
	triple.setOS(toOS(token));			triple.setOS(toOS(token));
	if (triple.getOS() == llvm::Triple::UnknownOS)			if (triple.getOS() == llvm::Triple::UnknownOS)
	return llvm::None;			return llvm::None;

	▲ Show 20 Lines • Show All 116 Lines • ▼ Show 20 Lines
	}			}

	bool ObjectFileBreakpad::GetUUID(UUID *uuid) {			bool ObjectFileBreakpad::GetUUID(UUID *uuid) {
	*uuid = m_uuid;			*uuid = m_uuid;
	return true;			return true;
	}			}

	void ObjectFileBreakpad::CreateSections(SectionList &unified_section_list) {			void ObjectFileBreakpad::CreateSections(SectionList &unified_section_list) {
	// TODO			if (m_sections_ap)
				return;
				m_sections_ap = llvm::make_unique<SectionList>();

				Token current_section = Token::Unknown;
				offset_t section_start;
				llvm::StringRef text = toStringRef(m_data.GetData());
				uint32_t next_section_id = 1;
				auto maybe_add_section = [&](const uint8_t *end_ptr) {
				if (current_section == Token::Unknown)
				return; // We have been called before parsing the first line.

				offset_t end_offset = end_ptr - m_data.GetDataStart();
				auto section_sp = std::make_shared<Section>(
				GetModule(), this, next_section_id++,
				ConstString(toString(current_section)), eSectionTypeOther,
				/file_vm_addr/ 0, /vm_size/ 0, section_start,
				end_offset - section_start, /log2align/ 0, /flags/ 0);
				m_sections_ap->AddSection(section_sp);
				unified_section_list.AddSection(section_sp);
				};
				while (!text.empty()) {
				llvm::StringRef line;
				std::tie(line, text) = text.split('\n');

				Token token = toToken(getToken(line).first);
				if (token == Token::Unknown) {
				// We assume this is a line record, which logically belongs to the Func
				// section. Errors will be handled when parsing the Func section.
				token = Token::Func;
				}
				if (token == current_section)
				continue;

				// Changing sections, finish off the previous one, if there was any.
				maybe_add_section(line.bytes_begin());
				// And start a new one.
				current_section = token;
				section_start = line.bytes_begin() - m_data.GetDataStart();
				}
				// Finally, add the last section.
				maybe_add_section(m_data.GetDataEnd());
	}			}