This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lldb/
-
include/lldb/Symbol/
-
lldb/
-
Symbol/
11/11
Symbol.h
-
source/
-
Plugins/
-
ObjectFile/
-
CMakeLists.txt
-
JSON/
-
CMakeLists.txt
2/2
ObjectFileJSON.h
8/8
ObjectFileJSON.cpp
-
SymbolFile/
-
CMakeLists.txt
-
JSON/
-
CMakeLists.txt
-
SymbolFileJSON.h
6/6
SymbolFileJSON.cpp
-
Symbol/
8/8
Symbol.cpp
-
test/API/macosx/symbols/
-
API/
-
macosx/
-
symbols/
-
Makefile
3/3
TestSymbolFileJSON.py
-
main.c
-
unittests/Symbol/
-
Symbol/
-
CMakeLists.txt
7/8
JSONSymbolTest.cpp

Differential D145180

[lldb] Introduce new SymbolFileJSON and ObjectFileJSON
ClosedPublic

Authored by JDevlieghere on Mar 2 2023, 12:57 PM.

Download Raw Diff

Details

Reviewers

jingham
clayborg
labath
DavidSpickett
mib
jdoerfert

Commits

rGcf3524a5746f: [lldb] Introduce new SymbolFileJSON and ObjectFileJSON

Summary

Introduce a new object and symbol file format with the goal of mapping addresses to symbol names. I'd like to think of is as an extremely simple textual syntab. The new file format is extremely simple, it contains a triple, a UUID and a list of address to symbol name mapping. JSON is used for the encoding, but that's mostly an implementation detail: any other encoding could achieve the same thing. However I did purposely pick a human readable format.

The new format is motivated by two use cases:

Stripped binaries: when a binary is stripped, you lose the ability to do thing like setting symbolic breakpoints. You can keep the unstripped binary around, but if all you need is the stripped symbols then that's a lot of overhead. Instead, we could save the stripped symbols to a file and load them in the debugger when needed. I want to extend llvm-strip to have a mode where it emits this new file format.
Interactive crashlogs: with interactive crashlogs, if we don't have the binary or the dSYM for a particular module, we currently show an unnamed symbol for those frames. This is a regression compared to the textual format, that has these frames pre-symbolicated. Given that this information is available in the JSON crashlog, we need a way to tell LLDB about it. With the new symbol file format, we can easily synthesize a symbol file for each of those modules and load them to symbolicate those frames.

Here's an example of the file format:

{
    "triple": "arm64-apple-macosx13.0.0",
    "uuid": "36D0CCE7-8ED2-3CA3-96B0-48C1764DA908",
    "symbols": [
        {
            "name": "main",
            "addr": 4294983568
        },
        {
            "name": "foo",
            "addr": 4294983560
        }
    ]
}

I've added a test case that illustrates the stripped binary workflow. For the interactive crashlogs, we'll need to extend the crashlog script.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

JDevlieghere created this revision.Mar 2 2023, 12:57 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2023, 12:57 PM

Herald added a subscriber: kristof.beyls. · View Herald Transcript

JDevlieghere requested review of this revision.Mar 2 2023, 12:57 PM

Herald added a reviewer: jdoerfert. · View Herald TranscriptMar 2 2023, 12:57 PM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added a subscriber: sstefan1. · View Herald Transcript

Harbormaster completed remote builds in B217035: Diff 501944.Mar 2 2023, 1:01 PM

Left some comments but overall looks good to me :) Thanks for taking this !

lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp
38
48–53	Why do we do this twice ?
93–112	I don't see the point of re-parsing the file here, we could just save the parsed object as a class member.
146	Cool 😁
lldb/source/Plugins/SymbolFile/JSON/SymbolFileJSON.cpp
113	Should we increment the `symID` here ?

Given that json numbers are read into double, should the addresses be stored as strings to avoid any issues with address integer values that can't be represented as double (>=2**53)?

JDevlieghere marked 5 inline comments as done.Mar 2 2023, 2:26 PM

JDevlieghere added inline comments.

lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp
38	See my comment bellow why this is split up.
48–53	When we create an object file instance, we usually read the first 512 bytes to read the magic. If we haven't read anything at all, the code above reads in the file. If we've only read the first 512 bytes, we read the rest of the file here.
93–112	That doesn't work because `GetModuleSpecifications` is a static method.
lldb/source/Plugins/SymbolFile/JSON/SymbolFileJSON.cpp
113	The IDs are arbitrary, and if we start at zero, we'll have conflicts with the ones already in the symbol table (e.g. the lldb_unnamed_symbols for stripped binaries). We could check the size of the symtab and continue counting from there. Or we can use 0 like we did here to indicate that these values are "special". I went with the latter approach mostly because that's what SymbolFileBreakpad does too.

I like this idea of this, but I would like to see this be a bit more complete. One idea is to remove the ObjectFileJSON::Symbol definition and just use lldb_private::Symbol objects and allow these objects to construct encode and decode from JSON. Then we have the ability to re-create any symbols we need in full fidelity. But that not be the goal in your case, but I don't think it would hurt to use the lldb_private::Symbol as the object we serialize and deserialize from JSON as ObjectFileJSON::Symbol just can't reproduce the full depth of the symbols we want.

From reading this it looks like your main use case is to supply additional symbols to an existing stripped binary. Is that correct? Are you aware that each dSYM file has a full copy of the unstripped symbol table already? It removes the debug map entries, but each dSYM copies all non debug map symbols inside of it already. So the same thing from this patch can already be accomplished by stripping a dSYM file of all debug info sections and leaving the symbol table alone, and then using this this minimal dSYM file just to get the symbols.

Any idea on where the JSON file will live if it is just a companion file for another mach-o or ELF executable? Will it always be next to the mach-o executable? Will we enable a Spotlight importer to find it like we do for dSYM files?

lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp
134	Don't we want to create a Symtab from the ObjectFileJSON::Symbol vector here? Symbols are not considered debug info. Debug info for functions is supposed to be more rich than just address and name.
lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.h
93–96	It would be great to be able to also define sections somewhere. I can think of this being really handy for core file serialization where we generate a core file and then store extra metadata that describes the object file in this JSON format. It would be great to have sections for this, and we really need sections to make ObjectFileJSON be able to represent something that actually looks like a real object file. See Section.h for what we expect section to have, but we can just store uint64_t values instead of lldb_private::Address values for the start address of the range for the section.
98–101	A few things to think about: some symbols have a byte size (like in ELF or mach-o symbols in the debug map that have pairs) some symbol have absolute values that are not actually addresses, there is no way to represent this currently here some symbols have symbol types. We could allow symbols to specify a lldb::SymbolType by name? If we have symbol types other than just symbols with addresses, some might refer to a sibling symbol by ID or index. It might be good to allow symbols to have integer ids One suggestion would be to define Symbol as: struct Symbol { uint64_t value; std::optional<uint64_t> size; std::optional<uint64_t> id; std::optional<SymbolType> type; std::string name; bool value_is_address; // Set to true if "value" is an address, false if "value" is just an integer with a different meaning };
lldb/source/Plugins/SymbolFile/JSON/SymbolFileJSON.cpp
88	If we have no debug info i SymbolFileJSON, then there is really no need to add the symbols here, we can just avoid this whole SymbolFileJSON class and have ObjectFileJSON parse the JSON symbols and add them to the Symtab in: void ObjectFileJSON::ParseSymtab(Symtab &symtab); So if we are not going to add JSON debug info to the ObjectFileJSON schema, then I would vote to remove this class and just do everything from ObjectFileJSON.
113	regardless of where the code goes that converts ObjectFileJSON::Symbol to lldb_private::Symbol, I would vote that we include enough information in ObjectFileJSON::Symbol so that we don't have to make up symbols IDs, or set all symbol IDs to zero. LLDB relies on having unique symbol IDs in each lldb_private::Symbol instance.

Thanks for the feedback Greg, they're all great suggestions.

In D145180#4166302, @clayborg wrote:

From reading this it looks like your main use case is to supply additional symbols to an existing stripped binary. Is that correct?

That's one use case, the other one being the interactive crashlogs. I went into a bit a bit more detail in the summary.

Are you aware that each dSYM file has a full copy of the unstripped symbol table already? It removes the debug map entries, but each dSYM copies all non debug map symbols inside of it already. So the same thing from this patch can already be accomplished by stripping a dSYM file of all debug info sections and leaving the symbol table alone, and then using this this minimal dSYM file just to get the symbols.

Yup and for the strip scenario I described above, we wouldn't even have to go through a dSYM, we could just have llvm-strip emit a Mach-O with only the unstripped symbol table and that should work out of the box in LLDB (similar to how you can add the unstripped binary with target symbols add). But for the crashlog use case where we only have an address and a symbol, it would be really tedious to have to build the whole symbol table. I really like the idea of a textual format for this and it's easy to read and modify. The barrier is super low and even if you had nothing but the textual output of nm you could create one of these JSON files and symbolicate your binary in LLDB.

Any idea on where the JSON file will live if it is just a companion file for another mach-o or ELF executable? Will it always be next to the mach-o executable? Will we enable a Spotlight importer to find it like we do for dSYM files?

For now I have no plans to have LLDB pick these files up automatically, but that's definitely something we could explore in the future.

Address Greg's feedback

Harbormaster completed remote builds in B217334: Diff 502330.Mar 3 2023, 6:18 PM

If these files can be used as the only source of information (without a stripped executable), we really should include a serialized SectionList in the JSON that can be loaded into ObjectFileJSON. This would be very useful for easily creating unit tests.

lldb/include/lldb/Symbol/Symbol.h
23	Do we something that says "value is an address"? Or are we inferring that from the lldb::SymbolType?
lldb/test/API/macosx/symbols/TestSymbolFileJSON.py
40	we probably should test that the "id" and "section_id" fields work correctly. We should also test that we are able to make symbols with only an address and name, then add tests for symbols that each add a new optional value that can be specified to ensure we can correctly make symbols.

clayborg added inline comments.Mar 6 2023, 2:41 PM

lldb/include/lldb/Symbol/Symbol.h
352–357	Do we want to stick with JSONSymbol or just teach lldb_private::Symbol to serialize and deserialize from JSON? If we so the latter, we can create any type of symbol we need and it would be useful for easily making unit tests that could use ObjectFileJSON as the basis.

JDevlieghere marked 3 inline comments as done.Mar 6 2023, 2:47 PM

JDevlieghere added inline comments.

lldb/include/lldb/Symbol/Symbol.h
23	In the Symbolc constructor that takes a JSONSymbol, I assume the value is an address if there's no section_id and an offset otherwise. We can definitely tweak that or add a sanity check based on the symbol type.
352–357	I think we really want the `JSONSymbol` class. That also matches what we do for the tracing objects as well, where there's a JSON object and a "real" object. For the crashlog use case where I don't have sections in the JSON, I want to be able to reuse the existing section list from the module so I need a way to pass that in. To (de)serialize the Symbol object directly, we'd need a way to pass that in, which I don't think exists today.
lldb/test/API/macosx/symbols/TestSymbolFileJSON.py
40	Yep, I saw that working manually but I can extend the test case to include that.

clayborg added inline comments.Mar 6 2023, 3:06 PM

lldb/include/lldb/Symbol/Symbol.h
23	Remember there are some symbols that are not addresses. Think of trying to create a N_OSO symbol where the value isn't an address, but it is a time stamp integer. Symbols in the symbol table for mach-o files can have a section ID that is valid and the value is not an offset, it is still an address. Not sure if that matters. One way to make this clear is if we have a "section_id" then "value" is an address, and if not, then "value" is just an integer? It does require that we always specify section IDs though. Also, section IDs are not human readable, we could change "std::optional<uint64_t> section_id;" to be "std::optional<std::string> section;" and thid becomes the fully qualified section name like "TEXT.text" or "PT_LOAD[0]..text". Much more readable. And easy to dig up the section from the Module's section list.
23	Another way to do this would be to have JSONSymbol know if it has an address or just an integer value by defining this as: std::optional<uint64_t> address; std::optional<uint64_t> value; The deserializer would verify that only one of these is valid and fail if both or neither were specified
352–357	Any ObjectFile can always get the section list from the Module, so if there is no serialized section list, ObjectFileJSON doesn't need to provide any sections, and it can just call: SectionList *sections = GetModule()->GetSectionList(); Any object files within a module are asked to add their own sections if they have any extra sections. We do this with dSYM files where we add the DWARF segment to the Module's main section list. So the executable adds "PAGEZERO, __TEXT, DATA_, etc", then the dSYM object file just adds any extra sections it need with a call to: void ObjectFile::CreateSections(SectionList &unified_section_list); So if we have a section list already in ObjectFileJSON, we need to verify that they all match with any pre-existing sections that are already in the SectionList, and can add any extra sections if needed. If the ObjectFileJSON was the only object file, it could fully populate a Module's section list on its own.

JDevlieghere marked 3 inline comments as done.Mar 6 2023, 3:13 PM

JDevlieghere added inline comments.

lldb/include/lldb/Symbol/Symbol.h
23	Sure, that sounds reasonable.
352–357	I'm aware the data is present but I don't see how to pass that into the JSON deserialization function if we don't go through an intermediary object. As far as I can tell, there's no way to pass this additional data into `fromJSON`. That's why I have a ctor (`Symbol(const JSONSymbol &symbol, SectionList *section_list);`) that takes both the deserialized symbol and the section list. If there's a way to do that I'm happy to scrap the `JSONSymbol`.

clayborg added inline comments.Mar 6 2023, 3:35 PM

lldb/include/lldb/Symbol/Symbol.h
23	And we can add tests that verify we get an error back if neither or both are specified.

Implement Greg's features. I'll add a unit test for the deserialization later today.

Harbormaster completed remote builds in B217749: Diff 502854.Mar 6 2023, 4:28 PM

clayborg added inline comments.Mar 7 2023, 2:45 PM

lldb/include/lldb/Symbol/Symbol.h
27	Come to think of it, we might not need the section as a name as it adds no real value unless we want to add a "std::optional<uint64_t> sect_offset;" field that could be specified. We could switch to saying "if we have a valid address member, we must be able to look it up in the section lists by address".
lldb/source/Symbol/Symbol.cpp
44–49	We probably should make this function into a static factory function that returns a llvm::Expected<Symbol> so we can return an error if the symbol has not value or address, and if we can't resolve the address in a section: llvm::Expected<Symbol> Symbol::CreateSymbol(const JSONSymbol &symbol, SectionList *section_list); Then we can return errors for the above cases and also for below.
55–74	This code should probably check if we have a "symbol.address" first, then always fill out the address range correctly and expect a value section_sp like the code above is doing. If we have.a symbol.value, then we don't expect a symbol to have an address, and to do this, we will in the AddressRange with no section and the "symbol.value" is the offset. I am not sure if we need a "JSONSymbol::section" member anymore since we know if this is an address now, we will expect the section lookup to succeed and can return an error if this fails, so we can probably remove "JSONSymbol::section". WE could leave this in if we want to specify an offset directly.
lldb/test/API/macosx/symbols/TestSymbolFileJSON.py
119–120	We should never have "section" and "value" as a combination. See above discussion about possibly removing the "section" field as we might not need it if we can assume: if "address" is valid, we must be able to look it up by address in the section list if "value" is valid, it just gets encoded as the suggested code edit

JDevlieghere marked an inline comment as done.Mar 7 2023, 2:51 PM

JDevlieghere added inline comments.

lldb/source/Symbol/Symbol.cpp
44–49	Do you prefer to handle the error here or during deserialization? Currently these things are enforced there. I don't think it makes sense to check the same invariant twice. I'm happy to move it here though if you think that's better.

JDevlieghere marked an inline comment as done.Mar 7 2023, 3:04 PM

JDevlieghere added inline comments.

lldb/source/Symbol/Symbol.cpp
44–49	Oh I missed the case where the address doesn't belong to a section. Yeah we definitely need to diagnose that here.

Remove section field
Support raw value symbols
Update test

Harbormaster completed remote builds in B217954: Diff 503166.Mar 7 2023, 3:19 PM

Looks good. Only questions is if we can add a C++ unit test for this file and test the new Symbol::FromJSON() and test all error conditions.

lldb/source/Symbol/Symbol.cpp
99–100	Can we unittest this function with a ObjectFileJSONTest.cpp? It would be nice to check for the errors.

Add unit tests for JSON deserialization
Add unit tests for converting JSONSymbol to Symbol

Harbormaster completed remote builds in B218153: Diff 503432.Mar 8 2023, 10:10 AM

clayborg added inline comments.Mar 8 2023, 3:06 PM

lldb/include/lldb/Core/Debugger.h
594 ↗	(On Diff #503432)	I am guessing these changes are in Debugger.h and Debugger.cpp are not related to this diff?
lldb/source/Core/Debugger.cpp
830–845 ↗	(On Diff #503432)	I am guessing these changes are in Debugger.h and Debugger.cpp are not related to this diff?
lldb/source/Symbol/Symbol.cpp
780–781	Should this return an llvm::Expected<lldb_private::JSONSymbol> instead of a bool? Or is this fromJSON pattern used everywhere? Then we wouldn't need to fill in "path" and could return an error?
804–805	Should this return an llvm::Expected<SymbolType> instead of a bool? Or is this fromJSON pattern used everywhere? Then we wouldn't need to fill in "path" and could return an error?
lldb/unittests/Symbol/JSONSymbolTest.cpp
35	Change over to using EXPECT_THAT_EXPECTED. Repeat for all cases below.
57	Use EXPECT_THAT_EXPECTED for clarity
82
144–145	Use EXPECT_THAT_EXPECTED for all "Expected<Symbol>" error cases.
156–157	Use EXPECT_THAT_EXPECTED for all "Expected<Symbol>" error cases.
166–167	Use EXPECT_THAT_EXPECTED for all "Expected<Symbol>" error cases.
190–192	Use EXPECT_THAT_EXPECTED for all "Expected<Symbol>" error cases

Remove 9d311dd6a71b from patch
Use EXPECT_THAT_EXPECTED

lldb/include/lldb/Core/Debugger.h
594 ↗	(On Diff #503432)	Yup. This got unintentionally mixed with D135631.
lldb/source/Symbol/Symbol.cpp
780–781	No, these are specializations/overloads for the JSON library in LLVM. Same below.

Harbormaster completed remote builds in B218228: Diff 503543.Mar 8 2023, 3:39 PM

JDevlieghere updated this revision to Diff 503575.Mar 8 2023, 5:10 PM

Harbormaster completed remote builds in B218255: Diff 503575.Mar 8 2023, 5:13 PM

Thanks for the changes! LGTM. Just one missed EXPECT_THAT_EXPECTED, but accepted.

lldb/unittests/Symbol/JSONSymbolTest.cpp
145–146	Missed one EXPECT_THAT_EXPECTED. Feel free to fix and submit without approval.

This revision is now accepted and ready to land.Mar 8 2023, 5:28 PM

Closed by commit rGcf3524a5746f: [lldb] Introduce new SymbolFileJSON and ObjectFileJSON (authored by JDevlieghere). · Explain WhyMar 8 2023, 8:56 PM

This revision was automatically updated to reflect the committed changes.

JDevlieghere added a commit: rGcf3524a5746f: [lldb] Introduce new SymbolFileJSON and ObjectFileJSON.

labath added inline comments.Mar 14 2023, 9:53 AM

lldb/source/Plugins/SymbolFile/JSON/SymbolFileJSON.cpp
113	Speaking of breakpad, have you considered using SymbolFileBreakpad directly? I think it serves pretty much the same purpose, and it also supports other, more advanced functionality (e.g. line tables and unwind info). It sounds like you don't need that now, but if you ever did, you wouldn't need to reinvent that logic...

JDevlieghere marked an inline comment as done.Mar 17 2023, 10:31 AM

JDevlieghere added inline comments.

lldb/source/Plugins/SymbolFile/JSON/SymbolFileJSON.cpp
113	I feel pretty dumb now: looking at breakpad in more detail, the `PUBLIC` records pretty much have everything I was looking for. For some reason I was under the impression that it was a binary format and I really wanted something human readable. I think I got it confused with minidumps. I wasn't planning on adding anything fancy to the JSON format so I agree that if that need arises for something more advanced we should use breakpad.

Herald added a subscriber: jplehr. · View Herald TranscriptMar 17 2023, 10:31 AM

Revision Contents

Path

Size

lldb/

include/

lldb/

Symbol/

Symbol.h

28 lines

source/

Plugins/

ObjectFile/

CMakeLists.txt

1 line

JSON/

CMakeLists.txt

12 lines

ObjectFileJSON.h

119 lines

ObjectFileJSON.cpp

176 lines

SymbolFile/

CMakeLists.txt

1 line

JSON/

CMakeLists.txt

7 lines

SymbolFileJSON.h

110 lines

SymbolFileJSON.cpp

105 lines

Symbol/

Symbol.cpp

130 lines

test/

API/

macosx/

symbols/

Makefile

8 lines

TestSymbolFileJSON.py

103 lines

main.c

2 lines

unittests/

Symbol/

CMakeLists.txt

1 line

JSONSymbolTest.cpp

192 lines

Diff 503611

lldb/include/lldb/Symbol/Symbol.h

//===-- Symbol.h ------------------------------------------------- C++ --===//		//===-- Symbol.h ------------------------------------------------- C++ --===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLDB_SYMBOL_SYMBOL_H		#ifndef LLDB_SYMBOL_SYMBOL_H
#define LLDB_SYMBOL_SYMBOL_H		#define LLDB_SYMBOL_SYMBOL_H

#include "lldb/Core/AddressRange.h"		#include "lldb/Core/AddressRange.h"
#include "lldb/Core/Mangled.h"		#include "lldb/Core/Mangled.h"
		#include "lldb/Core/Section.h"
#include "lldb/Symbol/SymbolContextScope.h"		#include "lldb/Symbol/SymbolContextScope.h"
#include "lldb/Utility/UserID.h"		#include "lldb/Utility/UserID.h"
#include "lldb/lldb-private.h"		#include "lldb/lldb-private.h"
		#include "llvm/Support/JSON.h"

namespace lldb_private {		namespace lldb_private {

		struct JSONSymbol {
		std::optional<uint64_t> address;
		clayborgUnsubmitted Done Reply Inline Actions Do we something that says "value is an address"? Or are we inferring that from the lldb::SymbolType? clayborg: Do we something that says "value is an address"? Or are we inferring that from the lldb…
		JDevlieghereAuthorUnsubmitted Done Reply Inline Actions In the Symbolc constructor that takes a JSONSymbol, I assume the value is an address if there's no section_id and an offset otherwise. We can definitely tweak that or add a sanity check based on the symbol type. JDevlieghere: In the Symbolc constructor that takes a JSONSymbol, I assume the value is an address if there's…
		clayborgUnsubmitted Done Reply Inline Actions Remember there are some symbols that are not addresses. Think of trying to create a N_OSO symbol where the value isn't an address, but it is a time stamp integer. Symbols in the symbol table for mach-o files can have a section ID that is valid and the value is not an offset, it is still an address. Not sure if that matters. One way to make this clear is if we have a "section_id" then "value" is an address, and if not, then "value" is just an integer? It does require that we always specify section IDs though. Also, section IDs are not human readable, we could change "std::optional<uint64_t> section_id;" to be "std::optional<std::string> section;" and thid becomes the fully qualified section name like "TEXT.text" or "PT_LOAD[0]..text". Much more readable. And easy to dig up the section from the Module's section list. clayborg: Remember there are some symbols that are not addresses. Think of trying to create a N_OSO…
		clayborgUnsubmitted Done Reply Inline Actions Another way to do this would be to have JSONSymbol know if it has an address or just an integer value by defining this as: std::optional<uint64_t> address; std::optional<uint64_t> value; The deserializer would verify that only one of these is valid and fail if both or neither were specified clayborg: Another way to do this would be to have JSONSymbol know if it has an address or just an integer…
		JDevlieghereAuthorUnsubmitted Done Reply Inline Actions Sure, that sounds reasonable. JDevlieghere: Sure, that sounds reasonable.
		clayborgUnsubmitted Done Reply Inline Actions And we can add tests that verify we get an error back if neither or both are specified. clayborg: And we can add tests that verify we get an error back if neither or both are specified.
		std::optional<uint64_t> value;
		std::optional<uint64_t> size;
		std::optional<uint64_t> id;
		std::optional<lldb::SymbolType> type;
		clayborgUnsubmitted Done Reply Inline Actions Come to think of it, we might not need the section as a name as it adds no real value unless we want to add a "std::optional<uint64_t> sect_offset;" field that could be specified. We could switch to saying "if we have a valid address member, we must be able to look it up in the section lists by address". clayborg: Come to think of it, we might not need the section as a name as it adds no real value unless we…
		std::string name;
		};

class Symbol : public SymbolContextScope {		class Symbol : public SymbolContextScope {
public:		public:
// ObjectFile readers can classify their symbol table entries and searches		// ObjectFile readers can classify their symbol table entries and searches
// can be made on specific types where the symbol values will have		// can be made on specific types where the symbol values will have
// drastically different meanings and sorting requirements.		// drastically different meanings and sorting requirements.
Symbol();		Symbol();

Symbol(uint32_t symID, llvm::StringRef name, lldb::SymbolType type,		Symbol(uint32_t symID, llvm::StringRef name, lldb::SymbolType type,
bool external, bool is_debug, bool is_trampoline, bool is_artificial,		bool external, bool is_debug, bool is_trampoline, bool is_artificial,
const lldb::SectionSP &section_sp, lldb::addr_t value,		const lldb::SectionSP &section_sp, lldb::addr_t value,
lldb::addr_t size, bool size_is_valid,		lldb::addr_t size, bool size_is_valid,
bool contains_linker_annotations, uint32_t flags);		bool contains_linker_annotations, uint32_t flags);

Symbol(uint32_t symID, const Mangled &mangled, lldb::SymbolType type,		Symbol(uint32_t symID, const Mangled &mangled, lldb::SymbolType type,
bool external, bool is_debug, bool is_trampoline, bool is_artificial,		bool external, bool is_debug, bool is_trampoline, bool is_artificial,
const AddressRange &range, bool size_is_valid,		const AddressRange &range, bool size_is_valid,
bool contains_linker_annotations, uint32_t flags);		bool contains_linker_annotations, uint32_t flags);

Symbol(const Symbol &rhs);		Symbol(const Symbol &rhs);

const Symbol &operator=(const Symbol &rhs);		const Symbol &operator=(const Symbol &rhs);

		static llvm::Expected<Symbol> FromJSON(const JSONSymbol &symbol,
		SectionList *section_list);

void Clear();		void Clear();

bool Compare(ConstString name, lldb::SymbolType type) const;		bool Compare(ConstString name, lldb::SymbolType type) const;

void Dump(Stream s, Target target, uint32_t index,		void Dump(Stream s, Target target, uint32_t index,
Mangled::NamePreference name_preference =		Mangled::NamePreference name_preference =
Mangled::ePreferDemangled) const;		Mangled::ePreferDemangled) const;

▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines	public:
void SetExternal(bool b) { m_is_external = b; }		void SetExternal(bool b) { m_is_external = b; }

bool IsTrampoline() const;		bool IsTrampoline() const;

bool IsIndirect() const;		bool IsIndirect() const;

bool IsWeak() const { return m_is_weak; }		bool IsWeak() const { return m_is_weak; }

void SetIsWeak (bool b) { m_is_weak = b; }		void SetIsWeak(bool b) { m_is_weak = b; }

bool GetByteSizeIsValid() const { return m_size_is_valid; }		bool GetByteSizeIsValid() const { return m_size_is_valid; }

lldb::addr_t GetByteSize() const;		lldb::addr_t GetByteSize() const;

void SetByteSize(lldb::addr_t size) {		void SetByteSize(lldb::addr_t size) {
m_size_is_valid = size > 0;		m_size_is_valid = size > 0;
m_addr_range.SetByteSize(size);		m_addr_range.SetByteSize(size);
▲ Show 20 Lines • Show All 128 Lines • ▼ Show 20 Lines	AddressRange m_addr_range; // Contains the value, or the section offset
// address when the value is an address in a		// address when the value is an address in a
// section, and the size (if any)		// section, and the size (if any)
uint32_t m_flags = 0; // A copy of the flags from the original symbol table,		uint32_t m_flags = 0; // A copy of the flags from the original symbol table,
// the ObjectFile plug-in can interpret these		// the ObjectFile plug-in can interpret these
};		};

} // namespace lldb_private		} // namespace lldb_private

		namespace llvm {
		namespace json {

		bool fromJSON(const llvm::json::Value &value, lldb_private::JSONSymbol &symbol,
		llvm::json::Path path);

		bool fromJSON(const llvm::json::Value &value, lldb::SymbolType &type,
		llvm::json::Path path);

		clayborgUnsubmitted Done Reply Inline Actions Do we want to stick with JSONSymbol or just teach lldb_private::Symbol to serialize and deserialize from JSON? If we so the latter, we can create any type of symbol we need and it would be useful for easily making unit tests that could use ObjectFileJSON as the basis. clayborg: Do we want to stick with JSONSymbol or just teach lldb_private::Symbol to serialize and…
		JDevlieghereAuthorUnsubmitted Done Reply Inline Actions I think we really want the `JSONSymbol` class. That also matches what we do for the tracing objects as well, where there's a JSON object and a "real" object. For the crashlog use case where I don't have sections in the JSON, I want to be able to reuse the existing section list from the module so I need a way to pass that in. To (de)serialize the Symbol object directly, we'd need a way to pass that in, which I don't think exists today. JDevlieghere: I think we really want the `JSONSymbol` class. That also matches what we do for the tracing…
		clayborgUnsubmitted Done Reply Inline Actions Any ObjectFile can always get the section list from the Module, so if there is no serialized section list, ObjectFileJSON doesn't need to provide any sections, and it can just call: SectionList sections = GetModule()->GetSectionList(); Any object files within a module are asked to add their own sections if they have any extra sections. We do this with dSYM files where we add the DWARF segment to the Module's main section list. So the executable adds "PAGEZERO, __TEXT, DATA_, etc", then the dSYM object file just adds any extra sections it need with a call to: void ObjectFile::CreateSections(SectionList &unified_section_list); So if we have a section list already in ObjectFileJSON, we need to verify that they all match with any pre-existing sections that are already in the SectionList, and can add any extra sections if needed. If the ObjectFileJSON was the only object file, it could fully populate a Module's section list on its own. clayborg:* Any ObjectFile can always get the section list from the Module, so if there is no serialized…
		JDevlieghereAuthorUnsubmitted Done Reply Inline Actions I'm aware the data is present but I don't see how to pass that into the JSON deserialization function if we don't go through an intermediary object. As far as I can tell, there's no way to pass this additional data into `fromJSON`. That's why I have a ctor (`Symbol(const JSONSymbol &symbol, SectionList section_list);`) that takes both the deserialized symbol and the section list. If there's a way to do that I'm happy to scrap the `JSONSymbol`. JDevlieghere:* I'm aware the data is present but I don't see how to pass that into the JSON deserialization…
		} // namespace json
		} // namespace llvm

#endif // LLDB_SYMBOL_SYMBOL_H		#endif // LLDB_SYMBOL_SYMBOL_H

lldb/source/Plugins/ObjectFile/CMakeLists.txt

	add_subdirectory(Breakpad)			add_subdirectory(Breakpad)
	add_subdirectory(ELF)			add_subdirectory(ELF)
	add_subdirectory(JIT)			add_subdirectory(JIT)
				add_subdirectory(JSON)
	add_subdirectory(Mach-O)			add_subdirectory(Mach-O)
	add_subdirectory(Minidump)			add_subdirectory(Minidump)
	add_subdirectory(PDB)			add_subdirectory(PDB)
	add_subdirectory(PECOFF)			add_subdirectory(PECOFF)
	add_subdirectory(wasm)			add_subdirectory(wasm)

lldb/source/Plugins/ObjectFile/JSON/CMakeLists.txt

This file was added.

				add_lldb_library(lldbPluginObjectFileJSON PLUGIN
				ObjectFileJSON.cpp

				LINK_LIBS
				lldbCore
				lldbHost
				lldbSymbol
				lldbUtility
				LINK_COMPONENTS
				Support
				TargetParser
				)

lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.h

This file was added.

				//===-- ObjectFileJSON.h -------------------------------------- -- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLDB_SOURCE_PLUGINS_OBJECTFILE_JSON_OBJECTFILEJSON_H
				#define LLDB_SOURCE_PLUGINS_OBJECTFILE_JSON_OBJECTFILEJSON_H

				#include "lldb/Symbol/ObjectFile.h"
				#include "lldb/Utility/ArchSpec.h"
				#include "llvm/Support/JSON.h"

				namespace lldb_private {

				class ObjectFileJSON : public ObjectFile {
				public:
				static void Initialize();
				static void Terminate();

				static llvm::StringRef GetPluginNameStatic() { return "JSON"; }

				static const char *GetPluginDescriptionStatic() {
				return "JSON object file reader.";
				}

				static ObjectFile *
				CreateInstance(const lldb::ModuleSP &module_sp, lldb::DataBufferSP data_sp,
				lldb::offset_t data_offset, const FileSpec *file,
				lldb::offset_t file_offset, lldb::offset_t length);

				static ObjectFile *CreateMemoryInstance(const lldb::ModuleSP &module_sp,
				lldb::WritableDataBufferSP data_sp,
				const lldb::ProcessSP &process_sp,
				lldb::addr_t header_addr);

				static size_t GetModuleSpecifications(const FileSpec &file,
				lldb::DataBufferSP &data_sp,
				lldb::offset_t data_offset,
				lldb::offset_t file_offset,
				lldb::offset_t length,
				ModuleSpecList &specs);

				llvm::StringRef GetPluginName() override { return GetPluginNameStatic(); }

				// LLVM RTTI support
				static char ID;
				bool isA(const void *ClassID) const override {
				return ClassID == &ID \|\| ObjectFile::isA(ClassID);
				}
				static bool classof(const ObjectFile *obj) { return obj->isA(&ID); }

				bool ParseHeader() override;

				lldb::ByteOrder GetByteOrder() const override {
				return m_arch.GetByteOrder();
				}

				bool IsExecutable() const override { return false; }

				uint32_t GetAddressByteSize() const override {
				return m_arch.GetAddressByteSize();
				}

				AddressClass GetAddressClass(lldb::addr_t file_addr) override {
				return AddressClass::eInvalid;
				}

				void ParseSymtab(lldb_private::Symtab &symtab) override;

				bool IsStripped() override { return false; }

				void CreateSections(SectionList &unified_section_list) override;

				void Dump(Stream *s) override {}

				ArchSpec GetArchitecture() override { return m_arch; }

				UUID GetUUID() override { return m_uuid; }

				uint32_t GetDependentModules(FileSpecList &files) override { return 0; }

				Type CalculateType() override { return eTypeDebugInfo; }

				Strata CalculateStrata() override { return eStrataUser; }

				static bool MagicBytesMatch(lldb::DataBufferSP data_sp, lldb::addr_t offset,
				lldb::addr_t length);

				struct Header {
				std::string triple;
				std::string uuid;
				};

				clayborgUnsubmitted Done Reply Inline Actions It would be great to be able to also define sections somewhere. I can think of this being really handy for core file serialization where we generate a core file and then store extra metadata that describes the object file in this JSON format. It would be great to have sections for this, and we really need sections to make ObjectFileJSON be able to represent something that actually looks like a real object file. See Section.h for what we expect section to have, but we can just store uint64_t values instead of lldb_private::Address values for the start address of the range for the section. clayborg: It would be great to be able to also define sections somewhere. I can think of this being…
				struct Body {
				std::vector<JSONSymbol> symbols;
				};

				private:
				clayborgUnsubmitted Done Reply Inline Actions A few things to think about: some symbols have a byte size (like in ELF or mach-o symbols in the debug map that have pairs) some symbol have absolute values that are not actually addresses, there is no way to represent this currently here some symbols have symbol types. We could allow symbols to specify a lldb::SymbolType by name? If we have symbol types other than just symbols with addresses, some might refer to a sibling symbol by ID or index. It might be good to allow symbols to have integer ids One suggestion would be to define Symbol as: struct Symbol { uint64_t value; std::optional<uint64_t> size; std::optional<uint64_t> id; std::optional<SymbolType> type; std::string name; bool value_is_address; // Set to true if "value" is an address, false if "value" is just an integer with a different meaning }; clayborg: A few things to think about: - some symbols have a byte size (like in ELF or mach-o symbols in…
				ArchSpec m_arch;
				UUID m_uuid;
				std::vector<JSONSymbol> m_symbols;

				ObjectFileJSON(const lldb::ModuleSP &module_sp, lldb::DataBufferSP &data_sp,
				lldb::offset_t data_offset, const FileSpec *file,
				lldb::offset_t offset, lldb::offset_t length, ArchSpec arch,
				UUID uuid, std::vector<JSONSymbol> symbols);
				};

				bool fromJSON(const llvm::json::Value &value, ObjectFileJSON::Header &header,
				llvm::json::Path path);

				bool fromJSON(const llvm::json::Value &value, ObjectFileJSON::Body &body,
				llvm::json::Path path);

				} // namespace lldb_private
				#endif // LLDB_SOURCE_PLUGINS_OBJECTFILE_JSON_OBJECTFILEJSON_H

lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp

This file was added.

//===-- ObjectFileJSON.cpp ------------------------------------------------===//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

#include "Plugins/ObjectFile/JSON/ObjectFileJSON.h"

#include "lldb/Core/Module.h"

#include "lldb/Core/ModuleSpec.h"

#include "lldb/Core/PluginManager.h"

#include "lldb/Core/Section.h"

#include "lldb/Symbol/Symbol.h"

#include "lldb/Utility/LLDBLog.h"

#include "lldb/Utility/Log.h"

#include "llvm/ADT/DenseSet.h"

#include <optional>

using namespace llvm;

using namespace lldb;

using namespace lldb_private;

LLDB_PLUGIN_DEFINE(ObjectFileJSON)

char ObjectFileJSON::ID;

void ObjectFileJSON::Initialize() {

PluginManager::RegisterPlugin(GetPluginNameStatic(),

GetPluginDescriptionStatic(), CreateInstance,

CreateMemoryInstance, GetModuleSpecifications);

}

void ObjectFileJSON::Terminate() {

PluginManager::UnregisterPlugin(CreateInstance);

}

ObjectFile *

mibUnsubmitted

Done

offset_t file_offset, offset_t length) {

- if (!data_sp) {

+ if (!data_sp || data_sp->GetByteSize() < length) {

data_sp = MapFileData(*file, length, file_offset);

mib:

JDevlieghereAuthorUnsubmitted

Done

See my comment bellow why this is split up.

JDevlieghere: See my comment bellow why this is split up.

ObjectFileJSON::CreateInstance(const ModuleSP &module_sp, DataBufferSP data_sp,

offset_t data_offset, const FileSpec *file,

offset_t file_offset, offset_t length) {

if (!data_sp) {

data_sp = MapFileData(*file, length, file_offset);

if (!data_sp)

return nullptr;

data_offset = 0;

}

if (!MagicBytesMatch(data_sp, 0, data_sp->GetByteSize()))

return nullptr;

if (data_sp->GetByteSize() < length) {

data_sp = MapFileData(*file, length, file_offset);

mibUnsubmitted

Done

Why do we do this twice ?

mib: Why do we do this twice ?

JDevlieghereAuthorUnsubmitted

Done

When we create an object file instance, we usually read the first 512 bytes to read the magic. If we haven't read anything at all, the code above reads in the file. If we've only read the first 512 bytes, we read the rest of the file here.

JDevlieghere: When we create an object file instance, we usually read the first 512 bytes to read the magic.

if (!data_sp)

return nullptr;

data_offset = 0;

}

auto text =

llvm::StringRef(reinterpret_cast<const char *>(data_sp->GetBytes()));

Expected<json::Value> json = json::parse(text);

if (!json) {

llvm::consumeError(json.takeError());

return nullptr;

}

json::Path::Root root;

Header header;

if (!fromJSON(*json, header, root))

return nullptr;

ArchSpec arch(header.triple);

UUID uuid;

uuid.SetFromStringRef(header.uuid);

Body body;

fromJSON(*json, body, root);

return new ObjectFileJSON(module_sp, data_sp, data_offset, file, file_offset,

length, std::move(arch), std::move(uuid),

std::move(body.symbols));

}

ObjectFile *ObjectFileJSON::CreateMemoryInstance(const ModuleSP &module_sp,

WritableDataBufferSP data_sp,

const ProcessSP &process_sp,

addr_t header_addr) {

return nullptr;

}

size_t ObjectFileJSON::GetModuleSpecifications(

const FileSpec &file, DataBufferSP &data_sp, offset_t data_offset,

offset_t file_offset, offset_t length, ModuleSpecList &specs) {

if (!MagicBytesMatch(data_sp, data_offset, data_sp->GetByteSize()))

return 0;

auto text =

llvm::StringRef(reinterpret_cast<const char *>(data_sp->GetBytes()));

Expected<json::Value> json = json::parse(text);

if (!json) {

llvm::consumeError(json.takeError());

return 0;

}

json::Path::Root root;

Header header;

if (!fromJSON(*json, header, root))

return 0;

mibUnsubmitted

Done

I don't see the point of re-parsing the file here, we could just save the parsed object as a class member.

mib: I don't see the point of re-parsing the file here, we could just save the parsed object as a…

JDevlieghereAuthorUnsubmitted

Done

That doesn't work because GetModuleSpecifications is a static method.

JDevlieghere: That doesn't work because `GetModuleSpecifications` is a static method.

ArchSpec arch(header.triple);

UUID uuid;

uuid.SetFromStringRef(header.uuid);

ModuleSpec spec(file, std::move(arch));

spec.GetUUID() = std::move(uuid);

specs.Append(spec);

return 1;

}

ObjectFileJSON::ObjectFileJSON(const ModuleSP &module_sp, DataBufferSP &data_sp,

offset_t data_offset, const FileSpec *file,

offset_t offset, offset_t length, ArchSpec arch,

UUID uuid, std::vector<JSONSymbol> symbols)

: ObjectFile(module_sp, file, offset, length, data_sp, data_offset),

m_arch(std::move(arch)), m_uuid(std::move(uuid)),

m_symbols(std::move(symbols)) {}

bool ObjectFileJSON::ParseHeader() {

// We already parsed the header during initialization.

return true;

}

clayborgUnsubmitted

Done

Don't we want to create a Symtab from the ObjectFileJSON::Symbol vector here? Symbols are not considered debug info. Debug info for functions is supposed to be more rich than just address and name.

clayborg: Don't we want to create a Symtab from the ObjectFileJSON::Symbol vector here? Symbols are not…

void ObjectFileJSON::ParseSymtab(Symtab &symtab) {

Log *log = GetLog(LLDBLog::Symbols);

SectionList *section_list = GetModule()->GetSectionList();

for (JSONSymbol json_symbol : m_symbols) {

llvm::Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, section_list);

if (!symbol) {

LLDB_LOG_ERROR(log, symbol.takeError(), "invalid symbol");

continue;

}

symtab.AddSymbol(*symbol);

}

mibUnsubmitted

Done

Cool 😁

mib: Cool 😁

symtab.Finalize();

}

void ObjectFileJSON::CreateSections(SectionList &unified_section_list) {}

bool ObjectFileJSON::MagicBytesMatch(DataBufferSP data_sp,

lldb::addr_t data_offset,

lldb::addr_t data_length) {

DataExtractor data;

data.SetData(data_sp, data_offset, data_length);

lldb::offset_t offset = 0;

uint32_t magic = data.GetU8(&offset);

return magic == '{';

}

namespace lldb_private {

bool fromJSON(const json::Value &value, ObjectFileJSON::Header &header,

json::Path path) {

json::ObjectMapper o(value, path);

return o && o.map("triple", header.triple) && o.map("uuid", header.uuid);

}

bool fromJSON(const json::Value &value, ObjectFileJSON::Body &body,

json::Path path) {

json::ObjectMapper o(value, path);

return o && o.map("symbols", body.symbols);

}

} // namespace lldb_private

lldb/source/Plugins/SymbolFile/CMakeLists.txt

	add_subdirectory(Breakpad)			add_subdirectory(Breakpad)
	add_subdirectory(DWARF)			add_subdirectory(DWARF)
				add_subdirectory(JSON)
	add_subdirectory(NativePDB)			add_subdirectory(NativePDB)
	add_subdirectory(PDB)			add_subdirectory(PDB)
	add_subdirectory(Symtab)			add_subdirectory(Symtab)

lldb/source/Plugins/SymbolFile/JSON/CMakeLists.txt

This file was added.

				add_lldb_library(lldbPluginSymbolFileJSON PLUGIN
				SymbolFileJSON.cpp

				LINK_LIBS
				lldbCore
				lldbSymbol
				)

lldb/source/Plugins/SymbolFile/JSON/SymbolFileJSON.h

This file was added.

				//===-- SymbolFileJSON.h ----------------------------------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLDB_SOURCE_PLUGINS_SYMBOLFILE_JSON_SYMBOLFILETEXT_H
				#define LLDB_SOURCE_PLUGINS_SYMBOLFILE_JSON_SYMBOLFILETEXT_H

				#include <map>
				#include <optional>
				#include <vector>

				#include "lldb/Symbol/CompileUnit.h"
				#include "lldb/Symbol/SymbolFile.h"

				namespace lldb_private {

				class SymbolFileJSON : public lldb_private::SymbolFileCommon {
				/// LLVM RTTI support.
				static char ID;

				public:
				/// LLVM RTTI support.
				/// \{
				bool isA(const void *ClassID) const override {
				return ClassID == &ID \|\| SymbolFileCommon::isA(ClassID);
				}
				static bool classof(const SymbolFile *obj) { return obj->isA(&ID); }
				/// \}

				SymbolFileJSON(lldb::ObjectFileSP objfile_sp);

				static void Initialize();

				static void Terminate();

				static llvm::StringRef GetPluginNameStatic() { return "text"; }

				static llvm::StringRef GetPluginDescriptionStatic();

				static lldb_private::SymbolFile *
				CreateInstance(lldb::ObjectFileSP objfile_sp);

				llvm::StringRef GetPluginName() override { return GetPluginNameStatic(); }

				uint32_t CalculateAbilities() override;

				lldb::LanguageType ParseLanguage(CompileUnit &comp_unit) override {
				return lldb::eLanguageTypeUnknown;
				}

				size_t ParseFunctions(CompileUnit &comp_unit) override { return 0; }

				bool ParseLineTable(CompileUnit &comp_unit) override { return false; }

				bool ParseDebugMacros(CompileUnit &comp_unit) override { return false; }

				bool ParseSupportFiles(CompileUnit &comp_unit,
				FileSpecList &support_files) override {
				return false;
				}

				size_t ParseTypes(CompileUnit &cu) override { return 0; }

				bool ParseImportedModules(
				const SymbolContext &sc,
				std::vector<lldb_private::SourceModule> &imported_modules) override {
				return false;
				}

				size_t ParseBlocksRecursive(Function &func) override { return 0; }

				size_t ParseVariablesForContext(const SymbolContext &sc) override {
				return 0;
				}

				uint32_t CalculateNumCompileUnits() override { return 0; }

				lldb::CompUnitSP ParseCompileUnitAtIndex(uint32_t index) override;

				Type *ResolveTypeUID(lldb::user_id_t type_uid) override { return nullptr; }
				std::optional<ArrayInfo> GetDynamicArrayInfoForUID(
				lldb::user_id_t type_uid,
				const lldb_private::ExecutionContext *exe_ctx) override {
				return std::nullopt;
				}

				bool CompleteType(CompilerType &compiler_type) override { return false; }

				uint32_t ResolveSymbolContext(const lldb_private::Address &so_addr,
				lldb::SymbolContextItem resolve_scope,
				lldb_private::SymbolContext &sc) override;

				void GetTypes(lldb_private::SymbolContextScope *sc_scope,
				lldb::TypeClass type_mask,
				lldb_private::TypeList &type_list) override;

				void AddSymbols(Symtab &symtab) override;

				private:
				lldb::addr_t GetBaseFileAddress();

				std::vector<std::pair<uint64_t, std::string>> m_symbols;
				};
				} // namespace lldb_private

				#endif // LLDB_SOURCE_PLUGINS_SYMBOLFILE_JSON_SYMBOLFILETEXT_H

lldb/source/Plugins/SymbolFile/JSON/SymbolFileJSON.cpp

This file was added.

				//===-- SymbolFileJSON.cpp ----------------------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "SymbolFileJSON.h"

				#include "Plugins/ObjectFile/JSON/ObjectFileJSON.h"
				#include "lldb/Core/Module.h"
				#include "lldb/Core/PluginManager.h"
				#include "lldb/Symbol/CompileUnit.h"
				#include "lldb/Symbol/Function.h"
				#include "lldb/Symbol/ObjectFile.h"
				#include "lldb/Symbol/Symbol.h"
				#include "lldb/Symbol/SymbolContext.h"
				#include "lldb/Symbol/Symtab.h"
				#include "lldb/Symbol/TypeList.h"
				#include "lldb/Utility/LLDBLog.h"
				#include "lldb/Utility/Log.h"
				#include "lldb/Utility/RegularExpression.h"
				#include "lldb/Utility/Timer.h"
				#include "llvm/Support/MemoryBuffer.h"

				#include <memory>
				#include <optional>

				using namespace llvm;
				using namespace lldb;
				using namespace lldb_private;

				LLDB_PLUGIN_DEFINE(SymbolFileJSON)

				char SymbolFileJSON::ID;

				SymbolFileJSON::SymbolFileJSON(lldb::ObjectFileSP objfile_sp)
				: SymbolFileCommon(std::move(objfile_sp)) {}

				void SymbolFileJSON::Initialize() {
				PluginManager::RegisterPlugin(GetPluginNameStatic(),
				GetPluginDescriptionStatic(), CreateInstance);
				}

				void SymbolFileJSON::Terminate() {
				PluginManager::UnregisterPlugin(CreateInstance);
				}

				llvm::StringRef SymbolFileJSON::GetPluginDescriptionStatic() {
				return "Reads debug symbols from a textual symbol table.";
				}

				SymbolFile *SymbolFileJSON::CreateInstance(ObjectFileSP objfile_sp) {
				return new SymbolFileJSON(std::move(objfile_sp));
				}

				uint32_t SymbolFileJSON::CalculateAbilities() {
				if (!m_objfile_sp \|\| !llvm::isa<ObjectFileJSON>(*m_objfile_sp))
				return 0;

				return GlobalVariables \| Functions;
				}

				uint32_t SymbolFileJSON::ResolveSymbolContext(const Address &so_addr,
				SymbolContextItem resolve_scope,
				SymbolContext &sc) {
				std::lock_guard<std::recursive_mutex> guard(GetModuleMutex());
				if (m_objfile_sp->GetSymtab() == nullptr)
				return 0;

				uint32_t resolved_flags = 0;
				if (resolve_scope & eSymbolContextSymbol) {
				sc.symbol = m_objfile_sp->GetSymtab()->FindSymbolContainingFileAddress(
				so_addr.GetFileAddress());
				if (sc.symbol)
				resolved_flags \|= eSymbolContextSymbol;
				}
				return resolved_flags;
				}

				CompUnitSP SymbolFileJSON::ParseCompileUnitAtIndex(uint32_t idx) { return {}; }

				void SymbolFileJSON::GetTypes(SymbolContextScope *sc_scope, TypeClass type_mask,
				lldb_private::TypeList &type_list) {}

				void SymbolFileJSON::AddSymbols(Symtab &symtab) {
				if (!m_objfile_sp)
				clayborgUnsubmitted Done Reply Inline Actions If we have no debug info i SymbolFileJSON, then there is really no need to add the symbols here, we can just avoid this whole SymbolFileJSON class and have ObjectFileJSON parse the JSON symbols and add them to the Symtab in: void ObjectFileJSON::ParseSymtab(Symtab &symtab); So if we are not going to add JSON debug info to the ObjectFileJSON schema, then I would vote to remove this class and just do everything from ObjectFileJSON. clayborg: If we have no debug info i SymbolFileJSON, then there is really no need to add the symbols here…
				return;

				Symtab *json_symtab = m_objfile_sp->GetSymtab();
				if (!json_symtab)
				return;

				if (&symtab == json_symtab)
				return;

				// Merge the two symbol tables.
				const size_t num_new_symbols = json_symtab->GetNumSymbols();
				for (size_t i = 0; i < num_new_symbols; ++i) {
				Symbol *s = json_symtab->SymbolAtIndex(i);
				symtab.AddSymbol(*s);
				}
				symtab.Finalize();
				}
				mibUnsubmitted Done Reply Inline Actions Should we increment the `symID` here ? mib: Should we increment the `symID` here ?
				JDevlieghereAuthorUnsubmitted Done Reply Inline Actions The IDs are arbitrary, and if we start at zero, we'll have conflicts with the ones already in the symbol table (e.g. the lldb_unnamed_symbols for stripped binaries). We could check the size of the symtab and continue counting from there. Or we can use 0 like we did here to indicate that these values are "special". I went with the latter approach mostly because that's what SymbolFileBreakpad does too. JDevlieghere: The IDs are arbitrary, and if we start at zero, we'll have conflicts with the ones already in…
				labathUnsubmitted Not Done Reply Inline Actions Speaking of breakpad, have you considered using SymbolFileBreakpad directly? I think it serves pretty much the same purpose, and it also supports other, more advanced functionality (e.g. line tables and unwind info). It sounds like you don't need that now, but if you ever did, you wouldn't need to reinvent that logic... labath: Speaking of breakpad, have you considered using SymbolFileBreakpad directly? I think it serves…
				JDevlieghereAuthorUnsubmitted Done Reply Inline Actions I feel pretty dumb now: looking at breakpad in more detail, the `PUBLIC` records pretty much have everything I was looking for. For some reason I was under the impression that it was a binary format and I really wanted something human readable. I think I got it confused with minidumps. I wasn't planning on adding anything fancy to the JSON format so I agree that if that need arises for something more advanced we should use breakpad. JDevlieghere: I feel pretty dumb now: looking at breakpad in more detail, the `PUBLIC` records pretty much…
				clayborgUnsubmitted Done Reply Inline Actions regardless of where the code goes that converts ObjectFileJSON::Symbol to lldb_private::Symbol, I would vote that we include enough information in ObjectFileJSON::Symbol so that we don't have to make up symbols IDs, or set all symbol IDs to zero. LLDB relies on having unique symbol IDs in each lldb_private::Symbol instance. clayborg: regardless of where the code goes that converts ObjectFileJSON::Symbol to lldb_private::Symbol…

lldb/source/Symbol/Symbol.cpp

Show All 13 Lines

#include "lldb/Symbol/Function.h"

#include "lldb/Symbol/ObjectFile.h"

#include "lldb/Symbol/SymbolVendor.h"

#include "lldb/Symbol/Symtab.h"

#include "lldb/Target/Process.h"

#include "lldb/Target/Target.h"

#include "lldb/Utility/DataEncoder.h"

#include "lldb/Utility/Stream.h"

#include "llvm/ADT/StringSwitch.h"

using namespace lldb;

using namespace lldb_private;

Symbol::Symbol()

: SymbolContextScope(), m_type_data_resolved(false), m_is_synthetic(false),

m_is_debug(false), m_is_external(false), m_size_is_sibling(false),

m_size_is_synthesized(false), m_size_is_valid(false),

m_demangled_is_synthesized(false), m_contains_linker_annotations(false),

m_is_weak(false), m_type(eSymbolTypeInvalid), m_mangled(),

m_addr_range() {}

Symbol::Symbol(uint32_t symID, llvm::StringRef name, SymbolType type,

bool external, bool is_debug, bool is_trampoline,

bool is_artificial, const lldb::SectionSP &section_sp,

addr_t offset, addr_t size, bool size_is_valid,

bool contains_linker_annotations, uint32_t flags)

: SymbolContextScope(), m_uid(symID), m_type_data_resolved(false),

m_is_synthetic(is_artificial), m_is_debug(is_debug),

m_is_external(external), m_size_is_sibling(false),

m_size_is_synthesized(false), m_size_is_valid(size_is_valid || size > 0),

m_demangled_is_synthesized(false),

m_contains_linker_annotations(contains_linker_annotations),

m_is_weak(false), m_type(type), m_mangled(name),

m_addr_range(section_sp, offset, size), m_flags(flags) {}

Symbol::Symbol(uint32_t symID, const Mangled &mangled, SymbolType type,

clayborgUnsubmitted

Done

We probably should make this function into a static factory function that returns a llvm::Expected<Symbol> so we can return an error if the symbol has not value or address, and if we can't resolve the address in a section:

llvm::Expected<Symbol> Symbol::CreateSymbol(const JSONSymbol &symbol, SectionList *section_list);

Then we can return errors for the above cases and also for below.

clayborg: We probably should make this function into a static factory function that returns a llvm…

JDevlieghereAuthorUnsubmitted

Done

Do you prefer to handle the error here or during deserialization? Currently these things are enforced there. I don't think it makes sense to check the same invariant twice. I'm happy to move it here though if you think that's better.

JDevlieghere: Do you prefer to handle the error here or during deserialization? Currently these things are…

JDevlieghereAuthorUnsubmitted

Done

Oh I missed the case where the address doesn't belong to a section. Yeah we definitely need to diagnose that here.

JDevlieghere: Oh I missed the case where the address doesn't belong to a section. Yeah we definitely need to…

bool external, bool is_debug, bool is_trampoline,

bool is_artificial, const AddressRange &range,

bool size_is_valid, bool contains_linker_annotations,

uint32_t flags)

: SymbolContextScope(), m_uid(symID), m_type_data_resolved(false),

m_is_synthetic(is_artificial), m_is_debug(is_debug),

m_is_external(external), m_size_is_sibling(false),

m_size_is_synthesized(false),

m_size_is_valid(size_is_valid || range.GetByteSize() > 0),

m_demangled_is_synthesized(false),

m_contains_linker_annotations(contains_linker_annotations),

m_is_weak(false), m_type(type), m_mangled(mangled), m_addr_range(range),

m_flags(flags) {}

Symbol::Symbol(const Symbol &rhs)

: SymbolContextScope(rhs), m_uid(rhs.m_uid), m_type_data(rhs.m_type_data),

m_type_data_resolved(rhs.m_type_data_resolved),

m_is_synthetic(rhs.m_is_synthetic), m_is_debug(rhs.m_is_debug),

m_is_external(rhs.m_is_external),

m_size_is_sibling(rhs.m_size_is_sibling), m_size_is_synthesized(false),

m_size_is_valid(rhs.m_size_is_valid),

m_demangled_is_synthesized(rhs.m_demangled_is_synthesized),

m_contains_linker_annotations(rhs.m_contains_linker_annotations),

m_is_weak(rhs.m_is_weak), m_type(rhs.m_type), m_mangled(rhs.m_mangled),

m_addr_range(rhs.m_addr_range), m_flags(rhs.m_flags) {}

clayborgUnsubmitted

Done

const uint64_t size = symbol.size.value_or(0);

- if (symbol.section) {

- if (SectionSP section_sp =

- section_list->FindSectionByName(ConstString(*symbol.section))) {

- const uint64_t offset =

- symbol.address ? *symbol.address - section_sp->GetFileAddress()

- : *symbol.value;

- m_addr_range = AddressRange(section_sp, offset, size);

- return;

- }

- if (symbol.address) {

- if (SectionSP section_sp =

- section_list->FindSectionContainingFileAddress(*symbol.address)) {

- const uint64_t offset = *symbol.address - section_sp->GetFileAddress();

- m_addr_range = AddressRange(section_sp, offset, size);

- return;

- }

+ if (symbol.address) {

+ if (SectionSP section_sp =

+ section_list->FindSectionContainingFileAddress(*symbol.address)) {

+ const uint64_t offset = *symbol.address - section_sp->GetFileAddress();

+ m_addr_range = AddressRange(section_sp, offset, size);

+ } else {

+ // return error

}

+ } else {

+ // Absolute symbols encode the integer value in the m_offset of the

+ // AddressRange object and the section is set to nothing.

+ m_addr_range = AddressRange(SectionSP(), *symbol.value, size);

}

Symbol::Symbol(uint32_t symID, llvm::StringRef name, SymbolType type,

This code should probably check if we have a "symbol.address" first, then always fill out the address range correctly and expect a value section_sp like the code above is doing. If we have.a symbol.value, then we don't expect a symbol to have an address, and to do this, we will in the AddressRange with no section and the "symbol.value" is the offset.

I am not sure if we need a "JSONSymbol::section" member anymore since we know if this is an address now, we will expect the section lookup to succeed and can return an error if this fails, so we can probably remove "JSONSymbol::section". WE could leave this in if we want to specify an offset directly.

clayborg: This code should probably check if we have a "symbol.address" first, then always fill out the…

const Symbol &Symbol::operator=(const Symbol &rhs) {

if (this != &rhs) {

SymbolContextScope::operator=(rhs);

m_uid = rhs.m_uid;

m_type_data = rhs.m_type_data;

m_type_data_resolved = rhs.m_type_data_resolved;

m_is_synthetic = rhs.m_is_synthetic;

m_is_debug = rhs.m_is_debug;

m_is_external = rhs.m_is_external;

m_size_is_sibling = rhs.m_size_is_sibling;

m_size_is_synthesized = rhs.m_size_is_sibling;

m_size_is_valid = rhs.m_size_is_valid;

m_demangled_is_synthesized = rhs.m_demangled_is_synthesized;

m_contains_linker_annotations = rhs.m_contains_linker_annotations;

m_is_weak = rhs.m_is_weak;

m_type = rhs.m_type;

m_mangled = rhs.m_mangled;

m_addr_range = rhs.m_addr_range;

m_flags = rhs.m_flags;

}

return *this;

}

llvm::Expected<Symbol> Symbol::FromJSON(const JSONSymbol &symbol,

SectionList *section_list) {

clayborgUnsubmitted

Done

Can we unittest this function with a ObjectFileJSONTest.cpp? It would be nice to check for the errors.

clayborg: Can we unittest this function with a ObjectFileJSONTest.cpp? It would be nice to check for the…

if (!section_list)

return llvm::make_error<llvm::StringError>("no section list provided",

llvm::inconvertibleErrorCode());

if (!symbol.value && !symbol.address)

return llvm::make_error<llvm::StringError>(

"symbol must contain either a value or an address",

llvm::inconvertibleErrorCode());

if (symbol.value && symbol.address)

return llvm::make_error<llvm::StringError>(

"symbol cannot contain both a value and an address",

llvm::inconvertibleErrorCode());

const uint64_t size = symbol.size.value_or(0);

const bool is_artificial = false;

const bool is_trampoline = false;

const bool is_debug = false;

const bool external = false;

const bool size_is_valid = symbol.size.has_value();

const bool contains_linker_annotations = false;

const uint32_t flags = 0;

if (symbol.address) {

if (SectionSP section_sp =

section_list->FindSectionContainingFileAddress(*symbol.address)) {

const uint64_t offset = *symbol.address - section_sp->GetFileAddress();

return Symbol(symbol.id.value_or(0), Mangled(symbol.name),

symbol.type.value_or(eSymbolTypeAny), external, is_debug,

is_trampoline, is_artificial,

AddressRange(section_sp, offset, size), size_is_valid,

contains_linker_annotations, flags);

}

return llvm::make_error<llvm::StringError>(

llvm::formatv("no section found for address: {0:x}", *symbol.address),

llvm::inconvertibleErrorCode());

}

// Absolute symbols encode the integer value in the m_offset of the

// AddressRange object and the section is set to nothing.

return Symbol(symbol.id.value_or(0), Mangled(symbol.name),

symbol.type.value_or(eSymbolTypeAny), external, is_debug,

is_trampoline, is_artificial,

AddressRange(SectionSP(), *symbol.value, size), size_is_valid,

contains_linker_annotations, flags);

}

void Symbol::Clear() {

m_uid = UINT32_MAX;

m_mangled.Clear();

m_type_data = 0;

m_type_data_resolved = false;

m_is_synthetic = false;

m_is_debug = false;

m_is_external = false;

▲ Show 20 Lines • Show All 511 Lines • ▼ Show 20 Lines

bool Symbol::Decode(const DataExtractor &data, lldb::offset_t *offset_ptr,

m_type = bitfields & 0x003f;

if (!m_mangled.Decode(data, offset_ptr, strtab))

return false;

if (!data.ValidOffsetForDataOfSize(*offset_ptr, 20))

return false;

const bool is_addr = data.GetU8(offset_ptr) != 0;

const uint64_t value = data.GetU64(offset_ptr);

if (is_addr) {

m_addr_range.GetBaseAddress().ResolveAddressUsingFileSections(

m_addr_range.GetBaseAddress().ResolveAddressUsingFileSections(value,

value, section_list);

section_list);

} else {

m_addr_range.GetBaseAddress().Clear();

m_addr_range.GetBaseAddress().SetOffset(value);

}

m_addr_range.SetByteSize(data.GetU64(offset_ptr));

m_flags = data.GetU32(offset_ptr);

return true;

}

/// The encoding format for the symbol is as follows:

///

/// uint32_t m_uid;

/// uint16_t m_type_data;

/// uint16_t bitfield_data;

▲ Show 20 Lines • Show All 77 Lines • ▼ Show 20 Lines

bool Symbol::operator==(const Symbol &rhs) const {

if (m_addr_range.GetBaseAddress() != rhs.m_addr_range.GetBaseAddress())

return false;

if (m_addr_range.GetByteSize() != rhs.m_addr_range.GetByteSize())

return false;

if (m_flags != rhs.m_flags)

return false;

return true;

}

namespace llvm {

namespace json {

bool fromJSON(const llvm::json::Value &value, lldb_private::JSONSymbol &symbol,

llvm::json::Path path) {

clayborgUnsubmitted

Done

Should this return an llvm::Expected<lldb_private::JSONSymbol> instead of a bool? Or is this fromJSON pattern used everywhere? Then we wouldn't need to fill in "path" and could return an error?

clayborg: Should this return an llvm::Expected<lldb_private::JSONSymbol> instead of a bool? Or is this…

JDevlieghereAuthorUnsubmitted

Done

No, these are specializations/overloads for the JSON library in LLVM. Same below.

JDevlieghere: No, these are specializations/overloads for the JSON library in LLVM. Same below.

llvm::json::ObjectMapper o(value, path);

const bool mapped = o && o.map("value", symbol.value) &&

o.map("address", symbol.address) &&

o.map("size", symbol.size) && o.map("id", symbol.id) &&

o.map("type", symbol.type) && o.map("name", symbol.name);

if (!mapped)

return false;

if (!symbol.value && !symbol.address) {

path.report("symbol must have either a value or an address");

return false;

}

if (symbol.value && symbol.address) {

path.report("symbol cannot have both a value and an address");

return false;

}

return true;

}

bool fromJSON(const llvm::json::Value &value, lldb::SymbolType &type,

llvm::json::Path path) {

clayborgUnsubmitted

Done

Should this return an llvm::Expected<SymbolType> instead of a bool? Or is this fromJSON pattern used everywhere? Then we wouldn't need to fill in "path" and could return an error?

clayborg: Should this return an llvm::Expected<SymbolType> instead of a bool? Or is this fromJSON pattern…

if (auto str = value.getAsString()) {

type = llvm::StringSwitch<lldb::SymbolType>(*str)

.Case("absolute", eSymbolTypeAbsolute)

.Case("code", eSymbolTypeCode)

.Case("resolver", eSymbolTypeResolver)

.Case("data", eSymbolTypeData)

.Case("trampoline", eSymbolTypeTrampoline)

.Case("runtime", eSymbolTypeRuntime)

.Case("exception", eSymbolTypeException)

.Case("sourcefile", eSymbolTypeSourceFile)

.Case("headerfile", eSymbolTypeHeaderFile)

.Case("objectfile", eSymbolTypeObjectFile)

.Case("commonblock", eSymbolTypeCommonBlock)

.Case("block", eSymbolTypeBlock)

.Case("local", eSymbolTypeLocal)

.Case("param", eSymbolTypeParam)

.Case("variable", eSymbolTypeVariable)

.Case("variableType", eSymbolTypeVariableType)

.Case("lineentry", eSymbolTypeLineEntry)

.Case("lineheader", eSymbolTypeLineHeader)

.Case("scopebegin", eSymbolTypeScopeBegin)

.Case("scopeend", eSymbolTypeScopeEnd)

.Case("additional,", eSymbolTypeAdditional)

.Case("compiler", eSymbolTypeCompiler)

.Case("instrumentation", eSymbolTypeInstrumentation)

.Case("undefined", eSymbolTypeUndefined)

.Case("objcclass", eSymbolTypeObjCClass)

.Case("objcmetaClass", eSymbolTypeObjCMetaClass)

.Case("objcivar", eSymbolTypeObjCIVar)

.Case("reexporte", eSymbolTypeReExported)

.Default(eSymbolTypeInvalid);

if (type == eSymbolTypeInvalid) {

path.report("invalid symbol type");

return false;

}

return true;

}

path.report("expected string");

return false;

}

} // namespace json

} // namespace llvm

lldb/test/API/macosx/symbols/Makefile

This file was added.

				C_SOURCES := main.c

				all: stripped.out

				stripped.out : a.out
				strip a.out -o stripped.out

				include Makefile.rules

lldb/test/API/macosx/symbols/TestSymbolFileJSON.py

This file was added.

				""" Testing symbol loading via JSON file. """
				import lldb
				from lldbsuite.test.decorators import *
				from lldbsuite.test.lldbtest import *
				from lldbsuite.test import lldbutil


				class TargetSymbolsFileJSON(TestBase):

				def setUp(self):
				TestBase.setUp(self)
				self.source = 'main.c'

				@no_debug_info_test
				def test_symbol_file_json_address(self):
				"""Test that 'target symbols add' can load the symbols from a JSON file using file addresses."""

				self.build()
				stripped = self.getBuildArtifact("stripped.out")
				unstripped = self.getBuildArtifact("a.out")

				# Create a JSON symbol file from the unstripped target.
				unstripped_target = self.dbg.CreateTarget(unstripped)
				self.assertTrue(unstripped_target, VALID_TARGET)

				unstripped_module = unstripped_target.GetModuleAtIndex(0)
				main_symbol = unstripped_module.FindSymbol("main")
				foo_symbol = unstripped_module.FindSymbol("foo")

				data = {
				"triple": unstripped_module.GetTriple(),
				"uuid": unstripped_module.GetUUIDString(),
				"symbols": list()
				}
				data['symbols'].append({
				"name": "main",
				"type": "code",
				"size": main_symbol.GetSize(),
				"address": main_symbol.addr.GetFileAddress(),
				})
				clayborgUnsubmitted Done Reply Inline Actions we probably should test that the "id" and "section_id" fields work correctly. We should also test that we are able to make symbols with only an address and name, then add tests for symbols that each add a new optional value that can be specified to ensure we can correctly make symbols. clayborg: we probably should test that the "id" and "section_id" fields work correctly. We should also…
				JDevlieghereAuthorUnsubmitted Done Reply Inline Actions Yep, I saw that working manually but I can extend the test case to include that. JDevlieghere: Yep, I saw that working manually but I can extend the test case to include that.
				data['symbols'].append({
				"name": "foo",
				"type": "code",
				"size": foo_symbol.GetSize(),
				"address": foo_symbol.addr.GetFileAddress(),
				})
				data['symbols'].append({
				"name": "bar",
				"type": "code",
				"size": 0,
				"value": 0xFF,
				})

				json_object = json.dumps(data, indent=4)
				json_symbol_file = self.getBuildArtifact("a.json")
				with open(json_symbol_file, "w") as outfile:
				outfile.write(json_object)

				# Create a stripped target.
				stripped_target = self.dbg.CreateTarget(stripped)
				self.assertTrue(stripped_target, VALID_TARGET)

				# Ensure there's no symbol for main and foo.
				stripped_module = stripped_target.GetModuleAtIndex(0)
				self.assertFalse(stripped_module.FindSymbol("main").IsValid())
				self.assertFalse(stripped_module.FindSymbol("foo").IsValid())
				self.assertFalse(stripped_module.FindSymbol("bar").IsValid())

				main_bp = stripped_target.BreakpointCreateByName(
				"main", "stripped.out")
				self.assertTrue(main_bp, VALID_BREAKPOINT)
				self.assertEqual(main_bp.num_locations, 0)

				# Load the JSON symbol file.
				self.runCmd("target symbols add -s %s %s" %
				(stripped, self.getBuildArtifact("a.json")))

				stripped_main_symbol = stripped_module.FindSymbol("main")
				stripped_foo_symbol = stripped_module.FindSymbol("foo")
				stripped_bar_symbol = stripped_module.FindSymbol("bar")

				# Ensure main and foo are available now.
				self.assertTrue(stripped_main_symbol.IsValid())
				self.assertTrue(stripped_foo_symbol.IsValid())
				self.assertTrue(stripped_bar_symbol.IsValid())
				self.assertEqual(main_bp.num_locations, 1)

				# Ensure the file address matches between the stripped and unstripped target.
				self.assertEqual(stripped_main_symbol.addr.GetFileAddress(),
				main_symbol.addr.GetFileAddress())
				self.assertEqual(stripped_main_symbol.addr.GetFileAddress(),
				main_symbol.addr.GetFileAddress())

				# Ensure the size matches.
				self.assertEqual(stripped_main_symbol.GetSize(), main_symbol.GetSize())
				self.assertEqual(stripped_main_symbol.GetSize(), main_symbol.GetSize())

				# Ensure the type matches.
				self.assertEqual(stripped_main_symbol.GetType(), main_symbol.GetType())
				self.assertEqual(stripped_main_symbol.GetType(), main_symbol.GetType())

				# Ensure the bar symbol has a fixed value of 10.
				self.assertEqual(stripped_bar_symbol.GetValue(), 0xFF);
				clayborgUnsubmitted Done Reply Inline Actions We should never have "section" and "value" as a combination. See above discussion about possibly removing the "section" field as we might not need it if we can assume: if "address" is valid, we must be able to look it up by address in the section list if "value" is valid, it just gets encoded as the suggested code edit clayborg: We should never have "section" and "value" as a combination. See above discussion about…

lldb/test/API/macosx/symbols/main.c

This file was added.

				int foo() { return 1; }
				int main() { return foo(); }

lldb/unittests/Symbol/CMakeLists.txt

	add_lldb_unittest(SymbolTests			add_lldb_unittest(SymbolTests
				JSONSymbolTest.cpp
	LocateSymbolFileTest.cpp			LocateSymbolFileTest.cpp
	MangledTest.cpp			MangledTest.cpp
	PostfixExpressionTest.cpp			PostfixExpressionTest.cpp
	SymbolTest.cpp			SymbolTest.cpp
	SymtabTest.cpp			SymtabTest.cpp
	TestTypeSystem.cpp			TestTypeSystem.cpp
	TestTypeSystemClang.cpp			TestTypeSystemClang.cpp
	TestClangASTImporter.cpp			TestClangASTImporter.cpp
	Show All 21 Lines

lldb/unittests/Symbol/JSONSymbolTest.cpp

This file was added.

//===-- JSONSymbolTest.cpp ------------------------------------------------===//

// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.

// See https://llvm.org/LICENSE.txt for license information.

// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception

//===----------------------------------------------------------------------===//

#include "lldb/Core/Section.h"

#include "lldb/Symbol/Symbol.h"

#include "llvm/Testing/Support/Error.h"

#include "gtest/gtest.h"

using namespace lldb;

using namespace llvm;

using namespace lldb_private;

static std::string g_error_no_section_list = "no section list provided";

static std::string g_error_both_value_and_address =

"symbol cannot contain both a value and an address";

static std::string g_error_neither_value_or_address =

"symbol must contain either a value or an address";

TEST(JSONSymbolTest, DeserializeCodeAddress) {

std::string text = R"(

{

"name": "foo",

"type": "code",

"size": 32,

"address": 4096

})";

Expected<json::Value> json = json::parse(text);

ASSERT_TRUE(static_cast<bool>(json));

clayborgUnsubmitted

Done

Expected<json::Value> json = json::parse(text);

- ASSERT_TRUE(static_cast<bool>(json));

+ EXPECT_THAT_EXPECTED(json, llvm::Succeeded());

json::Path::Root root;

Change over to using EXPECT_THAT_EXPECTED. Repeat for all cases below.

clayborg: Change over to using EXPECT_THAT_EXPECTED. Repeat for all cases below.

json::Path::Root root;

JSONSymbol json_symbol;

ASSERT_TRUE(fromJSON(*json, json_symbol, root));

SectionSP sect_sp(new Section(

/*module_sp=*/ModuleSP(),

/*obj_file=*/nullptr,

/*sect_id=*/1,

/*name=*/ConstString(".text"),

/*sect_type=*/eSectionTypeCode,

/*file_vm_addr=*/0x1000,

/*vm_size=*/0x1000,

/*file_offset=*/0,

/*file_size=*/0,

/*log2align=*/5,

/*flags=*/0x10203040));

SectionList sect_list;

sect_list.AddSection(sect_sp);

Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, &sect_list);

EXPECT_THAT_EXPECTED(symbol, llvm::Succeeded());

clayborgUnsubmitted

Done

Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, &sect_list);

- ASSERT_TRUE(static_cast<bool>(symbol));

+ EXPECT_THAT_EXPECTED(symbol, llvm::Succeeded());

EXPECT_EQ(symbol->GetName(), ConstString("foo"));

Use EXPECT_THAT_EXPECTED for clarity

clayborg: Use EXPECT_THAT_EXPECTED for clarity

EXPECT_EQ(symbol->GetName(), ConstString("foo"));

EXPECT_EQ(symbol->GetFileAddress(), static_cast<lldb::addr_t>(0x1000));

EXPECT_EQ(symbol->GetType(), eSymbolTypeCode);

}

TEST(JSONSymbolTest, DeserializeCodeValue) {

std::string text = R"(

{

"name": "foo",

"type": "code",

"size": 32,

"value": 4096

})";

Expected<json::Value> json = json::parse(text);

EXPECT_THAT_EXPECTED(json, llvm::Succeeded());

json::Path::Root root;

JSONSymbol json_symbol;

ASSERT_TRUE(fromJSON(*json, json_symbol, root));

SectionList sect_list;

Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, &sect_list);

EXPECT_THAT_EXPECTED(symbol, llvm::Succeeded());

clayborgUnsubmitted

Done

Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, &sect_list);

- ASSERT_TRUE(static_cast<bool>(symbol));

+ EXPECT_THAT_EXPECTED(symbol, llvm::Succeeded());

EXPECT_EQ(symbol->GetName(), ConstString("foo"));

clayborg:

EXPECT_EQ(symbol->GetName(), ConstString("foo"));

EXPECT_EQ(symbol->GetRawValue(), static_cast<lldb::addr_t>(0x1000));

EXPECT_EQ(symbol->GetType(), eSymbolTypeCode);

}

TEST(JSONSymbolTest, JSONInvalidValueAndAddress) {

std::string text = R"(

{

"name": "foo",

"type": "code",

"size": 32,

"value": 4096,

"address": 4096

})";

Expected<json::Value> json = json::parse(text);

EXPECT_THAT_EXPECTED(json, llvm::Succeeded());

json::Path::Root root;

JSONSymbol json_symbol;

ASSERT_FALSE(fromJSON(*json, json_symbol, root));

}

TEST(JSONSymbolTest, JSONInvalidNoValueOrAddress) {

std::string text = R"(

{

"name": "foo",

"type": "code",

"size": 32

})";

Expected<json::Value> json = json::parse(text);

EXPECT_THAT_EXPECTED(json, llvm::Succeeded());

json::Path::Root root;

JSONSymbol json_symbol;

ASSERT_FALSE(fromJSON(*json, json_symbol, root));

}

TEST(JSONSymbolTest, JSONInvalidType) {

std::string text = R"(

{

"name": "foo",

"type": "bogus",

"value": 4096,

"size": 32

})";

Expected<json::Value> json = json::parse(text);

EXPECT_THAT_EXPECTED(json, llvm::Succeeded());

json::Path::Root root;

JSONSymbol json_symbol;

ASSERT_FALSE(fromJSON(*json, json_symbol, root));

}

TEST(JSONSymbolTest, SymbolInvalidNoSectionList) {

JSONSymbol json_symbol;

json_symbol.value = 0x1;

Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, nullptr);

EXPECT_THAT_EXPECTED(symbol,

llvm::FailedWithMessage(g_error_no_section_list));

clayborgUnsubmitted

Done

Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, nullptr);

- ASSERT_FALSE(static_cast<bool>(symbol));

- EXPECT_EQ(toString(symbol.takeError()), g_error_no_section_list);

+ EXPECT_THAT_EXPECTED(symbol, llvm::FailedWithMessage(g_error_no_section_list));

}

TEST(JSONSymbolTest, SymbolInvalidValueAndAddress) {

Use EXPECT_THAT_EXPECTED for all "Expected<Symbol>" error cases.

clayborg: Use EXPECT_THAT_EXPECTED for all "Expected<Symbol>" error cases.

}

clayborgUnsubmitted

Not Done

Missed one EXPECT_THAT_EXPECTED. Feel free to fix and submit without approval.

clayborg: Missed one EXPECT_THAT_EXPECTED. Feel free to fix and submit without approval.

TEST(JSONSymbolTest, SymbolInvalidValueAndAddress) {

JSONSymbol json_symbol;

json_symbol.value = 0x1;

json_symbol.address = 0x2;

SectionList sect_list;

Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, &sect_list);

EXPECT_THAT_EXPECTED(symbol,

llvm::FailedWithMessage(g_error_both_value_and_address));

clayborgUnsubmitted

Done

Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, &sect_list);

- ASSERT_FALSE(static_cast<bool>(symbol));

- EXPECT_EQ(toString(symbol.takeError()), g_error_both_value_and_address);

+ EXPECT_THAT_EXPECTED(symbol, llvm::FailedWithMessage(g_error_both_value_and_address));

}

TEST(JSONSymbolTest, SymbolInvalidNoValueOrAddress) {

Use EXPECT_THAT_EXPECTED for all "Expected<Symbol>" error cases.

clayborg: Use EXPECT_THAT_EXPECTED for all "Expected<Symbol>" error cases.

}

TEST(JSONSymbolTest, SymbolInvalidNoValueOrAddress) {

JSONSymbol json_symbol;

SectionList sect_list;

Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, &sect_list);

EXPECT_THAT_EXPECTED(

symbol, llvm::FailedWithMessage(g_error_neither_value_or_address));

clayborgUnsubmitted

Done

Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, &sect_list);

- ASSERT_FALSE(static_cast<bool>(symbol));

- EXPECT_EQ(toString(symbol.takeError()), g_error_neither_value_or_address);

+ EXPECT_THAT_EXPECTED(symbol, llvm::FailedWithMessage(g_error_neither_value_or_address));

}

TEST(JSONSymbolTest, SymbolInvalidAddressNotInSection) {

Use EXPECT_THAT_EXPECTED for all "Expected<Symbol>" error cases.

clayborg: Use EXPECT_THAT_EXPECTED for all "Expected<Symbol>" error cases.

}

TEST(JSONSymbolTest, SymbolInvalidAddressNotInSection) {

JSONSymbol json_symbol;

json_symbol.address = 0x0fff;

SectionSP sect_sp(new Section(

/*module_sp=*/ModuleSP(),

/*obj_file=*/nullptr,

/*sect_id=*/1,

/*name=*/ConstString(".text"),

/*sect_type=*/eSectionTypeCode,

/*file_vm_addr=*/0x1000,

/*vm_size=*/0x1000,

/*file_offset=*/0,

/*file_size=*/0,

/*log2align=*/5,

/*flags=*/0x10203040));

SectionList sect_list;

sect_list.AddSection(sect_sp);

Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, &sect_list);

EXPECT_THAT_EXPECTED(

symbol, llvm::FailedWithMessage("no section found for address: 0xfff"));

}

clayborgUnsubmitted

Done

Expected<Symbol> symbol = Symbol::FromJSON(json_symbol, &sect_list);

- ASSERT_FALSE(static_cast<bool>(symbol));

- EXPECT_EQ(toString(symbol.takeError()),

- "no section found for address: 0xfff");

+ EXPECT_THAT_EXPECTED(symbol, llvm::FailedWithMessage("no section found for address: 0xfff");

}

Use EXPECT_THAT_EXPECTED for all "Expected<Symbol>" error cases

clayborg: Use EXPECT_THAT_EXPECTED for all "Expected<Symbol>" error cases

This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Introduce new SymbolFileJSON and ObjectFileJSONClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 503611

lldb/include/lldb/Symbol/Symbol.h

lldb/source/Plugins/ObjectFile/CMakeLists.txt

lldb/source/Plugins/ObjectFile/JSON/CMakeLists.txt

lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.h

lldb/source/Plugins/ObjectFile/JSON/ObjectFileJSON.cpp

lldb/source/Plugins/SymbolFile/CMakeLists.txt

lldb/source/Plugins/SymbolFile/JSON/CMakeLists.txt

lldb/source/Plugins/SymbolFile/JSON/SymbolFileJSON.h

lldb/source/Plugins/SymbolFile/JSON/SymbolFileJSON.cpp

lldb/source/Symbol/Symbol.cpp

lldb/test/API/macosx/symbols/Makefile

lldb/test/API/macosx/symbols/TestSymbolFileJSON.py

lldb/test/API/macosx/symbols/main.c

lldb/unittests/Symbol/CMakeLists.txt

lldb/unittests/Symbol/JSONSymbolTest.cpp

[lldb] Introduce new SymbolFileJSON and ObjectFileJSON
ClosedPublic