This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
test/tools/llvm-gsymutil/
-
tools/
-
llvm-gsymutil/
-
X86/
-
elf-dwarf.yaml
-
cmdline.test
-
tools/llvm-gsymutil/
-
llvm-gsymutil/
8/8
llvm-gsymutil.cpp

Differential D102224

Add option to llvm-gsymutil to read addresses from stdin.
ClosedPublic

Authored by simon.giesecke on May 11 2021, 1:51 AM.

Download Raw Diff

Details

Reviewers

clayborg
echristo

Commits

rG0ddc75fd0834: Add option to llvm-gsymutil to read addresses from stdin.

Summary

llvm-symbolizer and llvm-addr2line allow to read addresses
from stdin, which makes them convenient to use in a context
where a large number of addresses should be resolved
(which may be too many to pass as command line arguments)
or not all addresses are known at the same time.

This patch adds a --addresses-from-stdin option to
llvm-gsymutil to allow the same.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

simon.giesecke requested review of this revision.May 11 2021, 1:51 AM

simon.giesecke created this revision.

Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2021, 1:51 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

I know this isn't covered by tests right now. I am not sure what the policy is for the command line utilities. Currently, llvm-gsymutil doesn't have any tests IIUC. If tests should be added, please direct me to a good example, and I will be happy to add them.

Harbormaster completed remote builds in B103687: Diff 344325.May 11 2021, 2:33 AM

Hm, clang-tidy complains about two issues in code I copied from llvm-symbolizer. Should these be addressed?

In D102224#2750014, @simon.giesecke wrote:

I know this isn't covered by tests right now. I am not sure what the policy is for the command line utilities. Currently, llvm-gsymutil doesn't have any tests IIUC. If tests should be added, please direct me to a good example, and I will be happy to add them.

yes tests should be added and there are indeed llvm-gsymutil tests. See this directory:

llvm/test/tools/llvm-gsymutil

In D102224#2750529, @simon.giesecke wrote:

Hm, clang-tidy complains about two issues in code I copied from llvm-symbolizer. Should these be addressed?

if it is new code, then yes.

One thing to note is that there is library code that tools can directly link to, just like llvm-gsymutil links against the "DebugInfoGSYM" library. Then your lookups are very simple and you would call a function just like doLookup() that you added. Is there a reason you are wanting to run a command line tool instead of linking against the code?

Also, I am not a fan of text scraping from the output of a tool unless it is purely for humans to read. If you are going to use this output from another tools, I would vote to add a new option: "--json" so that the output would be formatted using structured data like JSON. Something like:

$ llvm-gsymutil --json --addresses-from-stdin
0x1000 /tmp/a.gsym
0x2000 /tmp/b.gsym

And this would return output in a specific JSON format, something like:

[
  { "lookupAddress": 4096, "gsym": "/tmp/a.gsym", ... },
  { "lookupAddress": 8192, "gsym": "/tmp/b.gsym", ... }
]

The "..." above would be a JSON version of the llvm::gsym::LookupResult object, or an error message.

llvm/tools/llvm-gsymutil/llvm-gsymutil.cpp
117	You will need to specify what format the STDIN format needs to be. From reading to the code below it seems to be: <addr> <gsym-path> [<addr> <gsym-path> ...]
490	There should be a test for this error
501	Can we use C++ STL here? Or there might be some other LLVM tools that use STDIN using some other LLVM input wrapping? std::string line; std::getline(std::cin, line); `
505	What happens if you send the following input into this tool: "0x1000 /tmp/a.gsym 0x2000 /tmp/b.gsym " If I read the code above correctly, it will end up with a string: "0x1000 /tmp/a.gsym0x2000 /tmp/b.gsym" because you are erasing all newline characters and the newline between "0x1000 /tmp/a.gsym" and "0x2000 /tmp/b.gsym" will be removed.
507–509	Does any other tool use this type of stdin format where you specify an address and a file? Unless any other existing tool does (like atos?) I would think that we would run the tool with a GSYM file and then do multiple lookups on that one GSYM file: $ llvm-gsymutil /tmp/a.gsym --addresses-from-stdin 0x1000 0x2000 0x3000

This revision now requires changes to proceed.May 11 2021, 2:37 PM

Sounds like, after I reread the description, that we have other tools in the llvm ecosystem that use this <addr> <file> format... Sorry for the noise. I will add inline comments to clarify any needed changes.

So looks like we just need tests:

check error when user specifies --addresses-from-stdin and also an address
check error when user specifies --addresses-from-stdin and also an input file
check for successful lookups on multiple address + file tuples

And if would be good to specify the input format for the --addresses-from-stdin option in the option description.

llvm/tools/llvm-gsymutil/llvm-gsymutil.cpp
505	Never mind, we are grabbing one line at a time, so this wouldn't happen... Ignore above comment.
509	Ignore this, after I reread the description, it seems clear there are other tools using this same format.

On optimization idea is that is one input file is specified, we could specify only addresses in the STDIN? Something like:

$ llvm-gsymutil --addresses-from-stdin /tmp/a.gsym
0x1000 0x2000 0x3000

If this is desirable, we would need to implement and document this in the --addresses-from-stdin help text.

In D102224#2752153, @clayborg wrote:

In D102224#2750014, @simon.giesecke wrote:

I know this isn't covered by tests right now. I am not sure what the policy is for the command line utilities. Currently, llvm-gsymutil doesn't have any tests IIUC. If tests should be added, please direct me to a good example, and I will be happy to add them.

yes tests should be added and there are indeed llvm-gsymutil tests. See this directory:

llvm/test/tools/llvm-gsymutil

Ah, sorry for missing that. It seems so obvious now that I don't know how I missed it.

In D102224#2750529, @simon.giesecke wrote:

Hm, clang-tidy complains about two issues in code I copied from llvm-symbolizer. Should these be addressed?

if it is new code, then yes.

One thing to note is that there is library code that tools can directly link to, just like llvm-gsymutil links against the "DebugInfoGSYM" library. Then your lookups are very simple and you would call a function just like doLookup() that you added. Is there a reason you are wanting to run a command line tool instead of linking against the code?

In a first step, we just wanted to switch from one command line tool (addr2line) to another. For some use cases, linking against the library would be an option, and probably indeed one we will take in a future step. In other use cases, this might be less feasible, notably pprof, which is written in Go, which also calls other command line utilities such as nm, addr2line or llvm-symbolizer.

Also, I am not a fan of text scraping from the output of a tool unless it is purely for humans to read. If you are going to use this output from another tools, I would vote to add a new option: "--json" so that the output would be formatted using structured data like JSON. Something like:
$ llvm-gsymutil --json --addresses-from-stdin
0x1000 /tmp/a.gsym
0x2000 /tmp/b.gsym
And this would return output in a specific JSON format, something like:
[
  { "lookupAddress": 4096, "gsym": "/tmp/a.gsym", ... },
  { "lookupAddress": 8192, "gsym": "/tmp/b.gsym", ... }
]
The "..." above would be a JSON version of the llvm::gsym::LookupResult object, or an error message.

I agree that having a JSON output would be better for further processing. I just wanted to keep changes minimal for now, and stick with the output format the tool has. Just after putting up this review, I noticed that https://reviews.llvm.org/D96883 added a --json option to llvm-symbolizer.

We should probably use the same format here.

That makes me wonder whether the direction this takes is actually the right one. Another option might be to add support for GSYM to llvm-symbolizer. Do you think this would be a better direction?

In D102224#2753356, @simon.giesecke wrote:
In D102224#2752153, @clayborg wrote:

In D102224#2750014, @simon.giesecke wrote:

I know this isn't covered by tests right now. I am not sure what the policy is for the command line utilities. Currently, llvm-gsymutil doesn't have any tests IIUC. If tests should be added, please direct me to a good example, and I will be happy to add them.

yes tests should be added and there are indeed llvm-gsymutil tests. See this directory:

llvm/test/tools/llvm-gsymutil

Ah, sorry for missing that. It seems so obvious now that I don't know how I missed it.

In D102224#2750529, @simon.giesecke wrote:

Hm, clang-tidy complains about two issues in code I copied from llvm-symbolizer. Should these be addressed?

if it is new code, then yes.

One thing to note is that there is library code that tools can directly link to, just like llvm-gsymutil links against the "DebugInfoGSYM" library. Then your lookups are very simple and you would call a function just like doLookup() that you added. Is there a reason you are wanting to run a command line tool instead of linking against the code?

In a first step, we just wanted to switch from one command line tool (addr2line) to another. For some use cases, linking against the library would be an option, and probably indeed one we will take in a future step. In other use cases, this might be less feasible, notably pprof, which is written in Go, which also calls other command line utilities such as nm, addr2line or llvm-symbolizer.
Also, I am not a fan of text scraping from the output of a tool unless it is purely for humans to read. If you are going to use this output from another tools, I would vote to add a new option: "--json" so that the output would be formatted using structured data like JSON. Something like:
$ llvm-gsymutil --json --addresses-from-stdin
0x1000 /tmp/a.gsym
0x2000 /tmp/b.gsym
And this would return output in a specific JSON format, something like:
[
  { "lookupAddress": 4096, "gsym": "/tmp/a.gsym", ... },
  { "lookupAddress": 8192, "gsym": "/tmp/b.gsym", ... }
]
The "..." above would be a JSON version of the llvm::gsym::LookupResult object, or an error message.
I agree that having a JSON output would be better for further processing. I just wanted to keep changes minimal for now, and stick with the output format the tool has. Just after putting up this review, I noticed that https://reviews.llvm.org/D96883 added a --json option to llvm-symbolizer.

We should probably use the same format here.

That makes me wonder whether the direction this takes is actually the right one. Another option might be to add support for GSYM to llvm-symbolizer. Do you think this would be a better direction?

It would be great to add support for GSYM into llvm-symbolizer! It also should be very easy to add if users supply the symbol file to llvm-symbolizer as it would be easy to sniff the first few bytes of the files for magic bytes and routing to the correctly file format's parser.

That being said, since we have other tools that are already using this format, I am totally fine with this being in llvm-gsymutil.

Update using arc

Harbormaster completed remote builds in B104986: Diff 346112.May 18 2021, 5:38 AM

Addressed review comments

llvm/tools/llvm-gsymutil/llvm-gsymutil.cpp
501	I copied this from `llvm-symbolizer` but I can surely change this to use STL. I can also use some other LLVM tools, but would need some guidance.

@clayborg I addressed your comments now. Adding support for --json is something I will look into separately, if that's fine for you.

Harbormaster completed remote builds in B105195: Diff 346400.May 19 2021, 5:16 AM

Fix path handling in test case on Windows

Harbormaster completed remote builds in B105225: Diff 346440.May 19 2021, 7:25 AM

Looks good to me. Thanks for the changes.

This revision is now accepted and ready to land.May 19 2021, 3:05 PM

This revision was landed with ongoing or failed builds.May 19 2021, 11:11 PM

Closed by commit rG0ddc75fd0834: Add option to llvm-gsymutil to read addresses from stdin. (authored by simon.giesecke). · Explain Why

This revision was automatically updated to reflect the committed changes.

simon.giesecke added a commit: rG0ddc75fd0834: Add option to llvm-gsymutil to read addresses from stdin..

That makes me wonder whether the direction this takes is actually the right one. Another option might be to add support for GSYM to llvm-symbolizer. Do you think this would be a better direction?

It would be great to add support for GSYM into llvm-symbolizer! It also should be very easy to add if users supply the symbol file to llvm-symbolizer as it would be easy to sniff the first few bytes of the files for magic bytes and routing to the correctly file format's parser.

I am working on this now :) I am trying to find my way around llvm-symbolizer, and I came across DILineInfo, which is documented as "A format-neutral container for source line information." Is there already a way to get a DILineInfo from GSYM files? I haven't found any direct way to do this, in GsymReader or so.

In D102224#2874784, @simon.giesecke wrote:

That makes me wonder whether the direction this takes is actually the right one. Another option might be to add support for GSYM to llvm-symbolizer. Do you think this would be a better direction?

It would be great to add support for GSYM into llvm-symbolizer! It also should be very easy to add if users supply the symbol file to llvm-symbolizer as it would be easy to sniff the first few bytes of the files for magic bytes and routing to the correctly file format's parser.

I am working on this now :) I am trying to find my way around llvm-symbolizer, and I came across DILineInfo, which is documented as "A format-neutral container for source line information." Is there already a way to get a DILineInfo from GSYM files? I haven't found any direct way to do this, in GsymReader or so.

Not yet! Feel free to add an accessor to convert any GSYM data structures into DI data structures. There are two main formats that come out of GSYM: the full llvm::gsym::FunctionInfo or the llvm::gsym::LookupResult. llvm::gsym::FunctionInfo is the complete information for the function with the name, address range, line table and inline info. This info covers all addresses in the function. llvm::gsym::LookupResult is just the information you need for a single address and in a much friendlier format where all strings have been retrieved from the string table. It should be very easy to make any needed conversions. Let me know if you have any questions.

In D102224#2874874, @clayborg wrote:

In D102224#2874784, @simon.giesecke wrote:

That makes me wonder whether the direction this takes is actually the right one. Another option might be to add support for GSYM to llvm-symbolizer. Do you think this would be a better direction?

It would be great to add support for GSYM into llvm-symbolizer! It also should be very easy to add if users supply the symbol file to llvm-symbolizer as it would be easy to sniff the first few bytes of the files for magic bytes and routing to the correctly file format's parser.

I am working on this now :) I am trying to find my way around llvm-symbolizer, and I came across DILineInfo, which is documented as "A format-neutral container for source line information." Is there already a way to get a DILineInfo from GSYM files? I haven't found any direct way to do this, in GsymReader or so.

Not yet! Feel free to add an accessor to convert any GSYM data structures into DI data structures. There are two main formats that come out of GSYM: the full llvm::gsym::FunctionInfo or the llvm::gsym::LookupResult. llvm::gsym::FunctionInfo is the complete information for the function with the name, address range, line table and inline info. This info covers all addresses in the function. llvm::gsym::LookupResult is just the information you need for a single address and in a much friendlier format where all strings have been retrieved from the string table. It should be very easy to make any needed conversions. Let me know if you have any questions.

Ok, I just wanted to make sure I don't reinvent the wheel! Thanks for the directions, I'll get back to you if necessary :)

Revision Contents

Path

Size

llvm/

test/

tools/

llvm-gsymutil/

X86/

elf-dwarf.yaml

7 lines

cmdline.test

5 lines

tools/

llvm-gsymutil/

llvm-gsymutil.cpp

109 lines

Diff 346637

llvm/test/tools/llvm-gsymutil/X86/elf-dwarf.yaml

	## Test loading an ELF file with DWARF. First we make the ELF file from yaml,			## Test loading an ELF file with DWARF. First we make the ELF file from yaml,
	## then we convert the ELF file to GSYM, then we do lookups on the newly			## then we convert the ELF file to GSYM, then we do lookups on the newly
	## created GSYM, and finally we dump the entire GSYM.			## created GSYM, and finally we dump the entire GSYM.

	# RUN: yaml2obj %s -o %t			# RUN: yaml2obj %s -o %t
	# RUN: llvm-gsymutil --convert %t -o %t.gsym 2>&1 \| FileCheck %s --check-prefix=CONVERT			# RUN: llvm-gsymutil --convert %t -o %t.gsym 2>&1 \| FileCheck %s --check-prefix=CONVERT
	# RUN: llvm-gsymutil --address=0x400391 --address=0x4004cd %t.gsym 2>&1 \| FileCheck %s --check-prefix=ADDR			# RUN: llvm-gsymutil --address=0x400391 --address=0x4004cd %t.gsym 2>&1 \| FileCheck %s --check-prefix=ADDR
				# RUN: echo -e "0x400391 %/t.gsym\n0x4004cd %/t.gsym" \| llvm-gsymutil --addresses-from-stdin 2>&1 \| FileCheck %s --check-prefix=ADDRI --dump-input=always
				# RUN: llvm-gsymutil --address=0x400391 --address=0x4004cd --verbose %t.gsym 2>&1 \| FileCheck %s --check-prefix=ADDRV --dump-input=always
	# RUN: llvm-gsymutil --address=0x400391 --address=0x4004cd --verbose %t.gsym 2>&1 \| FileCheck %s --check-prefix=ADDRV --dump-input=always			# RUN: llvm-gsymutil --address=0x400391 --address=0x4004cd --verbose %t.gsym 2>&1 \| FileCheck %s --check-prefix=ADDRV --dump-input=always
	# RUN: llvm-gsymutil %t.gsym 2>&1 \| FileCheck %s --check-prefix=DUMP			# RUN: llvm-gsymutil %t.gsym 2>&1 \| FileCheck %s --check-prefix=DUMP

	# ADDR: Looking up addresses in "{{.*\.yaml\.tmp\.gsym}}":			# ADDR: Looking up addresses in "{{.*\.yaml\.tmp\.gsym}}":
	# ADDR: 0x0000000000400391: _init			# ADDR: 0x0000000000400391: _init
	# ADDR: 0x00000000004004cd: main @ /tmp/main.cpp:1			# ADDR: 0x00000000004004cd: main @ /tmp/main.cpp:1

				# ADDRI: 0x0000000000400391: _init
				# ADDRI-EMPTY:
				# ADDRI: 0x00000000004004cd: main @ /tmp/main.cpp:1
				# ADDRI-EMPTY:

	# ADDRV: Looking up addresses in "{{.*\.yaml\.tmp\.gsym}}":			# ADDRV: Looking up addresses in "{{.*\.yaml\.tmp\.gsym}}":
	# ADDRV: FunctionInfo for 0x0000000000400391:			# ADDRV: FunctionInfo for 0x0000000000400391:
	# ADDRV: [0x0000000000400390 - 0x0000000000400390) "_init"			# ADDRV: [0x0000000000400390 - 0x0000000000400390) "_init"
	# ADDRV: LookupResult for 0x0000000000400391:			# ADDRV: LookupResult for 0x0000000000400391:
	# ADDRV: 0x0000000000400391: _init			# ADDRV: 0x0000000000400391: _init
	# ADDRV: FunctionInfo for 0x00000000004004cd:			# ADDRV: FunctionInfo for 0x00000000004004cd:
	# ADDRV: [0x00000000004004cd - 0x00000000004004df) "main"			# ADDRV: [0x00000000004004cd - 0x00000000004004df) "main"
	# ADDRV: LineTable:			# ADDRV: LineTable:
	▲ Show 20 Lines • Show All 646 Lines • Show Last 20 Lines

llvm/test/tools/llvm-gsymutil/cmdline.test

	RUN: llvm-gsymutil -h 2>&1 \| FileCheck --check-prefix=HELP %s			RUN: llvm-gsymutil -h 2>&1 \| FileCheck --check-prefix=HELP %s
	RUN: llvm-gsymutil --help 2>&1 \| FileCheck --check-prefix=HELP %s			RUN: llvm-gsymutil --help 2>&1 \| FileCheck --check-prefix=HELP %s
	HELP: OVERVIEW: A tool for dumping, searching and creating GSYM files.			HELP: OVERVIEW: A tool for dumping, searching and creating GSYM files.
	HELP: USAGE: llvm-gsymutil{{[^ ]*}} [options] <input GSYM files>			HELP: USAGE: llvm-gsymutil{{[^ ]*}} [options] <input GSYM files>
	HELP: OPTIONS:			HELP: OPTIONS:
	HELP: Conversion Options:			HELP: Conversion Options:
	HELP: --arch=<arch>			HELP: --arch=<arch>
	HELP: --convert=<path>			HELP: --convert=<path>
	HELP: --num-threads=<n>			HELP: --num-threads=<n>
	HELP: --out-file=<path>			HELP: --out-file=<path>
	HELP: --verify			HELP: --verify
	HELP: Generic Options:			HELP: Generic Options:
	HELP: --help			HELP: --help
	HELP: --version			HELP: --version
	HELP: Lookup Options:			HELP: Lookup Options:
	HELP: --address=<addr>			HELP: --address=<addr>
				HELP: --addresses-from-stdin
	HELP: Options:			HELP: Options:
	HELP: --verbose			HELP: --verbose

	RUN: llvm-gsymutil --version 2>&1 \| FileCheck --check-prefix=VERSION %s			RUN: llvm-gsymutil --version 2>&1 \| FileCheck --check-prefix=VERSION %s
	VERSION: {{ version }}			VERSION: {{ version }}

				RUN: not llvm-gsymutil --addresses-from-stdin --address 0x12345678 \| FileCheck --check-prefix=INCOMPATIBLE %s
				RUN: not llvm-gsymutil --addresses-from-stdin llvm-gsymutil \| FileCheck --check-prefix=INCOMPATIBLE %s
				INCOMPATIBLE: error: no input files or addresses can be specified when using the --addresses-from-stdin option.

llvm/tools/llvm-gsymutil/llvm-gsymutil.cpp

Show All 24 Lines
#include "llvm/Support/PrettyStackTrace.h"		#include "llvm/Support/PrettyStackTrace.h"
#include "llvm/Support/Regex.h"		#include "llvm/Support/Regex.h"
#include "llvm/Support/Signals.h"		#include "llvm/Support/Signals.h"
#include "llvm/Support/TargetSelect.h"		#include "llvm/Support/TargetSelect.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <algorithm>		#include <algorithm>
#include <cstring>		#include <cstring>
#include <inttypes.h>		#include <inttypes.h>
		#include <iostream>
#include <map>		#include <map>
#include <string>		#include <string>
#include <system_error>		#include <system_error>
#include <vector>		#include <vector>

#include "llvm/DebugInfo/GSYM/DwarfTransformer.h"		#include "llvm/DebugInfo/GSYM/DwarfTransformer.h"
#include "llvm/DebugInfo/GSYM/FunctionInfo.h"		#include "llvm/DebugInfo/GSYM/FunctionInfo.h"
#include "llvm/DebugInfo/GSYM/GsymCreator.h"		#include "llvm/DebugInfo/GSYM/GsymCreator.h"
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	NumThreads("num-threads",
"number of cores on the current machine."),		"number of cores on the current machine."),
cl::value_desc("n"), cat(ConversionOptions));		cl::value_desc("n"), cat(ConversionOptions));

static list<uint64_t> LookupAddresses("address",		static list<uint64_t> LookupAddresses("address",
desc("Lookup an address in a GSYM file"),		desc("Lookup an address in a GSYM file"),
cl::value_desc("addr"),		cl::value_desc("addr"),
cat(LookupOptions));		cat(LookupOptions));

		static opt<bool> LookupAddressesFromStdin(
		"addresses-from-stdin",
		clayborgUnsubmitted Done Reply Inline Actions You will need to specify what format the STDIN format needs to be. From reading to the code below it seems to be: <addr> <gsym-path> [<addr> <gsym-path> ...] clayborg: You will need to specify what format the STDIN format needs to be. From reading to the code…
		desc("Lookup addresses in a GSYM file that are read from stdin\nEach input "
		"line is expected to be of the following format: <addr> <gsym-path>"),
		cat(LookupOptions));

} // namespace		} // namespace
/// @}		/// @}
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

static void error(StringRef Prefix, llvm::Error Err) {		static void error(StringRef Prefix, llvm::Error Err) {
if (!Err)		if (!Err)
return;		return;
errs() << Prefix << ": " << Err << "\n";		errs() << Prefix << ": " << Err << "\n";
consumeError(std::move(Err));		consumeError(std::move(Err));
exit(1);		exit(1);
}		}

static void error(StringRef Prefix, std::error_code EC) {		static void error(StringRef Prefix, std::error_code EC) {
if (!EC)		if (!EC)
return;		return;
errs() << Prefix << ": " << EC.message() << "\n";		errs() << Prefix << ": " << EC.message() << "\n";
exit(1);		exit(1);
}		}


/// If the input path is a .dSYM bundle (as created by the dsymutil tool),		/// If the input path is a .dSYM bundle (as created by the dsymutil tool),
/// replace it with individual entries for each of the object files inside the		/// replace it with individual entries for each of the object files inside the
/// bundle otherwise return the input path.		/// bundle otherwise return the input path.
static std::vector<std::string> expandBundle(const std::string &InputPath) {		static std::vector<std::string> expandBundle(const std::string &InputPath) {
std::vector<std::string> BundlePaths;		std::vector<std::string> BundlePaths;
SmallString<256> BundlePath(InputPath);		SmallString<256> BundlePath(InputPath);
// Manually open up the bundle to avoid introducing additional dependencies.		// Manually open up the bundle to avoid introducing additional dependencies.
if (sys::fs::is_directory(BundlePath) &&		if (sys::fs::is_directory(BundlePath) &&
▲ Show 20 Lines • Show All 121 Lines • ▼ Show 20 Lines	else if (const auto *ELFObj = dyn_cast<object::ELF32BEObjectFile>(&Obj))
return getImageBaseAddress(ELFObj->getELFFile());		return getImageBaseAddress(ELFObj->getELFFile());
else if (const auto *ELFObj = dyn_cast<object::ELF64LEObjectFile>(&Obj))		else if (const auto *ELFObj = dyn_cast<object::ELF64LEObjectFile>(&Obj))
return getImageBaseAddress(ELFObj->getELFFile());		return getImageBaseAddress(ELFObj->getELFFile());
else if (const auto *ELFObj = dyn_cast<object::ELF64BEObjectFile>(&Obj))		else if (const auto *ELFObj = dyn_cast<object::ELF64BEObjectFile>(&Obj))
return getImageBaseAddress(ELFObj->getELFFile());		return getImageBaseAddress(ELFObj->getELFFile());
return llvm::None;		return llvm::None;
}		}


static llvm::Error handleObjectFile(ObjectFile &Obj,		static llvm::Error handleObjectFile(ObjectFile &Obj,
const std::string &OutFile) {		const std::string &OutFile) {
auto ThreadCount =		auto ThreadCount =
NumThreads > 0 ? NumThreads : std::thread::hardware_concurrency();		NumThreads > 0 ? NumThreads : std::thread::hardware_concurrency();
auto &OS = outs();		auto &OS = outs();

GsymCreator Gsym;		GsymCreator Gsym;

Show All 18 Lines	for (const object::SectionRef &Sect : Obj.sections()) {
TextRanges.insert(AddressRange(StartAddr, StartAddr + Size));		TextRanges.insert(AddressRange(StartAddr, StartAddr + Size));
}		}

// Make sure there is DWARF to convert first.		// Make sure there is DWARF to convert first.
std::unique_ptr<DWARFContext> DICtx = DWARFContext::create(Obj);		std::unique_ptr<DWARFContext> DICtx = DWARFContext::create(Obj);
if (!DICtx)		if (!DICtx)
return createStringError(std::errc::invalid_argument,		return createStringError(std::errc::invalid_argument,
"unable to create DWARF context");		"unable to create DWARF context");
logAllUnhandledErrors(DICtx->loadRegisterInfo(Obj), OS,		logAllUnhandledErrors(DICtx->loadRegisterInfo(Obj), OS, "DwarfTransformer: ");
"DwarfTransformer: ");

// Make a DWARF transformer object and populate the ranges of the code		// Make a DWARF transformer object and populate the ranges of the code
// so we don't end up adding invalid functions to GSYM data.		// so we don't end up adding invalid functions to GSYM data.
DwarfTransformer DT(*DICtx, OS, Gsym);		DwarfTransformer DT(*DICtx, OS, Gsym);
if (!TextRanges.empty())		if (!TextRanges.empty())
Gsym.SetValidTextRanges(TextRanges);		Gsym.SetValidTextRanges(TextRanges);

// Convert all DWARF to GSYM.		// Convert all DWARF to GSYM.
if (auto Err = DT.convert(ThreadCount))		if (auto Err = DT.convert(ThreadCount))
return Err;		return Err;

// Get the UUID and convert symbol table to GSYM.		// Get the UUID and convert symbol table to GSYM.
if (auto Err = ObjectFileTransformer::convert(Obj, OS, Gsym))		if (auto Err = ObjectFileTransformer::convert(Obj, OS, Gsym))
return Err;		return Err;

// Finalize the GSYM to make it ready to save to disk. This will remove		// Finalize the GSYM to make it ready to save to disk. This will remove
// duplicate FunctionInfo entries where we might have found an entry from		// duplicate FunctionInfo entries where we might have found an entry from
// debug info and also a symbol table entry from the object file.		// debug info and also a symbol table entry from the object file.
if (auto Err = Gsym.finalize(OS))		if (auto Err = Gsym.finalize(OS))
return Err;		return Err;

// Save the GSYM file to disk.		// Save the GSYM file to disk.
support::endianness Endian = Obj.makeTriple().isLittleEndian() ?		support::endianness Endian =
support::little : support::big;		Obj.makeTriple().isLittleEndian() ? support::little : support::big;
if (auto Err = Gsym.save(OutFile.c_str(), Endian))		if (auto Err = Gsym.save(OutFile.c_str(), Endian))
return Err;		return Err;

// Verify the DWARF if requested. This will ensure all the info in the DWARF		// Verify the DWARF if requested. This will ensure all the info in the DWARF
// can be looked up in the GSYM and that all lookups get matching data.		// can be looked up in the GSYM and that all lookups get matching data.
if (Verify) {		if (Verify) {
if (auto Err = DT.verify(OutFile))		if (auto Err = DT.verify(OutFile))
return Err;		return Err;
Show All 12 Lines	if (auto *Obj = dyn_cast<ObjectFile>(BinOrErr->get())) {
auto ArchName = ObjTriple.getArchName();		auto ArchName = ObjTriple.getArchName();
outs() << "Output file (" << ArchName << "): " << OutFile << "\n";		outs() << "Output file (" << ArchName << "): " << OutFile << "\n";
if (auto Err = handleObjectFile(*Obj, OutFile.c_str()))		if (auto Err = handleObjectFile(*Obj, OutFile.c_str()))
return Err;		return Err;
} else if (auto *Fat = dyn_cast<MachOUniversalBinary>(BinOrErr->get())) {		} else if (auto *Fat = dyn_cast<MachOUniversalBinary>(BinOrErr->get())) {
// Iterate over all contained architectures and filter out any that were		// Iterate over all contained architectures and filter out any that were
// not specified with the "--arch <arch>" option. If the --arch option was		// not specified with the "--arch <arch>" option. If the --arch option was
// not specified on the command line, we will process all architectures.		// not specified on the command line, we will process all architectures.
std::vector< std::unique_ptr<MachOObjectFile> > FilterObjs;		std::vector<std::unique_ptr<MachOObjectFile>> FilterObjs;
for (auto &ObjForArch : Fat->objects()) {		for (auto &ObjForArch : Fat->objects()) {
if (auto MachOOrErr = ObjForArch.getAsObjectFile()) {		if (auto MachOOrErr = ObjForArch.getAsObjectFile()) {
auto &Obj = **MachOOrErr;		auto &Obj = **MachOOrErr;
if (filterArch(Obj))		if (filterArch(Obj))
FilterObjs.emplace_back(MachOOrErr->release());		FilterObjs.emplace_back(MachOOrErr->release());
} else {		} else {
error(Filename, MachOOrErr.takeError());		error(Filename, MachOOrErr.takeError());
}		}
}		}
if (FilterObjs.empty())		if (FilterObjs.empty())
error(Filename, createStringError(std::errc::invalid_argument,		error(Filename, createStringError(std::errc::invalid_argument,
"no matching architectures found"));		"no matching architectures found"));

// Now handle each architecture we need to convert.		// Now handle each architecture we need to convert.
for (auto &Obj: FilterObjs) {		for (auto &Obj : FilterObjs) {
Triple ObjTriple(Obj->getArchTriple());		Triple ObjTriple(Obj->getArchTriple());
auto ArchName = ObjTriple.getArchName();		auto ArchName = ObjTriple.getArchName();
std::string ArchOutFile(OutFile);		std::string ArchOutFile(OutFile);
// If we are only handling a single architecture, then we will use the		// If we are only handling a single architecture, then we will use the
// normal output file. If we are handling multiple architectures append		// normal output file. If we are handling multiple architectures append
// the architecture name to the end of the out file path so that we		// the architecture name to the end of the out file path so that we
// don't overwrite the previous architecture's gsym file.		// don't overwrite the previous architecture's gsym file.
if (FilterObjs.size() > 1) {		if (FilterObjs.size() > 1) {
Show All 33 Lines	static llvm::Error convertFileToGSYM(raw_ostream &OS) {

for (auto Object : Objects) {		for (auto Object : Objects) {
if (auto Err = handleFileConversionToGSYM(Object, OutFile))		if (auto Err = handleFileConversionToGSYM(Object, OutFile))
return Err;		return Err;
}		}
return Error::success();		return Error::success();
}		}

		static void doLookup(GsymReader &Gsym, uint64_t Addr, raw_ostream &OS) {
		if (auto Result = Gsym.lookup(Addr)) {
		// If verbose is enabled dump the full function info for the address.
		if (Verbose) {
		if (auto FI = Gsym.getFunctionInfo(Addr)) {
		OS << "FunctionInfo for " << HEX64(Addr) << ":\n";
		Gsym.dump(OS, *FI);
		OS << "\nLookupResult for " << HEX64(Addr) << ":\n";
		}
		}
		OS << Result.get();
		} else {
		if (Verbose)
		OS << "\nLookupResult for " << HEX64(Addr) << ":\n";
		OS << HEX64(Addr) << ": ";
		logAllUnhandledErrors(Result.takeError(), OS, "error: ");
		}
		if (Verbose)
		OS << "\n";
		}

int main(int argc, char const *argv[]) {		int main(int argc, char const *argv[]) {
// Print a stack trace if we signal out.		// Print a stack trace if we signal out.
sys::PrintStackTraceOnErrorSignal(argv[0]);		sys::PrintStackTraceOnErrorSignal(argv[0]);
PrettyStackTraceProgram X(argc, argv);		PrettyStackTraceProgram X(argc, argv);
llvm_shutdown_obj Y; // Call llvm_shutdown() on exit.		llvm_shutdown_obj Y; // Call llvm_shutdown() on exit.

llvm::InitializeAllTargets();		llvm::InitializeAllTargets();

const char *Overview =		const char *Overview =
"A tool for dumping, searching and creating GSYM files.\n\n"		"A tool for dumping, searching and creating GSYM files.\n\n"
"Specify one or more GSYM paths as arguments to dump all of the "		"Specify one or more GSYM paths as arguments to dump all of the "
"information in each GSYM file.\n"		"information in each GSYM file.\n"
"Specify a single GSYM file along with one or more --lookup options to "		"Specify a single GSYM file along with one or more --lookup options to "
"lookup addresses within that GSYM file.\n"		"lookup addresses within that GSYM file.\n"
"Use the --convert option to specify a file with option --out-file "		"Use the --convert option to specify a file with option --out-file "
"option to convert to GSYM format.\n";		"option to convert to GSYM format.\n";
HideUnrelatedOptions(		HideUnrelatedOptions({&GeneralOptions, &ConversionOptions, &LookupOptions});
{&GeneralOptions, &ConversionOptions, &LookupOptions});
cl::ParseCommandLineOptions(argc, argv, Overview);		cl::ParseCommandLineOptions(argc, argv, Overview);

if (Help) {		if (Help) {
PrintHelpMessage(/Hidden =/false, /Categorized =/true);		PrintHelpMessage(/Hidden =/false, /Categorized =/true);
return 0;		return 0;
}		}

raw_ostream &OS = outs();		raw_ostream &OS = outs();

if (!ConvertFilename.empty()) {		if (!ConvertFilename.empty()) {
// Convert DWARF to GSYM		// Convert DWARF to GSYM
if (!InputFilenames.empty()) {		if (!InputFilenames.empty()) {
OS << "error: no input files can be specified when using the --convert "		OS << "error: no input files can be specified when using the --convert "
"option.\n";		"option.\n";
return 1;		return 1;
}		}
// Call error() if we have an error and it will exit with a status of 1		// Call error() if we have an error and it will exit with a status of 1
if (auto Err = convertFileToGSYM(OS))		if (auto Err = convertFileToGSYM(OS))
error("DWARF conversion failed: ", std::move(Err));		error("DWARF conversion failed: ", std::move(Err));
return 0;		return 0;
}		}

		if (LookupAddressesFromStdin) {
		clayborgUnsubmitted Done Reply Inline Actions There should be a test for this error clayborg: There should be a test for this error
		if (!LookupAddresses.empty() \|\| !InputFilenames.empty()) {
		OS << "error: no input files or addresses can be specified when using "
		"the --addresses-from-stdin "
		"option.\n";
		return 1;
		}

		std::string InputLine;
		std::string CurrentGSYMPath;
		llvm::Optional<Expected<GsymReader>> CurrentGsym;

		clayborgUnsubmitted Done Reply Inline Actions Can we use C++ STL here? Or there might be some other LLVM tools that use STDIN using some other LLVM input wrapping? std::string line; std::getline(std::cin, line); ` clayborg: Can we use C++ STL here? Or there might be some other LLVM tools that use STDIN using some…
		simon.gieseckeAuthorUnsubmitted Done Reply Inline Actions I copied this from `llvm-symbolizer` but I can surely change this to use STL. I can also use some other LLVM tools, but would need some guidance. simon.giesecke: I copied this from `llvm-symbolizer` but I can surely change this to use STL. I can also use…
		while (std::getline(std::cin, InputLine)) {
		// Strip newline characters.
		std::string StrippedInputLine(InputLine);
		llvm::erase_if(StrippedInputLine,
		clayborgUnsubmitted Done Reply Inline Actions What happens if you send the following input into this tool: "0x1000 /tmp/a.gsym 0x2000 /tmp/b.gsym " If I read the code above correctly, it will end up with a string: "0x1000 /tmp/a.gsym0x2000 /tmp/b.gsym" because you are erasing all newline characters and the newline between "0x1000 /tmp/a.gsym" and "0x2000 /tmp/b.gsym" will be removed. clayborg: What happens if you send the following input into this tool: "0x1000 /tmp/a.gsym 0x2000 /tmp/b.
		clayborgUnsubmitted Done Reply Inline Actions Never mind, we are grabbing one line at a time, so this wouldn't happen... Ignore above comment. clayborg: Never mind, we are grabbing one line at a time, so this wouldn't happen... Ignore above comment.
		[](char c) { return c == '\r' \|\| c == '\n'; });

		StringRef AddrStr, GSYMPath;
		std::tie(AddrStr, GSYMPath) =
		clayborgUnsubmitted Done Reply Inline Actions Does any other tool use this type of stdin format where you specify an address and a file? Unless any other existing tool does (like atos?) I would think that we would run the tool with a GSYM file and then do multiple lookups on that one GSYM file: $ llvm-gsymutil /tmp/a.gsym --addresses-from-stdin 0x1000 0x2000 0x3000 clayborg: Does any other tool use this type of stdin format where you specify an address and a file?
		clayborgUnsubmitted Done Reply Inline Actions Ignore this, after I reread the description, it seems clear there are other tools using this same format. clayborg: Ignore this, after I reread the description, it seems clear there are other tools using this…
		llvm::StringRef{StrippedInputLine}.split(' ');

		if (GSYMPath != CurrentGSYMPath) {
		CurrentGsym = GsymReader::openFile(GSYMPath);
		if (!*CurrentGsym)
		error(GSYMPath, CurrentGsym->takeError());
		}

		uint64_t Addr;
		if (AddrStr.getAsInteger(0, Addr)) {
		OS << "error: invalid address " << AddrStr
		<< ", expected: Address GsymFile.\n";
		return 1;
		}

		doLookup(**CurrentGsym, Addr, OS);

		OS << "\n";
		OS.flush();
		}

		return EXIT_SUCCESS;
		}

// Dump or access data inside GSYM files		// Dump or access data inside GSYM files
for (const auto &GSYMPath : InputFilenames) {		for (const auto &GSYMPath : InputFilenames) {
auto Gsym = GsymReader::openFile(GSYMPath);		auto Gsym = GsymReader::openFile(GSYMPath);
if (!Gsym)		if (!Gsym)
error(GSYMPath, Gsym.takeError());		error(GSYMPath, Gsym.takeError());

if (LookupAddresses.empty()) {		if (LookupAddresses.empty()) {
Gsym->dump(outs());		Gsym->dump(outs());
continue;		continue;
}		}

// Lookup an address in a GSYM file and print any matches.		// Lookup an address in a GSYM file and print any matches.
OS << "Looking up addresses in \"" << GSYMPath << "\":\n";		OS << "Looking up addresses in \"" << GSYMPath << "\":\n";
for (auto Addr: LookupAddresses) {		for (auto Addr : LookupAddresses) {
if (auto Result = Gsym->lookup(Addr)) {		doLookup(*Gsym, Addr, OS);
// If verbose is enabled dump the full function info for the address.
if (Verbose) {
if (auto FI = Gsym->getFunctionInfo(Addr)) {
OS << "FunctionInfo for " << HEX64(Addr) << ":\n";
Gsym->dump(OS, *FI);
OS << "\nLookupResult for " << HEX64(Addr) << ":\n";
}
}
OS << Result.get();
} else {
if (Verbose)
OS << "\nLookupResult for " << HEX64(Addr) << ":\n";
OS << HEX64(Addr) << ": ";
logAllUnhandledErrors(Result.takeError(), OS, "error: ");
}
if (Verbose)
OS << "\n";
}		}
}		}
return EXIT_SUCCESS;		return EXIT_SUCCESS;
}		}