Download Raw Diff

Details

Reviewers

jhenderson
mcgrathr
phosek
MaskRay

Summary

This flag loads a JSON process context, e.g. emitted by llvm-symbolizer,
which records the runtime addresses that each virtual address in a
module corresponds to. This allows adjusting VMAs for display
automatically.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

mysterymath created this revision.Apr 11 2023, 2:05 PM

Herald added a reviewer: MaskRay. · View Herald TranscriptApr 11 2023, 2:05 PM

Herald added a project: Restricted Project. · View Herald Transcript

mysterymath requested review of this revision.Apr 11 2023, 2:05 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 11 2023, 2:05 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

mysterymath added a parent revision: D148047: [Symbolizer] Parse ProcessContext from JSON.Apr 11 2023, 2:05 PM

Harbormaster completed remote builds in B224865: Diff 512576.Apr 11 2023, 2:06 PM

Remove spurious #include; git clang-format.

Harbormaster completed remote builds in B224897: Diff 512617.Apr 11 2023, 4:41 PM

jhenderson added inline comments.Apr 12 2023, 1:25 AM

llvm/docs/CommandGuide/llvm-objdump.rst
185	This reference to the llvm-symbolizer option should be a link to the relevant documentation.
llvm/test/tools/llvm-objdump/X86/markup-context.test
10 ↗	(On Diff #512617)	In this and other dumps below, since it's really only the VMA that you care about, you should omit the other columns (replace with wildcards where necessary), to make it clear what you're actually trying to test.
48 ↗	(On Diff #512617)	This will fail on various OSes. You should use the %errc... substitutions to get the platform-specific message. You'll need to pass them in as FileCheck variables.
59 ↗	(On Diff #512617)	I think it would be useful to have comments in the YAML to explain why each section/symbol etc is interesting to the test.
llvm/tools/llvm-objdump/llvm-objdump.cpp
3054	You should probably have a test case that shows that only the last --markup-context is used.
3058	There are three checks here, but only two test cases that I can map them too. Is one of them missing a test case?

Address comments.

llvm/test/tools/llvm-objdump/X86/markup-context.test
48 ↗	(On Diff #512617)	Ah thanks, TIL!

Harbormaster completed remote builds in B225198: Diff 512992.Apr 12 2023, 4:24 PM

LGTM.

llvm/test/tools/llvm-objdump/X86/markup-context.test
63 ↗	(On Diff #512992)	Nit: most new tests in llvm-objdump use `##` for comments. Comments should also end with a "."

This revision is now accepted and ready to land.Apr 13 2023, 12:05 AM

Comment changes.

MaskRay accepted this revision.Apr 13 2023, 1:27 PM

mysterymath marked an inline comment as done.Apr 13 2023, 1:41 PM

Harbormaster completed remote builds in B225428: Diff 513341.Apr 13 2023, 2:53 PM

Rebase.

mysterymath retitled this revision from [llvm-objdump] Add --markup-context to adjust VMAs. to [llvm-objdump] Add --process-context to adjust VMAs.Apr 24 2023, 4:21 PM

mysterymath edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B227844: Diff 516569.Apr 24 2023, 5:53 PM

I hadn't really expected to have an option like this. I'd always presumed one would just use separate scripting to process the JSON into --build-id and --adjust-vma switches. But this can handle nonuniform adjustments for a module that doesn't have a constant runtime - link-time load bias across all its segments (which is possible in various general cases, though not usually used in common ELF layouts).

AFAICT, this basically uses --process-context to replace --adjust-vma and nothing else. That seems a bit odd to me, since the JSON format contains a) multiple modules and b) build IDs for each module. But here you are just looking at whatever one file the user pointed objdump at via file name or -build-id switch, with no cross-checking that the module looked up from the JSON mmap data has either the same build ID as the chosen file or a set of memory regions congruent with the phdrs (if ELF, or equivalent elsewhere) in the file. It's fine for objdump not to be doing all this stuff, but I think I'd want something doing that and so just having objdump parse the same JSON I'd want to preprocess in other ways to save me passing the --adjust-vma I could compute while doing that preprocessing seems like it might not be enough of a feature to bother supporting.

I can imagine a couple of modes for a more thoroughgoing feature that seems like it could more nearly eliminate the need to do separate JSON-based scripting to drive using this in practice.

One approach is to have JSON provide simply an address adjustment reference in addition to separate file selection, as you have here, but with some robustness cross-checking. That is, if the input file specified (and already opened by now) has a build ID, then only use address adjustments that apply to a module specified to have that build ID. If the input file has no build ID, then a good fallback cross-check that the static address has a valid runtime correspondence is to normalize the mmap region list for each module (or just each module with no build ID) in the input and compare that against the normalized (i.e. address-sorted and page-rounded) list of segments in the file's headers (PT_LOAD phdrs for ELF). Perhaps have a mode (or default) to error/warn if any static address being presented doesn't map to any module.

A fancier approach is to make any specified input files (i.e. file names or -build-id switches) identify a subset of the JSON-based list of modules. That is, the JSON-based list of modules acts like a list of -build-id switches to go and find a bunch of modules. The command-line list acts as a filter to identify a subset of those to actually find. There could be a new --all-modules switch or just no file/build-id arguments could mean all when there is context input (instead of a command-line error as it always is now). Then you act like objdump always does with multiple file arguments: dump the requested parts from one file, then the next, with "file blah in format soforth:" lines before each one (perhaps something novel in lieu of file name when it came from -build-id or implicit debuginfod fetching of all in the module list, like the build ID or the resolved debuginfod URL instead of or in addition to the possibly empty and/or synthetic file name from the module element name field).

An especially script-friendly variant of the fancy approach could accept an enriched version of the JSON schema. This would be easy for a script to inject into the context-capture JSON without ingesting it in any full way. Each module object in the JSON lists can have additional keys giving an output file name and perhaps a set of objdump switches to apply for that particular module's output (which can then combine with and/or supercede command-line switches selecting output details). This would do nearly all the work of the original scripting use case I had in mind, which would just use the symbolizer to parse markup and then lightly filter its JSON to inject {"output":"dir/{moduleid}.{name}.{buildid}.lst", "args":["-drl", "--demangle"]} and then feed that to objdump. Making objdump that fancy makes it tempting to offer some canned version without JSON massaging for objdump -dl --output=%{id}.%{name}.%{buildid}.lst or some such ad hoc syntax like various things use for generating dump file names with interpolated user strings, since that would replace the whole of my intended script. But making it a purely scriptable piece requiring some jq pipelines or the like remains extremely easy to cobble together from the smaller general pieces and not bother to fret about the precise UI details of any all-in-one switch features.

In a different all-in-one vein as I raised for contemplation on an earlier change, we again might consider these features in the more general like of different kinds of ProcessContext input via direct switches using common library code rather than only via the JSON bottleneck. (Though maybe instead we do want a JSON bottleneck, I don't know.) For objdump, making use of that kind of requires either the pure subset-specified-by-other-means or some kind of all-in-one output selector if you don't just want everything to go sequentially to stdout. But for parity with the symbolizer, it's kind of compelling to get even better analogs to:

$ eu-unstrip -n -p $$                                                                                                                                                                                                                                                                                  
0x55a546ea9000+0x134000 - - - /usr/bin/bash (deleted)                                                                                                                                                                                                                                                              
0x7fbe42000000+0x2e9000 - - - /usr/lib/locale/locale-archive (deleted)                                                                                                                                                                                                                                             
0x7fbe42347000+0x1d4000 e144007f35d794adf218479af5ddcb2a11a2c583@0x7fbe42347380 /usr/lib/x86_64-linux-gnu/libc.so.6 /usr/lib/debug/.build-id/e1/44007f35d794adf218479af5ddcb2a11a2c583.debug /usr/lib/x86_64-linux-gnu/libc.so.6                                                                                   
0x7fbe42528000+0x32000 - - - /usr/lib/x86_64-linux-gnu/libtinfo.so.6.3 (deleted)                                                                                                                                                                                                                                   
0x7fbe42563000+0x7000 697a51fa9b8ee632114d4e54f6f8ba7304f19700@0x7fbe42563248 /usr/lib/x86_64-linux-gnu/libnss_cache.so.2.0 - /usr/lib/x86_64-linux-gnu/libnss_cache.so.2.0                                                                                                                                        
0x7fbe4256a000+0x7000 - /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache - /usr/lib/x86_64-linux-gnu/gconv/gconv-modules.cache                                                                                                                                                                                  
0x7fbe42573000+0x34000 74d101cb610a46ca719adfc8365bd257139ae610@0x7fbe42573248 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2 /usr/lib/debug/.build-id/74/d101cb610a46ca719adfc8365bd257139ae610.debug /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2                                                              
0x7fffbb59b000+0x2000 - - - [vdso: 78194]                                                                                                                                                                                                                                                                          
$ eu-addr2line -p $$ 0x7fbe42579000                                                                                                                                                                                                                                                                    
./elf/./elf/dl-load.c:622:6

(I happen to have debuginfo for libc installed. Some of the tools can also read the vDSO out of memory to reconstruct it or acquire its build ID and if you have the right kernel debuginfo package you can get source line information from your vDSO code addresses too. Linux doesn't quite let this tool read process memory for hairy reasons, so it only knows the build IDs for the modules it could open by gleaned file name. Otherwise it would be able to e.g. lookup debuginfod for the bash binary even though it's been deleted locally by a package upgrade since that shell was launched, and the same for the vDSO.) There isn't another objdump tool that integrates all that stuff the way a few elfutils tools and things like systemtap do (but eu-objdump doesn't), but it doesn't seem like an unworthy goal nonetheless. :-) eu-unstrip is IIRC the only tool that deals with multiple module files as files (as opposed to e.g. eu-addr2line, which just uses them all as sources of symbols/debuginfo without caring how many different modules there are), and it uses the approach of command-line args that are a match list of module names (or resolved file names, with another switch) and omitting those args meaning all modules (where other options to say where to place all the output are required, but just use a fixed name scheme in a specified output directory rather than some fancy user-specified interpolation thing).

Diff 516569

llvm/docs/CommandGuide/llvm-objdump.rst

Show First 20 Lines • Show All 173 Lines • ▼ Show 20 Lines	.. option:: -M, --disassembler-options=<opt1[,opt2,...]>

* ``reg-names-std``: ARM only (default). Print in ARM 's instruction set documentation, with r13/r14/r15 replaced by sp/lr/pc.		* ``reg-names-std``: ARM only (default). Print in ARM 's instruction set documentation, with r13/r14/r15 replaced by sp/lr/pc.
* ``reg-names-raw``: ARM only. Use r followed by the register number.		* ``reg-names-raw``: ARM only. Use r followed by the register number.
* ``no-aliases``: AArch64 and RISC-V only. Print raw instruction mnemonic instead of pseudo instruction mnemonic.		* ``no-aliases``: AArch64 and RISC-V only. Print raw instruction mnemonic instead of pseudo instruction mnemonic.
* ``numeric``: RISC-V only. Print raw register names instead of ABI mnemonic. (e.g. print x1 instead of ra)		* ``numeric``: RISC-V only. Print raw register names instead of ABI mnemonic. (e.g. print x1 instead of ra)
* ``att``: x86 only (default). Print in the AT&T syntax.		* ``att``: x86 only (default). Print in the AT&T syntax.
* ``intel``: x86 only. Print in the intel syntax.		* ``intel``: x86 only. Print in the intel syntax.

		.. option:: --process-context=<context.json>

		Adjust addresses using a JSON process context file, e.g. one produced from
		:option:`llvm-symbolizer --dump-process-context`. Note that llvm-symbolizer
		jhendersonUnsubmitted Done Reply Inline Actions This reference to the llvm-symbolizer option should be a link to the relevant documentation. jhenderson: This reference to the llvm-symbolizer option should be a link to the relevant documentation.
		emits one or more contexts as an array of JSON objects, but this flag accepts
		either an object or an array with exactly one object.

.. option:: --mcpu=<cpu-name>		.. option:: --mcpu=<cpu-name>

Target a specific CPU type for disassembly. Specify ``--mcpu=help`` to display		Target a specific CPU type for disassembly. Specify ``--mcpu=help`` to display
available CPUs.		available CPUs.

.. option:: --mattr=<a1,+a2,-a3,...>		.. option:: --mattr=<a1,+a2,-a3,...>

Enable/disable target-specific attributes. Specify ``--mattr=help`` to display		Enable/disable target-specific attributes. Specify ``--mattr=help`` to display
▲ Show 20 Lines • Show All 263 Lines • Show Last 20 Lines

llvm/test/tools/llvm-objdump/X86/process-context.test

This file was added.

				# RUN: split-file %s %t
				# RUN: yaml2obj %t/obj.yaml -o %t/obj.o
				# RUN: llvm-objdump --all-headers -D -z --process-context=%t/context.json %t/obj.o \| FileCheck %s --check-prefixes=OUTPUT
				# RUN: llvm-objdump --all-headers -D -z --process-context=%t/missing.json --process-context=%t/context.json %t/obj.o \| FileCheck %s --check-prefixes=OUTPUT
				# RUN: not llvm-objdump --all-headers -D -z --process-context=%t/missing.json %t/obj.o 2>&1 \| FileCheck %s --check-prefixes=MISSING-FILE -DMSG=%errc_ENOENT
				# RUN: not llvm-objdump --all-headers -D -z --process-context=%t/invalid.json %t/obj.o 2>&1 \| FileCheck %s --check-prefixes=INVALID-JSON
				# RUN: not llvm-objdump --all-headers -D -z --process-context=%t/invalid-context.json %t/obj.o 2>&1 \| FileCheck %s --check-prefixes=INVALID-CONTEXT
				# RUN: not llvm-objdump --all-headers -D -z --process-context=%t/context.json %t/obj.o --adjust-vma=1 2>&1 \| FileCheck %s --check-prefixes=FLAG-CONFLICT

				# OUTPUT: Sections:
				# OUTPUT-NEXT: Idx Name Size VMA
				# OUTPUT-NEXT: 0 {{.*}} 0000000000000000
				# OUTPUT-NEXT: 1 .text {{.*}} 0000000000123000
				# OUTPUT-NEXT: 2 .debug_str {{.*}} 0000000000000000
				# OUTPUT-NEXT: 3 .rela.debug_str {{.*}} 0000000000000000
				# OUTPUT-NEXT: 4 .data {{.*}} 0000000000abc000
				# OUTPUT-NEXT: 5 .rela.data {{.*}} 0000000000000000
				# OUTPUT-NEXT: 6 .symtab {{.*}} 0000000000000000
				# OUTPUT-NEXT: 7 .strtab {{.*}} 0000000000000000
				# OUTPUT-NEXT: 8 .shstrtab {{.*}} 0000000000000000

				# OUTPUT: SYMBOL TABLE:
				# OUTPUT-NEXT: 0000000000000001 {{.*}} func
				# OUTPUT-NEXT: 0000000000000000 {{.*}} sym
				# OUTPUT-NEXT: 0000000000000000 {{.*}} .text

				# OUTPUT: 0000000000123000 <sym>:
				# OUTPUT-NEXT: 123000: {{.*}} nop
				# OUTPUT: 0000000000123001 <func>:
				# OUTPUT-NEXT: 123001: {{.*}} retq

				# OUTPUT: 0000000000000000 <.debug_str>:
				# OUTPUT-NEXT: 0: {{.*}} %al, (%rax)
				# OUTPUT-NEXT: 0000000000123001: R_X86_64_32 .text
				# OUTPUT-NEXT: 2: {{.*}} addb %al, (%rax)

				# OUTPUT: 0000000000000000 <.rela.debug_str>:
				# OUTPUT-NEXT: 0: {{.*}} addl %eax, (%rax)
				## ... There are more lines here. We do not care.

				# OUTPUT: 0000000000abc000 <.data>:
				# OUTPUT-NEXT: abc000: {{.*}} addb %al, (%rax)
				# OUTPUT-NEXT: 0000000000abc000: R_X86_64_32 .text
				# OUTPUT-NEXT: abc002: {{.*}} addb %al, (%rax)

				# OUTPUT: 0000000000000000 <.rela.data>:
				# OUTPUT-NEXT: 0: {{.*}} addb %al, (%rax)
				## ... There are more lines here. We do not care.

				# MISSING-FILE: error: '{{.*}}missing.json': [[MSG]]
				# INVALID-JSON: error: '{{.*}}invalid.json': [2:0, byte=2]: Expected object key
				# INVALID-CONTEXT: error: '{{.*}}invalid-context.json': missing value at (root).modules
				# FLAG-CONFLICT: error: --adjust-vma and --process-context are incompatible

				#--- obj.yaml
				--- !ELF
				FileHeader:
				Class: ELFCLASS64
				Data: ELFDATA2LSB
				Type: ET_REL
				Machine: EM_X86_64
				Sections:
				## Demonstrates basic dissembly address adjustment.
				- Name: .text
				Type: SHT_PROGBITS
				Flags: [ SHF_ALLOC, SHF_EXECINSTR ]
				Address: 0
				AddressAlign: 0x0000000000000004
				Content: 90C3
				## Demonstrates that addresses are not adjusted for non-alloc sections.
				- Name: .debug_str
				Type: SHT_PROGBITS
				Flags: [ SHF_MERGE, SHF_STRINGS ]
				AddressAlign: 0x0000000000000001
				Content: '00000000'
				## Demonstrates that addresses are adjusted for relocations to alloc sections.
				- Name: .rela.debug_str
				Type: SHT_RELA
				Link: .symtab
				AddressAlign: 0x0000000000000008
				Info: .debug_str
				Relocations:
				- Offset: 0x0000000000000001
				Symbol: .text
				Type: R_X86_64_32
				## Demonstrates that addresses are adjusted for disassemblies of any alloc
				## section.
				- Name: .data
				Type: SHT_PROGBITS
				Flags: [ SHF_WRITE, SHF_ALLOC ]
				AddressAlign: 0x0000000000000001
				Address: 0x0000000000000004
				Content: '00000000'
				## Demonstrates that addresses are adjusted for relocs to any alloc section.
				- Name: .rela.data
				Type: SHT_RELA
				Link: .symtab
				AddressAlign: 0x0000000000000008
				Info: .data
				Relocations:
				- Offset: 0x0000000000000000
				Symbol: .text
				Type: R_X86_64_32
				Symbols:
				- Name: func
				Type: STT_FUNC
				Section: .text
				Value: 0x0000000000000001
				- Name: sym
				Section: .text
				- Name: .text
				Type: STT_SECTION
				Section: .text

				#--- context.json
				{
				"modules" : [{
				"id": 0,
				"name": "a.o",
				"type": "elf",
				"buildID": "ab"
				}],
				"mmaps": [
				{
				"address": 1191936,
				"size": 2,
				"type": "load",
				"moduleID": 0,
				"mode": "rx",
				"moduleRelativeAddress": 0
				},
				{
				"address": 11255808,
				"size": 4,
				"type": "load",
				"moduleID": 0,
				"mode": "rx",
				"moduleRelativeAddress": 4
				}
				]
				}
				#--- invalid.json
				{
				#--- invalid-context.json
				{}

llvm/tools/llvm-objdump/ObjdumpOpts.td

	Show First 20 Lines • Show All 122 Lines • ▼ Show 20 Lines
	def mcpu_EQ : Joined<["--"], "mcpu=">,			def mcpu_EQ : Joined<["--"], "mcpu=">,
	MetaVarName<"cpu-name">,			MetaVarName<"cpu-name">,
	HelpText<"Target a specific cpu type (--mcpu=help for details)">;			HelpText<"Target a specific cpu type (--mcpu=help for details)">;

	def mattr_EQ : Joined<["--"], "mattr=">,			def mattr_EQ : Joined<["--"], "mattr=">,
	MetaVarName<"a1,+a2,-a3,...">,			MetaVarName<"a1,+a2,-a3,...">,
	HelpText<"Target specific attributes (--mattr=help for details)">;			HelpText<"Target specific attributes (--mattr=help for details)">;

				def process_context_EQ : Joined<["--"], "process-context=">,
				MetaVarName<"<context.json>">,
				HelpText<"Adjust addresses using a JSON process context, e.g. from llvm-symbolizer markup">;

	def no_show_raw_insn : Flag<["--"], "no-show-raw-insn">,			def no_show_raw_insn : Flag<["--"], "no-show-raw-insn">,
	HelpText<"When disassembling instructions, "			HelpText<"When disassembling instructions, "
	"do not print the instruction bytes.">;			"do not print the instruction bytes.">;

	def no_leading_addr : Flag<["--"], "no-leading-addr">,			def no_leading_addr : Flag<["--"], "no-leading-addr">,
	HelpText<"When disassembling, do not print leading addresses for instructions or inline relocations">;			HelpText<"When disassembling, do not print leading addresses for instructions or inline relocations">;
	def : Flag<["--"], "no-addresses">, Alias<no_leading_addr>,			def : Flag<["--"], "no-addresses">, Alias<no_leading_addr>,
	HelpText<"Alias for --no-leading-addr">;			HelpText<"Alias for --no-leading-addr">;
	▲ Show 20 Lines • Show All 237 Lines • Show Last 20 Lines

llvm/tools/llvm-objdump/llvm-objdump.cpp

Show All 26 Lines
#include "llvm/ADT/IndexedMap.h"		#include "llvm/ADT/IndexedMap.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/SetOperations.h"		#include "llvm/ADT/SetOperations.h"
#include "llvm/ADT/SmallSet.h"		#include "llvm/ADT/SmallSet.h"
#include "llvm/ADT/StringExtras.h"		#include "llvm/ADT/StringExtras.h"
#include "llvm/ADT/StringSet.h"		#include "llvm/ADT/StringSet.h"
#include "llvm/ADT/Twine.h"		#include "llvm/ADT/Twine.h"
#include "llvm/DebugInfo/DWARF/DWARFContext.h"		#include "llvm/DebugInfo/DWARF/DWARFContext.h"
		#include "llvm/DebugInfo/Symbolize/ProcessContext.h"
#include "llvm/DebugInfo/Symbolize/SymbolizableModule.h"		#include "llvm/DebugInfo/Symbolize/SymbolizableModule.h"
#include "llvm/DebugInfo/Symbolize/Symbolize.h"		#include "llvm/DebugInfo/Symbolize/Symbolize.h"
#include "llvm/Debuginfod/BuildIDFetcher.h"		#include "llvm/Debuginfod/BuildIDFetcher.h"
#include "llvm/Debuginfod/Debuginfod.h"		#include "llvm/Debuginfod/Debuginfod.h"
#include "llvm/Debuginfod/HTTPClient.h"		#include "llvm/Debuginfod/HTTPClient.h"
#include "llvm/Demangle/Demangle.h"		#include "llvm/Demangle/Demangle.h"
#include "llvm/MC/MCAsmInfo.h"		#include "llvm/MC/MCAsmInfo.h"
#include "llvm/MC/MCContext.h"		#include "llvm/MC/MCContext.h"
▲ Show 20 Lines • Show All 159 Lines • ▼ Show 20 Lines
DIDumpType objdump::DwarfDumpType;		DIDumpType objdump::DwarfDumpType;
static bool DynamicRelocations;		static bool DynamicRelocations;
static bool FaultMapSection;		static bool FaultMapSection;
static bool FileHeaders;		static bool FileHeaders;
bool objdump::SectionContents;		bool objdump::SectionContents;
static std::vector<std::string> InputFilenames;		static std::vector<std::string> InputFilenames;
bool objdump::PrintLines;		bool objdump::PrintLines;
static bool MachOOpt;		static bool MachOOpt;
		std::optional<symbolize::ProcessContext> ProcessContext;
std::string objdump::MCPU;		std::string objdump::MCPU;
std::vector<std::string> objdump::MAttrs;		std::vector<std::string> objdump::MAttrs;
bool objdump::ShowRawInsn;		bool objdump::ShowRawInsn;
bool objdump::LeadingAddr;		bool objdump::LeadingAddr;
static bool Offloading;		static bool Offloading;
static bool RawClangAST;		static bool RawClangAST;
bool objdump::Relocations;		bool objdump::Relocations;
bool objdump::PrintImmHex;		bool objdump::PrintImmHex;
▲ Show 20 Lines • Show All 770 Lines • ▼ Show 20 Lines	for (SectionRef Sec : Obj.sections()) {
std::vector<RelocationRef> &V = Ret[*Relocated];		std::vector<RelocationRef> &V = Ret[*Relocated];
append_range(V, Sec.relocations());		append_range(V, Sec.relocations());
// Sort relocations by address.		// Sort relocations by address.
llvm::stable_sort(V, isRelocAddressLess);		llvm::stable_sort(V, isRelocAddressLess);
}		}
return Ret;		return Ret;
}		}

// Used for --adjust-vma to check if address should be adjusted by the		// Adjust a virtual address for the --adjust-vma and --process-context flags. For
// specified value for a given section.		// ELF we do not adjust non-allocatable sections like debug ones, because they
// For ELF we do not adjust non-allocatable sections like debug ones,		// are not loadable.
// because they are not loadable.
// TODO: implement for other file formats.		// TODO: implement for other file formats.
static bool shouldAdjustVA(const SectionRef &Section) {		static uint64_t adjustVMA(uint64_t Addr, const SectionRef &Section) {
const ObjectFile *Obj = Section.getObject();		const ObjectFile *Obj = Section.getObject();
if (Obj->isELF())		if (!Obj->isELF())
return ELFSectionRef(Section).getFlags() & ELF::SHF_ALLOC;		return Addr;
return false;		if (!(ELFSectionRef(Section).getFlags() & ELF::SHF_ALLOC))
		return Addr;
		if (ProcessContext) {
		const symbolize::ProcessContext::MMap *MM =
		ProcessContext->getContainingMMapForMRAddr(Addr);
		if (!MM)
		return Addr;
		return MM->fromModuleRelativeAddr(Addr);
		}
		return Addr + AdjustVMA;
}		}


typedef std::pair<uint64_t, char> MappingSymbolPair;		typedef std::pair<uint64_t, char> MappingSymbolPair;
static char getMappingSymbolKind(ArrayRef<MappingSymbolPair> MappingSymbols,		static char getMappingSymbolKind(ArrayRef<MappingSymbolPair> MappingSymbols,
uint64_t Address) {		uint64_t Address) {
auto It =		auto It =
partition_point(MappingSymbols, [Address](const MappingSymbolPair &Val) {		partition_point(MappingSymbols, [Address](const MappingSymbolPair &Val) {
return Val.first <= Address;		return Val.first <= Address;
});		});
▲ Show 20 Lines • Show All 491 Lines • ▼ Show 20 Lines	if (Symbols.empty() \|\| Symbols[0].Addr != 0) {
createDummySymbolInfo(Obj, SectionAddr, SectionName,		createDummySymbolInfo(Obj, SectionAddr, SectionName,
Section.isText() ? ELF::STT_FUNC		Section.isText() ? ELF::STT_FUNC
: ELF::STT_OBJECT));		: ELF::STT_OBJECT));
}		}

SmallString<40> Comments;		SmallString<40> Comments;
raw_svector_ostream CommentStream(Comments);		raw_svector_ostream CommentStream(Comments);

uint64_t VMAAdjustment = 0;
if (shouldAdjustVA(Section))
VMAAdjustment = AdjustVMA;

// In executable and shared objects, r_offset holds a virtual address.		// In executable and shared objects, r_offset holds a virtual address.
// Subtract SectionAddr from the r_offset field of a relocation to get		// Subtract SectionAddr from the r_offset field of a relocation to get
// the section offset.		// the section offset.
uint64_t RelAdjustment = Obj.isRelocatableObject() ? 0 : SectionAddr;		uint64_t RelAdjustment = Obj.isRelocatableObject() ? 0 : SectionAddr;
uint64_t Size;		uint64_t Size;
uint64_t Index;		uint64_t Index;
bool PrintedSection = false;		bool PrintedSection = false;
std::vector<RelocationRef> Rels = RelocMap[Section];		std::vector<RelocationRef> Rels = RelocMap[Section];
▲ Show 20 Lines • Show All 130 Lines • ▼ Show 20 Lines	for (size_t SI = 0, SE = Symbols.size(); SI != SE;) {
if (!SymsToPrint[i])		if (!SymsToPrint[i])
continue;		continue;

const SymbolInfoTy &Symbol = SymbolsHere[i];		const SymbolInfoTy &Symbol = SymbolsHere[i];
const StringRef SymbolName = SymNamesHere[i];		const StringRef SymbolName = SymNamesHere[i];

if (LeadingAddr)		if (LeadingAddr)
outs() << format(Is64Bits ? "%016" PRIx64 " " : "%08" PRIx64 " ",		outs() << format(Is64Bits ? "%016" PRIx64 " " : "%08" PRIx64 " ",
SectionAddr + Start + VMAAdjustment);		adjustVMA(SectionAddr + Start, Section));
if (Obj.isXCOFF() && SymbolDescription) {		if (Obj.isXCOFF() && SymbolDescription) {
outs() << getXCOFFSymbolDescription(Symbol, SymbolName) << ":\n";		outs() << getXCOFFSymbolDescription(Symbol, SymbolName) << ":\n";
} else		} else
outs() << '<' << SymbolName << ">:\n";		outs() << '<' << SymbolName << ">:\n";
}		}

// Don't print raw contents of a virtual section. A virtual section		// Don't print raw contents of a virtual section. A virtual section
// doesn't have any contents in the file.		// doesn't have any contents in the file.
▲ Show 20 Lines • Show All 141 Lines • ▼ Show 20 Lines	for (size_t SI = 0, SE = Symbols.size(); SI != SE;) {

LVP.update({Index, Section.getIndex()},		LVP.update({Index, Section.getIndex()},
{Index + Size, Section.getIndex()}, Index + Size != End);		{Index + Size, Section.getIndex()}, Index + Size != End);

IP->setCommentStream(CommentStream);		IP->setCommentStream(CommentStream);

PIP.printInst(		PIP.printInst(
*IP, Disassembled ? &Inst : nullptr, Bytes.slice(Index, Size),		*IP, Disassembled ? &Inst : nullptr, Bytes.slice(Index, Size),
{SectionAddr + Index + VMAAdjustment, Section.getIndex()}, FOS,		{adjustVMA(SectionAddr + Index, Section), Section.getIndex()},
"", *STI, &SP, Obj.getFileName(), &Rels, LVP);		FOS, "", *STI, &SP, Obj.getFileName(), &Rels, LVP);

IP->setCommentStream(llvm::nulls());		IP->setCommentStream(llvm::nulls());

// If disassembly has failed, avoid analysing invalid/incomplete		// If disassembly has failed, avoid analysing invalid/incomplete
// instruction information. Otherwise, try to resolve the target		// instruction information. Otherwise, try to resolve the target
// address (jump target or memory operand address) and print it on the		// address (jump target or memory operand address) and print it on the
// right of the instruction.		// right of the instruction.
if (Disassembled && MIA) {		if (Disassembled && MIA) {
▲ Show 20 Lines • Show All 127 Lines • ▼ Show 20 Lines	for (size_t SI = 0, SE = Symbols.size(); SI != SE;) {
// Stop when RelCur's offset is past the disassembled		// Stop when RelCur's offset is past the disassembled
// instruction/data. Note that it's possible the disassembled data		// instruction/data. Note that it's possible the disassembled data
// is not the complete data: we might see the relocation printed in		// is not the complete data: we might see the relocation printed in
// the middle of the data, but this matches the binutils objdump		// the middle of the data, but this matches the binutils objdump
// output.		// output.
if (Offset >= Index + Size)		if (Offset >= Index + Size)
break;		break;

		uint64_t Addr = SectionAddr + Offset;

// When --adjust-vma is used, update the address printed.		// When --adjust-vma is used, update the address printed.
if (RelCur->getSymbol() != Obj.symbol_end()) {		if (RelCur->getSymbol() != Obj.symbol_end()) {
Expected<section_iterator> SymSI =		Expected<section_iterator> SymSI =
RelCur->getSymbol()->getSection();		RelCur->getSymbol()->getSection();
if (SymSI && *SymSI != Obj.section_end() &&		if (SymSI && *SymSI != Obj.section_end())
shouldAdjustVA(**SymSI))		Addr = adjustVMA(Addr, **SymSI);
Offset += AdjustVMA;
}		}

printRelocation(FOS, Obj.getFileName(), *RelCur,		printRelocation(FOS, Obj.getFileName(), *RelCur, Addr, Is64Bits);
SectionAddr + Offset, Is64Bits);
LVP.printAfterOtherLine(FOS, true);		LVP.printAfterOtherLine(FOS, true);
++RelCur;		++RelCur;
}		}
}		}

Index += Size;		Index += Size;
}		}
}		}
▲ Show 20 Lines • Show All 288 Lines • ▼ Show 20 Lines	outs() << "Idx " << left_justify("Name", NameWidth) << " Size "
<< left_justify("LMA", AddressWidth) << " Type\n";		<< left_justify("LMA", AddressWidth) << " Type\n";
else		else
outs() << "Idx " << left_justify("Name", NameWidth) << " Size "		outs() << "Idx " << left_justify("Name", NameWidth) << " Size "
<< left_justify("VMA", AddressWidth) << " Type\n";		<< left_justify("VMA", AddressWidth) << " Type\n";

uint64_t Idx;		uint64_t Idx;
for (const SectionRef &Section : ToolSectionFilter(Obj, &Idx)) {		for (const SectionRef &Section : ToolSectionFilter(Obj, &Idx)) {
StringRef Name = unwrapOrError(Section.getName(), Obj.getFileName());		StringRef Name = unwrapOrError(Section.getName(), Obj.getFileName());
uint64_t VMA = Section.getAddress();		uint64_t VMA = adjustVMA(Section.getAddress(), Section);
if (shouldAdjustVA(Section))
VMA += AdjustVMA;

uint64_t Size = Section.getSize();		uint64_t Size = Section.getSize();

std::string Type = Section.isText() ? "TEXT" : "";		std::string Type = Section.isText() ? "TEXT" : "";
if (Section.isData())		if (Section.isData())
Type += Type.empty() ? "DATA" : ", DATA";		Type += Type.empty() ? "DATA" : ", DATA";
if (Section.isBSS())		if (Section.isBSS())
Type += Type.empty() ? "BSS" : ", BSS";		Type += Type.empty() ? "BSS" : ", BSS";
▲ Show 20 Lines • Show All 721 Lines • ▼ Show 20 Lines	if (O.getGroup().isValid() && O.getGroup().getID() == OTOOL_grp_obsolete) {
reportCmdLineWarning(O.getPrefixedName() +		reportCmdLineWarning(O.getPrefixedName() +
" is obsolete and not implemented");		" is obsolete and not implemented");
}		}
}		}
}		}

static void parseObjdumpOptions(const llvm::opt::InputArgList &InputArgs) {		static void parseObjdumpOptions(const llvm::opt::InputArgList &InputArgs) {
parseIntArg(InputArgs, OBJDUMP_adjust_vma_EQ, AdjustVMA);		parseIntArg(InputArgs, OBJDUMP_adjust_vma_EQ, AdjustVMA);
		if (InputArgs.hasArg(OBJDUMP_adjust_vma_EQ) &&
		InputArgs.hasArg(OBJDUMP_process_context_EQ))
		reportCmdLineError("--adjust-vma and --process-context are incompatible");
AllHeaders = InputArgs.hasArg(OBJDUMP_all_headers);		AllHeaders = InputArgs.hasArg(OBJDUMP_all_headers);
ArchName = InputArgs.getLastArgValue(OBJDUMP_arch_name_EQ).str();		ArchName = InputArgs.getLastArgValue(OBJDUMP_arch_name_EQ).str();
ArchiveHeaders = InputArgs.hasArg(OBJDUMP_archive_headers);		ArchiveHeaders = InputArgs.hasArg(OBJDUMP_archive_headers);
Demangle = InputArgs.hasArg(OBJDUMP_demangle);		Demangle = InputArgs.hasArg(OBJDUMP_demangle);
Disassemble = InputArgs.hasArg(OBJDUMP_disassemble);		Disassemble = InputArgs.hasArg(OBJDUMP_disassemble);
DisassembleAll = InputArgs.hasArg(OBJDUMP_disassemble_all);		DisassembleAll = InputArgs.hasArg(OBJDUMP_disassemble_all);
SymbolDescription = InputArgs.hasArg(OBJDUMP_symbol_description);		SymbolDescription = InputArgs.hasArg(OBJDUMP_symbol_description);
DisassembleSymbols =		DisassembleSymbols =
Show All 9 Lines	static void parseObjdumpOptions(const llvm::opt::InputArgList &InputArgs) {
DynamicRelocations = InputArgs.hasArg(OBJDUMP_dynamic_reloc);		DynamicRelocations = InputArgs.hasArg(OBJDUMP_dynamic_reloc);
FaultMapSection = InputArgs.hasArg(OBJDUMP_fault_map_section);		FaultMapSection = InputArgs.hasArg(OBJDUMP_fault_map_section);
Offloading = InputArgs.hasArg(OBJDUMP_offloading);		Offloading = InputArgs.hasArg(OBJDUMP_offloading);
FileHeaders = InputArgs.hasArg(OBJDUMP_file_headers);		FileHeaders = InputArgs.hasArg(OBJDUMP_file_headers);
SectionContents = InputArgs.hasArg(OBJDUMP_full_contents);		SectionContents = InputArgs.hasArg(OBJDUMP_full_contents);
PrintLines = InputArgs.hasArg(OBJDUMP_line_numbers);		PrintLines = InputArgs.hasArg(OBJDUMP_line_numbers);
InputFilenames = InputArgs.getAllArgValues(OBJDUMP_INPUT);		InputFilenames = InputArgs.getAllArgValues(OBJDUMP_INPUT);
MachOOpt = InputArgs.hasArg(OBJDUMP_macho);		MachOOpt = InputArgs.hasArg(OBJDUMP_macho);
		if (const opt::Arg *A = InputArgs.getLastArg(OBJDUMP_process_context_EQ)) {
		jhendersonUnsubmitted Done Reply Inline Actions You should probably have a test case that shows that only the last --markup-context is used. jhenderson: You should probably have a test case that shows that only the last --markup-context is used.
		StringRef Filename = A->getValue();
		ErrorOr<std::unique_ptr<MemoryBuffer>> Buf =
		MemoryBuffer::getFileOrSTDIN(Filename);
		if (!Buf)
		jhendersonUnsubmitted Done Reply Inline Actions There are three checks here, but only two test cases that I can map them too. Is one of them missing a test case? jhenderson: There are three checks here, but only two test cases that I can map them too. Is one of them…
		reportError(Filename, Buf.getError().message());
		Expected<json::Value> V = json::parse((*Buf)->getBuffer());
		if (!V)
		reportError(V.takeError(), Filename);
		json::Path::Root R;
		if (!fromJSON(*V, ProcessContext.emplace(), R))
		reportError(R.getError(), Filename);
		}
MCPU = InputArgs.getLastArgValue(OBJDUMP_mcpu_EQ).str();		MCPU = InputArgs.getLastArgValue(OBJDUMP_mcpu_EQ).str();
MAttrs = commaSeparatedValues(InputArgs, OBJDUMP_mattr_EQ);		MAttrs = commaSeparatedValues(InputArgs, OBJDUMP_mattr_EQ);
ShowRawInsn = !InputArgs.hasArg(OBJDUMP_no_show_raw_insn);		ShowRawInsn = !InputArgs.hasArg(OBJDUMP_no_show_raw_insn);
LeadingAddr = !InputArgs.hasArg(OBJDUMP_no_leading_addr);		LeadingAddr = !InputArgs.hasArg(OBJDUMP_no_leading_addr);
RawClangAST = InputArgs.hasArg(OBJDUMP_raw_clang_ast);		RawClangAST = InputArgs.hasArg(OBJDUMP_raw_clang_ast);
Relocations = InputArgs.hasArg(OBJDUMP_reloc);		Relocations = InputArgs.hasArg(OBJDUMP_reloc);
PrintImmHex =		PrintImmHex =
InputArgs.hasFlag(OBJDUMP_print_imm_hex, OBJDUMP_no_print_imm_hex, true);		InputArgs.hasFlag(OBJDUMP_print_imm_hex, OBJDUMP_no_print_imm_hex, true);
▲ Show 20 Lines • Show All 200 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Add --process-context to adjust VMAs
AcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 516569

llvm/docs/CommandGuide/llvm-objdump.rst

llvm/test/tools/llvm-objdump/X86/process-context.test

llvm/tools/llvm-objdump/ObjdumpOpts.td

llvm/tools/llvm-objdump/llvm-objdump.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Add --process-context to adjust VMAsAcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 516569

llvm/docs/CommandGuide/llvm-objdump.rst

llvm/test/tools/llvm-objdump/X86/process-context.test

llvm/tools/llvm-objdump/ObjdumpOpts.td

llvm/tools/llvm-objdump/llvm-objdump.cpp

[llvm-objdump] Add --process-context to adjust VMAs
AcceptedPublic