Download Raw Diff

Details

Reviewers

• tstellarAMD
Bigcheese

Commits

rGde04805e9f05: [AMDGPU] llvm-objdump: Minimal HSA Code Object disassembler support.
rGbd90c60afb86: [AMDGPU] llvm-objdump: Minimal HSA Code Object disassembler support.
rL265645: [AMDGPU] llvm-objdump: Minimal HSA Code Object disassembler support.
rL265550: [AMDGPU] llvm-objdump: Minimal HSA Code Object disassembler support.

Summary

Minimal ISA dumper for amdgcn object file.

.hsatext ELF kernel symbol references amd_kernel_code_t runtime control structure (256 bytes) followed by instructions, so skip it.

Diff Detail

Repository: rL LLVM

Event Timeline

vpykhtin updated this revision to Diff 47224.Feb 8 2016, 10:47 AM

vpykhtin retitled this revision from to [AMDGPU] llvm-objdump, MCDisassembler: disassembling .hsatext section of HSA Code Object (draft).

vpykhtin updated this object.

vpykhtin added a reviewer: • tstellarAMD.

vpykhtin set the repository for this revision to rL LLVM.

vpykhtin added subscribers: nhaustov, SamWot.

Herald added a subscriber: aemerson. · View Herald TranscriptFeb 8 2016, 10:47 AM

Is there some reason why you can't parse the code object in from the AMDGPU implementation of getInstruction()?

In D16998#346610, @tstellarAMD wrote:

Is there some reason why you can't parse the code object in from the AMDGPU implementation of getInstruction()?

MCDisassembler::getInstruction is declared as follows:

/// Returns the disassembly of a single instruction.
///
/// \param Instr    - An MCInst to populate with the contents of the
///                   instruction.
/// \param Size     - A value to populate with the size of the instruction, or
///                   the number of bytes consumed while attempting to decode
///                   an invalid instruction.
/// ...
virtual DecodeStatus getInstruction(MCInst &Instr, uint64_t &Size,
                                    ArrayRef<uint8_t> Bytes, uint64_t Address,
                                    raw_ostream &VStream,
                                    raw_ostream &CStream) const = 0;

Its supposed to return filled MCInst as a result and return consumed bytes num.

Current DisassemblyObject just passes what it consider as 'code' to the disassembler and therefore inside getInstruction we facing the following:

We need to know whether we're actually inside an amd_kernel_code_t. Yes we can analize Address but llvm-mc calls dissassembler for a raw stream of bytes and we need to care of it too.
amd_kernel_code_t cannot be stored to MCInst but its desirable to dump it too as part of disassembly.
it would be a hack to try reposition reader specifying negative size in the case when amd_kernel_code_t is actually before amd_kernel_code_t
we need to process all code symbols in .hsasection to find all kernels ends as kernel size is not specified by design.
DisassemblyObject knows nothing about ELF::STT_AMDGPU_HSA_KERNEL

Hi,

In the loop that iterates over symbols and emits them, there is already some target specific code for AARCH64.
I would recommend adding a callback to MCDisassembler called something like disassembleSymbolStart() and call that function for the AMDGPU target, and then move the disassembly code for the amd_kernel_code_t into the AMDGPU backend.

In D16998#347911, @tstellarAMD wrote:

Hi,

In the loop that iterates over symbols and emits them, there is already some target specific code for AARCH64.
I would recommend adding a callback to MCDisassembler called something like disassembleSymbolStart() and call that function for the AMDGPU target, and then move the disassembly code for the amd_kernel_code_t into the AMDGPU backend.

For now I just moved all my code to AMDGPUDisassembler library (links with llvm-objdump already) and call DisassembleHSACodeObject in the beginning of llvm-objdump's DisassembleObject if Obj->getArch() == Triple::amdgcn. It still lefts llvm-objdump slightly AMDGPU target dependent but now I think it's not worth to touch MCDisassembler interface until we know all the requirements. It's likely that we might want to disassemble HSA code object completely with AMDGPU code not depending on llvm-objdump.

AMDGPU dependent code moved to AMDGPUDisassembler library.
Dissassembly code slightly enhanced to be ready assembled again. A lot of TODO though.
Skipping unussembled code bytes nicely

Herald added a subscriber: arsenm. · View Herald TranscriptFeb 11 2016, 9:49 AM

vpykhtin added parent revisions: D17150: [AMDGPU] table-driven parser/printer for amd_kernel_code_t structure fields, D17144: [AMDGPU] add AMDGPU target support to ELFObjectFile.h header.Feb 11 2016, 9:50 AM

vpykhtin added a parent revision: D16723: [AMDGPU] Disassembler: Added basic disassembler for AMDGPU target..Feb 11 2016, 10:00 AM

arsenm added inline comments.Feb 11 2016, 11:47 AM

lib/Target/AMDGPU/Disassembler/HSAObjCodeDisassembler.cpp
113 ↗	(On Diff #47672)	Would this be useful in the MCContext header?
150–152 ↗	(On Diff #47672)	This is dead code
158 ↗	(On Diff #47672)	The triple should be amdgcn-unknown-amdhsa. Why do you need to hardcode this here?
166–167 ↗	(On Diff #47672)	I don't think this should be able to happen. These checks should maybe all be asserts
208 ↗	(On Diff #47672)	Put declarations on separate lines
225 ↗	(On Diff #47672)	Single quotes
228 ↗	(On Diff #47672)	Single quotes
tools/llvm-objdump/llvm-objdump.cpp
834	Function should start with lowercase letter

vpykhtin added inline comments.Feb 11 2016, 12:23 PM

lib/Target/AMDGPU/Disassembler/HSAObjCodeDisassembler.cpp
113 ↗	(On Diff #47672)	Yep, I would move it there.
150–152 ↗	(On Diff #47672)	Well, when NDEBUG there is no DebugFlag, but it is tested below if (!DisAsm->getInstruction(Inst, EatenBytesNum, Code.slice(Index), Index, DebugFlag ? dbgs() : nulls(), nulls())) { I just dont like interleave code with macros a lot.
158 ↗	(On Diff #47672)	probably not need to hardcode, but I haven't found yet how to construct it properly.
166–167 ↗	(On Diff #47672)	ok.

vpykhtin added a project: Restricted Project.Feb 13 2016, 3:41 AM

• tstellarAMD added inline comments.Feb 24 2016, 6:48 PM

tools/llvm-objdump/llvm-objdump.cpp
840	You aren't going to be able to call target specific functions from a common tool like this. The only way this is going to work is with some kind of generic callback.

nhaustov added subscribers: ruiu, • rafael.Mar 2 2016, 1:54 AM

nhaustov added inline comments.Mar 9 2016, 5:36 AM

tools/llvm-objdump/llvm-objdump.cpp
840	llvm-objdump actually has custom code for AArch64 that tries to determine whether bytes being disassembled is code or data: // AArch64 ELF binaries can interleave data and text in the // same section. We rely on the markers introduced to // understand what we need to dump. ... To me, proposed code in separate file looks cleaner and does not clutter llvm-objdump.cpp. Anyway, another question is whether it is really only AMDGPU that needs custom handling of parts of .text segment? How do other platforms handle metadata needed for code startup? Any pointers? One reason to keep amd_kernel_code_t (metadata needed for kernel startup) close to actual ISA is effiiency and cache concerns. We can still probably change it in new format.

HSACodeObjectDisassembler class introduced
ELFNote support added
Target ISA determined based on amdgpu_hsa_note_isa_t ELF note (spec)
printing for assembler directives

Herald added a subscriber: rampitec. · View Herald TranscriptMar 9 2016, 11:14 AM

Testcase?

Added binary test

How has the test created. What is a ".co"?

In D16998#371942, @rafael wrote:

How has the test created. What is a ".co"?

".co" means CodeObject. It was created using out of llvm tree utility. It is of HSA Code Object v1.0 format which is slightly different from what AMDGPU assembler currently emitting, the main difference is - code symbol has zero size and amd_kernel_code_t (runtime control structure) and ISA could be non-adjacent. The dissasembler is supposed to accept both versions.

lib/Target/AMDGPU/Disassembler/HSAObjCodeDisassembler.cpp
113 ↗	(On Diff #47672)	Let's do it in separate change later.

Adding quote from summary so it can be seen in notification emails:

Problem: this change contains direct call for disassembleHSACodeObject from llvm-objdump which is actually located in the AMDGPU library. This is incorrect as require llvm-objdump always link with AMDGPU target. There should be a target callback somewhere that allows custom object file disassembly. One thought was to add such callback (something like disassembleObjectFile(ObjFile, OutStream)) to the MCDisassembler interface but it is a bit late because MCDisassembler is created when the target ISA specification is already known (it depends on MCSubtargetInfo) but the target ISA spec is yet to be read from the object file really.

In other words, this is a Elf_Rel file? Why do you need a new extension?

Right Elf_Rel. Disassembling of this format requires different features that is currently suggested by llvm-objdump and these are really target specific. In particular we would like to print all custom directives and structures so it can be fed directly to assembler input, for example amd_kernel_code_t runtime control structure definitions. Dumping this structure without llvm-objdump->AMDGPU link dependency would require copy-paste of the target code. I'm not sure it's possible to create sufficient number of target callbacks to accomplish this.

Another issue is with the HSA Code Object 1.0 format itself - zero sized kernel code symbol require to preprocess the whole .hsatext section to find start markers of every item in this section to calculate the kernel code size. This particular issue is fixed with what AMDGPU assembler currently emits but we would like disassembler could accept old files too.

llvm-commits mailing list
llvm-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits

Made minimal possible change.

Limitations:

accepts object file produced by current AMDGPU assembler (no older formats support)
only ISA dumping (no directives (notes), amd_kernel_code_t etc.)
mcpu should be explicitly specified upon llvm-objdump run (no autodetection)

Added test using llvm-mc | llvm-objdump check.

Bigcheese requested changes to this revision.Mar 17 2016, 1:36 PM

Bigcheese added a reviewer: Bigcheese.

Bigcheese added a subscriber: Bigcheese.

Bigcheese added inline comments.

tools/llvm-objdump/llvm-objdump.cpp
397	This should be using a endian specific type. Either llvm::support::ulittle32_t or ubig32_t.
1182	Same with the endianness.

This revision now requires changes to proceed.Mar 17 2016, 1:36 PM

Thanks for the review!

Updated the diff with issues fixed.

ping.

lgtm

This revision is now accepted and ready to land.Apr 5 2016, 4:19 PM

Closed by commit rL265550: [AMDGPU] llvm-objdump: Minimal HSA Code Object disassembler support. (authored by vpykhtin). · Explain WhyApr 6 2016, 9:00 AM

This revision was automatically updated to reflect the committed changes.

I reverted it for a while, instruction bytes are printed in bigendian order on ppc despite on support::ulittle32_t usage. Need to fix.

Diff 47224

tools/llvm-objdump/llvm-objdump.cpp

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines
#include "llvm/Support/TargetRegistry.h"		#include "llvm/Support/TargetRegistry.h"
#include "llvm/Support/TargetSelect.h"		#include "llvm/Support/TargetSelect.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <algorithm>		#include <algorithm>
#include <cctype>		#include <cctype>
#include <cstring>		#include <cstring>
#include <system_error>		#include <system_error>

		#include "../../lib/Target/AMDGPU/AMDKernelCodeT.h"

using namespace llvm;		using namespace llvm;
using namespace object;		using namespace object;

static cl::list<std::string>		static cl::list<std::string>
InputFilenames(cl::Positional, cl::desc("<input object files>"),cl::ZeroOrMore);		InputFilenames(cl::Positional, cl::desc("<input object files>"),cl::ZeroOrMore);

cl::opt<bool>		cl::opt<bool>
llvm::Disassemble("disassemble",		llvm::Disassemble("disassemble",
▲ Show 20 Lines • Show All 316 Lines • ▼ Show 20 Lines
}		}

template <class ELFT>		template <class ELFT>
static std::error_code getRelocationValueString(const ELFObjectFile<ELFT> *Obj,		static std::error_code getRelocationValueString(const ELFObjectFile<ELFT> *Obj,
const RelocationRef &RelRef,		const RelocationRef &RelRef,
SmallVectorImpl<char> &Result) {		SmallVectorImpl<char> &Result) {
DataRefImpl Rel = RelRef.getRawDataRefImpl();		DataRefImpl Rel = RelRef.getRawDataRefImpl();

typedef typename ELFObjectFile<ELFT>::Elf_Sym Elf_Sym;		typedef typename ELFObjectFile<ELFT>::Elf_Sym Elf_Sym;
		BigcheeseUnsubmitted Not Done Reply Inline Actions This should be using a endian specific type. Either llvm::support::ulittle32_t or ubig32_t. Bigcheese: This should be using a endian specific type. Either llvm::support::ulittle32_t or ubig32_t.
typedef typename ELFObjectFile<ELFT>::Elf_Shdr Elf_Shdr;		typedef typename ELFObjectFile<ELFT>::Elf_Shdr Elf_Shdr;
typedef typename ELFObjectFile<ELFT>::Elf_Rela Elf_Rela;		typedef typename ELFObjectFile<ELFT>::Elf_Rela Elf_Rela;

const ELFFile<ELFT> &EF = *Obj->getELFFile();		const ELFFile<ELFT> &EF = *Obj->getELFFile();

ErrorOr<const Elf_Shdr *> SecOrErr = EF.getSection(Rel.d.a);		ErrorOr<const Elf_Shdr *> SecOrErr = EF.getSection(Rel.d.a);
if (std::error_code EC = SecOrErr.getError())		if (std::error_code EC = SecOrErr.getError())
return EC;		return EC;
▲ Show 20 Lines • Show All 420 Lines • ▼ Show 20 Lines	if (Type == MachO::X86_64_RELOC_UNSIGNED && Rel.d.a > 0) {
uint64_t PrevType = MachO->getRelocationType(RelPrev);		uint64_t PrevType = MachO->getRelocationType(RelPrev);
if (PrevType == MachO::X86_64_RELOC_SUBTRACTOR)		if (PrevType == MachO::X86_64_RELOC_SUBTRACTOR)
return true;		return true;
}		}
}		}

return false;		return false;
}		}

		arsenmUnsubmitted Done Reply Inline Actions Function should start with lowercase letter arsenm: Function should start with lowercase letter
		class HSACodeObject {
		public:
		typedef ELF64LEObjectFile MyELF;
		const MyELF& Obj;

		HSACodeObject(const ObjectFile *Obj_)
		tstellarAMDUnsubmitted Not Done Reply Inline Actions You aren't going to be able to call target specific functions from a common tool like this. The only way this is going to work is with some kind of generic callback. tstellarAMD: You aren't going to be able to call target specific functions from a common tool like this.
		nhaustovUnsubmitted Not Done Reply Inline Actions llvm-objdump actually has custom code for AArch64 that tries to determine whether bytes being disassembled is code or data: // AArch64 ELF binaries can interleave data and text in the // same section. We rely on the markers introduced to // understand what we need to dump. ... To me, proposed code in separate file looks cleaner and does not clutter llvm-objdump.cpp. Anyway, another question is whether it is really only AMDGPU that needs custom handling of parts of .text segment? How do other platforms handle metadata needed for code startup? Any pointers? One reason to keep amd_kernel_code_t (metadata needed for kernel startup) close to actual ISA is effiiency and cache concerns. We can still probably change it in new format. nhaustov: llvm-objdump actually has custom code for AArch64 that tries to determine whether bytes being…
		: Obj(static_cast<const ELF64LEObjectFile>(Obj_)) {}

		std::pair<const amd_kernel_code_t*, ArrayRef<uint8_t>>
		getKernel(SymbolRef Symbol) const {
		auto ElfSym = Obj.getSymbol(Symbol.getRawDataRefImpl());
		if (ElfSym->getType() != ELF::STT_AMDGPU_HSA_KERNEL) {
		return decltype(this->getKernel(Symbol))();
		}

		auto ELF = Obj.getELFFile();
		auto SecBytes = *ELF->getSectionContentsAsArray<uint8_t>(
		*ELF->getSection(ElfSym->st_shndx));

		uint64_t const Ofs = ElfSym->getValue();
		auto const C =
		reinterpret_cast<const amd_kernel_code_t*>(SecBytes.slice(Ofs).begin());

		auto& M = getStartMarkers();
		uint64_t const CodeStart = Ofs + C->kernel_code_entry_byte_offset;
		uint64_t const CodeEnd = *std::upper_bound(M.begin(), M.end(), CodeStart);

		return std::make_pair(C, SecBytes.slice(CodeStart, CodeEnd - CodeStart));
		}

		private:
		mutable SmallVector<uint64_t, 8> StartMarkers;

		const decltype(StartMarkers)& getStartMarkers() const;
		};

		// StartMarkers is a sorted array of every entity's begin in the code section.
		// We determine an entity's ending using upper bound on this array because there'is
		// no kernel code size specified by design
		const decltype(HSACodeObject::StartMarkers)&
		HSACodeObject::getStartMarkers() const {
		if (!StartMarkers.empty()) return StartMarkers;

		auto ELF = Obj.getELFFile();
		int HsaTextSecIdx = -1;
		ArrayRef<uint8_t> SecBytes;

		for (auto& Symbol : Obj.symbols()) {
		auto ElfSym = Obj.getSymbol(Symbol.getRawDataRefImpl());
		if (HsaTextSecIdx != -1 && ElfSym->st_shndx != HsaTextSecIdx) continue;
		switch (ElfSym->getType()) {
		case ELF::STT_AMDGPU_HSA_KERNEL: {
		if (HsaTextSecIdx == -1) {
		HsaTextSecIdx = ElfSym->st_shndx;
		SecBytes = *ELF->getSectionContentsAsArray<uint8_t>(
		*ELF->getSection(HsaTextSecIdx));
		}
		uint64_t Ofs = ElfSym->getValue();
		StartMarkers.push_back(Ofs);
		auto C = reinterpret_cast<const amd_kernel_code_t*>(
		SecBytes.slice(Ofs).begin());
		StartMarkers.push_back(Ofs + C->kernel_code_entry_byte_offset);
		break;
		}
		default: break;
		};
		}

		StartMarkers.push_back(SecBytes.size());
		array_pod_sort(StartMarkers.begin(), StartMarkers.end());

		return StartMarkers;
		}

		namespace llvm {
		void dump(const amd_kernel_code_t, raw_ostream& os, const char);
		}

		static void DisassembleHSACodeObject(const ObjectFile *Obj,
		MCInstPrinter& IP,
		MCDisassembler& DisAsm,
		const MCSubtargetInfo& STI) {

		#ifdef NDEBUG
		bool const DebugFlag = false;
		#endif

		HSACodeObject CObj(Obj);
		SmallString<40> Comments;
		raw_svector_ostream CommentStream(Comments);
		raw_ostream &DebugOut = DebugFlag ? dbgs() : nulls();
		PrettyPrinter &PIP = selectPrettyPrinter(Triple(TripleName));

		// Disassemble symbol by symbol.
		for (auto Symbol : Obj->symbols()) {

		auto Kernel = CObj.getKernel(Symbol);
		if (!Kernel.first) continue;

		outs() << *Symbol.getName() << '\n';
		dump(Kernel.first, outs(), " ");

		const auto& Code = Kernel.second;
		if (Code.empty()) continue;

		uint64_t Size = 0;
		for (uint64_t Index = 0; Index < Code.size(); Index += Size) {
		MCInst Inst;
		if (!DisAsm.getInstruction(Inst, Size,
		Code.slice(Index), Index,
		DebugOut, CommentStream)) {
		errs() << ToolName << ": warning: invalid instruction encoding\n";
		if (Size == 0) Size = 4; // skip illegible dwords
		continue;
		}

		PIP.printInst(IP, &Inst, Code.slice(Index, Size),
		Index, outs(), "", STI);
		outs() << CommentStream.str();
		Comments.clear();
		outs() << "\n";
		}
		outs() << "\n";
		}
		}


static void DisassembleObject(const ObjectFile *Obj, bool InlineRelocs) {		static void DisassembleObject(const ObjectFile *Obj, bool InlineRelocs) {
const Target *TheTarget = getTarget(Obj);		const Target *TheTarget = getTarget(Obj);

// Package up features to be passed to target/subtarget		// Package up features to be passed to target/subtarget
std::string FeaturesStr;		std::string FeaturesStr;
if (MAttrs.size()) {		if (MAttrs.size()) {
SubtargetFeatures Features;		SubtargetFeatures Features;
for (unsigned i = 0; i != MAttrs.size(); ++i)		for (unsigned i = 0; i != MAttrs.size(); ++i)
Show All 33 Lines	static void DisassembleObject(const ObjectFile *Obj, bool InlineRelocs) {
std::unique_ptr<MCInstPrinter> IP(TheTarget->createMCInstPrinter(		std::unique_ptr<MCInstPrinter> IP(TheTarget->createMCInstPrinter(
Triple(TripleName), AsmPrinterVariant, AsmInfo, MII, *MRI));		Triple(TripleName), AsmPrinterVariant, AsmInfo, MII, *MRI));
if (!IP)		if (!IP)
report_fatal_error("error: no instruction printer for target " +		report_fatal_error("error: no instruction printer for target " +
TripleName);		TripleName);
IP->setPrintImmHex(PrintImmHex);		IP->setPrintImmHex(PrintImmHex);
PrettyPrinter &PIP = selectPrettyPrinter(Triple(TripleName));		PrettyPrinter &PIP = selectPrettyPrinter(Triple(TripleName));

		if (Obj->getArch() == Triple::amdgcn) {
		DisassembleHSACodeObject(Obj, IP, DisAsm, *STI);
		return;
		}

StringRef Fmt = Obj->getBytesInAddress() > 4 ? "\t\t%016" PRIx64 ": " :		StringRef Fmt = Obj->getBytesInAddress() > 4 ? "\t\t%016" PRIx64 ": " :
"\t\t\t%08" PRIx64 ": ";		"\t\t\t%08" PRIx64 ": ";

// Create a mapping, RelocSecs = SectionRelocMap[S], where sections		// Create a mapping, RelocSecs = SectionRelocMap[S], where sections
// in RelocSecs contain the relocations for section S.		// in RelocSecs contain the relocations for section S.
std::error_code EC;		std::error_code EC;
std::map<SectionRef, SmallVector<SectionRef, 1>> SectionRelocMap;		std::map<SectionRef, SmallVector<SectionRef, 1>> SectionRelocMap;
for (const SectionRef &Section : ToolSectionFilter(*Obj)) {		for (const SectionRef &Section : ToolSectionFilter(*Obj)) {
▲ Show 20 Lines • Show All 150 Lines • ▼ Show 20 Lines	for (unsigned si = 0, se = Symbols.size(); si != se; ++si) {
outs() << '\n' << Symbols[si].second << ":\n";		outs() << '\n' << Symbols[si].second << ":\n";

#ifndef NDEBUG		#ifndef NDEBUG
raw_ostream &DebugOut = DebugFlag ? dbgs() : nulls();		raw_ostream &DebugOut = DebugFlag ? dbgs() : nulls();
#else		#else
raw_ostream &DebugOut = nulls();		raw_ostream &DebugOut = nulls();
#endif		#endif

for (Index = Start; Index < End; Index += Size) {		for (Index = Start; Index < End; Index += Size) {
		BigcheeseUnsubmitted Not Done Reply Inline Actions Same with the endianness. Bigcheese: Same with the endianness.
MCInst Inst;		MCInst Inst;

// AArch64 ELF binaries can interleave data and text in the		// AArch64 ELF binaries can interleave data and text in the
// same section. We rely on the markers introduced to		// same section. We rely on the markers introduced to
// understand what we need to dump.		// understand what we need to dump.
if (Obj->isELF() && Obj->getArch() == Triple::aarch64) {		if (Obj->isELF() && Obj->getArch() == Triple::aarch64) {
uint64_t Stride = 0;		uint64_t Stride = 0;

▲ Show 20 Lines • Show All 622 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] llvm-objdump: disassembling amdgcn object file
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 47224

tools/llvm-objdump/llvm-objdump.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] llvm-objdump: disassembling amdgcn object fileClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 47224

tools/llvm-objdump/llvm-objdump.cpp

[AMDGPU] llvm-objdump: disassembling amdgcn object file
ClosedPublic