Download Raw Diff

Details

Reviewers

arsenm
rampitec
alexander-shaposhnikov

Commits

rGefec1396accb: [AMDGPU] Implement AMDGPUMCInstrAnalysis
rL355373: [AMDGPU] Implement AMDGPUMCInstrAnalysis

Summary

Implement MCInstrAnalysis for AMDGPU. Implement evaluateBranch to get <symbol+offset> notation in llvm-objdump when the symbolizer fails. I believe the remaining default implementations are OK, if pessimistic, but we can implement them as-needed.

Diff Detail

Event Timeline

scott.linder created this revision.Feb 19 2019, 11:36 AM

Herald added a reviewer: alexander-shaposhnikov. · View Herald TranscriptFeb 19 2019, 11:36 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: llvm-commits, t-tye, tpr and 6 others. · View Herald Transcript

arsenm added inline comments.Feb 19 2019, 11:50 AM

lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.cpp
122	Needs a comment for why this is 18
test/MC/AMDGPU/branch-comment.s
10	This should be printed as signed?
16	Can you include some cases where it can't figure out an associated symbol?
16	Also could use some stressing the maximum branch distances. There's some macro trick one of the other tests uses to produce a lot of nops for this

scott.linder marked an inline comment as done.Feb 20 2019, 8:01 AM

scott.linder added inline comments.

test/MC/AMDGPU/branch-comment.s
10	I went back to make this change, but I'm not really certain why we choose to represent the branch immediate the way we currently do. It seems like we reinterpret the bytes of a signed int16 as a signed int64 without a sign-extension when creating the MCOperand immediate. It seems confusing to me, and leads to some awkward casts in places. Would sign-extending and adding checks for `isInt<16>(Imm)` in places be a better approach, or is there something I'm missing? Alternatively should we just make the asm parser/printer aware of this so we see "-1"? Should we also support the old way? It seems like there must be assembly kernels out there with the old "notation" for negative immediates.

arsenm added inline comments.Feb 20 2019, 8:15 AM

test/MC/AMDGPU/branch-comment.s
10	These should probably be consistently extended

Address feedback. Sign-extend all s_branch immediates and change the assembler syntax to represent these as true negative numbers.

One remaining question I have is about "overflow" in the calculation of the target; for example the test for the smallest simm16 on a branch at a low PC results in the offset <keep_symbol+0xfffffffffffe0018>but I don't know if the hardware is defined to behave this way. I will try to experiment, but I'm unsure how to know definitively.

arsenm added inline comments.Feb 20 2019, 2:03 PM

test/MC/AMDGPU/branch-comment-fail.s
4 ↗	(On Diff #187669)	This error message is bad. Why doesn't it say something about out of range?

scott.linder added inline comments.Feb 20 2019, 2:24 PM

test/MC/AMDGPU/branch-comment.s
16	I added tests for the boundaries when assembling an immediate, and tests for failure due to the immediate being out of range, but the test you mention (`test/MC/AMDGPU/branch-comment-fail.s`) already handles the symbolic case, so I don't think there is anything to add.

Improved error message.

Does anyone have an opinion on returning negative branch targets (e.g. <keep_symbol+0xfffffffffffe0018>)? I don't know how this would ever come up in hardware anyway, or what the hardware would do, but it doesn't seem very helpful in the disassembly.

In D58400#1405973, @scott.linder wrote:

Does anyone have an opinion on returning negative branch targets (e.g. <keep_symbol+0xfffffffffffe0018>)? I don't know how this would ever come up in hardware anyway, or what the hardware would do, but it doesn't seem very helpful in the disassembly.

The hardware doesn't know about the symbol? Do you mean for overflow?

In D58400#1407115, @arsenm wrote:

In D58400#1405973, @scott.linder wrote:

Does anyone have an opinion on returning negative branch targets (e.g. <keep_symbol+0xfffffffffffe0018>)? I don't know how this would ever come up in hardware anyway, or what the hardware would do, but it doesn't seem very helpful in the disassembly.

The hardware doesn't know about the symbol? Do you mean for overflow?

Right, I just mean that the notation of an offset from a symbol kind of breaks down with the overflow, and that the "overflow" doesn't represent an overflow in the hardware anyway. At best <keep_symbol+0xfffffffffffe0018> is just an odd way to represent a negative offset. I think it makes more sense to just return nothing from evaulateBranch if the result would be negative?

After digging a bit more into how we parse/print operands for sopp_br I think there are some more fundamental decisions to make beyond just "do we support signed integers", so I want to avoid changing anything in this patch. I will revisit how we treat them in the future and make any breaking changes all at once, rather than spreading them out.

I'm not sure how the question is whether signed integers are supported. The instruction does treat the offset as signed

In D58400#1412667, @arsenm wrote:

I'm not sure how the question is whether signed integers are supported. The instruction does treat the offset as signed

I mean during parsing/printing. We don't support parsing s_branch -1, nor we don't support printing it.

LGTM

This revision is now accepted and ready to land.Mar 1 2019, 3:29 PM

Closed by commit rL355373: [AMDGPU] Implement AMDGPUMCInstrAnalysis (authored by scott.linder). · Explain WhyMar 4 2019, 7:01 PM

This revision was automatically updated to reflect the committed changes.

Diff 188604

lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp

Show First 20 Lines • Show All 70 Lines • ▼ Show 20 Lines	static int insertNamedMCOperand(MCInst &MI, const MCOperand &Op,
}		}
return OpIdx;		return OpIdx;
}		}

static DecodeStatus decodeSoppBrTarget(MCInst &Inst, unsigned Imm,		static DecodeStatus decodeSoppBrTarget(MCInst &Inst, unsigned Imm,
uint64_t Addr, const void *Decoder) {		uint64_t Addr, const void *Decoder) {
auto DAsm = static_cast<const AMDGPUDisassembler*>(Decoder);		auto DAsm = static_cast<const AMDGPUDisassembler*>(Decoder);

		// Our branches take a simm16, but we need two extra bits to account for the
		// factor of 4.
APInt SignedOffset(18, Imm * 4, true);		APInt SignedOffset(18, Imm * 4, true);
int64_t Offset = (SignedOffset.sext(64) + 4 + Addr).getSExtValue();		int64_t Offset = (SignedOffset.sext(64) + 4 + Addr).getSExtValue();

if (DAsm->tryAddingSymbolicOperand(Inst, Offset, Addr, true, 2, 2))		if (DAsm->tryAddingSymbolicOperand(Inst, Offset, Addr, true, 2, 2))
return MCDisassembler::Success;		return MCDisassembler::Success;
return addOperand(Inst, MCOperand::createImm(Imm));		return addOperand(Inst, MCOperand::createImm(Imm));
}		}

▲ Show 20 Lines • Show All 859 Lines • Show Last 20 Lines

lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.cpp

Show All 14 Lines
#include "AMDGPUELFStreamer.h"		#include "AMDGPUELFStreamer.h"
#include "AMDGPUMCAsmInfo.h"		#include "AMDGPUMCAsmInfo.h"
#include "AMDGPUTargetStreamer.h"		#include "AMDGPUTargetStreamer.h"
#include "InstPrinter/AMDGPUInstPrinter.h"		#include "InstPrinter/AMDGPUInstPrinter.h"
#include "SIDefines.h"		#include "SIDefines.h"
#include "llvm/MC/MCAsmBackend.h"		#include "llvm/MC/MCAsmBackend.h"
#include "llvm/MC/MCCodeEmitter.h"		#include "llvm/MC/MCCodeEmitter.h"
#include "llvm/MC/MCContext.h"		#include "llvm/MC/MCContext.h"
		#include "llvm/MC/MCInstrAnalysis.h"
#include "llvm/MC/MCInstrInfo.h"		#include "llvm/MC/MCInstrInfo.h"
#include "llvm/MC/MCObjectWriter.h"		#include "llvm/MC/MCObjectWriter.h"
#include "llvm/MC/MCRegisterInfo.h"		#include "llvm/MC/MCRegisterInfo.h"
#include "llvm/MC/MCStreamer.h"		#include "llvm/MC/MCStreamer.h"
#include "llvm/MC/MCSubtargetInfo.h"		#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/MC/MachineLocation.h"		#include "llvm/MC/MachineLocation.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/TargetRegistry.h"		#include "llvm/Support/TargetRegistry.h"
▲ Show 20 Lines • Show All 67 Lines • ▼ Show 20 Lines	static MCStreamer *createMCStreamer(const Triple &T, MCContext &Context,
std::unique_ptr<MCAsmBackend> &&MAB,		std::unique_ptr<MCAsmBackend> &&MAB,
std::unique_ptr<MCObjectWriter> &&OW,		std::unique_ptr<MCObjectWriter> &&OW,
std::unique_ptr<MCCodeEmitter> &&Emitter,		std::unique_ptr<MCCodeEmitter> &&Emitter,
bool RelaxAll) {		bool RelaxAll) {
return createAMDGPUELFStreamer(T, Context, std::move(MAB), std::move(OW),		return createAMDGPUELFStreamer(T, Context, std::move(MAB), std::move(OW),
std::move(Emitter), RelaxAll);		std::move(Emitter), RelaxAll);
}		}

		namespace {

		class AMDGPUMCInstrAnalysis : public MCInstrAnalysis {
		public:
		explicit AMDGPUMCInstrAnalysis(const MCInstrInfo *Info)
		: MCInstrAnalysis(Info) {}

		bool evaluateBranch(const MCInst &Inst, uint64_t Addr, uint64_t Size,
		uint64_t &Target) const override {
		if (Inst.getNumOperands() == 0 \|\| !Inst.getOperand(0).isImm() \|\|
		Info->get(Inst.getOpcode()).OpInfo[0].OperandType !=
		MCOI::OPERAND_PCREL)
		return false;

		int64_t Imm = Inst.getOperand(0).getImm();
		// Our branches take a simm16, but we need two extra bits to account for
		arsenmUnsubmitted Done Reply Inline Actions Needs a comment for why this is 18 arsenm: Needs a comment for why this is 18
		// the factor of 4.
		APInt SignedOffset(18, Imm * 4, true);
		Target = (SignedOffset.sext(64) + Addr + Size).getZExtValue();
		return true;
		}
		};

		} // end anonymous namespace

		static MCInstrAnalysis createAMDGPUMCInstrAnalysis(const MCInstrInfo Info) {
		return new AMDGPUMCInstrAnalysis(Info);
		}

extern "C" void LLVMInitializeAMDGPUTargetMC() {		extern "C" void LLVMInitializeAMDGPUTargetMC() {

TargetRegistry::RegisterMCInstrInfo(getTheGCNTarget(), createAMDGPUMCInstrInfo);		TargetRegistry::RegisterMCInstrInfo(getTheGCNTarget(), createAMDGPUMCInstrInfo);
TargetRegistry::RegisterMCInstrInfo(getTheAMDGPUTarget(), createR600MCInstrInfo);		TargetRegistry::RegisterMCInstrInfo(getTheAMDGPUTarget(), createR600MCInstrInfo);
for (Target *T : {&getTheAMDGPUTarget(), &getTheGCNTarget()}) {		for (Target *T : {&getTheAMDGPUTarget(), &getTheGCNTarget()}) {
RegisterMCAsmInfo<AMDGPUMCAsmInfo> X(*T);		RegisterMCAsmInfo<AMDGPUMCAsmInfo> X(*T);

TargetRegistry::RegisterMCRegInfo(*T, createAMDGPUMCRegisterInfo);		TargetRegistry::RegisterMCRegInfo(*T, createAMDGPUMCRegisterInfo);
TargetRegistry::RegisterMCSubtargetInfo(*T, createAMDGPUMCSubtargetInfo);		TargetRegistry::RegisterMCSubtargetInfo(*T, createAMDGPUMCSubtargetInfo);
TargetRegistry::RegisterMCInstPrinter(*T, createAMDGPUMCInstPrinter);		TargetRegistry::RegisterMCInstPrinter(*T, createAMDGPUMCInstPrinter);
		TargetRegistry::RegisterMCInstrAnalysis(*T, createAMDGPUMCInstrAnalysis);
TargetRegistry::RegisterMCAsmBackend(*T, createAMDGPUAsmBackend);		TargetRegistry::RegisterMCAsmBackend(*T, createAMDGPUAsmBackend);
TargetRegistry::RegisterELFStreamer(*T, createMCStreamer);		TargetRegistry::RegisterELFStreamer(*T, createMCStreamer);
}		}

// R600 specific registration		// R600 specific registration
TargetRegistry::RegisterMCCodeEmitter(getTheAMDGPUTarget(),		TargetRegistry::RegisterMCCodeEmitter(getTheAMDGPUTarget(),
createR600MCCodeEmitter);		createR600MCCodeEmitter);
TargetRegistry::RegisterObjectTargetStreamer(		TargetRegistry::RegisterObjectTargetStreamer(
Show All 11 Lines

test/MC/AMDGPU/branch-comment.s

This file was added.

				// RUN: llvm-mc -arch=amdgcn -mcpu=fiji -filetype=obj %s \| llvm-objcopy -S -K keep_symbol - \| llvm-objdump -disassemble -mcpu=fiji - \| FileCheck %s --check-prefix=BIN

				// FIXME: Immediate operands to sopp_br instructions are currently scaled by a
				// factor of 4, are unsigned, are always PC relative, don't accept most
				// expressions, and are not range checked.

				loop_start_nosym:
				s_branch loop_start_nosym
				// BIN-NOT: loop_start_nosym:
				// BIN: s_branch 65535 // 000000000000: BF82FFFF <.text>
				arsenmUnsubmitted Not Done Reply Inline Actions This should be printed as signed? arsenm: This should be printed as signed?
				scott.linderAuthorUnsubmitted Done Reply Inline Actions I went back to make this change, but I'm not really certain why we choose to represent the branch immediate the way we currently do. It seems like we reinterpret the bytes of a signed int16 as a signed int64 without a sign-extension when creating the MCOperand immediate. It seems confusing to me, and leads to some awkward casts in places. Would sign-extending and adding checks for `isInt<16>(Imm)` in places be a better approach, or is there something I'm missing? Alternatively should we just make the asm parser/printer aware of this so we see "-1"? Should we also support the old way? It seems like there must be assembly kernels out there with the old "notation" for negative immediates. scott.linder: I went back to make this change, but I'm not really certain why we choose to represent the…
				arsenmUnsubmitted Done Reply Inline Actions These should probably be consistently extended arsenm: These should probably be consistently extended

				s_branch loop_end_nosym
				// BIN: s_branch 0 // 000000000004: BF820000 <.text+0x8>
				// BIN-NOT: loop_end_nosym:
				loop_end_nosym:
				s_nop 0
				arsenmUnsubmitted Done Reply Inline Actions Can you include some cases where it can't figure out an associated symbol? arsenm: Can you include some cases where it can't figure out an associated symbol?
				arsenmUnsubmitted Not Done Reply Inline Actions Also could use some stressing the maximum branch distances. There's some macro trick one of the other tests uses to produce a lot of nops for this arsenm: Also could use some stressing the maximum branch distances. There's some macro trick one of the…
				scott.linderAuthorUnsubmitted Done Reply Inline Actions I added tests for the boundaries when assembling an immediate, and tests for failure due to the immediate being out of range, but the test you mention (`test/MC/AMDGPU/branch-comment-fail.s`) already handles the symbolic case, so I don't think there is anything to add. scott.linder: I added tests for the boundaries when assembling an immediate, and tests for failure due to the…

				keep_symbol:
				s_nop 0

				loop_start_sym:
				s_branch loop_start_sym
				// BIN-NOT: loop_start_sym:
				// BIN: s_branch 65535 // 000000000010: BF82FFFF <keep_symbol+0x4>

				s_branch loop_end_sym
				// BIN: s_branch 0 // 000000000014: BF820000 <keep_symbol+0xc>
				// BIN-NOT: loop_end_sym:
				loop_end_sym:
				s_nop 0

				s_branch 65535
				// BIN: s_branch 65535 // 00000000001C: BF82FFFF <keep_symbol+0x10>

				s_branch 32768
				// BIN: s_branch 32768 // 000000000020: BF828000 <keep_symbol+0xfffffffffffe0018>

				s_branch 32767
				// BIN: s_branch 32767 // 000000000024: BF827FFF <keep_symbol+0x20018>

				s_branch 0x80000000ffff
				// BIN: s_branch 65535 // 000000000028: BF82FFFF <keep_symbol+0x1c>

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Implement AMDGPUMCInstrAnalysis
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 188604

lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp

lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.cpp

test/MC/AMDGPU/branch-comment.s

This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Implement AMDGPUMCInstrAnalysisClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 188604

lib/Target/AMDGPU/Disassembler/AMDGPUDisassembler.cpp

lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.cpp

test/MC/AMDGPU/branch-comment.s

[AMDGPU] Implement AMDGPUMCInstrAnalysis
ClosedPublic