This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][GFX10] Disabled v_movrel*[sdwa|dpp] opcodes in codegen
ClosedPublic

Authored by dp on Nov 18 2019, 8:07 AM.

Download Raw Diff

Details

Reviewers

arsenm
rampitec
vpykhtin

Commits

rG6778a62eb0d2: [AMDGPU][GFX10] Disabled v_movrel*[sdwa|dpp] opcodes in codegen

Summary

These opcodes use indirect register addressing so they need special handling by codegen (currently missing).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dp created this revision.Nov 18 2019, 8:07 AM

Herald added a project: Restricted Project. · View Herald TranscriptNov 18 2019, 8:07 AM

Herald added subscribers: llvm-commits, kbarton, hiraditya and 9 others. · View Herald Transcript

Look mostly good, but can you split this change into one that relates to DPP and another that disables asm only instructions?

Herald added a subscriber: • wuzish. · View Herald TranscriptNov 18 2019, 8:13 AM

dp added a parent revision: D70402: [AMDGPU][DPP] Corrected DPP combiner.Nov 18 2019, 8:35 AM

Separated dpp combiner changes to D70402

vpykhtin added inline comments.Nov 18 2019, 10:01 AM

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
6332	is there anyway to mark these instructions in td files?

dp marked 2 inline comments as done.Nov 18 2019, 10:56 AM

dp added inline comments.

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
6332	I thought about it. Yes, it is possible, but that will not make code more readable overall. Labelling these opcodes in td will make code cleaner in this file, but require more changes elsewhere. Overall I think that this case is very special and requires a special solution. If we face similar issues in the future (that need more cases in the switch below), we may create a flag for this purpose. I'm not sure it is necessary for MOVREL*.

rampitec added inline comments.Nov 18 2019, 12:42 PM

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
6332	Is it the same as isAsmParserOnly in td? If so shouldn't it be easy to mark it there? In turn if we need an extra TSFlags bit that is not worth it, as these bits are not countless.

Actually what makes them risky is impuse of M0, so it can be folded around M0 definition. Isn't it cleaner to check for impuse in the SDWA and DPP combiner and disable the combining on these grounds rather than excluding it from codegen completely?

In D70400#1750450, @rampitec wrote:

Actually what makes them risky is impuse of M0, so it can be folded around M0 definition. Isn't it cleaner to check for impuse in the SDWA and DPP combiner and disable the combining on these grounds rather than excluding it from codegen completely?

Maybe. But I do not understand how codegen can handle these instructions without knowing actual dst and src registers. To support _dpp and _sdwa variants codegen needs the same (or similar) hacks as those implemented for v_movreld_b32.

In D70400#1750597, @dp wrote:

In D70400#1750450, @rampitec wrote:

Actually what makes them risky is impuse of M0, so it can be folded around M0 definition. Isn't it cleaner to check for impuse in the SDWA and DPP combiner and disable the combining on these grounds rather than excluding it from codegen completely?

Maybe. But I do not understand how codegen can handle these instructions without knowing actual dst and src registers. To support _dpp and _sdwa variants codegen needs the same (or similar) hacks as those implemented for v_movreld_b32.

Hmm. I think you are right:

v1 = v_and_b32 v2, 0xf
v3 = v_movrels_b32 v1

Means: v3 = v1[m0], same as v3 = (v1 & 0xf)[m0]
After sdwa conversion it would be: v3 = v2[m0] & 0xf

Not exactly the same thing.

LGTM

This revision is now accepted and ready to land.Nov 18 2019, 2:30 PM

LGTM.

Closed by commit rG6778a62eb0d2: [AMDGPU][GFX10] Disabled v_movrel*[sdwa|dpp] opcodes in codegen (authored by dp). · Explain WhyNov 20 2019, 7:08 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

Target/

AMDGPU/

SIInstrInfo.h

4 lines

SIInstrInfo.cpp

23 lines

Diff 230256

llvm/lib/Target/AMDGPU/SIInstrInfo.h

Show First 20 Lines • Show All 1,011 Lines • ▼ Show 20 Lines	public:
bool isLegalFLATOffset(int64_t Offset, unsigned AddrSpace,		bool isLegalFLATOffset(int64_t Offset, unsigned AddrSpace,
bool Signed) const;		bool Signed) const;

/// \brief Return a target-specific opcode if Opcode is a pseudo instruction.		/// \brief Return a target-specific opcode if Opcode is a pseudo instruction.
/// Return -1 if the target-specific opcode for the pseudo instruction does		/// Return -1 if the target-specific opcode for the pseudo instruction does
/// not exist. If Opcode is not a pseudo instruction, this is identity.		/// not exist. If Opcode is not a pseudo instruction, this is identity.
int pseudoToMCOpcode(int Opcode) const;		int pseudoToMCOpcode(int Opcode) const;

		/// \brief Check if this instruction should only be used by assembler.
		/// Return true if this opcode should not be used by codegen.
		bool isAsmOnlyOpcode(int MCOp) const;

const TargetRegisterClass *getRegClass(const MCInstrDesc &TID, unsigned OpNum,		const TargetRegisterClass *getRegClass(const MCInstrDesc &TID, unsigned OpNum,
const TargetRegisterInfo *TRI,		const TargetRegisterInfo *TRI,
const MachineFunction &MF)		const MachineFunction &MF)
const override {		const override {
if (OpNum >= TID.getNumOperands())		if (OpNum >= TID.getNumOperands())
return nullptr;		return nullptr;
return RI.getRegClass(TID.OpInfo[OpNum].RegClass);		return RI.getRegClass(TID.OpInfo[OpNum].RegClass);
}		}
▲ Show 20 Lines • Show All 133 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp

Show First 20 Lines • Show All 6,323 Lines • ▼ Show 20 Lines	static SIEncodingFamily subtargetEncodingFamily(const GCNSubtarget &ST) {
case AMDGPUSubtarget::GFX9:		case AMDGPUSubtarget::GFX9:
return SIEncodingFamily::VI;		return SIEncodingFamily::VI;
case AMDGPUSubtarget::GFX10:		case AMDGPUSubtarget::GFX10:
return SIEncodingFamily::GFX10;		return SIEncodingFamily::GFX10;
}		}
llvm_unreachable("Unknown subtarget generation!");		llvm_unreachable("Unknown subtarget generation!");
}		}

		bool SIInstrInfo::isAsmOnlyOpcode(int MCOp) const {
		vpykhtinUnsubmitted Done Reply Inline Actions is there anyway to mark these instructions in td files? vpykhtin: is there anyway to mark these instructions in td files?
		dpAuthorUnsubmitted Done Reply Inline Actions I thought about it. Yes, it is possible, but that will not make code more readable overall. Labelling these opcodes in td will make code cleaner in this file, but require more changes elsewhere. Overall I think that this case is very special and requires a special solution. If we face similar issues in the future (that need more cases in the switch below), we may create a flag for this purpose. I'm not sure it is necessary for MOVREL. dp:* I thought about it. Yes, it is possible, but that will not make code more readable overall.
		rampitecUnsubmitted Not Done Reply Inline Actions Is it the same as isAsmParserOnly in td? If so shouldn't it be easy to mark it there? In turn if we need an extra TSFlags bit that is not worth it, as these bits are not countless. rampitec: Is it the same as isAsmParserOnly in td? If so shouldn't it be easy to mark it there? In turn…
		switch(MCOp) {
		// These opcodes use indirect register addressing so
		// they need special handling by codegen (currently missing).
		// Therefore it is too risky to allow these opcodes
		// to be selected by dpp combiner or sdwa peepholer.
		case AMDGPU::V_MOVRELS_B32_dpp_gfx10:
		case AMDGPU::V_MOVRELS_B32_sdwa_gfx10:
		case AMDGPU::V_MOVRELD_B32_dpp_gfx10:
		case AMDGPU::V_MOVRELD_B32_sdwa_gfx10:
		case AMDGPU::V_MOVRELSD_B32_dpp_gfx10:
		case AMDGPU::V_MOVRELSD_B32_sdwa_gfx10:
		case AMDGPU::V_MOVRELSD_2_B32_dpp_gfx10:
		case AMDGPU::V_MOVRELSD_2_B32_sdwa_gfx10:
		return true;
		default:
		return false;
		}
		}

int SIInstrInfo::pseudoToMCOpcode(int Opcode) const {		int SIInstrInfo::pseudoToMCOpcode(int Opcode) const {
SIEncodingFamily Gen = subtargetEncodingFamily(ST);		SIEncodingFamily Gen = subtargetEncodingFamily(ST);

if ((get(Opcode).TSFlags & SIInstrFlags::renamedInGFX9) != 0 &&		if ((get(Opcode).TSFlags & SIInstrFlags::renamedInGFX9) != 0 &&
ST.getGeneration() == AMDGPUSubtarget::GFX9)		ST.getGeneration() == AMDGPUSubtarget::GFX9)
Gen = SIEncodingFamily::GFX9;		Gen = SIEncodingFamily::GFX9;

// Adjust the encoding family to GFX80 for D16 buffer instructions when the		// Adjust the encoding family to GFX80 for D16 buffer instructions when the
Show All 22 Lines	int SIInstrInfo::pseudoToMCOpcode(int Opcode) const {
if (MCOp == -1)		if (MCOp == -1)
return Opcode;		return Opcode;

// (uint16_t)-1 means that Opcode is a pseudo instruction that has		// (uint16_t)-1 means that Opcode is a pseudo instruction that has
// no encoding in the given subtarget generation.		// no encoding in the given subtarget generation.
if (MCOp == (uint16_t)-1)		if (MCOp == (uint16_t)-1)
return -1;		return -1;

		if (isAsmOnlyOpcode(MCOp))
		return -1;

return MCOp;		return MCOp;
}		}

static		static
TargetInstrInfo::RegSubRegPair getRegOrUndef(const MachineOperand &RegOpnd) {		TargetInstrInfo::RegSubRegPair getRegOrUndef(const MachineOperand &RegOpnd) {
assert(RegOpnd.isReg());		assert(RegOpnd.isReg());
return RegOpnd.isUndef() ? TargetInstrInfo::RegSubRegPair() :		return RegOpnd.isUndef() ? TargetInstrInfo::RegSubRegPair() :
getRegSubRegPair(RegOpnd);		getRegSubRegPair(RegOpnd);
▲ Show 20 Lines • Show All 218 Lines • Show Last 20 Lines