This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/GlobalISel/
-
GlobalISel/
-
GenericOpcode.rst
-
include/llvm/
-
llvm/
-
CodeGen/GlobalISel/
-
GlobalISel/
-
MachineIRBuilder.h
-
Support/
-
TargetOpcodes.def
-
Target/
1
GenericOpcodes.td
-
lib/CodeGen/
-
CodeGen/
-
MachineVerifier.cpp
-
test/MachineVerifier/
-
MachineVerifier/
-
test_g_ubfx_sbfx.mir
-
unittests/CodeGen/GlobalISel/
-
CodeGen/
-
GlobalISel/
-
MachineIRBuilderTest.cpp

Differential D98464

[GlobalISel] Add G_SBFX + G_UBFX (bitfield extraction opcodes)
ClosedPublic

Authored by paquette on Mar 11 2021, 3:58 PM.

Download Raw Diff

Details

Reviewers

aemerson
arsenm

Commits

rG4773dd5ba999: [GlobalISel] Add G_SBFX + G_UBFX (bitfield extraction opcodes)

Summary

There is a bunch of similar bitfield extraction code throughout *ISelDAGToDAG.

E.g, ARMISelDAGToDAG, AArch64ISelDAGToDAG, and AMDGPUISelDAGToDAG all contain code that matches a bitfield extract from an and + right shift.

Rather than duplicating code in the same way, this adds two opcodes:

G_UBFX (unsigned bitfield extract)
G_SBFX (signed bitfield extract)

They work like this

%x = G_UBFX %y, lsb, width

Where lsb and width denote

The least-significant bit of the extraction
The width of the extraction

This will extract width bits from %y, starting at lsb. G_UBFX zero-extends the result, while G_SBFX sign-extends the result.

This should allow us to use the combiner to match the bitfield extraction patterns rather than duplicating pattern-matching code in each target.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

paquette created this revision.Mar 11 2021, 3:58 PM

Herald added subscribers: dexonsmith, hiraditya, kristof.beyls and 2 others. · View Herald TranscriptMar 11 2021, 3:58 PM

paquette requested review of this revision.Mar 11 2021, 3:58 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 11 2021, 3:58 PM

Herald added a subscriber: wdng. · View Herald Transcript

arsenm added inline comments.Mar 11 2021, 5:40 PM

llvm/include/llvm/Target/GenericOpcodes.td
1364	These can be registers like a normal op. AMDGPU has registers/variable inputs for the offset and width

Harbormaster completed remote builds in B93388: Diff 330090.Mar 11 2021, 9:26 PM

gargaroff added a subscriber: gargaroff.Mar 11 2021, 11:18 PM

If we have to allow variable width operands for these, it doesn't really help AArch64 that much. We have to match the constant pattern to a target specific opcode, and then lower the rest of these to shifts by default. I can see the benefit of AMDGPU, but with variable extracts this just puts us back where we started for AArch64.

Make the LSB + width registers
Update verifier
Add a MIRBuilder test

In D98464#2622698, @aemerson wrote:

If we have to allow variable width operands for these, it doesn't really help AArch64 that much. We have to match the constant pattern to a target specific opcode, and then lower the rest of these to shifts by default. I can see the benefit of AMDGPU, but with variable extracts this just puts us back where we started for AArch64.

Would a allowsVariableWidthExtracts hook of some sort help? Then we could match the constants-only pattern in AArch64, but allow other targets to have variable-width operands.

I think something like this would work post-legalization:

// We'd need these functions...
if (isBeforeLegalizer() || !isLegal(...))
  return false;

Register LshrLHS, LshrRHS;
int64_t Mask;
if (!mi_match(Reg, MRI,
              m_GAnd(m_GLShr(m_Reg(LshrLHS), m_Reg(LshrRHS)), m_ICst(Mask))))
  return false;
if (/*Mask isn't a mask...*/)
  return false;

if (TLI.allowsRegisterExtractOps(/*...*/)) {
  // ... Do stuff ...
  return true;
}

// Need immediates for LSB + width.
int64_t LshrImm;
if (!mi_match(LshrRHS, MRI, m_ICst(LshrImm)))
  return false;

// ... Do stuff ...
return true;

Harbormaster completed remote builds in B93536: Diff 330291.Mar 12 2021, 11:27 AM

In D98464#2622885, @paquette wrote:

I think something like this would work post-legalization:

// We'd need these functions...
if (isBeforeLegalizer() || !isLegal(...))
  return false;

Register LshrLHS, LshrRHS;
int64_t Mask;
if (!mi_match(Reg, MRI,
              m_GAnd(m_GLShr(m_Reg(LshrLHS), m_Reg(LshrRHS)), m_ICst(Mask))))
  return false;
if (/*Mask isn't a mask...*/)
  return false;

if (TLI.allowsRegisterExtractOps(/*...*/)) {
  // ... Do stuff ...
  return true;
}

// Need immediates for LSB + width.
int64_t LshrImm;
if (!mi_match(LshrRHS, MRI, m_ICst(LshrImm)))
  return false;

// ... Do stuff ...
return true;

So we form these post-legalize only if we can see that the sources are constant for arm64. Sounds reasonable to me. @arsenm that ok for AMDGPU?

In D98464#2622698, @aemerson wrote:

If we have to allow variable width operands for these, it doesn't really help AArch64 that much. We have to match the constant pattern to a target specific opcode, and then lower the rest of these to shifts by default. I can see the benefit of AMDGPU, but with variable extracts this just puts us back where we started for AArch64.

In D98464#2627521, @aemerson wrote:
In D98464#2622885, @paquette wrote:
I think something like this would work post-legalization:
// We'd need these functions...
if (isBeforeLegalizer() || !isLegal(...))
  return false;

Register LshrLHS, LshrRHS;
int64_t Mask;
if (!mi_match(Reg, MRI,
              m_GAnd(m_GLShr(m_Reg(LshrLHS), m_Reg(LshrRHS)), m_ICst(Mask))))
  return false;
if (/*Mask isn't a mask...*/)
  return false;

if (TLI.allowsRegisterExtractOps(/*...*/)) {
  // ... Do stuff ...
  return true;
}

// Need immediates for LSB + width.
int64_t LshrImm;
if (!mi_match(LshrRHS, MRI, m_ICst(LshrImm)))
  return false;

// ... Do stuff ...
return true;
So we form these post-legalize only if we can see that the sources are constant for arm64. Sounds reasonable to me. @arsenm that ok for AMDGPU?

Yes, forming them only if the target supports the variable case makes sense (although also having the legalize back to shifts path would be nice for consistency)

LGTM.

This revision is now accepted and ready to land.Mar 15 2021, 9:39 PM

dexonsmith removed a subscriber: dexonsmith.Mar 15 2021, 9:39 PM

Closed by commit rG4773dd5ba999: [GlobalISel] Add G_SBFX + G_UBFX (bitfield extraction opcodes) (authored by paquette). · Explain WhyMar 19 2021, 2:49 PM

This revision was automatically updated to reflect the committed changes.

paquette added a commit: rG4773dd5ba999: [GlobalISel] Add G_SBFX + G_UBFX (bitfield extraction opcodes).

Revision Contents

Path

Size

llvm/

docs/

GlobalISel/

GenericOpcode.rst

33 lines

include/

llvm/

CodeGen/

GlobalISel/

MachineIRBuilder.h

12 lines

Support/

TargetOpcodes.def

5 lines

Target/

GenericOpcodes.td

18 lines

lib/

CodeGen/

MachineVerifier.cpp

11 lines

test/

MachineVerifier/

test_g_ubfx_sbfx.mir

15 lines

unittests/

CodeGen/

GlobalISel/

MachineIRBuilderTest.cpp

22 lines

Diff 332008

llvm/docs/GlobalISel/GenericOpcode.rst

	Show First 20 Lines • Show All 227 Lines • ▼ Show 20 Lines
	^^^^^^^^^^^^			^^^^^^^^^^^^

	Reverse the order of the bits in a scalar.			Reverse the order of the bits in a scalar.

	.. code-block:: none			.. code-block:: none

	%1:_(s32) = G_BITREVERSE %0:_(s32)			%1:_(s32) = G_BITREVERSE %0:_(s32)

				G_SBFX, G_UBFX
				^^^^^^^^^^^^^^

				Extract a range of bits from a register.

				The source operands are registers as follows:

				- Source
				- The least-significant bit for the extraction
				- The width of the extraction

				G_SBFX sign-extends the result, while G_UBFX zero-extends the result.

				.. code-block:: none

				; Extract 5 bits starting at bit 1 from %x and store them in %a.
				; Sign-extend the result.
				;
				; Example:
				; %x = 0...0000[10110]1 ---> %a = 1...111111[10110]
				%lsb_one = G_CONSTANT i32 1
				%width_five = G_CONSTANT i32 5
				%a:_(s32) = G_SBFX %x, %lsb_one, %width_five

				; Extract 3 bits starting at bit 2 from %x and store them in %b. Zero-extend
				; the result.
				;
				; Example:
				; %x = 1...11111[100]11 ---> %b = 0...00000[100]
				%lsb_two = G_CONSTANT i32 2
				%width_three = G_CONSTANT i32 3
				%b:_(s32) = G_UBFX %x, %lsb_two, %width_three

	Integer Operations			Integer Operations
	-------------------			-------------------

	G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SDIV, G_UDIV, G_SREM, G_UREM			G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SDIV, G_UDIV, G_SREM, G_UREM
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

	These each perform their respective integer arithmetic on a scalar.			These each perform their respective integer arithmetic on a scalar.

	▲ Show 20 Lines • Show All 539 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h

	Show First 20 Lines • Show All 1,825 Lines • ▼ Show 20 Lines

	MachineInstrBuilder buildMemCpy(const SrcOp &DstPtr, const SrcOp &SrcPtr,			MachineInstrBuilder buildMemCpy(const SrcOp &DstPtr, const SrcOp &SrcPtr,
	const SrcOp &Size, MachineMemOperand &DstMMO,			const SrcOp &Size, MachineMemOperand &DstMMO,
	MachineMemOperand &SrcMMO) {			MachineMemOperand &SrcMMO) {
	return buildMemTransferInst(TargetOpcode::G_MEMCPY, DstPtr, SrcPtr, Size,			return buildMemTransferInst(TargetOpcode::G_MEMCPY, DstPtr, SrcPtr, Size,
	DstMMO, SrcMMO);			DstMMO, SrcMMO);
	}			}

				/// Build and insert \p Dst = G_SBFX \p Src, \p LSB, \p Width.
				MachineInstrBuilder buildSbfx(const DstOp &Dst, const SrcOp &Src,
				const SrcOp &LSB, const SrcOp &Width) {
				return buildInstr(TargetOpcode::G_SBFX, {Dst}, {Src, LSB, Width});
				}

				/// Build and insert \p Dst = G_UBFX \p Src, \p LSB, \p Width.
				MachineInstrBuilder buildUbfx(const DstOp &Dst, const SrcOp &Src,
				const SrcOp &LSB, const SrcOp &Width) {
				return buildInstr(TargetOpcode::G_UBFX, {Dst}, {Src, LSB, Width});
				}

	virtual MachineInstrBuilder buildInstr(unsigned Opc, ArrayRef<DstOp> DstOps,			virtual MachineInstrBuilder buildInstr(unsigned Opc, ArrayRef<DstOp> DstOps,
	ArrayRef<SrcOp> SrcOps,			ArrayRef<SrcOp> SrcOps,
	Optional<unsigned> Flags = None);			Optional<unsigned> Flags = None);
	};			};

	} // End namespace llvm.			} // End namespace llvm.
	#endif // LLVM_CODEGEN_GLOBALISEL_MACHINEIRBUILDER_H			#endif // LLVM_CODEGEN_GLOBALISEL_MACHINEIRBUILDER_H

llvm/include/llvm/Support/TargetOpcodes.def

	Show First 20 Lines • Show All 743 Lines • ▼ Show 20 Lines
	HANDLE_TARGET_OPCODE(G_VECREDUCE_AND)			HANDLE_TARGET_OPCODE(G_VECREDUCE_AND)
	HANDLE_TARGET_OPCODE(G_VECREDUCE_OR)			HANDLE_TARGET_OPCODE(G_VECREDUCE_OR)
	HANDLE_TARGET_OPCODE(G_VECREDUCE_XOR)			HANDLE_TARGET_OPCODE(G_VECREDUCE_XOR)
	HANDLE_TARGET_OPCODE(G_VECREDUCE_SMAX)			HANDLE_TARGET_OPCODE(G_VECREDUCE_SMAX)
	HANDLE_TARGET_OPCODE(G_VECREDUCE_SMIN)			HANDLE_TARGET_OPCODE(G_VECREDUCE_SMIN)
	HANDLE_TARGET_OPCODE(G_VECREDUCE_UMAX)			HANDLE_TARGET_OPCODE(G_VECREDUCE_UMAX)
	HANDLE_TARGET_OPCODE(G_VECREDUCE_UMIN)			HANDLE_TARGET_OPCODE(G_VECREDUCE_UMIN)

				HANDLE_TARGET_OPCODE(G_SBFX)
				HANDLE_TARGET_OPCODE(G_UBFX)

	/// Marker for the end of the generic opcode.			/// Marker for the end of the generic opcode.
	/// This is used to check if an opcode is in the range of the			/// This is used to check if an opcode is in the range of the
	/// generic opcodes.			/// generic opcodes.
	HANDLE_TARGET_OPCODE_MARKER(PRE_ISEL_GENERIC_OPCODE_END, G_VECREDUCE_UMIN)			HANDLE_TARGET_OPCODE_MARKER(PRE_ISEL_GENERIC_OPCODE_END, G_UBFX)

	/// BUILTIN_OP_END - This must be the last enum value in this list.			/// BUILTIN_OP_END - This must be the last enum value in this list.
	/// The target-specific post-isel opcode values start here.			/// The target-specific post-isel opcode values start here.
	HANDLE_TARGET_OPCODE_MARKER(GENERIC_OP_END, PRE_ISEL_GENERIC_OPCODE_END)			HANDLE_TARGET_OPCODE_MARKER(GENERIC_OP_END, PRE_ISEL_GENERIC_OPCODE_END)

llvm/include/llvm/Target/GenericOpcodes.td

	Show First 20 Lines • Show All 1,349 Lines • ▼ Show 20 Lines
	def G_MEMSET : GenericInstruction {			def G_MEMSET : GenericInstruction {
	let OutOperandList = (outs);			let OutOperandList = (outs);
	let InOperandList = (ins ptype0:$dst_addr, type1:$value, type2:$size, untyped_imm_0:$tailcall);			let InOperandList = (ins ptype0:$dst_addr, type1:$value, type2:$size, untyped_imm_0:$tailcall);
	let hasSideEffects = false;			let hasSideEffects = false;
	let mayStore = true;			let mayStore = true;
	}			}

	//------------------------------------------------------------------------------			//------------------------------------------------------------------------------
				// Bitfield extraction.
				//------------------------------------------------------------------------------

				// Generic signed bitfield extraction.
				def G_SBFX : GenericInstruction {
				let OutOperandList = (outs type0:$dst);
				let InOperandList = (ins type0:$src, type0:$lsb, type0:$width);
				arsenmUnsubmitted Not Done Reply Inline Actions These can be registers like a normal op. AMDGPU has registers/variable inputs for the offset and width arsenm: These can be registers like a normal op. AMDGPU has registers/variable inputs for the offset…
				let hasSideEffects = false;
				}

				// Generic unsigned bitfield extraction.
				def G_UBFX : GenericInstruction {
				let OutOperandList = (outs type0:$dst);
				let InOperandList = (ins type0:$src, type0:$lsb, type0:$width);
				let hasSideEffects = false;
				}

				//------------------------------------------------------------------------------
	// Optimization hints			// Optimization hints
	//------------------------------------------------------------------------------			//------------------------------------------------------------------------------

	// Asserts that an operation has already been zero-extended from a specific			// Asserts that an operation has already been zero-extended from a specific
	// type.			// type.
	def G_ASSERT_ZEXT : GenericInstruction {			def G_ASSERT_ZEXT : GenericInstruction {
	let OutOperandList = (outs type0:$dst);			let OutOperandList = (outs type0:$dst);
	let InOperandList = (ins type0:$src, untyped_imm_0:$sz);			let InOperandList = (ins type0:$src, untyped_imm_0:$sz);
	Show All 10 Lines

llvm/lib/CodeGen/MachineVerifier.cpp

Show First 20 Lines • Show All 1,560 Lines • ▼ Show 20 Lines	case TargetOpcode::G_VECREDUCE_UMIN: {
LLT DstTy = MRI->getType(MI->getOperand(0).getReg());		LLT DstTy = MRI->getType(MI->getOperand(0).getReg());
LLT SrcTy = MRI->getType(MI->getOperand(1).getReg());		LLT SrcTy = MRI->getType(MI->getOperand(1).getReg());
if (!DstTy.isScalar())		if (!DstTy.isScalar())
report("Vector reduction requires a scalar destination type", MI);		report("Vector reduction requires a scalar destination type", MI);
if (!SrcTy.isVector())		if (!SrcTy.isVector())
report("Vector reduction requires vector source=", MI);		report("Vector reduction requires vector source=", MI);
break;		break;
}		}

		case TargetOpcode::G_SBFX:
		case TargetOpcode::G_UBFX: {
		LLT DstTy = MRI->getType(MI->getOperand(0).getReg());
		if (DstTy.isVector()) {
		report("Bitfield extraction is not supported on vectors", MI);
		break;
		}
		break;
		}

default:		default:
break;		break;
}		}
}		}

void MachineVerifier::visitMachineInstrBefore(const MachineInstr *MI) {		void MachineVerifier::visitMachineInstrBefore(const MachineInstr *MI) {
const MCInstrDesc &MCID = MI->getDesc();		const MCInstrDesc &MCID = MI->getDesc();
if (MI->getNumOperands() < MCID.getNumOperands()) {		if (MI->getNumOperands() < MCID.getNumOperands()) {
▲ Show 20 Lines • Show All 1,579 Lines • Show Last 20 Lines

llvm/test/MachineVerifier/test_g_ubfx_sbfx.mir

This file was added.

				# RUN: not --crash llc -verify-machineinstrs -run-pass none -o /dev/null %s 2>&1 \| FileCheck %s
				# REQUIRES: aarch64-registered-target

				name: test
				body: \|
				bb.0:
				%v1:_(<2 x s64>) = G_IMPLICIT_DEF
				%v2:_(<2 x s64>) = G_IMPLICIT_DEF
				%v3:_(<2 x s64>) = G_IMPLICIT_DEF

				; CHECK: * Bad machine code: Bitfield extraction is not supported on vectors *
				%ubfx_vector:_(<2 x s64>) = G_UBFX %v1, %v2, %v3
				; CHECK: * Bad machine code: Bitfield extraction is not supported on vectors *
				%sbfx_vector:_(<2 x s64>) = G_SBFX %v1, %v2, %v3
				...

llvm/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp

Show First 20 Lines • Show All 392 Lines • ▼ Show 20 Lines	TEST_F(AArch64GISelMITest, BuildAddoSubo) {
; CHECK: [[UADDE:%[0-9]+]]:_(s64), [[UADDE_FLAG:%[0-9]+]]:_(s1) = G_UADDE [[COPY0]]:_, [[COPY1]]:_, [[UADDO_FLAG]]		; CHECK: [[UADDE:%[0-9]+]]:_(s64), [[UADDE_FLAG:%[0-9]+]]:_(s1) = G_UADDE [[COPY0]]:_, [[COPY1]]:_, [[UADDO_FLAG]]
; CHECK: [[USUBE:%[0-9]+]]:_(s64), [[USUBE_FLAG:%[0-9]+]]:_(s1) = G_USUBE [[COPY0]]:_, [[COPY1]]:_, [[USUBO_FLAG]]		; CHECK: [[USUBE:%[0-9]+]]:_(s64), [[USUBE_FLAG:%[0-9]+]]:_(s1) = G_USUBE [[COPY0]]:_, [[COPY1]]:_, [[USUBO_FLAG]]
; CHECK: [[SADDE:%[0-9]+]]:_(s64), [[SADDE_FLAG:%[0-9]+]]:_(s1) = G_SADDE [[COPY0]]:_, [[COPY1]]:_, [[SADDO_FLAG]]		; CHECK: [[SADDE:%[0-9]+]]:_(s64), [[SADDE_FLAG:%[0-9]+]]:_(s1) = G_SADDE [[COPY0]]:_, [[COPY1]]:_, [[SADDO_FLAG]]
; CHECK: [[SSUBE:%[0-9]+]]:_(s64), [[SSUBE_FLAG:%[0-9]+]]:_(s1) = G_SSUBE [[COPY0]]:_, [[COPY1]]:_, [[SSUBO_FLAG]]		; CHECK: [[SSUBE:%[0-9]+]]:_(s64), [[SSUBE_FLAG:%[0-9]+]]:_(s1) = G_SSUBE [[COPY0]]:_, [[COPY1]]:_, [[SSUBO_FLAG]]
)";		)";

EXPECT_TRUE(CheckMachineFunction(MF, CheckStr)) << MF;		EXPECT_TRUE(CheckMachineFunction(MF, CheckStr)) << MF;
}		}

		TEST_F(AArch64GISelMITest, BuildBitfieldExtract) {
		setUp();
		if (!TM)
		return;
		LLT S64 = LLT::scalar(64);
		SmallVector<Register, 4> Copies;
		collectCopies(Copies, MF);

		auto Ubfx = B.buildUbfx(S64, Copies[0], Copies[1], Copies[2]);
		B.buildSbfx(S64, Ubfx, Copies[0], Copies[2]);

		const auto *CheckStr = R"(
		; CHECK: [[COPY0:%[0-9]+]]:_(s64) = COPY $x0
		; CHECK: [[COPY1:%[0-9]+]]:_(s64) = COPY $x1
		; CHECK: [[COPY2:%[0-9]+]]:_(s64) = COPY $x2
		; CHECK: [[UBFX:%[0-9]+]]:_(s64) = G_UBFX [[COPY0]]:_, [[COPY1]]:_, [[COPY2]]:_
		; CHECK: [[SBFX:%[0-9]+]]:_(s64) = G_SBFX [[UBFX]]:_, [[COPY0]]:_, [[COPY2]]:_
		)";

		EXPECT_TRUE(CheckMachineFunction(MF, CheckStr)) << MF;
		}

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Add G_SBFX + G_UBFX (bitfield extraction opcodes)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 332008

llvm/docs/GlobalISel/GenericOpcode.rst

llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h

llvm/include/llvm/Support/TargetOpcodes.def

llvm/include/llvm/Target/GenericOpcodes.td

llvm/lib/CodeGen/MachineVerifier.cpp

llvm/test/MachineVerifier/test_g_ubfx_sbfx.mir

llvm/unittests/CodeGen/GlobalISel/MachineIRBuilderTest.cpp

[GlobalISel] Add G_SBFX + G_UBFX (bitfield extraction opcodes)
ClosedPublic