This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/
-
CodeGen/GlobalISel/
-
GlobalISel/
1/2
LegalizerHelper.cpp
-
Target/AMDGPU/
-
AMDGPU/
-
AMDGPULegalizerInfo.cpp
-
test/CodeGen/AMDGPU/GlobalISel/
-
CodeGen/
-
AMDGPU/
-
GlobalISel/
-
legalize-insert-vector-elt.mir

Differential D69513

[GlobalISel] Widen one type at the time for insert/extract vector elt
Needs ReviewPublic

Authored by Petar.Avramovic on Oct 28 2019, 8:20 AM.

Download Raw Diff

Details

Reviewers

arsenm
atanasyan
petarj

Summary

In widenScalar for G_INSERT_VECTOR_ELT and G_EXTRACT_VECTOR_ELT, widen
scalar of TypeIdx that corresponds to scalar element that is being
inserted/extracted used to also affect vector type at another TypeIdx.
It used to widen vector scalar type and keep same number of elements.
Change widenScalar for these opcodes to only affect operands with LLT
that corresponds to same TypeIdx.
This allows to widen only scalar type and leave vector type unchanged.
Old functionality of elt widenScalar can be achieved by two widenScalar
calls for elt(minScalar) and vector(minScalarOrElt) TypeIdx with same
scalar type. Update AMDGPULegalizerInfo.

Diff Detail

Event Timeline

Petar.Avramovic created this revision.Oct 28 2019, 8:20 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 28 2019, 8:20 AM

Herald added subscribers: llvm-commits, volkan, hiraditya and 5 others. · View Herald Transcript

Petar.Avramovic marked an inline comment as done.Oct 28 2019, 8:25 AM

Petar.Avramovic added inline comments.

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
1784	G_INSERT_VECTOR_ELT uses G_ANYEXT for its vector TypeIdx (0). Should this one also use G_ANYEXT?

Petar.Avramovic added a child revision: D69711: [MIPS GlobalISel] Select MSA insert_vector_elt with immediate index.Nov 1 2019, 6:32 AM

I don't understand the motivation. The vector element and insert element type need to match, but it appears there's a missing verifier check

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
1784	This should probably be changed in a separate patch

Petar.Avramovic mentioned this in D69711: [MIPS GlobalISel] Select MSA insert_vector_elt with immediate index.Nov 4 2019, 1:27 AM

I don't understand the motivation.
The vector element and insert element type need to match, but it appears there's a missing verifier check

This is definitely true for llvm-ir insertelement and extractvalue.
But SDAG nodes ISD::INSERT_VECTOR_ELT and ISD::EXTRACT_VECTOR_ELT don't follow this,
Mips uses DAGTypeLegalizer::PromoteIntegerResult for i8 and i16 and promotes them to i32 leaving vector scalar type unchanged.
In .td file element being inserted/extracted has i32 operand type (for i8, i16 and i32) and instruction is selected based on vector type (v16i8, v8i16, v4i32).

I have just saw these two asserts that forbid different types of vector scalar and element scalar
https://github.com/llvm/llvm-project/blob/a0324e911374441151903ed0d828e0fc1994c167/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp#L1090
https://github.com/llvm/llvm-project/blob/a0324e911374441151903ed0d828e0fc1994c167/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp#L1100
Widen scalar change does not trigger asserts since it only changes operand instead of making new instruction. How do we approach this issue then?
Btw, based on G_ZEXTLOAD and G_SEXTLOAD, are G_SEXT_EXTRACT_VECTOR_ELT and G_ZEXT_EXTRACT_VECTOR_ELT a consideration?
They would have different element scalar then vector scalar.

Ping.

In D69513#1732467, @Petar.Avramovic wrote:

I have just saw these two asserts that forbid different types of vector scalar and element scalar
https://github.com/llvm/llvm-project/blob/a0324e911374441151903ed0d828e0fc1994c167/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp#L1090
https://github.com/llvm/llvm-project/blob/a0324e911374441151903ed0d828e0fc1994c167/llvm/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp#L1100
Widen scalar change does not trigger asserts since it only changes operand instead of making new instruction. How do we approach this issue then?
Btw, based on G_ZEXTLOAD and G_SEXTLOAD, are G_SEXT_EXTRACT_VECTOR_ELT and G_ZEXT_EXTRACT_VECTOR_ELT a consideration?
They would have different element scalar then vector scalar.

Based on the fact that we have separate G_BUILD_VECTOR and G_BUILD_VECTOR_TRUNC, I think there should be a separate opcode for vector operations which implicitly convert

I don't fully understand G_BUILD_VECTOR_TRUNC. Such opcode does not exist in llvm-ir and as such can only be created in legalizer or later (at the moment at least). The problem here is the context sensitive legality (we still avoid this) along side with type legality. e.g. types might be fine but we still can't select an instructions because vector is built from different virtual registers (so splat is not an option) and we should perform lower. At the current state of legalizer I don't think it is possible to have it legal for type and as such I planned to handle it with custom legalization by either context sensitive select or lower.

Reasoning similarly to G_BUILD_VECTOR_TRUNC it is probably good thing to have verbose corner case opcodes for vector operations which implicitly convert.
If I understood correctly, we need 4 new generic opcodes: G_INSERT_VECTOR_ELT_TRUNC
G_SEXT_EXTRACT_VECTOR_ELT
G_ZEXT_EXTRACT_VECTOR_ELT
G_ANYEXT_EXTRACT_VECTOR_ELT.
For MIPS, I would create them in preLegalizerCombiner.

Only difference between G_INSERT_VECTOR_ELT_TRUNC, G_ANYEXT_EXTRACT_VECTOR_ELT and G_INSERT_VECTOR_ELT, G_EXTRACT_VECTOR_ELT I see is widen scalar for scalar type.
Do scalar and vector scalar type have to be always different for them,
i.e. is %Dest:_(<2 x s8>) = G_INSERT_VECTOR_ELT_TRUNC %Src:_(<2 x s8>), %Elt:_(s8), %Index:_(s32) allowed or it must be G_INSERT_VECTOR_ELT instead.
If later, how do we widen scalar for G_INSERT_VECTOR_ELT (it should change opcode to G_INSERT_VECTOR_ELT_TRUNC). Something like WidenEltScalar that would be used only for G_INSERT_VECTOR_ELT and G_EXTRACT_VECTOR_ELT?

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

GlobalISel/

LegalizerHelper.cpp

29 lines

Target/

AMDGPU/

AMDGPULegalizerInfo.cpp

5 lines

test/

CodeGen/

AMDGPU/

GlobalISel/

legalize-insert-vector-elt.mir

6 lines

Diff 226666

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp

Show First 20 Lines • Show All 1,767 Lines • ▼ Show 20 Lines	case TargetOpcode::G_PHI: {
MachineBasicBlock &MBB = *MI.getParent();		MachineBasicBlock &MBB = *MI.getParent();
MIRBuilder.setInsertPt(MBB, --MBB.getFirstNonPHI());		MIRBuilder.setInsertPt(MBB, --MBB.getFirstNonPHI());
widenScalarDst(MI, WideTy);		widenScalarDst(MI, WideTy);
Observer.changedInstr(MI);		Observer.changedInstr(MI);
return Legalized;		return Legalized;
}		}
case TargetOpcode::G_EXTRACT_VECTOR_ELT: {		case TargetOpcode::G_EXTRACT_VECTOR_ELT: {
if (TypeIdx == 0) {		if (TypeIdx == 0) {
Register VecReg = MI.getOperand(1).getReg();
LLT VecTy = MRI.getType(VecReg);
Observer.changingInstr(MI);		Observer.changingInstr(MI);

widenScalarSrc(MI, LLT::vector(VecTy.getNumElements(),
WideTy.getSizeInBits()),
1, TargetOpcode::G_SEXT);

widenScalarDst(MI, WideTy, 0);		widenScalarDst(MI, WideTy, 0);
Observer.changedInstr(MI);		Observer.changedInstr(MI);
return Legalized;		return Legalized;
}		}

		if (TypeIdx == 1) {
		Observer.changingInstr(MI);
		widenScalarSrc(MI, WideTy, 1, TargetOpcode::G_SEXT);
		Petar.AvramovicAuthorUnsubmitted Done Reply Inline Actions G_INSERT_VECTOR_ELT uses G_ANYEXT for its vector TypeIdx (0). Should this one also use G_ANYEXT? Petar.Avramovic: G_INSERT_VECTOR_ELT uses G_ANYEXT for its vector TypeIdx (0). Should this one also use…
		arsenmUnsubmitted Not Done Reply Inline Actions This should probably be changed in a separate patch arsenm: This should probably be changed in a separate patch
		Observer.changedInstr(MI);
		return Legalized;
		}

if (TypeIdx != 2)		if (TypeIdx != 2)
return UnableToLegalize;		return UnableToLegalize;
Observer.changingInstr(MI);		Observer.changingInstr(MI);
// TODO: Probably should be zext		// TODO: Probably should be zext
widenScalarSrc(MI, WideTy, 2, TargetOpcode::G_SEXT);		widenScalarSrc(MI, WideTy, 2, TargetOpcode::G_SEXT);
Observer.changedInstr(MI);		Observer.changedInstr(MI);
return Legalized;		return Legalized;
}		}
case TargetOpcode::G_INSERT_VECTOR_ELT: {		case TargetOpcode::G_INSERT_VECTOR_ELT: {
if (TypeIdx == 1) {		if (TypeIdx == 0) {
Observer.changingInstr(MI);		Observer.changingInstr(MI);
		widenScalarSrc(MI, WideTy, 1, TargetOpcode::G_ANYEXT);
		widenScalarDst(MI, WideTy, 0);
		Observer.changedInstr(MI);
		return Legalized;
		}

Register VecReg = MI.getOperand(1).getReg();		if (TypeIdx == 1) {
LLT VecTy = MRI.getType(VecReg);		Observer.changingInstr(MI);
LLT WideVecTy = LLT::vector(VecTy.getNumElements(), WideTy);

widenScalarSrc(MI, WideVecTy, 1, TargetOpcode::G_ANYEXT);
widenScalarSrc(MI, WideTy, 2, TargetOpcode::G_ANYEXT);		widenScalarSrc(MI, WideTy, 2, TargetOpcode::G_ANYEXT);
widenScalarDst(MI, WideVecTy, 0);
Observer.changedInstr(MI);		Observer.changedInstr(MI);
return Legalized;		return Legalized;
}		}

if (TypeIdx == 2) {		if (TypeIdx == 2) {
Observer.changingInstr(MI);		Observer.changingInstr(MI);
// TODO: Probably should be zext		// TODO: Probably should be zext
widenScalarSrc(MI, WideTy, 3, TargetOpcode::G_SEXT);		widenScalarSrc(MI, WideTy, 3, TargetOpcode::G_SEXT);
▲ Show 20 Lines • Show All 2,487 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp

Show First 20 Lines • Show All 913 Lines • ▼ Show 20 Lines	for (unsigned Op : {G_EXTRACT_VECTOR_ELT, G_INSERT_VECTOR_ELT}) {
unsigned EltTypeIdx = Op == G_EXTRACT_VECTOR_ELT ? 0 : 1;		unsigned EltTypeIdx = Op == G_EXTRACT_VECTOR_ELT ? 0 : 1;
unsigned IdxTypeIdx = 2;		unsigned IdxTypeIdx = 2;

getActionDefinitionsBuilder(Op)		getActionDefinitionsBuilder(Op)
.customIf([=](const LegalityQuery &Query) {		.customIf([=](const LegalityQuery &Query) {
const LLT EltTy = Query.Types[EltTypeIdx];		const LLT EltTy = Query.Types[EltTypeIdx];
const LLT VecTy = Query.Types[VecTypeIdx];		const LLT VecTy = Query.Types[VecTypeIdx];
const LLT IdxTy = Query.Types[IdxTypeIdx];		const LLT IdxTy = Query.Types[IdxTypeIdx];
return (EltTy.getSizeInBits() == 16 \|\|		return EltTy.getSizeInBits() == VecTy.getScalarSizeInBits() &&
		(EltTy.getSizeInBits() == 16 \|\|
EltTy.getSizeInBits() % 32 == 0) &&		EltTy.getSizeInBits() % 32 == 0) &&
VecTy.getSizeInBits() % 32 == 0 &&		VecTy.getSizeInBits() % 32 == 0 &&
VecTy.getSizeInBits() <= 1024 &&		VecTy.getSizeInBits() <= 1024 &&
IdxTy.getSizeInBits() == 32;		IdxTy.getSizeInBits() == 32;
})		})
.clampScalar(EltTypeIdx, S32, S64)		.clampScalar(EltTypeIdx, S32, S64)
.clampScalar(VecTypeIdx, S32, S64)		.clampScalarOrElt(VecTypeIdx, S32, S64)
.clampScalar(IdxTypeIdx, S32, S32);		.clampScalar(IdxTypeIdx, S32, S32);
}		}

getActionDefinitionsBuilder(G_EXTRACT_VECTOR_ELT)		getActionDefinitionsBuilder(G_EXTRACT_VECTOR_ELT)
.unsupportedIf([=](const LegalityQuery &Query) {		.unsupportedIf([=](const LegalityQuery &Query) {
const LLT &EltTy = Query.Types[1].getElementType();		const LLT &EltTy = Query.Types[1].getElementType();
return Query.Types[0] != EltTy;		return Query.Types[0] != EltTy;
});		});
▲ Show 20 Lines • Show All 1,271 Lines • Show Last 20 Lines

llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-insert-vector-elt.mir

	Show First 20 Lines • Show All 144 Lines • ▼ Show 20 Lines

	body: \|			body: \|
	bb.0:			bb.0:
	liveins: $vgpr0			liveins: $vgpr0

	; CHECK-LABEL: name: insert_vector_elt_0_v2i8_i32			; CHECK-LABEL: name: insert_vector_elt_0_v2i8_i32
	; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0			; CHECK: [[COPY:%[0-9]+]]:_(s32) = COPY $vgpr0
	; CHECK: [[DEF:%[0-9]+]]:_(<2 x s32>) = G_IMPLICIT_DEF			; CHECK: [[DEF:%[0-9]+]]:_(<2 x s32>) = G_IMPLICIT_DEF
	; CHECK: [[COPY1:%[0-9]+]]:_(<2 x s32>) = COPY [[DEF]](<2 x s32>)			; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY [[COPY]](s32)
	; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY [[COPY]](s32)			; CHECK: [[COPY2:%[0-9]+]]:_(<2 x s32>) = COPY [[DEF]](<2 x s32>)
	; CHECK: [[INSERT:%[0-9]+]]:_(<2 x s32>) = G_INSERT [[COPY1]], [[COPY2]](s32), 0			; CHECK: [[INSERT:%[0-9]+]]:_(<2 x s32>) = G_INSERT [[COPY2]], [[COPY1]](s32), 0
	; CHECK: [[COPY3:%[0-9]+]]:_(<2 x s32>) = COPY [[INSERT]](<2 x s32>)			; CHECK: [[COPY3:%[0-9]+]]:_(<2 x s32>) = COPY [[INSERT]](<2 x s32>)
	; CHECK: $vgpr0_vgpr1 = COPY [[COPY3]](<2 x s32>)			; CHECK: $vgpr0_vgpr1 = COPY [[COPY3]](<2 x s32>)
	%0:_(s32) = COPY $vgpr0			%0:_(s32) = COPY $vgpr0
	%1:_(s8) = G_TRUNC %0			%1:_(s8) = G_TRUNC %0
	%2:_(<2 x s8>) = G_IMPLICIT_DEF			%2:_(<2 x s8>) = G_IMPLICIT_DEF
	%3:_(s32) = G_CONSTANT i32 0			%3:_(s32) = G_CONSTANT i32 0
	%4:_(<2 x s8>) = G_INSERT_VECTOR_ELT %2, %1, %3			%4:_(<2 x s8>) = G_INSERT_VECTOR_ELT %2, %1, %3
	%5:_(<2 x s32>) = G_ANYEXT %4			%5:_(<2 x s32>) = G_ANYEXT %4
	$vgpr0_vgpr1 = COPY %5			$vgpr0_vgpr1 = COPY %5
	...			...