This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
2/3
InstCombineVectorOps.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
bitcast-vec-canon.ll

Differential D79171

[InstCombine] canonicalize bitcast after insertelement into undef
ClosedPublic

Authored by spatel on Apr 30 2020, 6:25 AM.

Download Raw Diff

Details

Reviewers

craig.topper
RKSimon
lebedev.ri

Commits

rG856cc60bc1ad: [InstCombine] canonicalize bitcast after insertelement into undef

Summary

We have a partial transform in the opposite direction, so that needs to be removed while adding a more general transform that moves bitcast after insertelement.

The motivating case from PR45748:
https://bugs.llvm.org/show_bug.cgi?id=45748
...is the last test diff. In that example, we are triggering an existing bitcast transform, so we reduce the number of casts, and that should give us the ideal x86 codegen.

I'm not sure what to do about the mmx diffs. If the x86 backend is expecting something in particular, we need to specify that here (do we need to exclude/add the mmx type to either of these code diffs?).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

spatel created this revision.Apr 30 2020, 6:25 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2020, 6:25 AM

Herald added subscribers: hiraditya, mcrosier. · View Herald Transcript

I messed with mmx in this area once and regretted it. See rG5ebbabc1af360756f402203ba7704bb480f279a7

Is canonicalizing towards a vector type that wasn't mentioned in the IR the right way to go? Is that cast free for legal types on all targets? I think its potentially scalarized or becomes a load/store to stack temporary for illegal types in the backend.

In D79171#2013138, @craig.topper wrote:

I messed with mmx in this area once and regretted it. See rG5ebbabc1af360756f402203ba7704bb480f279a7

Is canonicalizing towards a vector type that wasn't mentioned in the IR the right way to go? Is that cast free for legal types on all targets? I think its potentially scalarized or becomes a load/store to stack temporary for illegal types in the backend.

It kinda sounds to me like such cases should use some custom intrinsics instead of generic bitcasts.

In D79171#2013138, @craig.topper wrote:

I messed with mmx in this area once and regretted it. See rG5ebbabc1af360756f402203ba7704bb480f279a7

Sufficiently scared. Will remove mmx diffs.

Is canonicalizing towards a vector type that wasn't mentioned in the IR the right way to go? Is that cast free for legal types on all targets? I think its potentially scalarized or becomes a load/store to stack temporary for illegal types in the backend.

I think we're fine here. The vector bitcast that we are creating is only changing the element type (FP <-> int), and we assume that both of those are ok since they are already in the code independent of this transform.
The vector cast is likely safer than the scalar cast on most targets because it's probably not causing any underlying register move (assuming vector registers hold both int and FP). So even if the vector type is illegal, the obvious lowering is a no-op cast to the legal vector type.

Patch updated:
Kept existing fold for mmx and added warning comment.

LG in general, but MMX stuff puzzles me, so would be good for @craig.topper to comment.

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
2487–2488 ↗	(On Diff #261290)	Am i reading this correctly that this is NFC in context of rG5ebbabc1af360756f402203ba7704bb480f279a7? Can this be split off?
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1064	I guess this won't work for scalable vectors? Can we somehow just replace the elt type in `VecOp->getType()` instead?

This revision is now accepted and ready to land.May 6 2020, 12:07 AM

spatel marked 4 inline comments as done.May 6 2020, 7:50 AM

spatel added a subscriber: ctetreau.

spatel added inline comments.

llvm/lib/Transforms/InstCombine/InstCombineCasts.cpp
2487–2488 ↗	(On Diff #261290)	It's not quite NFC - this code diff results in the test diff in llvm/test/Transforms/InstCombine/bitcast-vec-canon.ll test "@c". That does seem like an independent improvement though, so I can break that into a preliminary commit.
llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1064	I think this is safe for scalable vectors, so I better add a test. See similar construct annotated below - around line 1279. cc @ctetreau
1315	Safe for scalable vectors?

spatel mentioned this in rG2058c98715f6: [InstCombine] limit bitcast+insertelement transform to x86 MMX type.May 6 2020, 10:13 AM

Patch updated:

Pulled the MMX diff into a preliminary commit - rG2058c98715f6
Added a scalable vector test.

spatel mentioned this in rGbcc5ed7b24e9: [CodeGen] fix test to be (mostly) independent of LLVM optimizer; NFC.May 10 2020, 8:30 AM

spatel mentioned this in rGd02b3aba37d9: [CodeGen] fix test to be (mostly) independent of LLVM optimizer; NFC.

Closed by commit rG856cc60bc1ad: [InstCombine] canonicalize bitcast after insertelement into undef (authored by spatel). · Explain WhyMay 10 2020, 9:02 AM

This revision was automatically updated to reflect the committed changes.

cdevadas mentioned this in D106545: SROA: Process bitcast (select ptr1, ptr2).Jul 22 2021, 6:35 AM

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineVectorOps.cpp

19 lines

test/

Transforms/

InstCombine/

bitcast-vec-canon.ll

38 lines

Diff 263072

llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp

Show First 20 Lines • Show All 1,044 Lines • ▼ Show 20 Lines	Instruction *InstCombiner::visitInsertElementInst(InsertElementInst &IE) {
Value *VecOp = IE.getOperand(0);		Value *VecOp = IE.getOperand(0);
Value *ScalarOp = IE.getOperand(1);		Value *ScalarOp = IE.getOperand(1);
Value *IdxOp = IE.getOperand(2);		Value *IdxOp = IE.getOperand(2);

if (auto *V = SimplifyInsertElementInst(		if (auto *V = SimplifyInsertElementInst(
VecOp, ScalarOp, IdxOp, SQ.getWithInstruction(&IE)))		VecOp, ScalarOp, IdxOp, SQ.getWithInstruction(&IE)))
return replaceInstUsesWith(IE, V);		return replaceInstUsesWith(IE, V);

		// If the scalar is bitcast and inserted into undef, do the insert in the
		// source type followed by bitcast.
		// TODO: Generalize for insert into any constant, not just undef?
		Value *ScalarSrc;
		if (match(VecOp, m_Undef()) &&
		match(ScalarOp, m_OneUse(m_BitCast(m_Value(ScalarSrc)))) &&
		(ScalarSrc->getType()->isIntegerTy() \|\|
		ScalarSrc->getType()->isFloatingPointTy())) {
		// inselt undef, (bitcast ScalarSrc), IdxOp -->
		// bitcast (inselt undef, ScalarSrc, IdxOp)
		Type *ScalarTy = ScalarSrc->getType();
		Type *VecTy = VectorType::get(ScalarTy, IE.getType()->getElementCount());
		lebedev.riUnsubmitted Not Done Reply Inline Actions I guess this won't work for scalable vectors? Can we somehow just replace the elt type in `VecOp->getType()` instead? lebedev.ri: I guess this won't work for scalable vectors? Can we somehow just replace the elt type in…
		spatelAuthorUnsubmitted Done Reply Inline Actions I think this is safe for scalable vectors, so I better add a test. See similar construct annotated below - around line 1279. cc @ctetreau spatel: I think this is safe for scalable vectors, so I better add a test. See similar construct…
		UndefValue *NewUndef = UndefValue::get(VecTy);
		Value *NewInsElt = Builder.CreateInsertElement(NewUndef, ScalarSrc, IdxOp);
		return new BitCastInst(NewInsElt, IE.getType());
		}

// If the vector and scalar are both bitcast from the same element type, do		// If the vector and scalar are both bitcast from the same element type, do
// the insert in that source type followed by bitcast.		// the insert in that source type followed by bitcast.
Value VecSrc, ScalarSrc;		Value *VecSrc;
if (match(VecOp, m_BitCast(m_Value(VecSrc))) &&		if (match(VecOp, m_BitCast(m_Value(VecSrc))) &&
match(ScalarOp, m_BitCast(m_Value(ScalarSrc))) &&		match(ScalarOp, m_BitCast(m_Value(ScalarSrc))) &&
(VecOp->hasOneUse() \|\| ScalarOp->hasOneUse()) &&		(VecOp->hasOneUse() \|\| ScalarOp->hasOneUse()) &&
VecSrc->getType()->isVectorTy() && !ScalarSrc->getType()->isVectorTy() &&		VecSrc->getType()->isVectorTy() && !ScalarSrc->getType()->isVectorTy() &&
cast<VectorType>(VecSrc->getType())->getElementType() ==		cast<VectorType>(VecSrc->getType())->getElementType() ==
ScalarSrc->getType()) {		ScalarSrc->getType()) {
// inselt (bitcast VecSrc), (bitcast ScalarSrc), IdxOp -->		// inselt (bitcast VecSrc), (bitcast ScalarSrc), IdxOp -->
// bitcast (inselt VecSrc, ScalarSrc, IdxOp)		// bitcast (inselt VecSrc, ScalarSrc, IdxOp)
▲ Show 20 Lines • Show All 226 Lines • ▼ Show 20 Lines	switch (I->getOpcode()) {
case Instruction::UIToFP:		case Instruction::UIToFP:
case Instruction::SIToFP:		case Instruction::SIToFP:
case Instruction::FPTrunc:		case Instruction::FPTrunc:
case Instruction::FPExt: {		case Instruction::FPExt: {
// It's possible that the mask has a different number of elements from		// It's possible that the mask has a different number of elements from
// the original cast. We recompute the destination type to match the mask.		// the original cast. We recompute the destination type to match the mask.
Type *DestTy = VectorType::get(		Type *DestTy = VectorType::get(
I->getType()->getScalarType(),		I->getType()->getScalarType(),
cast<VectorType>(NewOps[0]->getType())->getElementCount());		cast<VectorType>(NewOps[0]->getType())->getElementCount());
		spatelAuthorUnsubmitted Done Reply Inline Actions Safe for scalable vectors? spatel: Safe for scalable vectors?
assert(NewOps.size() == 1 && "cast with #ops != 1");		assert(NewOps.size() == 1 && "cast with #ops != 1");
return CastInst::Create(cast<CastInst>(I)->getOpcode(), NewOps[0], DestTy,		return CastInst::Create(cast<CastInst>(I)->getOpcode(), NewOps[0], DestTy,
"", I);		"", I);
}		}
case Instruction::GetElementPtr: {		case Instruction::GetElementPtr: {
Value *Ptr = NewOps[0];		Value *Ptr = NewOps[0];
ArrayRef<Value*> Idx = NewOps.slice(1);		ArrayRef<Value*> Idx = NewOps.slice(1);
GetElementPtrInst *GEP = GetElementPtrInst::Create(		GetElementPtrInst *GEP = GetElementPtrInst::Create(
▲ Show 20 Lines • Show All 1,025 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/bitcast-vec-canon.ll

	Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret double [[TMP0]]			; CHECK-NEXT: ret double [[TMP0]]
	;			;
	entry:			entry:
	%0 = bitcast x86_mmx %x to <1 x i64>			%0 = bitcast x86_mmx %x to <1 x i64>
	%1 = bitcast <1 x i64> %0 to double			%1 = bitcast <1 x i64> %0 to double
	ret double %1			ret double %1
	}			}

				; FP source is ok.

	define <3 x i64> @bitcast_inselt_undef(double %x, i32 %idx) {			define <3 x i64> @bitcast_inselt_undef(double %x, i32 %idx) {
	; CHECK-LABEL: @bitcast_inselt_undef(			; CHECK-LABEL: @bitcast_inselt_undef(
	; CHECK-NEXT: [[XB:%.]] = bitcast double [[X:%.]] to i64			; CHECK-NEXT: [[TMP1:%.]] = insertelement <3 x double> undef, double [[X:%.]], i32 [[IDX:%.*]]
	; CHECK-NEXT: [[I:%.]] = insertelement <3 x i64> undef, i64 [[XB]], i32 [[IDX:%.]]			; CHECK-NEXT: [[I:%.*]] = bitcast <3 x double> [[TMP1]] to <3 x i64>
	; CHECK-NEXT: ret <3 x i64> [[I]]			; CHECK-NEXT: ret <3 x i64> [[I]]
	;			;
	%xb = bitcast double %x to i64			%xb = bitcast double %x to i64
	%i = insertelement <3 x i64> undef, i64 %xb, i32 %idx			%i = insertelement <3 x i64> undef, i64 %xb, i32 %idx
	ret <3 x i64> %i			ret <3 x i64> %i
	}			}

				; Integer source is ok; index is anything.

	define <3 x float> @bitcast_inselt_undef_fp(i32 %x, i567 %idx) {			define <3 x float> @bitcast_inselt_undef_fp(i32 %x, i567 %idx) {
	; CHECK-LABEL: @bitcast_inselt_undef_fp(			; CHECK-LABEL: @bitcast_inselt_undef_fp(
	; CHECK-NEXT: [[XB:%.]] = bitcast i32 [[X:%.]] to float			; CHECK-NEXT: [[TMP1:%.]] = insertelement <3 x i32> undef, i32 [[X:%.]], i567 [[IDX:%.*]]
	; CHECK-NEXT: [[I:%.]] = insertelement <3 x float> undef, float [[XB]], i567 [[IDX:%.]]			; CHECK-NEXT: [[I:%.*]] = bitcast <3 x i32> [[TMP1]] to <3 x float>
	; CHECK-NEXT: ret <3 x float> [[I]]			; CHECK-NEXT: ret <3 x float> [[I]]
	;			;
	%xb = bitcast i32 %x to float			%xb = bitcast i32 %x to float
	%i = insertelement <3 x float> undef, float %xb, i567 %idx			%i = insertelement <3 x float> undef, float %xb, i567 %idx
	ret <3 x float> %i			ret <3 x float> %i
	}			}

				define <vscale x 3 x float> @bitcast_inselt_undef_vscale(i32 %x, i567 %idx) {
				; CHECK-LABEL: @bitcast_inselt_undef_vscale(
				; CHECK-NEXT: [[TMP1:%.]] = insertelement <vscale x 3 x i32> undef, i32 [[X:%.]], i567 [[IDX:%.*]]
				; CHECK-NEXT: [[I:%.*]] = bitcast <vscale x 3 x i32> [[TMP1]] to <vscale x 3 x float>
				; CHECK-NEXT: ret <vscale x 3 x float> [[I]]
				;
				%xb = bitcast i32 %x to float
				%i = insertelement <vscale x 3 x float> undef, float %xb, i567 %idx
				ret <vscale x 3 x float> %i
				}

	declare void @use(i64)			declare void @use(i64)

				; Negative test - extra use prevents canonicalization

	define <3 x i64> @bitcast_inselt_undef_extra_use(double %x, i32 %idx) {			define <3 x i64> @bitcast_inselt_undef_extra_use(double %x, i32 %idx) {
	; CHECK-LABEL: @bitcast_inselt_undef_extra_use(			; CHECK-LABEL: @bitcast_inselt_undef_extra_use(
	; CHECK-NEXT: [[XB:%.]] = bitcast double [[X:%.]] to i64			; CHECK-NEXT: [[XB:%.]] = bitcast double [[X:%.]] to i64
	; CHECK-NEXT: call void @use(i64 [[XB]])			; CHECK-NEXT: call void @use(i64 [[XB]])
	; CHECK-NEXT: [[I:%.]] = insertelement <3 x i64> undef, i64 [[XB]], i32 [[IDX:%.]]			; CHECK-NEXT: [[I:%.]] = insertelement <3 x i64> undef, i64 [[XB]], i32 [[IDX:%.]]
	; CHECK-NEXT: ret <3 x i64> [[I]]			; CHECK-NEXT: ret <3 x i64> [[I]]
	;			;
	%xb = bitcast double %x to i64			%xb = bitcast double %x to i64
	call void @use(i64 %xb)			call void @use(i64 %xb)
	%i = insertelement <3 x i64> undef, i64 %xb, i32 %idx			%i = insertelement <3 x i64> undef, i64 %xb, i32 %idx
	ret <3 x i64> %i			ret <3 x i64> %i
	}			}

				; Negative test - source type must be scalar

	define <3 x i64> @bitcast_inselt_undef_vec_src(<2 x i32> %x, i32 %idx) {			define <3 x i64> @bitcast_inselt_undef_vec_src(<2 x i32> %x, i32 %idx) {
	; CHECK-LABEL: @bitcast_inselt_undef_vec_src(			; CHECK-LABEL: @bitcast_inselt_undef_vec_src(
	; CHECK-NEXT: [[XB:%.]] = bitcast <2 x i32> [[X:%.]] to i64			; CHECK-NEXT: [[XB:%.]] = bitcast <2 x i32> [[X:%.]] to i64
	; CHECK-NEXT: [[I:%.]] = insertelement <3 x i64> undef, i64 [[XB]], i32 [[IDX:%.]]			; CHECK-NEXT: [[I:%.]] = insertelement <3 x i64> undef, i64 [[XB]], i32 [[IDX:%.]]
	; CHECK-NEXT: ret <3 x i64> [[I]]			; CHECK-NEXT: ret <3 x i64> [[I]]
	;			;
	%xb = bitcast <2 x i32> %x to i64			%xb = bitcast <2 x i32> %x to i64
	%i = insertelement <3 x i64> undef, i64 %xb, i32 %idx			%i = insertelement <3 x i64> undef, i64 %xb, i32 %idx
	ret <3 x i64> %i			ret <3 x i64> %i
	}			}

				; Negative test - source type must be scalar

	define <3 x i64> @bitcast_inselt_undef_from_mmx(x86_mmx %x, i32 %idx) {			define <3 x i64> @bitcast_inselt_undef_from_mmx(x86_mmx %x, i32 %idx) {
	; CHECK-LABEL: @bitcast_inselt_undef_from_mmx(			; CHECK-LABEL: @bitcast_inselt_undef_from_mmx(
	; CHECK-NEXT: [[XB:%.]] = bitcast x86_mmx [[X:%.]] to i64			; CHECK-NEXT: [[XB:%.]] = bitcast x86_mmx [[X:%.]] to i64
	; CHECK-NEXT: [[I:%.]] = insertelement <3 x i64> undef, i64 [[XB]], i32 [[IDX:%.]]			; CHECK-NEXT: [[I:%.]] = insertelement <3 x i64> undef, i64 [[XB]], i32 [[IDX:%.]]
	; CHECK-NEXT: ret <3 x i64> [[I]]			; CHECK-NEXT: ret <3 x i64> [[I]]
	;			;
	%xb = bitcast x86_mmx %x to i64			%xb = bitcast x86_mmx %x to i64
	%i = insertelement <3 x i64> undef, i64 %xb, i32 %idx			%i = insertelement <3 x i64> undef, i64 %xb, i32 %idx
	ret <3 x i64> %i			ret <3 x i64> %i
	}			}

				; Reduce number of casts

	define <2 x i64> @PR45748(double %x, double %y) {			define <2 x i64> @PR45748(double %x, double %y) {
	; CHECK-LABEL: @PR45748(			; CHECK-LABEL: @PR45748(
	; CHECK-NEXT: [[XB:%.]] = bitcast double [[X:%.]] to i64			; CHECK-NEXT: [[TMP1:%.]] = insertelement <2 x double> undef, double [[X:%.]], i32 0
	; CHECK-NEXT: [[I0:%.*]] = insertelement <2 x i64> undef, i64 [[XB]], i32 0			; CHECK-NEXT: [[TMP2:%.]] = insertelement <2 x double> [[TMP1]], double [[Y:%.]], i32 1
	; CHECK-NEXT: [[YB:%.]] = bitcast double [[Y:%.]] to i64			; CHECK-NEXT: [[I1:%.*]] = bitcast <2 x double> [[TMP2]] to <2 x i64>
	; CHECK-NEXT: [[I1:%.*]] = insertelement <2 x i64> [[I0]], i64 [[YB]], i32 1
	; CHECK-NEXT: ret <2 x i64> [[I1]]			; CHECK-NEXT: ret <2 x i64> [[I1]]
	;			;
	%xb = bitcast double %x to i64			%xb = bitcast double %x to i64
	%i0 = insertelement <2 x i64> undef, i64 %xb, i32 0			%i0 = insertelement <2 x i64> undef, i64 %xb, i32 0
	%yb = bitcast double %y to i64			%yb = bitcast double %y to i64
	%i1 = insertelement <2 x i64> %i0, i64 %yb, i32 1			%i1 = insertelement <2 x i64> %i0, i64 %yb, i32 1
	ret <2 x i64> %i1			ret <2 x i64> %i1
	}			}