This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineAndOrXor.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
logical-select.ll

Differential D124997

[InstCombine] Fix scalable-vector bitwise select matching
ClosedPublic

Authored by frasercrmck on May 5 2022, 3:23 AM.

Download Raw Diff

Details

Reviewers

spatel
lebedev.ri
dmgreen
craig.topper

Commits

rGbafab9c09f68: [InstCombine] Fix scalable-vector bitwise select matching

Summary

D113035 enhanced the matching of bitwise selects from vector types. This
change unfortunately introduced crashes as it tries to cast scalable
vector types to integers.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

frasercrmck created this revision.May 5 2022, 3:23 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2022, 3:23 AM

Herald added subscribers: StephenFan, hiraditya. · View Herald Transcript

frasercrmck requested review of this revision.May 5 2022, 3:23 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 5 2022, 3:23 AM

Herald added subscribers: llvm-commits, alextsao1999. · View Herald Transcript

Harbormaster completed remote builds in B162866: Diff 427253.May 5 2022, 4:07 AM

Do we have a test for scalable vectors where the remaining part of the transform is valid?

In D124997#3493929, @spatel wrote:

Do we have a test for scalable vectors where the remaining part of the transform is valid?

Ah, the 1st test covers that. But then couldn't we just make a one-line change to avoid the problematic part?

if (auto *VecTy = dyn_cast<FixedVectorType>(Cond->getType())) {

In D124997#3493954, @spatel wrote:
In D124997#3493929, @spatel wrote:

Do we have a test for scalable vectors where the remaining part of the transform is valid?

Ah, the 1st test covers that. But then couldn't we just make a one-line change to avoid the problematic part?
if (auto *VecTy = dyn_cast<FixedVectorType>(Cond->getType())) {

Aye, that's what the 1st test is for.

I was under the impression that if Cond->getType() is any vector type we must either be able to do the bitcast or bail. So wouldn't we just need the other checks I've added in an else?

In D124997#3493988, @frasercrmck wrote:

I was under the impression that if Cond->getType() is any vector type we must either be able to do the bitcast or bail. So wouldn't we just need the other checks I've added in an else?

We can still have a vector bitcast where the cast is to a type that matches the number of elements in the vector condition. That's the 2nd test IIUC, so if we just do the dyn_cast, it becomes:

define <vscale x 1 x i64> @vec_of_casted_bools_scalable(<vscale x 1 x i64> %a, <vscale x 1 x i64> %b, <vscale x 8 x i1> %cond) {
  %1 = bitcast <vscale x 1 x i64> %a to <vscale x 8 x i8>
  %2 = bitcast <vscale x 1 x i64> %b to <vscale x 8 x i8>
  %3 = select <vscale x 8 x i1> %cond, <vscale x 8 x i8> %1, <vscale x 8 x i8> %2
  %4 = bitcast <vscale x 8 x i8> %3 to <vscale x 1 x i64>
  ret <vscale x 1 x i64> %4
}

address comments

In D124997#3494017, @spatel wrote:
We can still have a vector bitcast where the cast is to a type that matches the number of elements in the vector condition. That's the 2nd test IIUC, so if we just do the dyn_cast, it becomes:
define <vscale x 1 x i64> @vec_of_casted_bools_scalable(<vscale x 1 x i64> %a, <vscale x 1 x i64> %b, <vscale x 8 x i1> %cond) {
  %1 = bitcast <vscale x 1 x i64> %a to <vscale x 8 x i8>
  %2 = bitcast <vscale x 1 x i64> %b to <vscale x 8 x i8>
  %3 = select <vscale x 8 x i1> %cond, <vscale x 8 x i8> %1, <vscale x 8 x i8> %2
  %4 = bitcast <vscale x 8 x i8> %3 to <vscale x 1 x i64>
  ret <vscale x 1 x i64> %4
}

Oh, right you are! Sorry, I had the wrong impression of how this worked. Thank you.

spatel mentioned this in rG7bad1d281c79: [InstCombine] add scalable vector test for logical select; NFC.May 5 2022, 9:46 AM

Harbormaster completed remote builds in B162931: Diff 427338.May 5 2022, 9:50 AM

In D124997#3494120, @frasercrmck wrote:

Oh, right you are! Sorry, I had the wrong impression of how this worked. Thank you.

The code comments could be improved. There are many different potential patterns within this block, and scalable vectors just make it harder to keep it all straight. :)
But after mucking around in here for a long time, I don't think the fix is sufficient.
I added a test with 7bad1d281c798929a to try to uncover another potential bug.
But I can't find a case currently where we would go wrong because code before here (ComputeNumSignBits) prevents matching a scalable vector with the right combination of bitcasts to trigger that bug.

A better fix should do something like this:

// If this is a vector, we may need to cast to match the condition's length.
Type *SelTy = A->getType();
if (auto *VecTy = dyn_cast<VectorType>(Cond->getType())) {
  // For a fixed or scalable vector get N from <{vscale x} N x iM>
  unsigned Elts = VecTy->getElementCount().getKnownMinValue();
  // For a fixed or scalable vector, get the value N x iM; for a scalar this is just M.
  unsigned SelEltSize = SelTy->getPrimitiveSizeInBits().getKnownMinSize();
  Type *EltTy = Builder.getIntNTy(SelEltSize / Elts);
  SelTy = VectorType::get(EltTy, VecTy->getElementCount());
}

I had a hard time making sense of the type and size APIs, so if anyone knows that better, please correct/improve if possible.

fix and enable bitcasts for scalable vectors

In D124997#3494745, @spatel wrote:
The code comments could be improved. There are many different potential patterns within this block, and scalable vectors just make it harder to keep it all straight. :)
But after mucking around in here for a long time, I don't think the fix is sufficient.
I added a test with 7bad1d281c798929a to try to uncover another potential bug.
But I can't find a case currently where we would go wrong because code before here (ComputeNumSignBits) prevents matching a scalable vector with the right combination of bitcasts to trigger that bug.

A better fix should do something like this:
// If this is a vector, we may need to cast to match the condition's length.
Type *SelTy = A->getType();
if (auto *VecTy = dyn_cast<VectorType>(Cond->getType())) {
  // For a fixed or scalable vector get N from <{vscale x} N x iM>
  unsigned Elts = VecTy->getElementCount().getKnownMinValue();
  // For a fixed or scalable vector, get the value N x iM; for a scalar this is just M.
  unsigned SelEltSize = SelTy->getPrimitiveSizeInBits().getKnownMinSize();
  Type *EltTy = Builder.getIntNTy(SelEltSize / Elts);
  SelTy = VectorType::get(EltTy, VecTy->getElementCount());
}
I had a hard time making sense of the type and size APIs, so if anyone knows that better, please correct/improve if possible.

Thanks for digging in. I also found it very difficult to get scalable-vector tests which would actually trigger all possible cases we're trying to handle.

I think what you've got looks okay. I adjusted the comments a little.

Harbormaster completed remote builds in B163092: Diff 427564.May 6 2022, 3:20 AM

LGTM

This revision is now accepted and ready to land.May 6 2022, 4:23 AM

frasercrmck edited the summary of this revision. (Show Details)May 6 2022, 4:55 AM

This revision was landed with ongoing or failed builds.May 6 2022, 5:12 AM

Closed by commit rGbafab9c09f68: [InstCombine] Fix scalable-vector bitwise select matching (authored by frasercrmck). · Explain Why

This revision was automatically updated to reflect the committed changes.

frasercrmck added a commit: rGbafab9c09f68: [InstCombine] Fix scalable-vector bitwise select matching.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

InstCombine/

InstCombineAndOrXor.cpp

6 lines

test/

Transforms/

InstCombine/

logical-select.ll

31 lines

Diff 427602

llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp

Show First 20 Lines • Show All 2,338 Lines • ▼ Show 20 Lines	Value InstCombinerImpl::matchSelectFromAndOr(Value A, Value C, Value B,
B = peekThroughBitcast(B, true);		B = peekThroughBitcast(B, true);
if (Value *Cond = getSelectCondition(A, B)) {		if (Value *Cond = getSelectCondition(A, B)) {
// ((bc Cond) & C) \| ((bc ~Cond) & D) --> bc (select Cond, (bc C), (bc D))		// ((bc Cond) & C) \| ((bc ~Cond) & D) --> bc (select Cond, (bc C), (bc D))
// If this is a vector, we may need to cast to match the condition's length.		// If this is a vector, we may need to cast to match the condition's length.
// The bitcasts will either all exist or all not exist. The builder will		// The bitcasts will either all exist or all not exist. The builder will
// not create unnecessary casts if the types already match.		// not create unnecessary casts if the types already match.
Type *SelTy = A->getType();		Type *SelTy = A->getType();
if (auto *VecTy = dyn_cast<VectorType>(Cond->getType())) {		if (auto *VecTy = dyn_cast<VectorType>(Cond->getType())) {
		// For a fixed or scalable vector get N from <{vscale x} N x iM>
unsigned Elts = VecTy->getElementCount().getKnownMinValue();		unsigned Elts = VecTy->getElementCount().getKnownMinValue();
Type *EltTy = Builder.getIntNTy(SelTy->getPrimitiveSizeInBits() / Elts);		// For a fixed or scalable vector, get the size in bits of N x iM; for a
		// scalar this is just M.
		unsigned SelEltSize = SelTy->getPrimitiveSizeInBits().getKnownMinSize();
		Type *EltTy = Builder.getIntNTy(SelEltSize / Elts);
SelTy = VectorType::get(EltTy, VecTy->getElementCount());		SelTy = VectorType::get(EltTy, VecTy->getElementCount());
}		}
Value *BitcastC = Builder.CreateBitCast(C, SelTy);		Value *BitcastC = Builder.CreateBitCast(C, SelTy);
Value *BitcastD = Builder.CreateBitCast(D, SelTy);		Value *BitcastD = Builder.CreateBitCast(D, SelTy);
Value *Select = Builder.CreateSelect(Cond, BitcastC, BitcastD);		Value *Select = Builder.CreateSelect(Cond, BitcastC, BitcastD);
return Builder.CreateBitCast(Select, OrigType);		return Builder.CreateBitCast(Select, OrigType);
}		}

▲ Show 20 Lines • Show All 1,273 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/logical-select.ll

	Show First 20 Lines • Show All 465 Lines • ▼ Show 20 Lines
	;			;
	%not = xor <4 x i1> %c, <i1 true, i1 true, i1 true, i1 true>			%not = xor <4 x i1> %c, <i1 true, i1 true, i1 true, i1 true>
	%and1 = and <4 x i1> %not, %a			%and1 = and <4 x i1> %not, %a
	%and2 = and <4 x i1> %b, %c			%and2 = and <4 x i1> %b, %c
	%or = or <4 x i1> %and2, %and1			%or = or <4 x i1> %and2, %and1
	ret <4 x i1> %or			ret <4 x i1> %or
	}			}

				define <vscale x 1 x i1> @vec_of_bools_scalable(<vscale x 1 x i1> %a, <vscale x 1 x i1> %c, <vscale x 1 x i1> %d) {
				; CHECK-LABEL: @vec_of_bools_scalable(
				; CHECK-NEXT: [[TMP1:%.]] = select <vscale x 1 x i1> [[A:%.]], <vscale x 1 x i1> [[C:%.]], <vscale x 1 x i1> [[D:%.]]
				; CHECK-NEXT: ret <vscale x 1 x i1> [[TMP1]]
				;
				%b = xor <vscale x 1 x i1> %a, shufflevector (<vscale x 1 x i1> insertelement (<vscale x 1 x i1> poison, i1 true, i32 0), <vscale x 1 x i1> poison, <vscale x 1 x i32> zeroinitializer)
				%t11 = and <vscale x 1 x i1> %a, %c
				%t12 = and <vscale x 1 x i1> %b, %d
				%r = or <vscale x 1 x i1> %t11, %t12
				ret <vscale x 1 x i1> %r
				}

	define i4 @vec_of_casted_bools(i4 %a, i4 %b, <4 x i1> %c) {			define i4 @vec_of_casted_bools(i4 %a, i4 %b, <4 x i1> %c) {
	; CHECK-LABEL: @vec_of_casted_bools(			; CHECK-LABEL: @vec_of_casted_bools(
	; CHECK-NEXT: [[TMP1:%.]] = bitcast i4 [[B:%.]] to <4 x i1>			; CHECK-NEXT: [[TMP1:%.]] = bitcast i4 [[B:%.]] to <4 x i1>
	; CHECK-NEXT: [[TMP2:%.]] = bitcast i4 [[A:%.]] to <4 x i1>			; CHECK-NEXT: [[TMP2:%.]] = bitcast i4 [[A:%.]] to <4 x i1>
	; CHECK-NEXT: [[TMP3:%.]] = select <4 x i1> [[C:%.]], <4 x i1> [[TMP1]], <4 x i1> [[TMP2]]			; CHECK-NEXT: [[TMP3:%.]] = select <4 x i1> [[C:%.]], <4 x i1> [[TMP1]], <4 x i1> [[TMP2]]
	; CHECK-NEXT: [[TMP4:%.*]] = bitcast <4 x i1> [[TMP3]] to i4			; CHECK-NEXT: [[TMP4:%.*]] = bitcast <4 x i1> [[TMP3]] to i4
	; CHECK-NEXT: ret i4 [[TMP4]]			; CHECK-NEXT: ret i4 [[TMP4]]
	;			;
	%not = xor <4 x i1> %c, <i1 true, i1 true, i1 true, i1 true>			%not = xor <4 x i1> %c, <i1 true, i1 true, i1 true, i1 true>
	%bc1 = bitcast <4 x i1> %not to i4			%bc1 = bitcast <4 x i1> %not to i4
	%bc2 = bitcast <4 x i1> %c to i4			%bc2 = bitcast <4 x i1> %c to i4
	%and1 = and i4 %a, %bc1			%and1 = and i4 %a, %bc1
	%and2 = and i4 %bc2, %b			%and2 = and i4 %bc2, %b
	%or = or i4 %and1, %and2			%or = or i4 %and1, %and2
	ret i4 %or			ret i4 %or
	}			}

				define <vscale x 1 x i64> @vec_of_casted_bools_scalable(<vscale x 1 x i64> %a, <vscale x 1 x i64> %b, <vscale x 8 x i1> %cond) {
				; CHECK-LABEL: @vec_of_casted_bools_scalable(
				; CHECK-NEXT: [[TMP1:%.]] = bitcast <vscale x 1 x i64> [[A:%.]] to <vscale x 8 x i8>
				; CHECK-NEXT: [[TMP2:%.]] = bitcast <vscale x 1 x i64> [[B:%.]] to <vscale x 8 x i8>
				; CHECK-NEXT: [[TMP3:%.]] = select <vscale x 8 x i1> [[COND:%.]], <vscale x 8 x i8> [[TMP1]], <vscale x 8 x i8> [[TMP2]]
				; CHECK-NEXT: [[TMP4:%.*]] = bitcast <vscale x 8 x i8> [[TMP3]] to <vscale x 1 x i64>
				; CHECK-NEXT: ret <vscale x 1 x i64> [[TMP4]]
				;
				%scond = sext <vscale x 8 x i1> %cond to <vscale x 8 x i8>
				%notcond = xor <vscale x 8 x i1> %cond, shufflevector (<vscale x 8 x i1> insertelement (<vscale x 8 x i1> poison, i1 true, i32 0), <vscale x 8 x i1> poison, <vscale x 8 x i32> zeroinitializer)
				%snotcond = sext <vscale x 8 x i1> %notcond to <vscale x 8 x i8>
				%bc1 = bitcast <vscale x 8 x i8> %scond to <vscale x 1 x i64>
				%bc2 = bitcast <vscale x 8 x i8> %snotcond to <vscale x 1 x i64>
				%and1 = and <vscale x 1 x i64> %a, %bc1
				%and2 = and <vscale x 1 x i64> %bc2, %b
				%or = or <vscale x 1 x i64> %and1, %and2
				ret <vscale x 1 x i64> %or
				}

	; Inverted 'and' constants mean this is a select which is canonicalized to a shuffle.			; Inverted 'and' constants mean this is a select which is canonicalized to a shuffle.

	define <4 x i32> @vec_sel_consts(<4 x i32> %a, <4 x i32> %b) {			define <4 x i32> @vec_sel_consts(<4 x i32> %a, <4 x i32> %b) {
	; CHECK-LABEL: @vec_sel_consts(			; CHECK-LABEL: @vec_sel_consts(
	; CHECK-NEXT: [[TMP1:%.]] = shufflevector <4 x i32> [[A:%.]], <4 x i32> [[B:%.*]], <4 x i32> <i32 0, i32 5, i32 6, i32 3>			; CHECK-NEXT: [[TMP1:%.]] = shufflevector <4 x i32> [[A:%.]], <4 x i32> [[B:%.*]], <4 x i32> <i32 0, i32 5, i32 6, i32 3>
	; CHECK-NEXT: ret <4 x i32> [[TMP1]]			; CHECK-NEXT: ret <4 x i32> [[TMP1]]
	;			;
	%and1 = and <4 x i32> %a, <i32 -1, i32 0, i32 0, i32 -1>			%and1 = and <4 x i32> %a, <i32 -1, i32 0, i32 0, i32 -1>
	▲ Show 20 Lines • Show All 401 Lines • Show Last 20 Lines