This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Analysis/
-
Analysis/
1/3
ValueTracking.cpp
-
test/Transforms/
-
Transforms/
-
InstCombine/X86/
-
X86/
-
x86-vector-shifts.ll
-
InstSimplify/
-
shift-knownbits.ll

Differential D104472

[ValueTracking] look through bitcast of vector in computeKnownBits
ClosedPublic

Authored by spatel on Jun 17 2021, 10:03 AM.

Download Raw Diff

Details

Reviewers

RKSimon
lebedev.ri
efriedma

Commits

rG656001e7b2b9: [ValueTracking] look through bitcast of vector in computeKnownBits

Summary

This borrows as much as possible from the SDAG version of the code (originally added with D27129 and since updated with big endian support).

In IR, we can test more easily for correctness than we did in the original patch. I'm using the simplest cases that I could find for InstSimplify - we computeKnownBits on variable shift amounts to see if they are zero or in range, so I used a shuffle to push constant elements into a vector.

The motivating x86 example from https://llvm.org/PR50123 is also here. We computeKnownBits in the caller code, but we only check if the shift amount is in range. That could be enhanced to catch the 2nd x86 test - if the shift amount is known too big, the result is 0.

Alive2 understands the datalayout and agrees that the tests here are correct - example:
https://alive2.llvm.org/ce/z/KZJFMZ

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

spatel created this revision.Jun 17 2021, 10:03 AM

Herald added subscribers: pengfei, hiraditya, mcrosier. · View Herald TranscriptJun 17 2021, 10:03 AM

spatel requested review of this revision.Jun 17 2021, 10:03 AM

Herald added a project: Restricted Project. · View Herald TranscriptJun 17 2021, 10:03 AM

Would it please be possible to either enhance comments, or port more comments from the SelectionDAG::computeKnownBits()?
It appears to match that original implementation, and seems correct, but it is hard to read/get through.

In D104472#2825125, @lebedev.ri wrote:

Would it please be possible to either enhance comments, or port more comments from the SelectionDAG::computeKnownBits()?
It appears to match that original implementation, and seems correct, but it is hard to read/get through.

I thought I had improved the original code comments, but it might have made it worse.

The tricky part is the calling of computeKnownBits with the shifting demanded elements vector.
The original says this:

// Collect known bits for the (larger) output by collecting the known
// bits from each set of sub elements and shift these into place.
// We need to separately call computeKnownBits for each set of
// sub elements as the knownbits for each is likely to be different.

Is this better or worse?

// Known bits are automatically intersected across demanded
// elements of a vector. So for example, if a bit is computed as 
// known zero, it must be zero across all demanded elements 
// of the vector. 
//
// For this bitcast, each demanded element of the output is 
// sub-divided across a set of smaller vector elements in the 
// source vector. To get the known bits for an entire element 
// of the output, compute the known bits for each sub-element 
// sequentially. This is done by shifting the one-set-bit demanded 
// elements parameter across the sub-elements for consecutive
// calls to computeKnownBits.
//
// The known bits of each sub-element are then extended
// and shifted into place (dependent on endian) to form the
// full result of known bits.

Harbormaster completed remote builds in B109751: Diff 352767.Jun 17 2021, 10:45 PM

Thanks again for looking at this!

llvm/lib/Analysis/ValueTracking.cpp
1217	Can you use APInt::insertBits here ?

spatel added inline comments.Jun 18 2021, 4:39 AM

llvm/lib/Analysis/ValueTracking.cpp
1217	Yes - at least that's how I drafted it originally. But I made it match the existing DAG code, so we'd get more testing for both versions once this is in (there seems to be much more fuzz/bot coverage for IR than codegen). How about a TODO that we could update both versions simultaneously?

spatel edited the summary of this revision. (Show Details)Jun 18 2021, 6:08 AM

RKSimon mentioned this in rG7353beda4aa1: [DAG] SelectionDAG::computeKnownBits - use APInt::insertBits to merge subvector….Jun 18 2021, 7:22 AM

RKSimon added inline comments.Jun 18 2021, 7:39 AM

llvm/lib/Analysis/ValueTracking.cpp
1217	Updated in rG7353beda4aa1 - there's several additional cases where we could use insertBits/extractBits that I'll deal with at another time.

Patch updated - no logic changes from earlier rev, but:

Make code comment more extensive (I'm speculating that this version is better than what I had before. If so, I can update the SDAG comment too; if not, I can paste in the existing comment from SDAG.)
Use KnownBits.insertBits() to reduce code.

Harbormaster completed remote builds in B109944: Diff 353020.Jun 19 2021, 4:13 AM

No objections from me - anyone else?

LG, thank you.

This revision is now accepted and ready to land.Jun 23 2021, 6:55 AM

This revision was landed with ongoing or failed builds.Jun 23 2021, 8:47 AM

Closed by commit rG656001e7b2b9: [ValueTracking] look through bitcast of vector in computeKnownBits (authored by spatel). · Explain Why

This revision was automatically updated to reflect the committed changes.

spatel added a commit: rG656001e7b2b9: [ValueTracking] look through bitcast of vector in computeKnownBits.

Revision Contents

Path

Size

llvm/

lib/

Analysis/

ValueTracking.cpp

41 lines

test/

Transforms/

InstCombine/

X86/

x86-vector-shifts.ll

10 lines

InstSimplify/

shift-knownbits.ll

84 lines

Diff 353995

llvm/lib/Analysis/ValueTracking.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 1,176 Lines • ▼ Show 20 Lines	case Instruction::BitCast: {
Type *SrcTy = I->getOperand(0)->getType();		Type *SrcTy = I->getOperand(0)->getType();
if (SrcTy->isIntOrPtrTy() &&		if (SrcTy->isIntOrPtrTy() &&
// TODO: For now, not handling conversions like:		// TODO: For now, not handling conversions like:
// (bitcast i64 %x to <2 x i32>)		// (bitcast i64 %x to <2 x i32>)
!I->getType()->isVectorTy()) {		!I->getType()->isVectorTy()) {
computeKnownBits(I->getOperand(0), Known, Depth + 1, Q);		computeKnownBits(I->getOperand(0), Known, Depth + 1, Q);
break;		break;
}		}

		// Handle cast from vector integer type to scalar or vector integer.
		auto *SrcVecTy = dyn_cast<FixedVectorType>(SrcTy);
		if (!SrcVecTy \|\| !SrcVecTy->getElementType()->isIntegerTy() \|\|
		!I->getType()->isIntOrIntVectorTy())
		break;

		// Look through a cast from narrow vector elements to wider type.
		// Examples: v4i32 -> v2i64, v3i8 -> v24
		unsigned SubBitWidth = SrcVecTy->getScalarSizeInBits();
		if (BitWidth % SubBitWidth == 0) {
		// Known bits are automatically intersected across demanded elements of a
		// vector. So for example, if a bit is computed as known zero, it must be
		// zero across all demanded elements of the vector.
		//
		// For this bitcast, each demanded element of the output is sub-divided
		// across a set of smaller vector elements in the source vector. To get
		// the known bits for an entire element of the output, compute the known
		// bits for each sub-element sequentially. This is done by shifting the
		// one-set-bit demanded elements parameter across the sub-elements for
		// consecutive calls to computeKnownBits. We are using the demanded
		// elements parameter as a mask operator.
		//
		// The known bits of each sub-element are then inserted into place
		// (dependent on endian) to form the full result of known bits.
		unsigned NumElts = DemandedElts.getBitWidth();
		unsigned SubScale = BitWidth / SubBitWidth;
		APInt SubDemandedElts = APInt::getNullValue(NumElts * SubScale);
		for (unsigned i = 0; i != NumElts; ++i) {
		if (DemandedElts[i])
		SubDemandedElts.setBit(i * SubScale);
		}

		RKSimonUnsubmitted Not Done Reply Inline Actions Can you use APInt::insertBits here ? RKSimon: Can you use APInt::insertBits here ?
		spatelAuthorUnsubmitted Done Reply Inline Actions Yes - at least that's how I drafted it originally. But I made it match the existing DAG code, so we'd get more testing for both versions once this is in (there seems to be much more fuzz/bot coverage for IR than codegen). How about a TODO that we could update both versions simultaneously? spatel: Yes - at least that's how I drafted it originally. But I made it match the existing DAG code…
		RKSimonUnsubmitted Not Done Reply Inline Actions Updated in rG7353beda4aa1 - there's several additional cases where we could use insertBits/extractBits that I'll deal with at another time. RKSimon: Updated in rG7353beda4aa1 - there's several additional cases where we could use…
		KnownBits KnownSrc(SubBitWidth);
		for (unsigned i = 0; i != SubScale; ++i) {
		computeKnownBits(I->getOperand(0), SubDemandedElts.shl(i), KnownSrc,
		Depth + 1, Q);
		unsigned ShiftElt = Q.DL.isLittleEndian() ? i : SubScale - 1 - i;
		Known.insertBits(KnownSrc, ShiftElt * SubBitWidth);
		}
		}
break;		break;
}		}
case Instruction::SExt: {		case Instruction::SExt: {
// Compute the bits in the result that are not present in the input.		// Compute the bits in the result that are not present in the input.
unsigned SrcBitWidth = I->getOperand(0)->getType()->getScalarSizeInBits();		unsigned SrcBitWidth = I->getOperand(0)->getType()->getScalarSizeInBits();

Known = Known.trunc(SrcBitWidth);		Known = Known.trunc(SrcBitWidth);
computeKnownBits(I->getOperand(0), Known, Depth + 1, Q);		computeKnownBits(I->getOperand(0), Known, Depth + 1, Q);
▲ Show 20 Lines • Show All 5,906 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/X86/x86-vector-shifts.ll

	Show First 20 Lines • Show All 2,756 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: [[TMP3:%.]] = shl <2 x i64> [[V:%.]], [[TMP2]]			; CHECK-NEXT: [[TMP3:%.]] = shl <2 x i64> [[V:%.]], [[TMP2]]
	; CHECK-NEXT: ret <2 x i64> [[TMP3]]			; CHECK-NEXT: ret <2 x i64> [[TMP3]]
	;			;
	%1 = and <2 x i64> %a, <i64 63, i64 undef>			%1 = and <2 x i64> %a, <i64 63, i64 undef>
	%2 = tail call <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64> %v, <2 x i64> %1)			%2 = tail call <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64> %v, <2 x i64> %1)
	ret <2 x i64> %2			ret <2 x i64> %2
	}			}

				; The shift amount is in range (masked with 31 and high 32-bits are zero),
				; so convert to standard IR - https://llvm.org/PR50123

	define <2 x i64> @sse2_psll_q_128_masked_bitcast(<2 x i64> %v, <2 x i64> %a) {			define <2 x i64> @sse2_psll_q_128_masked_bitcast(<2 x i64> %v, <2 x i64> %a) {
	; CHECK-LABEL: @sse2_psll_q_128_masked_bitcast(			; CHECK-LABEL: @sse2_psll_q_128_masked_bitcast(
	; CHECK-NEXT: [[B:%.]] = bitcast <2 x i64> [[A:%.]] to <4 x i32>			; CHECK-NEXT: [[B:%.]] = bitcast <2 x i64> [[A:%.]] to <4 x i32>
	; CHECK-NEXT: [[M:%.*]] = and <4 x i32> [[B]], <i32 31, i32 poison, i32 poison, i32 poison>			; CHECK-NEXT: [[M:%.*]] = and <4 x i32> [[B]], <i32 31, i32 poison, i32 poison, i32 poison>
	; CHECK-NEXT: [[I:%.*]] = insertelement <4 x i32> [[M]], i32 0, i32 1			; CHECK-NEXT: [[I:%.*]] = insertelement <4 x i32> [[M]], i32 0, i32 1
	; CHECK-NEXT: [[SHAMT:%.*]] = bitcast <4 x i32> [[I]] to <2 x i64>			; CHECK-NEXT: [[SHAMT:%.*]] = bitcast <4 x i32> [[I]] to <2 x i64>
	; CHECK-NEXT: [[R:%.]] = tail call <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64> [[V:%.]], <2 x i64> [[SHAMT]])			; CHECK-NEXT: [[TMP1:%.*]] = shufflevector <2 x i64> [[SHAMT]], <2 x i64> poison, <2 x i32> zeroinitializer
	; CHECK-NEXT: ret <2 x i64> [[R]]			; CHECK-NEXT: [[TMP2:%.]] = shl <2 x i64> [[V:%.]], [[TMP1]]
				; CHECK-NEXT: ret <2 x i64> [[TMP2]]
	;			;
	%b = bitcast <2 x i64> %a to <4 x i32>			%b = bitcast <2 x i64> %a to <4 x i32>
	%m = and <4 x i32> %b, <i32 31, i32 poison, i32 poison, i32 poison>			%m = and <4 x i32> %b, <i32 31, i32 poison, i32 poison, i32 poison>
	%i = insertelement <4 x i32> %m, i32 0, i32 1			%i = insertelement <4 x i32> %m, i32 0, i32 1
	%shamt = bitcast <4 x i32> %i to <2 x i64>			%shamt = bitcast <4 x i32> %i to <2 x i64>
	%r = tail call <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64> %v, <2 x i64> %shamt) #2			%r = tail call <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64> %v, <2 x i64> %shamt) #2
	ret <2 x i64> %r			ret <2 x i64> %r
	}			}

				; TODO: This could be recognized as an over-shift.

	define <2 x i64> @sse2_psll_q_128_masked_bitcast_overshift(<2 x i64> %v, <2 x i64> %a) {			define <2 x i64> @sse2_psll_q_128_masked_bitcast_overshift(<2 x i64> %v, <2 x i64> %a) {
	; CHECK-LABEL: @sse2_psll_q_128_masked_bitcast_overshift(			; CHECK-LABEL: @sse2_psll_q_128_masked_bitcast_overshift(
	; CHECK-NEXT: [[B:%.]] = bitcast <2 x i64> [[A:%.]] to <4 x i32>			; CHECK-NEXT: [[B:%.]] = bitcast <2 x i64> [[A:%.]] to <4 x i32>
	; CHECK-NEXT: [[M:%.*]] = and <4 x i32> [[B]], <i32 31, i32 poison, i32 poison, i32 poison>			; CHECK-NEXT: [[M:%.*]] = and <4 x i32> [[B]], <i32 31, i32 poison, i32 poison, i32 poison>
	; CHECK-NEXT: [[I:%.*]] = insertelement <4 x i32> [[M]], i32 1, i32 1			; CHECK-NEXT: [[I:%.*]] = insertelement <4 x i32> [[M]], i32 1, i32 1
	; CHECK-NEXT: [[SHAMT:%.*]] = bitcast <4 x i32> [[I]] to <2 x i64>			; CHECK-NEXT: [[SHAMT:%.*]] = bitcast <4 x i32> [[I]] to <2 x i64>
	; CHECK-NEXT: [[R:%.]] = tail call <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64> [[V:%.]], <2 x i64> [[SHAMT]])			; CHECK-NEXT: [[R:%.]] = tail call <2 x i64> @llvm.x86.sse2.psll.q(<2 x i64> [[V:%.]], <2 x i64> [[SHAMT]])
	; CHECK-NEXT: ret <2 x i64> [[R]]			; CHECK-NEXT: ret <2 x i64> [[R]]
	▲ Show 20 Lines • Show All 1,028 Lines • Show Last 20 Lines

llvm/test/Transforms/InstSimplify/shift-knownbits.ll

	; NOTE: Assertions have been autogenerated by utils/update_test_checks.py			; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
	; RUN: opt < %s -instsimplify -S -data-layout="E" \| FileCheck %s			; RUN: opt < %s -instsimplify -S -data-layout="E" \| FileCheck %s --check-prefixes=CHECK,BIGENDIAN
				; RUN: opt < %s -instsimplify -S -data-layout="e" \| FileCheck %s --check-prefixes=CHECK,LITTLEENDIAN

	; If any bits of the shift amount are known to make it exceed or equal			; If any bits of the shift amount are known to make it exceed or equal
	; the number of bits in the type, the shift causes undefined behavior.			; the number of bits in the type, the shift causes undefined behavior.

	define i32 @shl_amount_is_known_bogus(i32 %a, i32 %b) {			define i32 @shl_amount_is_known_bogus(i32 %a, i32 %b) {
	; CHECK-LABEL: @shl_amount_is_known_bogus(			; CHECK-LABEL: @shl_amount_is_known_bogus(
	; CHECK-NEXT: ret i32 poison			; CHECK-NEXT: ret i32 poison
	;			;
	▲ Show 20 Lines • Show All 207 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: ret i8 [[EX]]			; CHECK-NEXT: ret i8 [[EX]]
	;			;
	%ct = call <2 x i8> @llvm.cttz.v2i8(<2 x i8> %x, i1 true)			%ct = call <2 x i8> @llvm.cttz.v2i8(<2 x i8> %x, i1 true)
	%sh = lshr <2 x i8> %ct, <i8 3, i8 0>			%sh = lshr <2 x i8> %ct, <i8 3, i8 0>
	%ex = extractelement <2 x i8> %sh, i32 0			%ex = extractelement <2 x i8> %sh, i32 0
	ret i8 %ex			ret i8 %ex
	}			}

				; The shift amount is 0 on either of high/low bytes. The middle byte doesn't matter.

	define i24 @bitcast_noshift_scalar(<3 x i8> %v1, i24 %v2) {			define i24 @bitcast_noshift_scalar(<3 x i8> %v1, i24 %v2) {
	; CHECK-LABEL: @bitcast_noshift_scalar(			; CHECK-LABEL: @bitcast_noshift_scalar(
	; CHECK-NEXT: [[S:%.]] = shufflevector <3 x i8> [[V1:%.]], <3 x i8> <i8 0, i8 poison, i8 poison>, <3 x i32> <i32 3, i32 1, i32 3>			; CHECK-NEXT: ret i24 [[V2:%.*]]
	; CHECK-NEXT: [[B:%.*]] = bitcast <3 x i8> [[S]] to i24
	; CHECK-NEXT: [[R:%.]] = shl i24 [[V2:%.]], [[B]]
	; CHECK-NEXT: ret i24 [[R]]
	;			;
	%c = insertelement <3 x i8> poison, i8 0, i64 0			%c = insertelement <3 x i8> poison, i8 0, i64 0
	%s = shufflevector <3 x i8> %v1, <3 x i8> %c, <3 x i32> <i32 3, i32 1, i32 3>			%s = shufflevector <3 x i8> %v1, <3 x i8> %c, <3 x i32> <i32 3, i32 1, i32 3>
	%b = bitcast <3 x i8> %s to i24			%b = bitcast <3 x i8> %s to i24
	%r = shl i24 %v2, %b			%r = shl i24 %v2, %b
	ret i24 %r			ret i24 %r
	}			}

				; The shift amount is 0 on low byte of big-endian and unknown on little-endian.

	define i24 @bitcast_noshift_scalar_bigend(<3 x i8> %v1, i24 %v2) {			define i24 @bitcast_noshift_scalar_bigend(<3 x i8> %v1, i24 %v2) {
	; CHECK-LABEL: @bitcast_noshift_scalar_bigend(			; BIGENDIAN-LABEL: @bitcast_noshift_scalar_bigend(
	; CHECK-NEXT: [[S:%.]] = shufflevector <3 x i8> [[V1:%.]], <3 x i8> <i8 0, i8 poison, i8 poison>, <3 x i32> <i32 0, i32 1, i32 3>			; BIGENDIAN-NEXT: ret i24 [[V2:%.*]]
	; CHECK-NEXT: [[B:%.*]] = bitcast <3 x i8> [[S]] to i24			;
	; CHECK-NEXT: [[R:%.]] = shl i24 [[V2:%.]], [[B]]			; LITTLEENDIAN-LABEL: @bitcast_noshift_scalar_bigend(
	; CHECK-NEXT: ret i24 [[R]]			; LITTLEENDIAN-NEXT: [[S:%.]] = shufflevector <3 x i8> [[V1:%.]], <3 x i8> <i8 0, i8 poison, i8 poison>, <3 x i32> <i32 0, i32 1, i32 3>
				; LITTLEENDIAN-NEXT: [[B:%.*]] = bitcast <3 x i8> [[S]] to i24
				; LITTLEENDIAN-NEXT: [[R:%.]] = shl i24 [[V2:%.]], [[B]]
				; LITTLEENDIAN-NEXT: ret i24 [[R]]
	;			;
	%c = insertelement <3 x i8> poison, i8 0, i64 0			%c = insertelement <3 x i8> poison, i8 0, i64 0
	%s = shufflevector <3 x i8> %v1, <3 x i8> %c, <3 x i32> <i32 0, i32 1, i32 3>			%s = shufflevector <3 x i8> %v1, <3 x i8> %c, <3 x i32> <i32 0, i32 1, i32 3>
	%b = bitcast <3 x i8> %s to i24			%b = bitcast <3 x i8> %s to i24
	%r = shl i24 %v2, %b			%r = shl i24 %v2, %b
	ret i24 %r			ret i24 %r
	}			}

				; The shift amount is 0 on low byte of little-endian and unknown on big-endian.

	define i24 @bitcast_noshift_scalar_littleend(<3 x i8> %v1, i24 %v2) {			define i24 @bitcast_noshift_scalar_littleend(<3 x i8> %v1, i24 %v2) {
	; CHECK-LABEL: @bitcast_noshift_scalar_littleend(			; BIGENDIAN-LABEL: @bitcast_noshift_scalar_littleend(
	; CHECK-NEXT: [[S:%.]] = shufflevector <3 x i8> [[V1:%.]], <3 x i8> <i8 0, i8 poison, i8 poison>, <3 x i32> <i32 3, i32 1, i32 2>			; BIGENDIAN-NEXT: [[S:%.]] = shufflevector <3 x i8> [[V1:%.]], <3 x i8> <i8 0, i8 poison, i8 poison>, <3 x i32> <i32 3, i32 1, i32 2>
	; CHECK-NEXT: [[B:%.*]] = bitcast <3 x i8> [[S]] to i24			; BIGENDIAN-NEXT: [[B:%.*]] = bitcast <3 x i8> [[S]] to i24
	; CHECK-NEXT: [[R:%.]] = shl i24 [[V2:%.]], [[B]]			; BIGENDIAN-NEXT: [[R:%.]] = shl i24 [[V2:%.]], [[B]]
	; CHECK-NEXT: ret i24 [[R]]			; BIGENDIAN-NEXT: ret i24 [[R]]
				;
				; LITTLEENDIAN-LABEL: @bitcast_noshift_scalar_littleend(
				; LITTLEENDIAN-NEXT: ret i24 [[V2:%.*]]
	;			;
	%c = insertelement <3 x i8> poison, i8 0, i64 0			%c = insertelement <3 x i8> poison, i8 0, i64 0
	%s = shufflevector <3 x i8> %v1, <3 x i8> %c, <3 x i32> <i32 3, i32 1, i32 2>			%s = shufflevector <3 x i8> %v1, <3 x i8> %c, <3 x i32> <i32 3, i32 1, i32 2>
	%b = bitcast <3 x i8> %s to i24			%b = bitcast <3 x i8> %s to i24
	%r = shl i24 %v2, %b			%r = shl i24 %v2, %b
	ret i24 %r			ret i24 %r
	}			}

				; The shift amount is known 24 on little-endian and known 24<<16 on big-endian
				; across all vector elements, so it's an overshift either way.

	define <3 x i24> @bitcast_overshift_vector(<9 x i8> %v1, <3 x i24> %v2) {			define <3 x i24> @bitcast_overshift_vector(<9 x i8> %v1, <3 x i24> %v2) {
	; CHECK-LABEL: @bitcast_overshift_vector(			; CHECK-LABEL: @bitcast_overshift_vector(
	; CHECK-NEXT: [[S:%.]] = shufflevector <9 x i8> [[V1:%.]], <9 x i8> <i8 24, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison>, <9 x i32> <i32 9, i32 1, i32 2, i32 9, i32 4, i32 5, i32 9, i32 7, i32 8>			; CHECK-NEXT: ret <3 x i24> poison
	; CHECK-NEXT: [[B:%.*]] = bitcast <9 x i8> [[S]] to <3 x i24>
	; CHECK-NEXT: [[R:%.]] = shl <3 x i24> [[V2:%.]], [[B]]
	; CHECK-NEXT: ret <3 x i24> [[R]]
	;			;
	%c = insertelement <9 x i8> poison, i8 24, i64 0			%c = insertelement <9 x i8> poison, i8 24, i64 0
	%s = shufflevector <9 x i8> %v1, <9 x i8> %c, <9 x i32> <i32 9, i32 1, i32 2, i32 9, i32 4, i32 5, i32 9, i32 7, i32 8>			%s = shufflevector <9 x i8> %v1, <9 x i8> %c, <9 x i32> <i32 9, i32 1, i32 2, i32 9, i32 4, i32 5, i32 9, i32 7, i32 8>
	%b = bitcast <9 x i8> %s to <3 x i24>			%b = bitcast <9 x i8> %s to <3 x i24>
	%r = shl <3 x i24> %v2, %b			%r = shl <3 x i24> %v2, %b
	ret <3 x i24> %r			ret <3 x i24> %r
	}			}

				; The shift amount is known 23 on little-endian and known 23<<16 on big-endian
				; across all vector elements, so it's an overshift for big-endian.

	define <3 x i24> @bitcast_overshift_vector_bigend(<9 x i8> %v1, <3 x i24> %v2) {			define <3 x i24> @bitcast_overshift_vector_bigend(<9 x i8> %v1, <3 x i24> %v2) {
	; CHECK-LABEL: @bitcast_overshift_vector_bigend(			; BIGENDIAN-LABEL: @bitcast_overshift_vector_bigend(
	; CHECK-NEXT: [[S:%.]] = shufflevector <9 x i8> [[V1:%.]], <9 x i8> <i8 23, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison>, <9 x i32> <i32 9, i32 1, i32 2, i32 9, i32 4, i32 5, i32 9, i32 7, i32 8>			; BIGENDIAN-NEXT: ret <3 x i24> poison
	; CHECK-NEXT: [[B:%.*]] = bitcast <9 x i8> [[S]] to <3 x i24>			;
	; CHECK-NEXT: [[R:%.]] = shl <3 x i24> [[V2:%.]], [[B]]			; LITTLEENDIAN-LABEL: @bitcast_overshift_vector_bigend(
	; CHECK-NEXT: ret <3 x i24> [[R]]			; LITTLEENDIAN-NEXT: [[S:%.]] = shufflevector <9 x i8> [[V1:%.]], <9 x i8> <i8 23, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison>, <9 x i32> <i32 9, i32 1, i32 2, i32 9, i32 4, i32 5, i32 9, i32 7, i32 8>
				; LITTLEENDIAN-NEXT: [[B:%.*]] = bitcast <9 x i8> [[S]] to <3 x i24>
				; LITTLEENDIAN-NEXT: [[R:%.]] = shl <3 x i24> [[V2:%.]], [[B]]
				; LITTLEENDIAN-NEXT: ret <3 x i24> [[R]]
	;			;
	%c = insertelement <9 x i8> poison, i8 23, i64 0			%c = insertelement <9 x i8> poison, i8 23, i64 0
	%s = shufflevector <9 x i8> %v1, <9 x i8> %c, <9 x i32> <i32 9, i32 1, i32 2, i32 9, i32 4, i32 5, i32 9, i32 7, i32 8>			%s = shufflevector <9 x i8> %v1, <9 x i8> %c, <9 x i32> <i32 9, i32 1, i32 2, i32 9, i32 4, i32 5, i32 9, i32 7, i32 8>
	%b = bitcast <9 x i8> %s to <3 x i24>			%b = bitcast <9 x i8> %s to <3 x i24>
	%r = shl <3 x i24> %v2, %b			%r = shl <3 x i24> %v2, %b
	ret <3 x i24> %r			ret <3 x i24> %r
	}			}

				; The shift amount is known 23 on big-endian and known 23<<16 on little-endian
				; across all vector elements, so it's an overshift for little-endian.

	define <3 x i24> @bitcast_overshift_vector_littleend(<9 x i8> %v1, <3 x i24> %v2) {			define <3 x i24> @bitcast_overshift_vector_littleend(<9 x i8> %v1, <3 x i24> %v2) {
	; CHECK-LABEL: @bitcast_overshift_vector_littleend(			; BIGENDIAN-LABEL: @bitcast_overshift_vector_littleend(
	; CHECK-NEXT: [[S:%.]] = shufflevector <9 x i8> [[V1:%.]], <9 x i8> <i8 23, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison>, <9 x i32> <i32 0, i32 1, i32 9, i32 3, i32 4, i32 9, i32 6, i32 7, i32 9>			; BIGENDIAN-NEXT: [[S:%.]] = shufflevector <9 x i8> [[V1:%.]], <9 x i8> <i8 23, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison>, <9 x i32> <i32 0, i32 1, i32 9, i32 3, i32 4, i32 9, i32 6, i32 7, i32 9>
	; CHECK-NEXT: [[B:%.*]] = bitcast <9 x i8> [[S]] to <3 x i24>			; BIGENDIAN-NEXT: [[B:%.*]] = bitcast <9 x i8> [[S]] to <3 x i24>
	; CHECK-NEXT: [[R:%.]] = shl <3 x i24> [[V2:%.]], [[B]]			; BIGENDIAN-NEXT: [[R:%.]] = shl <3 x i24> [[V2:%.]], [[B]]
	; CHECK-NEXT: ret <3 x i24> [[R]]			; BIGENDIAN-NEXT: ret <3 x i24> [[R]]
				;
				; LITTLEENDIAN-LABEL: @bitcast_overshift_vector_littleend(
				; LITTLEENDIAN-NEXT: ret <3 x i24> poison
	;			;
	%c = insertelement <9 x i8> poison, i8 23, i64 0			%c = insertelement <9 x i8> poison, i8 23, i64 0
	%s = shufflevector <9 x i8> %v1, <9 x i8> %c, <9 x i32> <i32 0, i32 1, i32 9, i32 3, i32 4, i32 9, i32 6, i32 7, i32 9>			%s = shufflevector <9 x i8> %v1, <9 x i8> %c, <9 x i32> <i32 0, i32 1, i32 9, i32 3, i32 4, i32 9, i32 6, i32 7, i32 9>
	%b = bitcast <9 x i8> %s to <3 x i24>			%b = bitcast <9 x i8> %s to <3 x i24>
	%r = shl <3 x i24> %v2, %b			%r = shl <3 x i24> %v2, %b
	ret <3 x i24> %r			ret <3 x i24> %r
	}			}

				; Negative test - the shift amount is known 24 or 24<<16 on only 2 out of 3 elements.

	define <3 x i24> @bitcast_partial_overshift_vector(<9 x i8> %v1, <3 x i24> %v2) {			define <3 x i24> @bitcast_partial_overshift_vector(<9 x i8> %v1, <3 x i24> %v2) {
	; CHECK-LABEL: @bitcast_partial_overshift_vector(			; CHECK-LABEL: @bitcast_partial_overshift_vector(
	; CHECK-NEXT: [[S:%.]] = shufflevector <9 x i8> [[V1:%.]], <9 x i8> <i8 24, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison>, <9 x i32> <i32 9, i32 1, i32 2, i32 9, i32 4, i32 5, i32 6, i32 7, i32 8>			; CHECK-NEXT: [[S:%.]] = shufflevector <9 x i8> [[V1:%.]], <9 x i8> <i8 24, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison, i8 poison>, <9 x i32> <i32 9, i32 1, i32 2, i32 9, i32 4, i32 5, i32 6, i32 7, i32 8>
	; CHECK-NEXT: [[B:%.*]] = bitcast <9 x i8> [[S]] to <3 x i24>			; CHECK-NEXT: [[B:%.*]] = bitcast <9 x i8> [[S]] to <3 x i24>
	; CHECK-NEXT: [[R:%.]] = shl <3 x i24> [[V2:%.]], [[B]]			; CHECK-NEXT: [[R:%.]] = shl <3 x i24> [[V2:%.]], [[B]]
	; CHECK-NEXT: ret <3 x i24> [[R]]			; CHECK-NEXT: ret <3 x i24> [[R]]
	;			;
	%c = insertelement <9 x i8> poison, i8 24, i64 0			%c = insertelement <9 x i8> poison, i8 24, i64 0
	%s = shufflevector <9 x i8> %v1, <9 x i8> %c, <9 x i32> <i32 9, i32 1, i32 2, i32 9, i32 4, i32 5, i32 6, i32 7, i32 8>			%s = shufflevector <9 x i8> %v1, <9 x i8> %c, <9 x i32> <i32 9, i32 1, i32 2, i32 9, i32 4, i32 5, i32 6, i32 7, i32 8>
	%b = bitcast <9 x i8> %s to <3 x i24>			%b = bitcast <9 x i8> %s to <3 x i24>
	%r = shl <3 x i24> %v2, %b			%r = shl <3 x i24> %v2, %b
	ret <3 x i24> %r			ret <3 x i24> %r
	}			}

				; Negative test - don't know how to look through a cast with non-integer type (but we could handle this...).

	define <1 x i64> @bitcast_noshift_vector_wrong_type(<2 x float> %v1, <1 x i64> %v2) {			define <1 x i64> @bitcast_noshift_vector_wrong_type(<2 x float> %v1, <1 x i64> %v2) {
	; CHECK-LABEL: @bitcast_noshift_vector_wrong_type(			; CHECK-LABEL: @bitcast_noshift_vector_wrong_type(
	; CHECK-NEXT: [[S:%.]] = shufflevector <2 x float> [[V1:%.]], <2 x float> <float 0.000000e+00, float poison>, <2 x i32> <i32 2, i32 1>			; CHECK-NEXT: [[S:%.]] = shufflevector <2 x float> [[V1:%.]], <2 x float> <float 0.000000e+00, float poison>, <2 x i32> <i32 2, i32 1>
	; CHECK-NEXT: [[B:%.*]] = bitcast <2 x float> [[S]] to <1 x i64>			; CHECK-NEXT: [[B:%.*]] = bitcast <2 x float> [[S]] to <1 x i64>
	; CHECK-NEXT: [[R:%.]] = shl <1 x i64> [[V2:%.]], [[B]]			; CHECK-NEXT: [[R:%.]] = shl <1 x i64> [[V2:%.]], [[B]]
	; CHECK-NEXT: ret <1 x i64> [[R]]			; CHECK-NEXT: ret <1 x i64> [[R]]
	;			;
	%c = insertelement <2 x float> poison, float 0.0, i64 0			%c = insertelement <2 x float> poison, float 0.0, i64 0
	%s = shufflevector <2 x float> %v1, <2 x float> %c, <2 x i32> <i32 2, i32 1>			%s = shufflevector <2 x float> %v1, <2 x float> %c, <2 x i32> <i32 2, i32 1>
	%b = bitcast <2 x float> %s to <1 x i64>			%b = bitcast <2 x float> %s to <1 x i64>
	%r = shl <1 x i64> %v2, %b			%r = shl <1 x i64> %v2, %b
	ret <1 x i64> %r			ret <1 x i64> %r
	}			}