This is an archive of the discontinued LLVM Phabricator instance.

Allow scalable vectors in computeKnownBits
ClosedPublic

Authored by reames on Oct 21 2022, 10:48 AM.

Download Raw Diff

Details

Reviewers

craig.topper
asb
frasercrmck
RKSimon
efriedma
paulwalker-arm
david-arm

Commits

rG087bb0f1fe1c: Allow scalable vectors in computeKnownBits

Summary

This extends the computeKnownBits analysis to support scalable vectors. The critical detail is in deciding how to represent the demanded elements of a vector whose length is unknown at compile time.

For this patch, I adopt the convention that we track one bit which corresponds to all lanes. That is, that bit is implicitly broadcast to all lanes of the scalable vector resulting in all lanes being demanded. This is the same convention we use in getSplatValue in SelectionDAG.

Note that this convention doesn't actually impact much. Most of the code is agnostic to the interpretation of the demanded elements, and the few cases which actually care need case by case handling anyways. In this patch, I just bail out of those cases.

A prior patch (D128159) proposed using a different convention in SDAG. I don't see any strong reason to prefer one scheme over the other, so I propose we go with this one as it's conceptually the simplest. Getting known and demanded bit optimizations unblocked at all is a significant win.

I've locally implemented this scheme in reasonable large parts of ValueTracking.cpp and SelectionDAG equivalents, and have not hit any blockers. If this is approved, I plan to post a series of patches plumbing this through all the relevant parts.

In the discussion on that patch, a preference was expressed for introducing some form of abstraction around the demanded elements. I'll note that I've played with several variations on that idea locally, and have yet to find anything which results in more readable code. If anyone has concrete ideas in this area, I'm happy to explore in follow up patches. I'd strongly prefer to be making API changes in NFC manner with tests in place.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

reames created this revision.Oct 21 2022, 10:48 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 21 2022, 10:48 AM

Herald added subscribers: foad, StephenFan, bollu and 2 others. · View Herald Transcript

reames requested review of this revision.Oct 21 2022, 10:48 AM

Herald added a project: Restricted Project. · View Herald TranscriptOct 21 2022, 10:48 AM

Herald added a subscriber: alextsao1999. · View Herald Transcript

reames added a child revision: D136475: [InstCombine] Allow simplify demanded transformations on scalable vectors.Oct 21 2022, 11:17 AM

Harbormaster completed remote builds in B193579: Diff 469680.Oct 21 2022, 11:55 AM

tschuett added a subscriber: tschuett.Oct 24 2022, 1:59 AM

efriedma added inline comments.Oct 24 2022, 11:47 AM

llvm/lib/Analysis/ValueTracking.cpp
1840	I guess this is a more general thing, but does it make sense to analyze insertelement even if we can't figure out which element is being inserted?
1870	Duplicate ScalableVectorType check? Does it make sense to analyze the vector operand without trying to figure out which elt is demanded?

reames added inline comments.Oct 24 2022, 1:24 PM

llvm/lib/Analysis/ValueTracking.cpp
1840	I'd had this same thought, and in fact was working on a patch for that locally. This turns out to be a bit trickier than it seems as the unknown index can be out of bounds. As a result, the resulting vector could be entirely poison. In demanded bits, we seem to try not inferring bits of potentially undef values, and I think poison needs treated the same.
1870	Yep, looks like it's redundant, will remove. Same point as above about poison/undef.

efriedma added inline comments.Oct 24 2022, 1:31 PM

llvm/lib/Analysis/ValueTracking.cpp
1840	Inferring bits in a poison value shouldn't be an issue. Since it's poison, the entire value is meaningless. (If it didn't work this way, we wouldn't be able to compute known bits at all without proving a value isn't poison.) The issue with undef is just that we can't infer 1 or 0 for a bit we know is undef, because it could be both. (This sort of difficult reasoning is one of the reasons we're trying to move away from undef...)

Address review comment

LGTM, but give it a couple days to see if there are other comments on the general approach.

This revision is now accepted and ready to land.Oct 24 2022, 1:36 PM

Harbormaster completed remote builds in B194021: Diff 470277.Oct 24 2022, 3:55 PM

@reames Can I be rude and ask you to wait until Friday? I'm a bit swamped at the minute and would love a little more time to properly digest the idea before it lands.

In D136470#3882978, @paulwalker-arm wrote:

@reames Can I be rude and ask you to wait until Friday? I'm a bit swamped at the minute and would love a little more time to properly digest the idea before it lands.

Not a problem.

spatel mentioned this in D135876: [InstCombine] Remove redundant splats in InstCombineVectorOps.Oct 26 2022, 7:00 AM

Thanks for waiting @reames, I cannot say I'm truly in love with the approach but it is the best proposal so far and least likely to bit us later. One of my unfounded reservations of previous proposals was when transition between fixed length and scalable vectors. With this approach ensuring there's only ever one representation for all scalable vectors it should at least be easier to spot invalid code paths.

One day I see value in having a more powerful representation. My current thinking is to have an extra bit whereby:

0 -> duplicates the state of the final known lane across all unknown lanes [ACBDDDDDDDDDDDDD]
1 -> duplicates all known lanes across across all unknown lanes in "known lanes" sized blocks. [ABCDABCDABCDABCD]

Having such a representation we can model scalar and subvector inserts/extracts as well as help with cases where the even an odd lanes are constructed via predicated logic.

I've not bug down on this so much though and as you say the important thing is getting some level of support and splat handling is likely to give us some nice early wins. It'll also make @dmgreen's day, which is a bonus.

Again, thanks for allowing me pondering time :)

This revision was landed with ongoing or failed builds.Oct 30 2022, 9:00 AM

Closed by commit rG087bb0f1fe1c: Allow scalable vectors in computeKnownBits (authored by reames). · Explain Why

This revision was automatically updated to reflect the committed changes.

reames added a commit: rG087bb0f1fe1c: Allow scalable vectors in computeKnownBits.

reames mentioned this in D137046: Allows scalable vectors in ComputeNumSignBits and isKnownNonNull.Oct 30 2022, 1:06 PM

reames added a child revision: D137046: Allows scalable vectors in ComputeNumSignBits and isKnownNonNull.

For the record, let me sketch out where I think this might be going long term.

For scalable vectors, we have a couple of idiomatic patterns for representing demanded elements.

The first is a splat - which this patch nicely handles by letting us do lane independent reasoning on scalable vectors. This covers a majority of the cases I've noticed so far, and is thus highly useful to have in tree as we figure out next steps.

The second is sub_vector insert/extract. This comes up naturally in SDAG due to the way we lower fixed length vectors on RISCV (and, I think, ARM SVE.) This requires tracking a prefix of the demanded bits corresponding to the fixed vector size, and then a single bit smeared across remaining (unknown number of) lanes.

We could pick the prefix length in one of two ways:

From the fixed vector being inserted or extracted.
From the minimum known vector register size. This is more natural in DAG; at the IR layer, this requires combining the minimum vector length of a type which the minimum vscale_range value.

The third is scalar insert/extract. For indices under the minimum vector size, this reduces the former case. I don't yet know how common various runtime indices we can't prove in bounds are. One example we might see is the "end of vector - 1" pattern which comes e.g. from loop vectorization exit values. There may also be others. I don't yet really have a good sense here.

The fourth is generalized shuffle indices. (i.e. figuring out what lanes are demanded from a runtime shuffle mask) We're several steps from being able to talk about this concretely, and I'm not yet convinced we'll need anything here at all. If we do need to go here, this adds a huge amount of complexity. I'm hoping we don't get here.

I'm pretty sure we'll need to generalize at least as far as subvector insert/extract. I'm not sure about going beyond that yet.

@paulwalker-arm What's the motivation for your ABCDABCDABCDABCD pattern? That looks like a splat of a larger than element type value. What idioms does this come from?

reames mentioned this in D137140: [SDAG] Allow scalable vectors in ComputeKnownBits.Oct 31 2022, 5:53 PM

reames added a child revision: D137140: [SDAG] Allow scalable vectors in ComputeKnownBits.

reames mentioned this in rG2e999b7dd193: Allow scalable vectors in ComputeNumSignBits and isKnownNonNull.Nov 1 2022, 9:30 AM

reames mentioned this in rGbc0fea0d551b: [SDAG] Allow scalable vectors in ComputeKnownBits.Nov 18 2022, 7:41 AM

reames mentioned this in rG7969ab85e0a4: [SDAG] Allow scalable vectors in ComputeKnownBits (try 2).Dec 5 2022, 8:53 AM

Revision Contents

Path

Size

llvm/

lib/

Analysis/

ValueTracking.cpp

34 lines

test/

Transforms/

InstCombine/

add.ll

2 lines

intrinsics.ll

4 lines

Diff 471839

llvm/lib/Analysis/ValueTracking.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 160 Lines • ▼ Show 20 Lines	if (CxtI && CxtI->getParent())
return CxtI;		return CxtI;

return nullptr;		return nullptr;
}		}

static bool getShuffleDemandedElts(const ShuffleVectorInst *Shuf,		static bool getShuffleDemandedElts(const ShuffleVectorInst *Shuf,
const APInt &DemandedElts,		const APInt &DemandedElts,
APInt &DemandedLHS, APInt &DemandedRHS) {		APInt &DemandedLHS, APInt &DemandedRHS) {
// The length of scalable vectors is unknown at compile time, thus we		if (isa<ScalableVectorType>(Shuf->getType())) {
// cannot check their values		assert(DemandedElts == APInt(1,1));
if (isa<ScalableVectorType>(Shuf->getType()))		DemandedLHS = DemandedRHS = DemandedElts;
return false;		return true;
		}

int NumElts =		int NumElts =
cast<FixedVectorType>(Shuf->getOperand(0)->getType())->getNumElements();		cast<FixedVectorType>(Shuf->getOperand(0)->getType())->getNumElements();
int NumMaskElts = cast<FixedVectorType>(Shuf->getType())->getNumElements();		int NumMaskElts = cast<FixedVectorType>(Shuf->getType())->getNumElements();
DemandedLHS = DemandedRHS = APInt::getZero(NumElts);		DemandedLHS = DemandedRHS = APInt::getZero(NumElts);
if (DemandedElts.isZero())		if (DemandedElts.isZero())
return true;		return true;
// Simple case of a shuffle with zeroinitializer.		// Simple case of a shuffle with zeroinitializer.
Show All 20 Lines	static bool getShuffleDemandedElts(const ShuffleVectorInst *Shuf,
return true;		return true;
}		}

static void computeKnownBits(const Value *V, const APInt &DemandedElts,		static void computeKnownBits(const Value *V, const APInt &DemandedElts,
KnownBits &Known, unsigned Depth, const Query &Q);		KnownBits &Known, unsigned Depth, const Query &Q);

static void computeKnownBits(const Value *V, KnownBits &Known, unsigned Depth,		static void computeKnownBits(const Value *V, KnownBits &Known, unsigned Depth,
const Query &Q) {		const Query &Q) {
// FIXME: We currently have no way to represent the DemandedElts of a scalable		// Since the number of lanes in a scalable vector is unknown at compile time,
// vector		// we track one bit which is implicitly broadcast to all lanes. This means
if (isa<ScalableVectorType>(V->getType())) {		// that all lanes in a scalable vector are considered demanded.
Known.resetAll();
return;
}

auto *FVTy = dyn_cast<FixedVectorType>(V->getType());		auto *FVTy = dyn_cast<FixedVectorType>(V->getType());
APInt DemandedElts =		APInt DemandedElts =
FVTy ? APInt::getAllOnes(FVTy->getNumElements()) : APInt(1, 1);		FVTy ? APInt::getAllOnes(FVTy->getNumElements()) : APInt(1, 1);
computeKnownBits(V, DemandedElts, Known, Depth, Q);		computeKnownBits(V, DemandedElts, Known, Depth, Q);
}		}

void llvm::computeKnownBits(const Value *V, KnownBits &Known,		void llvm::computeKnownBits(const Value *V, KnownBits &Known,
const DataLayout &DL, unsigned Depth,		const DataLayout &DL, unsigned Depth,
▲ Show 20 Lines • Show All 1,009 Lines • ▼ Show 20 Lines	case Instruction::BitCast: {
auto *SrcVecTy = dyn_cast<FixedVectorType>(SrcTy);		auto *SrcVecTy = dyn_cast<FixedVectorType>(SrcTy);
if (!SrcVecTy \|\| !SrcVecTy->getElementType()->isIntegerTy() \|\|		if (!SrcVecTy \|\| !SrcVecTy->getElementType()->isIntegerTy() \|\|
!I->getType()->isIntOrIntVectorTy())		!I->getType()->isIntOrIntVectorTy())
break;		break;

// Look through a cast from narrow vector elements to wider type.		// Look through a cast from narrow vector elements to wider type.
// Examples: v4i32 -> v2i64, v3i8 -> v24		// Examples: v4i32 -> v2i64, v3i8 -> v24
unsigned SubBitWidth = SrcVecTy->getScalarSizeInBits();		unsigned SubBitWidth = SrcVecTy->getScalarSizeInBits();
if (BitWidth % SubBitWidth == 0) {		if (BitWidth % SubBitWidth == 0 && !isa<ScalableVectorType>(I->getType())) {
// Known bits are automatically intersected across demanded elements of a		// Known bits are automatically intersected across demanded elements of a
// vector. So for example, if a bit is computed as known zero, it must be		// vector. So for example, if a bit is computed as known zero, it must be
// zero across all demanded elements of the vector.		// zero across all demanded elements of the vector.
//		//
// For this bitcast, each demanded element of the output is sub-divided		// For this bitcast, each demanded element of the output is sub-divided
// across a set of smaller vector elements in the source vector. To get		// across a set of smaller vector elements in the source vector. To get
// the known bits for an entire element of the output, compute the known		// the known bits for an entire element of the output, compute the known
// bits for each sub-element sequentially. This is done by shifting the		// bits for each sub-element sequentially. This is done by shifting the
▲ Show 20 Lines • Show All 577 Lines • ▼ Show 20 Lines	case Instruction::ShuffleVector: {
if (!!DemandedRHS) {		if (!!DemandedRHS) {
const Value *RHS = Shuf->getOperand(1);		const Value *RHS = Shuf->getOperand(1);
computeKnownBits(RHS, DemandedRHS, Known2, Depth + 1, Q);		computeKnownBits(RHS, DemandedRHS, Known2, Depth + 1, Q);
Known = KnownBits::commonBits(Known, Known2);		Known = KnownBits::commonBits(Known, Known2);
}		}
break;		break;
}		}
case Instruction::InsertElement: {		case Instruction::InsertElement: {
		if (isa<ScalableVectorType>(I->getType())) {
		Known.resetAll();
		return;
		}
const Value *Vec = I->getOperand(0);		const Value *Vec = I->getOperand(0);
const Value *Elt = I->getOperand(1);		const Value *Elt = I->getOperand(1);
auto *CIdx = dyn_cast<ConstantInt>(I->getOperand(2));		auto *CIdx = dyn_cast<ConstantInt>(I->getOperand(2));
// Early out if the index is non-constant or out-of-range.		// Early out if the index is non-constant or out-of-range.
unsigned NumElts = DemandedElts.getBitWidth();		unsigned NumElts = DemandedElts.getBitWidth();
		efriedmaUnsubmitted Not Done Reply Inline Actions I guess this is a more general thing, but does it make sense to analyze insertelement even if we can't figure out which element is being inserted? efriedma: I guess this is a more general thing, but does it make sense to analyze insertelement even if…
		reamesAuthorUnsubmitted Done Reply Inline Actions I'd had this same thought, and in fact was working on a patch for that locally. This turns out to be a bit trickier than it seems as the unknown index can be out of bounds. As a result, the resulting vector could be entirely poison. In demanded bits, we seem to try not inferring bits of potentially undef values, and I think poison needs treated the same. reames: I'd had this same thought, and in fact was working on a patch for that locally. This turns out…
		efriedmaUnsubmitted Not Done Reply Inline Actions Inferring bits in a poison value shouldn't be an issue. Since it's poison, the entire value is meaningless. (If it didn't work this way, we wouldn't be able to compute known bits at all without proving a value isn't poison.) The issue with undef is just that we can't infer 1 or 0 for a bit we know is undef, because it could be both. (This sort of difficult reasoning is one of the reasons we're trying to move away from undef...) efriedma: Inferring bits in a poison value shouldn't be an issue. Since it's poison, the entire value is…
if (!CIdx \|\| CIdx->getValue().uge(NumElts)) {		if (!CIdx \|\| CIdx->getValue().uge(NumElts)) {
Known.resetAll();		Known.resetAll();
return;		return;
}		}
Known.One.setAllBits();		Known.One.setAllBits();
Known.Zero.setAllBits();		Known.Zero.setAllBits();
unsigned EltIdx = CIdx->getZExtValue();		unsigned EltIdx = CIdx->getZExtValue();
// Do we demand the inserted element?		// Do we demand the inserted element?
Show All 13 Lines	case Instruction::InsertElement: {
break;		break;
}		}
case Instruction::ExtractElement: {		case Instruction::ExtractElement: {
// Look through extract element. If the index is non-constant or		// Look through extract element. If the index is non-constant or
// out-of-range demand all elements, otherwise just the extracted element.		// out-of-range demand all elements, otherwise just the extracted element.
const Value *Vec = I->getOperand(0);		const Value *Vec = I->getOperand(0);
const Value *Idx = I->getOperand(1);		const Value *Idx = I->getOperand(1);
auto *CIdx = dyn_cast<ConstantInt>(Idx);		auto *CIdx = dyn_cast<ConstantInt>(Idx);
if (isa<ScalableVectorType>(Vec->getType())) {		if (isa<ScalableVectorType>(Vec->getType())) {
		efriedmaUnsubmitted Not Done Reply Inline Actions Duplicate ScalableVectorType check? Does it make sense to analyze the vector operand without trying to figure out which elt is demanded? efriedma: Duplicate ScalableVectorType check? Does it make sense to analyze the vector operand without…
		reamesAuthorUnsubmitted Done Reply Inline Actions Yep, looks like it's redundant, will remove. Same point as above about poison/undef. reames: Yep, looks like it's redundant, will remove. Same point as above about poison/undef.
// FIXME: there's probably something we can do with scalable vectors		// FIXME: there's probably something we can do with scalable vectors
Known.resetAll();		Known.resetAll();
break;		break;
}		}
unsigned NumElts = cast<FixedVectorType>(Vec->getType())->getNumElements();		unsigned NumElts = cast<FixedVectorType>(Vec->getType())->getNumElements();
APInt DemandedVecElts = APInt::getAllOnes(NumElts);		APInt DemandedVecElts = APInt::getAllOnes(NumElts);
if (CIdx && CIdx->getValue().ult(NumElts))		if (CIdx && CIdx->getValue().ult(NumElts))
DemandedVecElts = APInt::getOneBitSet(NumElts, CIdx->getZExtValue());		DemandedVecElts = APInt::getOneBitSet(NumElts, CIdx->getZExtValue());
▲ Show 20 Lines • Show All 65 Lines • ▼ Show 20 Lines
///		///
/// This function is defined on values with integer type, values with pointer		/// This function is defined on values with integer type, values with pointer
/// type, and vectors of integers. In the case		/// type, and vectors of integers. In the case
/// where V is a vector, known zero, and known one values are the		/// where V is a vector, known zero, and known one values are the
/// same width as the vector element, and the bit is set only if it is true		/// same width as the vector element, and the bit is set only if it is true
/// for all of the demanded elements in the vector specified by DemandedElts.		/// for all of the demanded elements in the vector specified by DemandedElts.
void computeKnownBits(const Value *V, const APInt &DemandedElts,		void computeKnownBits(const Value *V, const APInt &DemandedElts,
KnownBits &Known, unsigned Depth, const Query &Q) {		KnownBits &Known, unsigned Depth, const Query &Q) {
if (!DemandedElts \|\| isa<ScalableVectorType>(V->getType())) {		if (!DemandedElts) {
// No demanded elts or V is a scalable vector, better to assume we don't		// No demanded elts, better to assume we don't know anything.
// know anything.
Known.resetAll();		Known.resetAll();
return;		return;
}		}

assert(V && "No Value?");		assert(V && "No Value?");
assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");		assert(Depth <= MaxAnalysisRecursionDepth && "Limit Search Depth");

#ifndef NDEBUG		#ifndef NDEBUG
Type *Ty = V->getType();		Type *Ty = V->getType();
unsigned BitWidth = Known.getBitWidth();		unsigned BitWidth = Known.getBitWidth();

assert((Ty->isIntOrIntVectorTy(BitWidth) \|\| Ty->isPtrOrPtrVectorTy()) &&		assert((Ty->isIntOrIntVectorTy(BitWidth) \|\| Ty->isPtrOrPtrVectorTy()) &&
"Not integer or pointer type!");		"Not integer or pointer type!");

if (auto *FVTy = dyn_cast<FixedVectorType>(Ty)) {		if (auto *FVTy = dyn_cast<FixedVectorType>(Ty)) {
assert(		assert(
FVTy->getNumElements() == DemandedElts.getBitWidth() &&		FVTy->getNumElements() == DemandedElts.getBitWidth() &&
"DemandedElt width should equal the fixed vector number of elements");		"DemandedElt width should equal the fixed vector number of elements");
} else {		} else {
assert(DemandedElts == APInt(1, 1) &&		assert(DemandedElts == APInt(1, 1) &&
"DemandedElt width should be 1 for scalars");		"DemandedElt width should be 1 for scalars or scalable vectors");
}		}

Type *ScalarTy = Ty->getScalarType();		Type *ScalarTy = Ty->getScalarType();
if (ScalarTy->isPointerTy()) {		if (ScalarTy->isPointerTy()) {
assert(BitWidth == Q.DL.getPointerTypeSizeInBits(ScalarTy) &&		assert(BitWidth == Q.DL.getPointerTypeSizeInBits(ScalarTy) &&
"V and Known should have same BitWidth");		"V and Known should have same BitWidth");
} else {		} else {
assert(BitWidth == Q.DL.getTypeSizeInBits(ScalarTy) &&		assert(BitWidth == Q.DL.getTypeSizeInBits(ScalarTy) &&
Show All 10 Lines	#endif
// Null and aggregate-zero are all-zeros.		// Null and aggregate-zero are all-zeros.
if (isa<ConstantPointerNull>(V) \|\| isa<ConstantAggregateZero>(V)) {		if (isa<ConstantPointerNull>(V) \|\| isa<ConstantAggregateZero>(V)) {
Known.setAllZero();		Known.setAllZero();
return;		return;
}		}
// Handle a constant vector by taking the intersection of the known bits of		// Handle a constant vector by taking the intersection of the known bits of
// each element.		// each element.
if (const ConstantDataVector *CDV = dyn_cast<ConstantDataVector>(V)) {		if (const ConstantDataVector *CDV = dyn_cast<ConstantDataVector>(V)) {
		assert(!isa<ScalableVectorType>(V->getType()));
// We know that CDV must be a vector of integers. Take the intersection of		// We know that CDV must be a vector of integers. Take the intersection of
// each element.		// each element.
Known.Zero.setAllBits(); Known.One.setAllBits();		Known.Zero.setAllBits(); Known.One.setAllBits();
for (unsigned i = 0, e = CDV->getNumElements(); i != e; ++i) {		for (unsigned i = 0, e = CDV->getNumElements(); i != e; ++i) {
if (!DemandedElts[i])		if (!DemandedElts[i])
continue;		continue;
APInt Elt = CDV->getElementAsAPInt(i);		APInt Elt = CDV->getElementAsAPInt(i);
Known.Zero &= ~Elt;		Known.Zero &= ~Elt;
Known.One &= Elt;		Known.One &= Elt;
}		}
return;		return;
}		}

if (const auto *CV = dyn_cast<ConstantVector>(V)) {		if (const auto *CV = dyn_cast<ConstantVector>(V)) {
		assert(!isa<ScalableVectorType>(V->getType()));
// We know that CV must be a vector of integers. Take the intersection of		// We know that CV must be a vector of integers. Take the intersection of
// each element.		// each element.
Known.Zero.setAllBits(); Known.One.setAllBits();		Known.Zero.setAllBits(); Known.One.setAllBits();
for (unsigned i = 0, e = CV->getNumOperands(); i != e; ++i) {		for (unsigned i = 0, e = CV->getNumOperands(); i != e; ++i) {
if (!DemandedElts[i])		if (!DemandedElts[i])
continue;		continue;
Constant *Element = CV->getAggregateElement(i);		Constant *Element = CV->getAggregateElement(i);
auto *ElementCI = dyn_cast_or_null<ConstantInt>(Element);		auto *ElementCI = dyn_cast_or_null<ConstantInt>(Element);
▲ Show 20 Lines • Show All 5,384 Lines • Show Last 20 Lines

llvm/test/Transforms/InstCombine/add.ll

Show First 20 Lines • Show All 2,332 Lines • ▼ Show 20 Lines	;
%m = mul i8 %x, -3		%m = mul i8 %x, -3
%a = add i8 %m, 42		%a = add i8 %m, 42
ret i8 %a		ret i8 %a
}		}

define <vscale x 1 x i32> @add_to_or_scalable(<vscale x 1 x i32> %in) {		define <vscale x 1 x i32> @add_to_or_scalable(<vscale x 1 x i32> %in) {
; CHECK-LABEL: @add_to_or_scalable(		; CHECK-LABEL: @add_to_or_scalable(
; CHECK-NEXT: [[SHL:%.]] = shl <vscale x 1 x i32> [[IN:%.]], shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 1, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)		; CHECK-NEXT: [[SHL:%.]] = shl <vscale x 1 x i32> [[IN:%.]], shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 1, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)
; CHECK-NEXT: [[ADD:%.*]] = add <vscale x 1 x i32> [[SHL]], shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 1, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)		; CHECK-NEXT: [[ADD:%.*]] = or <vscale x 1 x i32> [[SHL]], shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 1, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)
; CHECK-NEXT: ret <vscale x 1 x i32> [[ADD]]		; CHECK-NEXT: ret <vscale x 1 x i32> [[ADD]]
;		;
%shl = shl <vscale x 1 x i32> %in, shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 1, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)		%shl = shl <vscale x 1 x i32> %in, shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 1, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)
%add = add <vscale x 1 x i32> %shl, shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 1, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)		%add = add <vscale x 1 x i32> %shl, shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 1, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)
ret <vscale x 1 x i32> %add		ret <vscale x 1 x i32> %add
}		}

llvm/test/Transforms/InstCombine/intrinsics.ll

Show First 20 Lines • Show All 121 Lines • ▼ Show 20 Lines	;
%or = or <2 x i32> %arg, <i32 4, i32 4>		%or = or <2 x i32> %arg, <i32 4, i32 4>
%cnt = call <2 x i32> @llvm.cttz.v2i32(<2 x i32> %or, i1 true) nounwind readnone		%cnt = call <2 x i32> @llvm.cttz.v2i32(<2 x i32> %or, i1 true) nounwind readnone
%res = icmp eq <2 x i32> %cnt, <i32 4, i32 4>		%res = icmp eq <2 x i32> %cnt, <i32 4, i32 4>
ret <2 x i1> %res		ret <2 x i1> %res
}		}

define <vscale x 1 x i1> @cttz_knownbits_scalable_vec(<vscale x 1 x i32> %arg) {		define <vscale x 1 x i1> @cttz_knownbits_scalable_vec(<vscale x 1 x i32> %arg) {
; CHECK-LABEL: @cttz_knownbits_scalable_vec(		; CHECK-LABEL: @cttz_knownbits_scalable_vec(
; CHECK-NEXT: [[OR:%.]] = and <vscale x 1 x i32> [[ARG:%.]], shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 27, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)		; CHECK-NEXT: ret <vscale x 1 x i1> zeroinitializer
; CHECK-NEXT: [[RES:%.*]] = icmp eq <vscale x 1 x i32> [[OR]], shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 20, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)
; CHECK-NEXT: ret <vscale x 1 x i1> [[RES]]
;		;
%or = or <vscale x 1 x i32> %arg, shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 4, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)		%or = or <vscale x 1 x i32> %arg, shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 4, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)
%cnt = call <vscale x 1 x i32> @llvm.cttz.nxv1i32(<vscale x 1 x i32> %or, i1 true) nounwind readnone		%cnt = call <vscale x 1 x i32> @llvm.cttz.nxv1i32(<vscale x 1 x i32> %or, i1 true) nounwind readnone
%res = icmp eq <vscale x 1 x i32> %cnt, shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 4, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)		%res = icmp eq <vscale x 1 x i32> %cnt, shufflevector (<vscale x 1 x i32> insertelement (<vscale x 1 x i32> poison, i32 4, i32 0), <vscale x 1 x i32> poison, <vscale x 1 x i32> zeroinitializer)
ret <vscale x 1 x i1> %res		ret <vscale x 1 x i1> %res
}		}


▲ Show 20 Lines • Show All 366 Lines • Show Last 20 Lines