This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/Transforms/Vectorize/
-
llvm/
-
Transforms/
-
Vectorize/
-
LoopVectorizationLegality.h
-
lib/Transforms/Vectorize/
-
Transforms/
-
Vectorize/
-
LoopVectorizationLegality.cpp
-
LoopVectorize.cpp
-
VPlan.h
-
test/Transforms/LoopVectorize/X86/
-
Transforms/
-
LoopVectorize/
-
X86/
-
float-induction-x86.ll
-
unittests/Transforms/Vectorize/
-
Transforms/
-
Vectorize/
-
VPlanTest.cpp

Differential D95452

[LoopVectorize] use IR fast-math-flags exclusively (not function attributes)
ClosedPublic

Authored by spatel on Jan 26 2021, 9:13 AM.

Download Raw Diff

Details

Reviewers

fhahn
dmgreen
SjoerdMeijer

Commits

rGab93c18c125f: [LoopVectorize] use IR fast-math-flags exclusively (not FP function attributes)

Summary

I am trying to untangle the fast-math-flags propagation logic in the vectorizers (see a6f022127 for SLP).

The loop vectorizer has a mix of checking FP function attributes, IR-level FMF, and just wrong assumptions.

I am trying to avoid regressions while fixing this, and I think the IR-level logic is good enough for that, but it's hard to say for sure. This would be the 1st step in the clean-up.

The existing test that I changed to include 'fast' actually shows a miscompile: the function only had the equivalent of nnan, but we created new instructions that had fast (all FMF set). This is similar to the example in https://llvm.org/PR35538

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

spatel created this revision.Jan 26 2021, 9:13 AM

Herald added subscribers: rogfer01, hiraditya, mcrosier. · View Herald TranscriptJan 26 2021, 9:13 AM

spatel requested review of this revision.Jan 26 2021, 9:13 AM

Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2021, 9:13 AM

Herald added a subscriber: vkmr. · View Herald Transcript

Should InductionDescriptor::hasUnsafeAlgebra be using isFast, or can it be something more precise? Just reassoc or does it need NoNan for some reason? Any others like NoInf?

I don't have a strong reason to think that is necessary though. This sounds fine to me as most of it is an NFC, with the only difference being the FP induction change. Removing NoNaN from VPReductionRecipe is nice now that they can come from the RecurrenceDesc.

In D95452#2523757, @dmgreen wrote:

Should InductionDescriptor::hasUnsafeAlgebra be using isFast, or can it be something more precise? Just reassoc or does it need NoNan for some reason? Any others like NoInf?

Good question - if we want to be more precise (and we do!), it will depend on the type of induction (RecurKind). I'd make that a follow-up patch to reduce change/risk.
For fmul/fadd, we should require reassoc and to be safer nsz (I don't think nnan is actually necessary because reassoc covers that).
For fmin/fmax, we should require nnan and nsz. We may want to update the variable naming to better indicate exactly what we mean by "UnsafeAlgebra" - that's a leftover from before the creation of FMF I think.
There's an additional complication for fmin/fmax in that we don't have a complete story for FMF propagation on fcmp/select yet. We'll probably need to mildly hack around that to preserve existing optimizations.

Sounds good. This LGTM then.

This revision is now accepted and ready to land.Jan 27 2021, 8:09 AM

Closed by commit rGab93c18c125f: [LoopVectorize] use IR fast-math-flags exclusively (not FP function attributes) (authored by spatel). · Explain WhyJan 27 2021, 11:17 AM

This revision was automatically updated to reflect the committed changes.

spatel added a commit: rGab93c18c125f: [LoopVectorize] use IR fast-math-flags exclusively (not FP function attributes).

spatel mentioned this in D95690: [LoopVectorize] improve IR fast-math-flags propagation in reductions.Jan 29 2021, 11:16 AM

spatel mentioned this in rGbbed5f2f8a04: [LoopVectorize] improve IR fast-math-flags propagation in reductions.Feb 1 2021, 1:21 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

Transforms/

Vectorize/

LoopVectorizationLegality.h

6 lines

lib/

Transforms/

Vectorize/

LoopVectorizationLegality.cpp

7 lines

LoopVectorize.cpp

2 lines

VPlan.h

6 lines

test/

Transforms/

LoopVectorize/

X86/

float-induction-x86.ll

13 lines

unittests/

Transforms/

Vectorize/

VPlanTest.cpp

2 lines

Diff 319632

llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h

Show First 20 Lines • Show All 338 Lines • ▼ Show 20 Lines	public:

/// Returns true if vector representation of the instruction \p I		/// Returns true if vector representation of the instruction \p I
/// requires mask.		/// requires mask.
bool isMaskRequired(const Instruction *I) { return MaskedOp.contains(I); }		bool isMaskRequired(const Instruction *I) { return MaskedOp.contains(I); }

unsigned getNumStores() const { return LAI->getNumStores(); }		unsigned getNumStores() const { return LAI->getNumStores(); }
unsigned getNumLoads() const { return LAI->getNumLoads(); }		unsigned getNumLoads() const { return LAI->getNumLoads(); }

// Returns true if the NoNaN attribute is set on the function.
bool hasFunNoNaNAttr() const { return HasFunNoNaNAttr; }

/// Returns all assume calls in predicated blocks. They need to be dropped		/// Returns all assume calls in predicated blocks. They need to be dropped
/// when flattening the CFG.		/// when flattening the CFG.
const SmallPtrSetImpl<Instruction *> &getConditionalAssumes() const {		const SmallPtrSetImpl<Instruction *> &getConditionalAssumes() const {
return ConditionalAssumes;		return ConditionalAssumes;
}		}

private:		private:
/// Return true if the pre-header, exiting and latch blocks of \p Lp and all		/// Return true if the pre-header, exiting and latch blocks of \p Lp and all
▲ Show 20 Lines • Show All 132 Lines • ▼ Show 20 Lines	private:

/// Holds the widest induction type encountered.		/// Holds the widest induction type encountered.
Type *WidestIndTy = nullptr;		Type *WidestIndTy = nullptr;

/// Allowed outside users. This holds the variables that can be accessed from		/// Allowed outside users. This holds the variables that can be accessed from
/// outside the loop.		/// outside the loop.
SmallPtrSet<Value *, 4> AllowedExit;		SmallPtrSet<Value *, 4> AllowedExit;

/// Can we assume the absence of NaNs.
bool HasFunNoNaNAttr = false;

/// Vectorization requirements that will go through late-evaluation.		/// Vectorization requirements that will go through late-evaluation.
LoopVectorizationRequirements *Requirements;		LoopVectorizationRequirements *Requirements;

/// Used to emit an analysis of any legality issues.		/// Used to emit an analysis of any legality issues.
LoopVectorizeHints *Hints;		LoopVectorizeHints *Hints;

/// The demanded bits analysis is used to compute the minimum type size in		/// The demanded bits analysis is used to compute the minimum type size in
/// which a reduction can be computed.		/// which a reduction can be computed.
Show All 22 Lines

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

Show First 20 Lines • Show All 599 Lines • ▼ Show 20 Lines	for (unsigned VF = 2, WidestVF = TLI.getWidestVF(ScalarName);
Scalarize &= !TLI.isFunctionVectorizable(ScalarName, VF);		Scalarize &= !TLI.isFunctionVectorizable(ScalarName, VF);
}		}
return Scalarize;		return Scalarize;
}		}

bool LoopVectorizationLegality::canVectorizeInstrs() {		bool LoopVectorizationLegality::canVectorizeInstrs() {
BasicBlock *Header = TheLoop->getHeader();		BasicBlock *Header = TheLoop->getHeader();

// Look for the attribute signaling the absence of NaNs.
Function &F = *Header->getParent();
HasFunNoNaNAttr =
F.getFnAttribute("no-nans-fp-math").getValueAsString() == "true";

// For each block in the loop.		// For each block in the loop.
for (BasicBlock *BB : TheLoop->blocks()) {		for (BasicBlock *BB : TheLoop->blocks()) {
// Scan the instructions in the block and look for hazards.		// Scan the instructions in the block and look for hazards.
for (Instruction &I : *BB) {		for (Instruction &I : *BB) {
if (auto *Phi = dyn_cast<PHINode>(&I)) {		if (auto *Phi = dyn_cast<PHINode>(&I)) {
Type *PhiTy = Phi->getType();		Type *PhiTy = Phi->getType();
// Check that this PHI type is allowed.		// Check that this PHI type is allowed.
if (!PhiTy->isIntegerTy() && !PhiTy->isFloatingPointTy() &&		if (!PhiTy->isIntegerTy() && !PhiTy->isFloatingPointTy() &&
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	for (Instruction &I : *BB) {
// handling below		// handling below
// 4. FirstOrderRecurrence phis that can possibly be handled by		// 4. FirstOrderRecurrence phis that can possibly be handled by
// extraction.		// extraction.
// By recording these, we can then reason about ways to vectorize each		// By recording these, we can then reason about ways to vectorize each
// of these NotAllowedExit.		// of these NotAllowedExit.
InductionDescriptor ID;		InductionDescriptor ID;
if (InductionDescriptor::isInductionPHI(Phi, TheLoop, PSE, ID)) {		if (InductionDescriptor::isInductionPHI(Phi, TheLoop, PSE, ID)) {
addInductionPhi(Phi, ID, AllowedExit);		addInductionPhi(Phi, ID, AllowedExit);
if (ID.hasUnsafeAlgebra() && !HasFunNoNaNAttr)		if (ID.hasUnsafeAlgebra())
Requirements->addUnsafeAlgebraInst(ID.getUnsafeAlgebraInst());		Requirements->addUnsafeAlgebraInst(ID.getUnsafeAlgebraInst());
continue;		continue;
}		}

if (RecurrenceDescriptor::isFirstOrderRecurrence(Phi, TheLoop,		if (RecurrenceDescriptor::isFirstOrderRecurrence(Phi, TheLoop,
SinkAfter, DT)) {		SinkAfter, DT)) {
AllowedExit.insert(Phi);		AllowedExit.insert(Phi);
FirstOrderRecurrences.insert(Phi);		FirstOrderRecurrences.insert(Phi);
▲ Show 20 Lines • Show All 611 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 8,902 Lines • ▼ Show 20 Lines	for (Instruction *R : ReductionOperations) {
unsigned VecOpId =		unsigned VecOpId =
R->getOperand(FirstOpId) == Chain ? FirstOpId + 1 : FirstOpId;		R->getOperand(FirstOpId) == Chain ? FirstOpId + 1 : FirstOpId;
VPValue *VecOp = Plan->getVPValue(R->getOperand(VecOpId));		VPValue *VecOp = Plan->getVPValue(R->getOperand(VecOpId));

auto *CondOp = CM.foldTailByMasking()		auto *CondOp = CM.foldTailByMasking()
? RecipeBuilder.createBlockInMask(R->getParent(), Plan)		? RecipeBuilder.createBlockInMask(R->getParent(), Plan)
: nullptr;		: nullptr;
VPReductionRecipe *RedRecipe = new VPReductionRecipe(		VPReductionRecipe *RedRecipe = new VPReductionRecipe(
&RdxDesc, R, ChainOp, VecOp, CondOp, Legal->hasFunNoNaNAttr(), TTI);		&RdxDesc, R, ChainOp, VecOp, CondOp, TTI);
WidenRecipe->getVPValue()->replaceAllUsesWith(RedRecipe);		WidenRecipe->getVPValue()->replaceAllUsesWith(RedRecipe);
Plan->removeVPValueFor(R);		Plan->removeVPValueFor(R);
Plan->addVPValue(R, RedRecipe);		Plan->addVPValue(R, RedRecipe);
WidenRecipe->getParent()->insert(RedRecipe, WidenRecipe->getIterator());		WidenRecipe->getParent()->insert(RedRecipe, WidenRecipe->getIterator());
WidenRecipe->getVPValue()->replaceAllUsesWith(RedRecipe);		WidenRecipe->getVPValue()->replaceAllUsesWith(RedRecipe);
WidenRecipe->eraseFromParent();		WidenRecipe->eraseFromParent();

if (RecurrenceDescriptor::isMinMaxRecurrenceKind(Kind)) {		if (RecurrenceDescriptor::isMinMaxRecurrenceKind(Kind)) {
▲ Show 20 Lines • Show All 809 Lines • Show Last 20 Lines

llvm/lib/Transforms/Vectorize/VPlan.h

	Show First 20 Lines • Show All 1,124 Lines • ▼ Show 20 Lines
	};			};

	/// A recipe to represent inloop reduction operations, performing a reduction on			/// A recipe to represent inloop reduction operations, performing a reduction on
	/// a vector operand into a scalar value, and adding the result to a chain.			/// a vector operand into a scalar value, and adding the result to a chain.
	/// The Operands are {ChainOp, VecOp, [Condition]}.			/// The Operands are {ChainOp, VecOp, [Condition]}.
	class VPReductionRecipe : public VPRecipeBase, public VPUser, public VPValue {			class VPReductionRecipe : public VPRecipeBase, public VPUser, public VPValue {
	/// The recurrence decriptor for the reduction in question.			/// The recurrence decriptor for the reduction in question.
	RecurrenceDescriptor *RdxDesc;			RecurrenceDescriptor *RdxDesc;
	/// Fast math flags to use for the resulting reduction operation.
	bool NoNaN;
	/// Pointer to the TTI, needed to create the target reduction			/// Pointer to the TTI, needed to create the target reduction
	const TargetTransformInfo *TTI;			const TargetTransformInfo *TTI;

	public:			public:
	VPReductionRecipe(RecurrenceDescriptor R, Instruction I, VPValue *ChainOp,			VPReductionRecipe(RecurrenceDescriptor R, Instruction I, VPValue *ChainOp,
	VPValue VecOp, VPValue CondOp, bool NoNaN,			VPValue VecOp, VPValue CondOp,
	const TargetTransformInfo *TTI)			const TargetTransformInfo *TTI)
	: VPRecipeBase(VPRecipeBase::VPReductionSC), VPUser({ChainOp, VecOp}),			: VPRecipeBase(VPRecipeBase::VPReductionSC), VPUser({ChainOp, VecOp}),
	VPValue(VPValue::VPVReductionSC, I, this), RdxDesc(R), NoNaN(NoNaN),			VPValue(VPValue::VPVReductionSC, I, this), RdxDesc(R),
	TTI(TTI) {			TTI(TTI) {
	if (CondOp)			if (CondOp)
	addOperand(CondOp);			addOperand(CondOp);
	}			}

	~VPReductionRecipe() override = default;			~VPReductionRecipe() override = default;

	/// Method to support type inquiry through isa, cast, and dyn_cast.			/// Method to support type inquiry through isa, cast, and dyn_cast.
	▲ Show 20 Lines • Show All 998 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py		; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
; RUN: opt < %s -O3 -simplifycfg -simplifycfg-require-and-preserve-domtree=1 -keep-loops=false -mcpu=core-avx2 -mtriple=x86_64-unknown-linux-gnu -S \| FileCheck --check-prefix AUTO_VEC %s		; RUN: opt < %s -O3 -simplifycfg -simplifycfg-require-and-preserve-domtree=1 -keep-loops=false -mcpu=core-avx2 -mtriple=x86_64-unknown-linux-gnu -S \| FileCheck --check-prefix AUTO_VEC %s

target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"		target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"

; This test checks auto-vectorization with FP induction variable.		; This test checks auto-vectorization with FP induction variable.
; The FP operation is not "fast" and requires "fast-math" function attribute.		; FMF is required on the IR instructions.

;void fp_iv_loop1(float * __restrict__ A, int N) {		;void fp_iv_loop1(float * __restrict__ A, int N) {
; float x = 1.0;		; float x = 1.0;
; for (int i=0; i < N; ++i) {		; for (int i=0; i < N; ++i) {
; A[i] = x;		; A[i] = x;
; x += 0.5;		; x += 0.5;
; }		; }
;}		;}
▲ Show 20 Lines • Show All 128 Lines • ▼ Show 20 Lines
; AUTO_VEC: middle.block:		; AUTO_VEC: middle.block:
; AUTO_VEC-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[ZEXT]]		; AUTO_VEC-NEXT: [[CMP_N:%.*]] = icmp eq i64 [[N_VEC]], [[ZEXT]]
; AUTO_VEC-NEXT: br i1 [[CMP_N]], label [[FOR_END]], label [[FOR_BODY]]		; AUTO_VEC-NEXT: br i1 [[CMP_N]], label [[FOR_END]], label [[FOR_BODY]]
; AUTO_VEC: for.body:		; AUTO_VEC: for.body:
; AUTO_VEC-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[FOR_BODY_PREHEADER]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]		; AUTO_VEC-NEXT: [[INDVARS_IV:%.]] = phi i64 [ [[INDVARS_IV_NEXT:%.]], [[FOR_BODY]] ], [ 0, [[FOR_BODY_PREHEADER]] ], [ [[N_VEC]], [[MIDDLE_BLOCK]] ]
; AUTO_VEC-NEXT: [[X_06:%.]] = phi float [ [[CONV1:%.]], [[FOR_BODY]] ], [ 1.000000e+00, [[FOR_BODY_PREHEADER]] ], [ [[IND_END]], [[MIDDLE_BLOCK]] ]		; AUTO_VEC-NEXT: [[X_06:%.]] = phi float [ [[CONV1:%.]], [[FOR_BODY]] ], [ 1.000000e+00, [[FOR_BODY_PREHEADER]] ], [ [[IND_END]], [[MIDDLE_BLOCK]] ]
; AUTO_VEC-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds float, float [[A]], i64 [[INDVARS_IV]]		; AUTO_VEC-NEXT: [[ARRAYIDX:%.]] = getelementptr inbounds float, float [[A]], i64 [[INDVARS_IV]]
; AUTO_VEC-NEXT: store float [[X_06]], float* [[ARRAYIDX]], align 4		; AUTO_VEC-NEXT: store float [[X_06]], float* [[ARRAYIDX]], align 4
; AUTO_VEC-NEXT: [[CONV1]] = fadd float [[X_06]], 5.000000e-01		; AUTO_VEC-NEXT: [[CONV1]] = fadd fast float [[X_06]], 5.000000e-01
; AUTO_VEC-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1		; AUTO_VEC-NEXT: [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
; AUTO_VEC-NEXT: [[TMP45:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[ZEXT]]		; AUTO_VEC-NEXT: [[TMP45:%.*]] = icmp eq i64 [[INDVARS_IV_NEXT]], [[ZEXT]]
; AUTO_VEC-NEXT: br i1 [[TMP45]], label [[FOR_END]], label [[FOR_BODY]], [[LOOP4:!llvm.loop !.*]]		; AUTO_VEC-NEXT: br i1 [[TMP45]], label [[FOR_END]], label [[FOR_BODY]], [[LOOP4:!llvm.loop !.*]]
; AUTO_VEC: for.end:		; AUTO_VEC: for.end:
; AUTO_VEC-NEXT: ret void		; AUTO_VEC-NEXT: ret void
;		;
entry:		entry:
%cmp4 = icmp sgt i32 %N, 0		%cmp4 = icmp sgt i32 %N, 0
br i1 %cmp4, label %for.body.preheader, label %for.end		br i1 %cmp4, label %for.body.preheader, label %for.end

for.body.preheader: ; preds = %entry		for.body.preheader: ; preds = %entry
br label %for.body		br label %for.body

for.body: ; preds = %for.body.preheader, %for.body		for.body: ; preds = %for.body.preheader, %for.body
%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]		%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
%x.06 = phi float [ %conv1, %for.body ], [ 1.000000e+00, %for.body.preheader ]		%x.06 = phi float [ %conv1, %for.body ], [ 1.000000e+00, %for.body.preheader ]
%arrayidx = getelementptr inbounds float, float* %A, i64 %indvars.iv		%arrayidx = getelementptr inbounds float, float* %A, i64 %indvars.iv
store float %x.06, float* %arrayidx, align 4		store float %x.06, float* %arrayidx, align 4
%conv1 = fadd float %x.06, 5.000000e-01		%conv1 = fadd fast float %x.06, 5.000000e-01
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1		%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%lftr.wideiv = trunc i64 %indvars.iv.next to i32		%lftr.wideiv = trunc i64 %indvars.iv.next to i32
%exitcond = icmp eq i32 %lftr.wideiv, %N		%exitcond = icmp eq i32 %lftr.wideiv, %N
br i1 %exitcond, label %for.end.loopexit, label %for.body		br i1 %exitcond, label %for.end.loopexit, label %for.body

for.end.loopexit: ; preds = %for.body		for.end.loopexit: ; preds = %for.body
br label %for.end		br label %for.end

for.end: ; preds = %for.end.loopexit, %entry		for.end: ; preds = %for.end.loopexit, %entry
ret void		ret void
}		}

; The same as the previous, FP operation is not fast, different function attribute		; The same as the previous, but FP operation has no FMF.
; Vectorization should be rejected.		; Vectorization should be rejected.
;void fp_iv_loop2(float * __restrict__ A, int N) {		;void fp_iv_loop2(float * __restrict__ A, int N) {
; float x = 1.0;		; float x = 1.0;
; for (int i=0; i < N; ++i) {		; for (int i=0; i < N; ++i) {
; A[i] = x;		; A[i] = x;
; x += 0.5;		; x += 0.5;
; }		; }
;}		;}

define void @fp_iv_loop2(float* noalias nocapture %A, i32 %N) #1 {		define void @fp_iv_loop2(float* noalias nocapture %A, i32 %N) {
; AUTO_VEC-LABEL: @fp_iv_loop2(		; AUTO_VEC-LABEL: @fp_iv_loop2(
; AUTO_VEC-NEXT: entry:		; AUTO_VEC-NEXT: entry:
; AUTO_VEC-NEXT: [[CMP4:%.]] = icmp sgt i32 [[N:%.]], 0		; AUTO_VEC-NEXT: [[CMP4:%.]] = icmp sgt i32 [[N:%.]], 0
; AUTO_VEC-NEXT: br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]]		; AUTO_VEC-NEXT: br i1 [[CMP4]], label [[FOR_BODY_PREHEADER:%.]], label [[FOR_END:%.]]
; AUTO_VEC: for.body.preheader:		; AUTO_VEC: for.body.preheader:
; AUTO_VEC-NEXT: [[ZEXT:%.*]] = zext i32 [[N]] to i64		; AUTO_VEC-NEXT: [[ZEXT:%.*]] = zext i32 [[N]] to i64
; AUTO_VEC-NEXT: [[TMP0:%.*]] = add nsw i64 [[ZEXT]], -1		; AUTO_VEC-NEXT: [[TMP0:%.*]] = add nsw i64 [[ZEXT]], -1
; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[ZEXT]], 7		; AUTO_VEC-NEXT: [[XTRAITER:%.*]] = and i64 [[ZEXT]], 7
▲ Show 20 Lines • Show All 330 Lines • ▼ Show 20 Lines	for.body:
%j.next = fadd double %j, 3.0		%j.next = fadd double %j, 3.0
%cond = icmp slt i64 %i.next, %n		%cond = icmp slt i64 %i.next, %n
br i1 %cond, label %for.body, label %for.end		br i1 %cond, label %for.body, label %for.end

for.end:		for.end:
%t1 = phi double [ %j, %for.body ]		%t1 = phi double [ %j, %for.body ]
ret double %t1		ret double %t1
}		}

attributes #0 = { "no-nans-fp-math"="true" }
attributes #1 = { "no-nans-fp-math"="false" }

llvm/unittests/Transforms/Vectorize/VPlanTest.cpp

	Show First 20 Lines • Show All 637 Lines • ▼ Show 20 Lines
	}			}

	TEST(VPRecipeTest, CastVPReductionRecipeToVPUser) {			TEST(VPRecipeTest, CastVPReductionRecipeToVPUser) {
	LLVMContext C;			LLVMContext C;

	VPValue ChainOp;			VPValue ChainOp;
	VPValue VecOp;			VPValue VecOp;
	VPValue CondOp;			VPValue CondOp;
	VPReductionRecipe Recipe(nullptr, nullptr, &ChainOp, &CondOp, &VecOp, false,			VPReductionRecipe Recipe(nullptr, nullptr, &ChainOp, &CondOp, &VecOp,
	nullptr);			nullptr);
	EXPECT_TRUE(isa<VPUser>(&Recipe));			EXPECT_TRUE(isa<VPUser>(&Recipe));
	VPRecipeBase *BaseR = &Recipe;			VPRecipeBase *BaseR = &Recipe;
	EXPECT_TRUE(isa<VPUser>(BaseR));			EXPECT_TRUE(isa<VPUser>(BaseR));
	}			}

	struct VPDoubleValueDef : public VPRecipeBase, public VPUser {			struct VPDoubleValueDef : public VPRecipeBase, public VPUser {
	VPDoubleValueDef(ArrayRef<VPValue *> Operands)			VPDoubleValueDef(ArrayRef<VPValue *> Operands)
	▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[LoopVectorize] use IR fast-math-flags exclusively (not function attributes)ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 319632

llvm/include/llvm/Transforms/Vectorize/LoopVectorizationLegality.h

llvm/lib/Transforms/Vectorize/LoopVectorizationLegality.cpp

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

llvm/lib/Transforms/Vectorize/VPlan.h

llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll

llvm/unittests/Transforms/Vectorize/VPlanTest.cpp

[LoopVectorize] use IR fast-math-flags exclusively (not function attributes)
ClosedPublic