This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Vectorize/
-
Transforms/
-
Vectorize/
4/8
LoopVectorize.cpp
-
test/Transforms/LoopVectorize/
-
Transforms/
-
LoopVectorize/
1
irregular_type.ll

Differential D97465

[LoopVectorize] Refine hasIrregularType predicate
ClosedPublic

Authored by LemonBoy on Feb 25 2021, 5:56 AM.

Download Raw Diff

Details

Reviewers

mkuper
fhahn
craig.topper
david-arm
lebedev.ri

Commits

rG4f024938e4c9: [LoopVectorize] Refine hasIrregularType predicate

Summary

The hasIrregularType predicate checks whether an array of N values of type Ty is "bitcast-compatible" with a <N x Ty> vector.
The previous check returned invalid results in some cases where there's some padding between the array elements: eg. a 4-element array of u7 values is considered as compatible with <4 x u7>, even though the vector is only loading/storing 28 bits instead of 32.

The problem causes LLVM to generate incorrect code for some targets: for AArch64 the vector loads/stores are lowered in terms of ubfx/bfi, effectively losing the top (N * padding bits).

Diff Detail

Event Timeline

LemonBoy created this revision.Feb 25 2021, 5:56 AM

Herald added subscribers: hiraditya, kristof.beyls. · View Herald TranscriptFeb 25 2021, 5:56 AM

LemonBoy requested review of this revision.Feb 25 2021, 5:56 AM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 25 2021, 5:56 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B90805: Diff 326369.Feb 25 2021, 6:38 AM

craig.topper added inline comments.Feb 25 2021, 11:50 AM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
376	Should we remove the unused VF argument?

Remove unused parameter.

LemonBoy marked an inline comment as done.Feb 25 2021, 1:50 PM

LemonBoy added inline comments.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
376	Good catch, fixed.

Harbormaster completed remote builds in B90890: Diff 326488.Feb 25 2021, 4:21 PM

fhahn added inline comments.Mar 2 2021, 3:56 AM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
375	comment needs updating, there's no given vectorization factor any longer, right?
377	The comment also needs to be updated to not refer to VF I think, as it is gone now. Something like `Determine if an array of type Ty is "bit cast compatible" with a vector with the same number of elements`.
llvm/test/Transforms/LoopVectorize/irregular_type.ll
6	can you add a comment explaining what the test checks?

LemonBoy added a subscriber: david-arm.Mar 2 2021, 4:14 AM

LemonBoy added inline comments.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
381	CC @david-arm This slightly different formulation of the check was introduced in c5ba0d33cc060cc06a28a5d9101060afd1c0ee9a.

Update some documentation comments.

Harbormaster completed remote builds in B91608: Diff 327503.Mar 2 2021, 12:09 PM

LemonBoy added a reviewer: david-arm.Mar 6 2021, 12:11 AM

Ping ?

Seems good to me.

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
380–381	I wonder why can't we still vectorize such cases, by instead loading `<N x DL.getTypeAllocSizeInBits(Ty)>` vector and then truncating it? (beware of endianness)

This revision is now accepted and ready to land.Mar 16 2021, 1:08 PM

LemonBoy added inline comments.Mar 17 2021, 4:31 AM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
380–381	This was meant to be a hotfix targeting LLVM12. I've experimented with the widen+truncate strategy and the results are promising (at least on x86), I'll submit a patch once I clean up the code.

lebedev.ri added inline comments.Mar 17 2021, 4:32 AM

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
380–381	This was meant to be a hotfix targeting LLVM12. Sure, i wasn't even suggesting doing that here. I've experimented with the widen+truncate strategy and the results are promising (at least on x86), I'll submit a patch once I clean up the code. Nice!

This revision was landed with ongoing or failed builds.Mar 17 2021, 9:05 AM

Closed by commit rG4f024938e4c9: [LoopVectorize] Refine hasIrregularType predicate (authored by LemonBoy). · Explain Why

This revision was automatically updated to reflect the committed changes.

LemonBoy added a commit: rG4f024938e4c9: [LoopVectorize] Refine hasIrregularType predicate.

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Vectorize/

LoopVectorize.cpp

10 lines

test/

Transforms/

LoopVectorize/

irregular_type.ll

24 lines

Diff 326369

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 366 Lines • ▼ Show 20 Lines	assert((isa<LoadInst>(I) \|\| isa<StoreInst>(I)) &&
"Expected Load or Store instruction");		"Expected Load or Store instruction");
if (auto *LI = dyn_cast<LoadInst>(I))		if (auto *LI = dyn_cast<LoadInst>(I))
return LI->getType();		return LI->getType();
return cast<StoreInst>(I)->getValueOperand()->getType();		return cast<StoreInst>(I)->getValueOperand()->getType();
}		}

/// A helper function that returns true if the given type is irregular. The		/// A helper function that returns true if the given type is irregular. The
/// type is irregular if its allocated size doesn't equal the store size of an		/// type is irregular if its allocated size doesn't equal the store size of an
/// element of the corresponding vector type at the given vectorization factor.		/// element of the corresponding vector type at the given vectorization factor.
		fhahnUnsubmitted Not Done Reply Inline Actions comment needs updating, there's no given vectorization factor any longer, right? fhahn: comment needs updating, there's no given vectorization factor any longer, right?
static bool hasIrregularType(Type *Ty, const DataLayout &DL, ElementCount VF) {		static bool hasIrregularType(Type *Ty, const DataLayout &DL, ElementCount VF) {
		craig.topperUnsubmitted Not Done Reply Inline Actions Should we remove the unused VF argument? craig.topper: Should we remove the unused VF argument?
		LemonBoyAuthorUnsubmitted Done Reply Inline Actions Good catch, fixed. LemonBoy: Good catch, fixed.
// Determine if an array of VF elements of type Ty is "bitcast compatible"		// Determine if an array of VF elements of type Ty is "bitcast compatible"
		fhahnUnsubmitted Not Done Reply Inline Actions The comment also needs to be updated to not refer to VF I think, as it is gone now. Something like `Determine if an array of type Ty is "bit cast compatible" with a vector with the same number of elements`. fhahn: The comment also needs to be updated to not refer to VF I think, as it is gone now. Something…
// with a <VF x Ty> vector.		// with a <VF x Ty> vector.
if (VF.isVector()) {		// This is only true if there is no padding between the array elements.
auto *VectorTy = VectorType::get(Ty, VF);
return TypeSize::get(VF.getKnownMinValue() *
LemonBoyAuthorUnsubmitted Done Reply Inline Actions CC @david-arm This slightly different formulation of the check was introduced in c5ba0d33cc060cc06a28a5d9101060afd1c0ee9a. LemonBoy: CC @david-arm This slightly different formulation of the check was introduced in…
DL.getTypeAllocSize(Ty).getFixedValue(),
VF.isScalable()) != DL.getTypeStoreSize(VectorTy);
}

// If the vectorization factor is one, we just check if an array of type Ty
// requires padding between elements.
return DL.getTypeAllocSizeInBits(Ty) != DL.getTypeSizeInBits(Ty);		return DL.getTypeAllocSizeInBits(Ty) != DL.getTypeSizeInBits(Ty);
}		}
		lebedev.riUnsubmitted Not Done Reply Inline Actions I wonder why can't we still vectorize such cases, by instead loading `<N x DL.getTypeAllocSizeInBits(Ty)>` vector and then truncating it? (beware of endianness) lebedev.ri: I wonder why can't we still vectorize such cases, by instead loading `<N x DL.
		LemonBoyAuthorUnsubmitted Done Reply Inline Actions This was meant to be a hotfix targeting LLVM12. I've experimented with the widen+truncate strategy and the results are promising (at least on x86), I'll submit a patch once I clean up the code. LemonBoy: This was meant to be a hotfix targeting LLVM12. I've experimented with the widen+truncate…
		lebedev.riUnsubmitted Done Reply Inline Actions This was meant to be a hotfix targeting LLVM12. Sure, i wasn't even suggesting doing that here. I've experimented with the widen+truncate strategy and the results are promising (at least on x86), I'll submit a patch once I clean up the code. Nice! lebedev.ri: > This was meant to be a hotfix targeting LLVM12. Sure, i wasn't even suggesting doing that…

/// A helper function that returns the reciprocal of the block probability of		/// A helper function that returns the reciprocal of the block probability of
/// predicated blocks. If we return X, we are assuming the predicated block		/// predicated blocks. If we return X, we are assuming the predicated block
/// will execute once for every X iterations of the loop header.		/// will execute once for every X iterations of the loop header.
///		///
/// TODO: We should use actual block probability here, if available. Currently,		/// TODO: We should use actual block probability here, if available. Currently,
/// we always assume predicated blocks have a 50% chance of executing.		/// we always assume predicated blocks have a 50% chance of executing.
static unsigned getReciprocalPredBlockProb() { return 2; }		static unsigned getReciprocalPredBlockProb() { return 2; }
▲ Show 20 Lines • Show All 9,356 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopVectorize/irregular_type.ll

This file was added.

				; RUN: opt %s -loop-vectorize -force-vector-width=4 -S \| FileCheck %s

				; CHECK: foo
				; CHECK: vector.body
				; CHECK-NOT: load <4 x i7>
				; CHECK-NOT: store <4 x i7>
				fhahnUnsubmitted Not Done Reply Inline Actions can you add a comment explaining what the test checks? fhahn: can you add a comment explaining what the test checks?
				; CHECK: for.body
				define void @foo(i7* %a, i64 %n) {
				entry:
				br label %for.body

				for.body:
				%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
				%arrayidx = getelementptr inbounds i7, i7* %a, i64 %indvars.iv
				%0 = load i7, i7* %arrayidx, align 1
				%sub = add nuw nsw i7 %0, 0
				store i7 %sub, i7* %arrayidx, align 1
				%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
				%cmp = icmp eq i64 %indvars.iv.next, %n
				br i1 %cmp, label %for.exit, label %for.body

				for.exit:
				ret void
				}