This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Transforms/Scalar/
-
Transforms/
-
Scalar/
2
LoopStrengthReduce.cpp
-
test/Transforms/LoopStrengthReduce/
-
Transforms/
-
LoopStrengthReduce/
1/2
canonical-form.ll

Differential D122457

[LSR] Fix canonicalization formula and its checker.
ClosedPublic

Authored by skatkov on Mar 24 2022, 9:41 PM.

Download Raw Diff

Details

Reviewers

qcolombet
danilaml
mkazantsev
efriedma
reames

Commits

rG6444a65514b5: [LSR] Fixup canonicalization formula and its checker.

Summary

According to definition of canonical form, it is a canonical
if scale reg does not contain addrec for loop L then none of bases
should contain addrec for this loop.

The critical word here is "contains".

Current checker of canonical form checks not "containing" property
but "is". So it does not check whether it contains but whether it is.

Fix the checker and canonicalizing utility to follow definition.

Without this fix in the test attached the base formula looking as
reg((-1 * {0,+,8}<nuw><nsw><%bb2>)<nsw>) + 1*reg((8 * (%arg /u 8))<nuw>)
is considered as conanocial while base contains an addrec.
And modified formula we want to insert
reg({0,+,8}<nuw><nsw><%bb2>) + 1*reg((-8 * (%arg /u 8)))
is considered as not canonical.

Diff Detail

Event Timeline

skatkov created this revision.Mar 24 2022, 9:41 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2022, 9:41 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

skatkov requested review of this revision.Mar 24 2022, 9:41 PM

Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2022, 9:41 PM

Harbormaster completed remote builds in B156213: Diff 418128.Mar 24 2022, 10:32 PM

Could you please elaborate what kind of crash do you see? Is it simply an assertion that something must be canonical and it's not, or something more sophisticated?

mkazantsev added inline comments.Mar 27 2022, 8:36 PM

llvm/test/Transforms/LoopStrengthReduce/canonical-form.ll
2	nit: I suggest adding `\| FileCheck` and also adding `CHECK-LABEL: <test name>` to the code itself.

In D122457#3410502, @mkazantsev wrote:

Could you please elaborate what kind of crash do you see? Is it simply an assertion that something must be canonical and it's not, or something more sophisticated?

You'll get
bool LSRUse::InsertFormula(const Formula &F, const Loop &L) {

assert(F.isCanonical(L) && "Invalid canonical representation");

on a attached test.

llvm/test/Transforms/LoopStrengthReduce/canonical-form.ll
2	will do before landing or with next update of the patch.

Interesting note. It seems that the definition of canonical representation of the formula looks as follows:

/// The list of "base" registers for this use. When this is non-empty. The
/// canonical representation of a formula is
/// 1. BaseRegs.size > 1 implies ScaledReg != NULL and
/// 2. ScaledReg != NULL implies Scale != 1 || !BaseRegs.empty().
/// 3. The reg containing recurrent expr related with currect loop in the
/// formula should be put in the ScaledReg.
/// #1 enforces that the scaled register is always used when at least two
/// registers are needed by the formula: e.g., reg1 + reg2 is reg1 + 1 * reg2.
/// #2 enforces that 1 * reg is reg.
/// #3 ensures invariant regs with respect to current loop can be combined
/// together in LSR codegen.
/// This invariant can be temporarily broken while building a formula.
/// However, every formula inserted into the LSRInstance must be in canonical
/// form.

The important statement here is "reg containing recurrent expr" while isCanonical utility checks that reg is recurrent expr.
The utility that makes canonicalization also operates on reg as if it is recurrent expr but not contains.

Might be the bug is here.

Yup, I agree with that. To me the initial formula

reg((-1 * {0,+,8}<nuw><nsw><%bb2>)<nsw>) + 1*reg((8 * (%arg /u 8))<nuw>)

isn't canonical too, but because of wrong check (which simply checks that base reg isa SCEVAddRec) multiplication by minus one obfuscates this.

One more counter-argument to this fix was: could we just replace {0,+,8} by 1 * {0,+,8} and treat this as canonical formula?

Let's see how deep this bug actually lies.

skatkov updated this revision to Diff 418503.Mar 28 2022, 12:19 AM

skatkov retitled this revision from [LSR] Canonicalize formula before inserting it. to [LSR] Fix canonicalization formula and its checker..

skatkov edited the summary of this revision. (Show Details)

LGTM with nits.

llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp
482	nit: naming convention for functions broken here, I suggest `containsAddRecDependentOnLoop`.
507	nit: I think we can capture `[&L]` only.

This revision is now accepted and ready to land.Mar 28 2022, 10:08 PM

Harbormaster completed remote builds in B156496: Diff 418503.Mar 28 2022, 10:24 PM

Closed by commit rG6444a65514b5: [LSR] Fixup canonicalization formula and its checker. (authored by skatkov). · Explain WhyMar 29 2022, 12:05 AM

This revision was automatically updated to reflect the committed changes.

skatkov added a commit: rG6444a65514b5: [LSR] Fixup canonicalization formula and its checker..

Revision Contents

Path

Size

llvm/

lib/

Transforms/

Scalar/

LoopStrengthReduce.cpp

21 lines

test/

Transforms/

LoopStrengthReduce/

canonical-form.ll

84 lines

Diff 418503

llvm/lib/Transforms/Scalar/LoopStrengthReduce.cpp

Show First 20 Lines • Show All 473 Lines • ▼ Show 20 Lines	if (!Bad.empty()) {
const SCEV *Sum = SE.getAddExpr(Bad);		const SCEV *Sum = SE.getAddExpr(Bad);
if (!Sum->isZero())		if (!Sum->isZero())
BaseRegs.push_back(Sum);		BaseRegs.push_back(Sum);
HasBaseReg = true;		HasBaseReg = true;
}		}
canonicalize(*L);		canonicalize(*L);
}		}

		static bool SCEVConatinsAddRecWithLoop(const SCEV *S, const Loop &L) {
		mkazantsevUnsubmitted Not Done Reply Inline Actions nit: naming convention for functions broken here, I suggest `containsAddRecDependentOnLoop`. mkazantsev: nit: naming convention for functions broken here, I suggest `containsAddRecDependentOnLoop`.
		return SCEVExprContains(S, [&L](const SCEV *S) {
		return isa<SCEVAddRecExpr>(S) && (cast<SCEVAddRecExpr>(S)->getLoop() == &L);
		});
		}

/// Check whether or not this formula satisfies the canonical		/// Check whether or not this formula satisfies the canonical
/// representation.		/// representation.
/// \see Formula::BaseRegs.		/// \see Formula::BaseRegs.
bool Formula::isCanonical(const Loop &L) const {		bool Formula::isCanonical(const Loop &L) const {
if (!ScaledReg)		if (!ScaledReg)
return BaseRegs.size() <= 1;		return BaseRegs.size() <= 1;

if (Scale != 1)		if (Scale != 1)
return true;		return true;

if (Scale == 1 && BaseRegs.empty())		if (Scale == 1 && BaseRegs.empty())
return false;		return false;

const SCEVAddRecExpr *SAR = dyn_cast<const SCEVAddRecExpr>(ScaledReg);		if (SCEVConatinsAddRecWithLoop(ScaledReg, L))
if (SAR && SAR->getLoop() == &L)
return true;		return true;

// If ScaledReg is not a recurrent expr, or it is but its loop is not current		// If ScaledReg is not a recurrent expr, or it is but its loop is not current
// loop, meanwhile BaseRegs contains a recurrent expr reg related with current		// loop, meanwhile BaseRegs contains a recurrent expr reg related with current
// loop, we want to swap the reg in BaseRegs with ScaledReg.		// loop, we want to swap the reg in BaseRegs with ScaledReg.
auto I = find_if(BaseRegs, [&](const SCEV *S) {		return none_of(BaseRegs, [&](const SCEV *S) {
		mkazantsevUnsubmitted Not Done Reply Inline Actions nit: I think we can capture `[&L]` only. mkazantsev: nit: I think we can capture `[&L]` only.
return isa<const SCEVAddRecExpr>(S) &&		return SCEVConatinsAddRecWithLoop(S, L);
(cast<SCEVAddRecExpr>(S)->getLoop() == &L);
});		});
return I == BaseRegs.end();
}		}

/// Helper method to morph a formula into its canonical representation.		/// Helper method to morph a formula into its canonical representation.
/// \see Formula::BaseRegs.		/// \see Formula::BaseRegs.
/// Every formula having more than one base register, must use the ScaledReg		/// Every formula having more than one base register, must use the ScaledReg
/// field. Otherwise, we would have to do special cases everywhere in LSR		/// field. Otherwise, we would have to do special cases everywhere in LSR
/// to treat reg1 + reg2 + ... the same way as reg1 + 1*reg2 + ...		/// to treat reg1 + reg2 + ... the same way as reg1 + 1*reg2 + ...
/// On the other hand, 1*reg should be canonicalized into reg.		/// On the other hand, 1*reg should be canonicalized into reg.
Show All 15 Lines	void Formula::canonicalize(const Loop &L) {
if (!ScaledReg) {		if (!ScaledReg) {
ScaledReg = BaseRegs.pop_back_val();		ScaledReg = BaseRegs.pop_back_val();
Scale = 1;		Scale = 1;
}		}

// If ScaledReg is an invariant with respect to L, find the reg from		// If ScaledReg is an invariant with respect to L, find the reg from
// BaseRegs containing the recurrent expr related with Loop L. Swap the		// BaseRegs containing the recurrent expr related with Loop L. Swap the
// reg with ScaledReg.		// reg with ScaledReg.
const SCEVAddRecExpr *SAR = dyn_cast<const SCEVAddRecExpr>(ScaledReg);		if (!SCEVConatinsAddRecWithLoop(ScaledReg, L)) {
if (!SAR \|\| SAR->getLoop() != &L) {
auto I = find_if(BaseRegs, [&](const SCEV *S) {		auto I = find_if(BaseRegs, [&](const SCEV *S) {
return isa<const SCEVAddRecExpr>(S) &&		return SCEVConatinsAddRecWithLoop(S, L);
(cast<SCEVAddRecExpr>(S)->getLoop() == &L);
});		});
if (I != BaseRegs.end())		if (I != BaseRegs.end())
std::swap(ScaledReg, *I);		std::swap(ScaledReg, *I);
}		}
assert(isCanonical(L) && "Failed to canonicalize?");		assert(isCanonical(L) && "Failed to canonicalize?");
}		}

/// Get rid of the scale in the formula.		/// Get rid of the scale in the formula.
▲ Show 20 Lines • Show All 5,893 Lines • Show Last 20 Lines

llvm/test/Transforms/LoopStrengthReduce/canonical-form.ll

This file was added.

				; RUN: opt -S -loop-reduce < %s \| FileCheck %s

				mkazantsevUnsubmitted Not Done Reply Inline Actions nit: I suggest adding `\| FileCheck` and also adding `CHECK-LABEL: <test name>` to the code itself. mkazantsev: nit: I suggest adding ` \| FileCheck` and also adding `CHECK-LABEL: <test name>` to the code…
				skatkovAuthorUnsubmitted Done Reply Inline Actions will do before landing or with next update of the patch. skatkov: will do before landing or with next update of the patch.
				; Check that no crash here.
				; When GenerateICmpZeroScales transforms the base formula
				; it can get non-canonical form.

				target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
				target triple = "x86_64-unknown-linux-gnu"

				define void @hoge(i32 %arg) {
				; CHECK: @hoge
				bb:
				%tmp = and i32 %arg, -8
				br label %bb2

				bb1: ; preds = %bb2
				ret void

				bb2: ; preds = %bb2, %bb
				%tmp3 = phi i64 [ 0, %bb ], [ %tmp62, %bb2 ]
				%tmp4 = phi i32 [ 1, %bb ], [ %tmp63, %bb2 ]
				%tmp5 = phi i32 [ 0, %bb ], [ %tmp64, %bb2 ]
				%tmp6 = add i64 %tmp3, 1
				%tmp7 = trunc i64 %tmp6 to i32
				%tmp8 = sub i32 %tmp4, %tmp7
				%tmp9 = mul i32 %tmp8, %tmp8
				%tmp10 = sub i32 %tmp9, %tmp8
				%tmp11 = sext i32 %tmp10 to i64
				%tmp12 = sub i64 %tmp6, %tmp11
				%tmp13 = add nuw nsw i32 %tmp4, 1
				%tmp14 = add i64 %tmp12, 1
				%tmp15 = trunc i64 %tmp14 to i32
				%tmp16 = sub i32 %tmp13, %tmp15
				%tmp17 = mul i32 %tmp16, %tmp16
				%tmp18 = sub i32 %tmp17, %tmp16
				%tmp19 = sext i32 %tmp18 to i64
				%tmp20 = sub i64 %tmp14, %tmp19
				%tmp21 = add i64 %tmp20, 1
				%tmp22 = sub i64 %tmp21, 0
				%tmp23 = add nuw nsw i32 %tmp4, 3
				%tmp24 = add i64 %tmp22, 1
				%tmp25 = trunc i64 %tmp24 to i32
				%tmp26 = sub i32 %tmp23, %tmp25
				%tmp27 = mul i32 %tmp26, %tmp26
				%tmp28 = sub i32 %tmp27, %tmp26
				%tmp29 = sext i32 %tmp28 to i64
				%tmp30 = sub i64 %tmp24, %tmp29
				%tmp31 = add nuw nsw i32 %tmp4, 4
				%tmp32 = add i64 %tmp30, 1
				%tmp33 = trunc i64 %tmp32 to i32
				%tmp34 = sub i32 %tmp31, %tmp33
				%tmp35 = mul i32 %tmp34, %tmp34
				%tmp36 = sub i32 %tmp35, %tmp34
				%tmp37 = sext i32 %tmp36 to i64
				%tmp38 = sub i64 %tmp32, %tmp37
				%tmp39 = add nuw nsw i32 %tmp4, 5
				%tmp40 = add i64 %tmp38, 1
				%tmp41 = trunc i64 %tmp40 to i32
				%tmp42 = sub i32 %tmp39, %tmp41
				%tmp43 = mul i32 %tmp42, %tmp42
				%tmp44 = sub i32 %tmp43, %tmp42
				%tmp45 = sext i32 %tmp44 to i64
				%tmp46 = sub i64 %tmp40, %tmp45
				%tmp47 = add nuw nsw i32 %tmp4, 6
				%tmp48 = add i64 %tmp46, 1
				%tmp49 = trunc i64 %tmp48 to i32
				%tmp50 = sub i32 %tmp47, %tmp49
				%tmp51 = mul i32 %tmp50, %tmp50
				%tmp52 = sub i32 %tmp51, %tmp50
				%tmp53 = sext i32 %tmp52 to i64
				%tmp54 = sub i64 %tmp48, %tmp53
				%tmp55 = add nuw nsw i32 %tmp4, 7
				%tmp56 = add i64 %tmp54, 1
				%tmp57 = trunc i64 %tmp56 to i32
				%tmp58 = sub i32 %tmp55, %tmp57
				%tmp59 = mul i32 %tmp58, %tmp58
				%tmp60 = sub i32 %tmp59, %tmp58
				%tmp61 = sext i32 %tmp60 to i64
				%tmp62 = sub i64 %tmp56, %tmp61
				%tmp63 = add nuw nsw i32 %tmp4, 8
				%tmp64 = add i32 %tmp5, 8
				%tmp65 = icmp eq i32 %tmp64, %tmp
				br i1 %tmp65, label %bb1, label %bb2
				}