This is an archive of the discontinued LLVM Phabricator instance.

LSR: Fix PR33514
ClosedPublic

Authored by evstupac on Aug 1 2017, 12:54 PM.

Download Raw Diff

Details

Reviewers

qcolombet
hans

Commits

rG38197c66a1f1: Fix PR33514
rL310092: Fix PR33514

Summary

The patch restrict multiplication of ICmpZero formula by any constant if there is a pointer type register (p) in the formula.
If not restricted it can potentially cause expand of SCEV : C * p, which is undefined (if C != 1 or 0).

Diff Detail

Repository: rL LLVM

Event Timeline

evstupac created this revision.Aug 1 2017, 12:54 PM

Herald added a subscriber: mzolotukhin. · View Herald TranscriptAug 1 2017, 12:55 PM

Hi Evgeny,

Multiplying pointers is indeed illegal in the IR, but instead of just dropping them, would it make sense to keep them with the proper ptrtoint casts?

Also, if that is not something we are supposed to do, I think it would make sense to have SCEV complain when we are trying to do that.

Cheers,
-Quentin

Hi Quentin,

In D36170#828004, @qcolombet wrote:

Multiplying pointers is indeed illegal in the IR, but instead of just dropping them, would it make sense to keep them with the proper ptrtoint casts?

We could do this, however I have some concerns:

If we convert pointer to int we can miss some optimizations. For example comparison of pointer == 1 is always false, comparison of int == 1 not.
If we insert more converts we should raise Formula cost somehow (most likely leading to the formula drop).

Also, if that is not something we are supposed to do, I think it would make sense to have SCEV complain when we are trying to do that.

SCEV expands to undefined value and it seems good enough. The expansion could be in dead code for example - so assert or error is too strict.

It is hard to imagine a case when C*p will be in the best solution (so that C*p is reused somewhere else). However that can happen if other formulas were deleted because of "too complex solution".

I've tested performance for x86.
spec2000/spec2006 are build same
Other tests that have difference in binaries got the same performance.

Thanks,
Evgeny

If we insert more converts we should raise Formula cost somehow (most likely leading to the formula drop).

Good point. Maybe mention that in the comment and we can revisit if we see test where it is needed later on.

SCEV expands to undefined value and it seems good enough. The expansion could be in dead code for example - so assert or error is too strict.

Make sense.

Other tests that have difference in binaries got the same performance.

Interesting, given we were generating invalid code, what changed there?

Other tests that have difference in binaries got the same performance.

Interesting, given we were generating invalid code, what changed there?

Something similar to what have happen to the LIT test here.
We are not generating wrong code.

We generate formulas that potentially generate buggy code. The formulas cold be unused, but context for the best solution differs (especially for complex cases), with and without this formulas.
Even if buggy formula is used it could happen, that expand is correct: "p1 + -1*p2" is valid when we have a mapped expand (existing instruction) for "p1 + -1*p2". However, if we generate a SCEV without existing map to instruction, we need to create a new instruction to expand the SCEV. -1*p2 + p1 could be an issue.

Thanks for the explanation.

LGTM

This revision is now accepted and ready to land.Aug 3 2017, 12:58 PM

Closed by commit rL310092: Fix PR33514 (authored by evstupac). · Explain WhyAug 4 2017, 11:46 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

lib/

Transforms/

Scalar/

LoopStrengthReduce.cpp

6 lines

test/

Transforms/

LoopStrengthReduce/

pr27056.ll

3 lines

Diff 109792

llvm/trunk/lib/Transforms/Scalar/LoopStrengthReduce.cpp

Show First 20 Lines • Show All 3,666 Lines • ▼ Show 20 Lines	void LSRInstance::GenerateICmpZeroScales(LSRUse &LU, unsigned LUIdx,
// Determine the integer type for the base formula.		// Determine the integer type for the base formula.
Type *IntTy = Base.getType();		Type *IntTy = Base.getType();
if (!IntTy) return;		if (!IntTy) return;
if (SE.getTypeSizeInBits(IntTy) > 64) return;		if (SE.getTypeSizeInBits(IntTy) > 64) return;

// Don't do this if there is more than one offset.		// Don't do this if there is more than one offset.
if (LU.MinOffset != LU.MaxOffset) return;		if (LU.MinOffset != LU.MaxOffset) return;

		// Check if transformation is valid. It is illegal to multiply pointer.
		if (Base.ScaledReg && Base.ScaledReg->getType()->isPointerTy())
		return;
		for (const SCEV *BaseReg : Base.BaseRegs)
		if (BaseReg->getType()->isPointerTy())
		return;
assert(!Base.BaseGV && "ICmpZero use is not legal!");		assert(!Base.BaseGV && "ICmpZero use is not legal!");

// Check each interesting stride.		// Check each interesting stride.
for (int64_t Factor : Factors) {		for (int64_t Factor : Factors) {
// Check that the multiplication doesn't overflow.		// Check that the multiplication doesn't overflow.
if (Base.BaseOffset == INT64_MIN && Factor == -1)		if (Base.BaseOffset == INT64_MIN && Factor == -1)
continue;		continue;
int64_t NewBaseOffset = (uint64_t)Base.BaseOffset * Factor;		int64_t NewBaseOffset = (uint64_t)Base.BaseOffset * Factor;
▲ Show 20 Lines • Show All 1,798 Lines • Show Last 20 Lines

llvm/trunk/test/Transforms/LoopStrengthReduce/pr27056.ll

	Show All 39 Lines

	try.cont: ; preds = %for.end			try.cont: ; preds = %for.end
	ret void			ret void
	}			}

	; CHECK-LABEL: define void @b_copy_ctor(			; CHECK-LABEL: define void @b_copy_ctor(
	; CHECK: catchpad			; CHECK: catchpad
	; CHECK-NEXT: icmp eq %struct.L			; CHECK-NEXT: icmp eq %struct.L
	; CHECK-NEXT: getelementptr {{.}} i64 sub (i64 0, i64 ptrtoint (%struct.L @GV2 to i64))			; CHECK-NEXT: %4 = sub i64 0, %1
				; CHECK-NEXT: getelementptr {{.}} getelementptr inbounds (%struct.L, %struct.L @GV2, i32 0, i32 0), i64 %4

	declare void @a_copy_ctor()			declare void @a_copy_ctor()