This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
InstCombineInternal.h
-
InstCombinePHI.cpp
-
test/Transforms/InstCombine/
-
Transforms/
-
InstCombine/
-
intptr1.ll
-
intptr2.ll
-
intptr3.ll
-
intptr4.ll

Differential D37419

Teach scalar evolution to handle inttoptr/ptrtoint
Needs ReviewPublic

Authored by davidxl on Sep 2 2017, 9:06 PM.

Download Raw Diff

Details

Reviewers

wmi
anemet
sanjoy

Summary

This enhancement will enable loop vectorizations that are currently blocked due to noop inttoptr/ptrtoint operations.

The change itself is straightforward, but it exposes a few bugs in existing code. For instance, const_expr created by the expander can grow exponentially so some size limit needs to be applied. The expander also invalidates some of the SCEVs for some of the values (when noop cast is inserted) which can leads to inconsistent SCEVs with other cached values when recomputed (e.g, when scalar loop's header phi's init value is replaced by loop vectorizer). This patch also fixes those problems.

Diff Detail

Event Timeline

davidxl created this revision.Sep 2 2017, 9:06 PM

Herald added subscribers: javed.absar, nemanjai. · View Herald TranscriptSep 2 2017, 9:06 PM

Before we go down this route, can you describe the class of inputs you're trying to handle? I ask because, at least historically, we've thought of inttoptr/ptrtoint as anti-canonical. We've held the line against teaching AA, SCEV, etc. about them. Taking a quick look at your LoopVectorizer tests, it looks like mostly cases where integer variables, that are really pointers is disguise, are being used as induction variables. Could we aggressively canonicalize inttoptr/ptrtoint toward i8* and GEPs instead (in InstCombine or similar)? If that's possible, I'd much rather we do that. This will also have the advantage that AA might understand what's going on (as will SCEV will no changes there), and will enable additional optimizations.

lib/Transforms/Scalar/AlignmentFromAssumptions.cpp
276 ↗	(On Diff #113674)	I'm not sure that this is correct: How do we know that AAPtr is a pointer value? The idea here is that there's some expression that looks like ptrtoint(ptr) + e1 + e2 + ..., and we want to separate it into (ptr, e1 + e2 + ...). If we update SCEV to look through ptrtoint, we can't do this in the same way. I imagine we need to write a visitor to pull out the underlying pointer, or we need to not use SCEV for this task.

The class of inputs are usually generated by other optimizations such as SROA from C++ code -- the int/ptr conversions are usually not directly created by the user code. Logically speaking, handling of inttoptr/ptrtoint is no different from BitCast operation (ideally also implemented using BitCast op, but not doable due to IR spec), so there is no reason we should treat them as opaque while not doing so for BitCast.

Unless we can fully get rid of these operations, analysis passes will still have to deal with them. Missing handling those operations is simply a bug to be fixed independent of wether there is a pass to reduce/eliminate them.

davidxl added inline comments.Sep 3 2017, 12:14 AM

lib/Transforms/Scalar/AlignmentFromAssumptions.cpp
276 ↗	(On Diff #113674)	Won't adding a pointer type check be enough ?

In D37419#859829, @davidxl wrote:

The class of inputs are usually generated by other optimizations such as SROA from C++ code -- the int/ptr conversions are usually not directly created by the user code.

SROA shouldn't be doing that. This is the last major problematic pass in the canonicalization part of the pipeline. It also hurts our ability to do AA on that code. Please just fix SROA. It should cast to i8* and use GEPs.

Logically speaking, handling of inttoptr/ptrtoint is no different from BitCast operation (ideally also implemented using BitCast op, but not doable due to IR spec), so there is no reason we should treat them as opaque while not doing so for BitCast.

We can't ever get rid of them completely, because the conversions are necessary for C/C++ lowering. That does not mean that we need to treat them as part of our canonical form. GEPs are much easier to deal with for AA and other kinds of simplification, and so we should canonicalize toward using pointer casts and GEPs. Especially in recent years, we've definitely been pushing in that direction. Having one canonical form in this regard keeps us from having to code many things twice (once for pointers and GEPs and once for ptrtoint/inttoptr + arithmetic).

Unless we can fully get rid of these operations, analysis passes will still have to deal with them. Missing handling those operations is simply a bug to be fixed independent of wether there is a pass to reduce/eliminate them.

I disagree. If we canonicalize away from them, then the only things left are things we really can't analyze anyway.

We might want some of the bug fixes from this patch regardless.

Hi David,

Generally it is not safe to look through inttoptr and ptrtoint in SCEV. https://github.com/llvm-mirror/llvm/blob/master/lib/Analysis/ScalarEvolution.cpp#L6053 has a little bit of the rationale, but basically the problem is that in:

%i0 = ptrtoint i8* %p0 to i64
%i1 = ptrtoint i8* %p1 to i64
%i2 = add i32 %i0, %i1
%ptr = inttoptr i64 %i2 to i8*

%ptr can alias both %p0 and %p1. However, if you round-trip %ptr through SCEV and SCEVExpander, then nothing stops SCEVExpander from expanding the SCEV for %ptr to

%i0 = ptrtoint i8* %p0 to i64
%i1 = ptrtoint i8* %p1 to i64
%p0_ = inttoptr i64 %i0 to i8*
%ptr = getelementptr i8, i8* %p0_, i64 %p0

and %ptr no longer aliases %p1 (assume that we can prove %p0 no-alias %p1 in both the snippets).

In other words, ptrtoint and inttoptr are not the same as bitcast since the former two have aliasing implications while the latter does not.

We could fix this in SCEV with some representational changes, but before that I'd prefer exploring Hal's direction of avoiding generating ptrtoint and inttoptr altogether.

This revision now requires changes to proceed.Sep 3 2017, 12:09 PM

Addressed review feedback.

In this version, more strict check is added such that int phis that feeds inttoptr whose result is actually used by load/store/getelemenptr operations are considered from this instruction combining. This is strictly not necessary for this particular transformation, but there is a concern that some latent bug may be exposed if not done.

Also added a negative test case.

Um -- updated the wrong patch. Please ignore.

sanjoy resigned from this revision.Jan 29 2022, 2:49 PM

Herald added a subscriber: arichardson. · View Herald TranscriptJan 29 2022, 2:49 PM

Revision Contents

Path

Size

lib/

Transforms/

InstCombine/

InstCombineInternal.h

4 lines

InstCombinePHI.cpp

219 lines

test/

Transforms/

InstCombine/

181 lines

37 lines

37 lines

47 lines

Diff 116068

lib/Transforms/InstCombine/InstCombineInternal.h

	Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines

	/// \brief Try to rotate an operation below a PHI node, using PHI nodes for			/// \brief Try to rotate an operation below a PHI node, using PHI nodes for
	/// its operands.			/// its operands.
	Instruction *FoldPHIArgOpIntoPHI(PHINode &PN);			Instruction *FoldPHIArgOpIntoPHI(PHINode &PN);
	Instruction *FoldPHIArgBinOpIntoPHI(PHINode &PN);			Instruction *FoldPHIArgBinOpIntoPHI(PHINode &PN);
	Instruction *FoldPHIArgGEPIntoPHI(PHINode &PN);			Instruction *FoldPHIArgGEPIntoPHI(PHINode &PN);
	Instruction *FoldPHIArgLoadIntoPHI(PHINode &PN);			Instruction *FoldPHIArgLoadIntoPHI(PHINode &PN);
	Instruction *FoldPHIArgZextsIntoPHI(PHINode &PN);			Instruction *FoldPHIArgZextsIntoPHI(PHINode &PN);
				/// If an integer typed PHI has only one use which is an IntToPtr operation,
				/// replace the PHI with an existing pointer typed PHI if it exists. Otherwise
				/// insert a new pointer typed PHI and replace the original one.
				Instruction *FoldIntegerTypedPHI(PHINode &PN);

	/// Helper function for FoldPHIArgXIntoPHI() to get debug location for the			/// Helper function for FoldPHIArgXIntoPHI() to get debug location for the
	/// folded operation.			/// folded operation.
	DebugLoc PHIArgMergedDebugLoc(PHINode &PN);			DebugLoc PHIArgMergedDebugLoc(PHINode &PN);

	Instruction foldGEPICmp(GEPOperator GEPLHS, Value *RHS,			Instruction foldGEPICmp(GEPOperator GEPLHS, Value *RHS,
	ICmpInst::Predicate Cond, Instruction &I);			ICmpInst::Predicate Cond, Instruction &I);
	Instruction foldAllocaCmp(ICmpInst &ICI, const AllocaInst Alloca,			Instruction foldAllocaCmp(ICmpInst &ICI, const AllocaInst Alloca,
	▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

lib/Transforms/InstCombine/InstCombinePHI.cpp

	Show All 33 Lines
	for (unsigned i = 1; i != PN.getNumIncomingValues(); ++i) {			for (unsigned i = 1; i != PN.getNumIncomingValues(); ++i) {
	auto *I = cast<Instruction>(PN.getIncomingValue(i));			auto *I = cast<Instruction>(PN.getIncomingValue(i));
	Loc = DILocation::getMergedLocation(Loc, I->getDebugLoc());			Loc = DILocation::getMergedLocation(Loc, I->getDebugLoc());
	}			}

	return Loc;			return Loc;
	}			}

				// Replace Integer typed PHI PN if the PHI's value is used as a pointer value.
				// If there is an existing pointer typed PHI that produces the same value as PN,
				// replace PN and the IntToPtr operation with it. Otherwise, synthesize a new
				// PHI node:
				//
				// Case-1:
				// bb1:
				// int_init = PtrToInt(ptr_init)
				// br label %bb2
				// bb2:
				// int_val = PHI([int_init, %bb1], [int_val_inc, %bb2]
				// ptr_val = PHI([ptr_init, %bb1], [ptr_val_inc, %bb2]
				// ptr_val2 = IntToPtr(int_val)
				// ...
				// use(ptr_val2)
				// ptr_val_inc = ...
				// inc_val_inc = PtrToInt(ptr_val_inc)
				//
				// ==>
				// bb1:
				// br label %bb2
				// bb2:
				// ptr_val = PHI([ptr_init, %bb1], [ptr_val_inc, %bb2]
				// ...
				// use(ptr_val)
				// ptr_val_inc = ...
				//
				// Case-2:
				// bb1:
				// int_ptr = BitCast(ptr_ptr)
				// int_init = Load(int_ptr)
				// br label %bb2
				// bb2:
				// int_val = PHI([int_init, %bb1], [int_val_inc, %bb2]
				// ptr_val2 = IntToPtr(int_val)
				// ...
				// use(ptr_val2)
				// ptr_val_inc = ...
				// inc_val_inc = PtrToInt(ptr_val_inc)
				// ==>
				// bb1:
				// ptr_init = Load(ptr_ptr)
				// br label %bb2
				// bb2:
				// ptr_val = PHI([ptr_init, %bb1], [ptr_val_inc, %bb2]
				// ...
				// use(ptr_val)
				// ptr_val_inc = ...
				// ...
				//
				Instruction *InstCombiner::FoldIntegerTypedPHI(PHINode &PN) {
				if (!PN.getType()->isIntegerTy())
				return nullptr;
				if (!PN.hasOneUse())
				return nullptr;

				auto *IntToPtr = dyn_cast<IntToPtrInst>(PN.user_back());
				if (!IntToPtr)
				return nullptr;

				// Check if the pointer is actually used as pointer:
				auto HasPointerUse = [](Instruction *IIP) {
				for (User *U : IIP->users()) {
				Value *Ptr = nullptr;
				if (LoadInst *LoadI = dyn_cast<LoadInst>(U)) {
				Ptr = LoadI->getPointerOperand();
				} else if (StoreInst *SI = dyn_cast<StoreInst>(U)) {
				Ptr = SI->getPointerOperand();
				} else if (GetElementPtrInst *GI = dyn_cast<GetElementPtrInst>(U)) {
				Ptr = GI->getPointerOperand();
				}

				if (Ptr && Ptr == IIP)
				return true;
				}
				return false;
				};

				if (!HasPointerUse(IntToPtr))
				return nullptr;

				if (DL.getPointerSizeInBits(IntToPtr->getAddressSpace()) !=
				DL.getTypeSizeInBits(IntToPtr->getOperand(0)->getType()))
				return nullptr;

				SmallVector<Value *, 4> AvailablePtrVals;
				for (unsigned i = 0; i != PN.getNumIncomingValues(); ++i) {
				Value *Arg = PN.getIncomingValue(i);

				// First look backward:
				if (auto *PI = dyn_cast<PtrToIntInst>(Arg)) {
				AvailablePtrVals.emplace_back(PI->getOperand(0));
				continue;
				}

				// Next look forward:
				Value *ArgIntToPtr = nullptr;
				for (User *U : Arg->users()) {
				if (isa<IntToPtrInst>(U) && U->getType() == IntToPtr->getType() &&
				(DT.dominates(cast<Instruction>(U), PN.getIncomingBlock(i)) \|\|
				cast<Instruction>(U)->getParent() == PN.getIncomingBlock(i))) {
				ArgIntToPtr = U;
				break;
				}
				}

				if (ArgIntToPtr) {
				AvailablePtrVals.emplace_back(ArgIntToPtr);
				continue;
				}

				// If Arg is defined by a PHI, allow it. This will also create
				// more opportunities iteratively.
				if (isa<PHINode>(Arg)) {
				AvailablePtrVals.emplace_back(Arg);
				continue;
				}

				// For a single use integer load:
				auto *LoadI = dyn_cast<LoadInst>(Arg);
				if (!LoadI)
				return nullptr;

				if (!LoadI->hasOneUse())
				return nullptr;

				// Push the integer typed Load instruction into the available
				// value set, and fix it up later when the pointer typed PHI
				// is synthesized.
				AvailablePtrVals.emplace_back(LoadI);
				}

				// Now search for a matching PHI
				auto *BB = PN.getParent();
				assert(AvailablePtrVals.size() == PN.getNumIncomingValues() &&
				"Not enough available ptr typed incoming values");
				PHINode *MatchingPtrPHI = nullptr;
				for (auto II = BB->begin(), EI = BasicBlock::iterator(BB->getFirstNonPHI());
				II != EI; II++) {
				PHINode *PtrPHI = dyn_cast<PHINode>(II);
				if (!PtrPHI \|\| PtrPHI == &PN)
				continue;
				MatchingPtrPHI = PtrPHI;
				for (unsigned i = 0; i != PtrPHI->getNumIncomingValues(); ++i) {
				if (AvailablePtrVals[i] != PtrPHI->getIncomingValue(i)) {
				MatchingPtrPHI = nullptr;
				break;
				}
				}

				if (MatchingPtrPHI)
				break;
				}

				if (MatchingPtrPHI) {
				assert(MatchingPtrPHI->getType() == IntToPtr->getType() &&
				"Phi's Type does not match with IntToPtr");
				// The PtrToCast + IntToPtr will be simplified later
				return CastInst::CreateBitOrPointerCast(MatchingPtrPHI,
				IntToPtr->getOperand(0)->getType());
				}

				// If it requires a conversion for every PHI operand, do not do it.
				if (std::all_of(AvailablePtrVals.begin(), AvailablePtrVals.end(),
				[&](Value *V) {
				return (V->getType() != IntToPtr->getType()) \|\|
				isa<IntToPtrInst>(V);
				}))
				return nullptr;

				PHINode *NewPtrPHI = PHINode::Create(
				IntToPtr->getType(), PN.getNumIncomingValues(), PN.getName() + ".ptr");

				InsertNewInstBefore(NewPtrPHI, PN);
				for (unsigned i = 0; i != PN.getNumIncomingValues(); ++i) {
				auto *IncomingBB = PN.getIncomingBlock(i);
				auto *IncomingVal = AvailablePtrVals[i];

				if (IncomingVal->getType() == IntToPtr->getType()) {
				NewPtrPHI->addIncoming(IncomingVal, IncomingBB);
				continue;
				}

				#ifndef NDEBUG
				LoadInst *LoadI = dyn_cast<LoadInst>(IncomingVal);
				assert((isa<PHINode>(IncomingVal) \|\| (LoadI && LoadI->hasOneUse())) &&
				"Can not replace LoadInst with multiple uses");
				#endif
				// Need to insert a BitCast.
				// For an integer Load instruction with a single use, the load + IntToPtr
				// cast will be simplified into a pointer load:
				// %v = load i64, i64* %a.ip, align 8
				// %v.cast = inttoptr i64 %v to float **
				// ==>
				// %v.ptrp = bitcast i64 * %a.ip to float **
				// %v.cast = load float , float * %v.ptrp, align 8
				auto *CI = CastInst::CreateBitOrPointerCast(
				IncomingVal, IntToPtr->getType(), IncomingVal->getName() + ".ptr");
				if (auto *IncomingI = dyn_cast<Instruction>(IncomingVal)) {
				BasicBlock::iterator InsertPos(IncomingI);
				InsertPos++;
				if (isa<PHINode>(IncomingI))
				InsertPos = IncomingI->getParent()->getFirstInsertionPt();
				InsertNewInstBefore(CI, *InsertPos);
				} else {
				auto *InsertBB = &IncomingBB->getParent()->getEntryBlock();
				InsertNewInstBefore(CI, *InsertBB->getFirstInsertionPt());
				}
				NewPtrPHI->addIncoming(CI, IncomingBB);
				}

				// The PtrToCast + IntToPtr will be simplified later
				return CastInst::CreateBitOrPointerCast(NewPtrPHI,
				IntToPtr->getOperand(0)->getType());
				}

	/// If we have something like phi [add (a,b), add(a,c)] and if a/b/c and the			/// If we have something like phi [add (a,b), add(a,c)] and if a/b/c and the
	/// adds all have a single use, turn this into a phi and a single binop.			/// adds all have a single use, turn this into a phi and a single binop.
	Instruction *InstCombiner::FoldPHIArgBinOpIntoPHI(PHINode &PN) {			Instruction *InstCombiner::FoldPHIArgBinOpIntoPHI(PHINode &PN) {
	Instruction *FirstInst = cast<Instruction>(PN.getIncomingValue(0));			Instruction *FirstInst = cast<Instruction>(PN.getIncomingValue(0));
	assert(isa<BinaryOperator>(FirstInst) \|\| isa<CmpInst>(FirstInst));			assert(isa<BinaryOperator>(FirstInst) \|\| isa<CmpInst>(FirstInst));
	unsigned Opc = FirstInst->getOpcode();			unsigned Opc = FirstInst->getOpcode();
	Value *LHSVal = FirstInst->getOperand(0);			Value *LHSVal = FirstInst->getOperand(0);
	Value *RHSVal = FirstInst->getOperand(1);			Value *RHSVal = FirstInst->getOperand(1);
	▲ Show 20 Lines • Show All 182 Lines • ▼ Show 20 Lines
	PN.getIncomingValue(0)->hasOneUse())			PN.getIncomingValue(0)->hasOneUse())
	if (Instruction *Result = FoldPHIArgOpIntoPHI(PN))			if (Instruction *Result = FoldPHIArgOpIntoPHI(PN))
	return Result;			return Result;

	// If this is a trivial cycle in the PHI node graph, remove it. Basically, if			// If this is a trivial cycle in the PHI node graph, remove it. Basically, if
	// this PHI only has a single use (a PHI), and if that PHI only has one use (a			// this PHI only has a single use (a PHI), and if that PHI only has one use (a
	// PHI)... break the cycle.			// PHI)... break the cycle.
	if (PN.hasOneUse()) {			if (PN.hasOneUse()) {
				if (Instruction *Result = FoldIntegerTypedPHI(PN))
				return Result;

	Instruction *PHIUser = cast<Instruction>(PN.user_back());			Instruction *PHIUser = cast<Instruction>(PN.user_back());
	if (PHINode *PU = dyn_cast<PHINode>(PHIUser)) {			if (PHINode *PU = dyn_cast<PHINode>(PHIUser)) {
	SmallPtrSet<PHINode*, 16> PotentiallyDeadPHIs;			SmallPtrSet<PHINode*, 16> PotentiallyDeadPHIs;
	PotentiallyDeadPHIs.insert(&PN);			PotentiallyDeadPHIs.insert(&PN);
	if (DeadPHICycle(PU, PotentiallyDeadPHIs))			if (DeadPHICycle(PU, PotentiallyDeadPHIs))
	return replaceInstUsesWith(PN, UndefValue::get(PN.getType()));			return replaceInstUsesWith(PN, UndefValue::get(PN.getType()));
	}			}

	▲ Show 20 Lines • Show All 91 Lines • Show Last 20 Lines

test/Transforms/InstCombine/intptr1.ll

				; RUN: opt < %s -instcombine -S \| FileCheck %s


				define void @test1(float* %a, float* readnone %a_end, i64* %b.i64) {
				; CHECK-LABEL: @test1
				entry:
				%cmp1 = icmp ult float* %a, %a_end
				br i1 %cmp1, label %for.body.preheader, label %for.end

				for.body.preheader: ; preds = %entry
				%b = load i64, i64* %b.i64, align 8
				; CHECK: load float, float*
				br label %for.body

				for.body: ; preds = %for.body, %for.body.preheader
				%a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				%b.addr.02 = phi i64 [ %add.int, %for.body ], [ %b, %for.body.preheader ]

				; CHECK: %a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				; CHECK: %b.addr.02.ptr = phi float* [ %add, %for.body ],
				; CHECK-NOT: %b.addr.02 = phi i64

				%tmp = inttoptr i64 %b.addr.02 to float*
				; CHECK-NOT: inttoptr i64
				%tmp1 = load float, float* %tmp, align 4
				%mul.i = fmul float %tmp1, 4.200000e+01
				store float %mul.i, float* %a.addr.03, align 4
				%add = getelementptr inbounds float, float* %tmp, i64 1
				%add.int = ptrtoint float* %add to i64
				; CHECK-NOT: ptrtoint float*
				%incdec.ptr = getelementptr inbounds float, float* %a.addr.03, i64 1
				%cmp = icmp ult float* %incdec.ptr, %a_end
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body, %entry
				ret void
				}

				define void @test1_neg(float* %a, float* readnone %a_end, i64* %b.i64) {
				; CHECK-LABEL: @test1_neg
				entry:
				%cmp1 = icmp ult float* %a, %a_end
				br i1 %cmp1, label %for.body.preheader, label %for.end

				for.body.preheader: ; preds = %entry
				%b = load i64, i64* %b.i64, align 8
				br label %for.body

				for.body: ; preds = %for.body, %for.body.preheader
				%a.addr.03 = phi float* [ %incdec.ptr, %bb ], [ %a, %for.body.preheader ]
				%b.addr.02 = phi i64 [ %add.int, %bb ], [ %b, %for.body.preheader ]

				; CHECK: %a.addr.03 = phi float* [ %incdec.ptr, %bb ], [ %a, %for.body.preheader ]
				; CHECK: %b.addr.02 = phi i64

				%tmp = inttoptr i64 %b.addr.02 to float*
				; CHECK: inttoptr i64
				%ptrcmp = icmp ult float* %tmp, %a_end
				br i1 %ptrcmp, label %for.end, label %bb

				bb:
				%tmp1 = load float, float* %a, align 4
				%mul.i = fmul float %tmp1, 4.200000e+01
				store float %mul.i, float* %a.addr.03, align 4
				%add = getelementptr inbounds float, float* %a, i64 1
				%add.int = ptrtoint float* %add to i64
				; CHECK: ptrtoint float*
				%incdec.ptr = getelementptr inbounds float, float* %a.addr.03, i64 1
				%cmp = icmp ult float* %incdec.ptr, %a_end
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body, %entry
				ret void
				}


				define void @test2(float* %a, float* readnone %a_end, float** %b.float) {
				; CHECK-LABEL: @test2
				entry:
				%cmp1 = icmp ult float* %a, %a_end
				br i1 %cmp1, label %for.body.preheader, label %for.end

				for.body.preheader: ; preds = %entry
				%b.i64 = bitcast float** %b.float to i64*
				%b = load i64, i64* %b.i64, align 8
				; CHECK: load float, float*
				br label %for.body

				for.body: ; preds = %for.body, %for.body.preheader
				%a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				%b.addr.02 = phi i64 [ %add.int, %for.body ], [ %b, %for.body.preheader ]

				; CHECK: %a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				; CHECK: %b.addr.02.ptr = phi float* [ %add, %for.body ],
				; CHECK-NOT: %b.addr.02 = phi i64

				%tmp = inttoptr i64 %b.addr.02 to float*
				; CHECK-NOT: inttoptr i64
				%tmp1 = load float, float* %tmp, align 4
				%mul.i = fmul float %tmp1, 4.200000e+01
				store float %mul.i, float* %a.addr.03, align 4
				%add = getelementptr inbounds float, float* %tmp, i64 1
				%add.int = ptrtoint float* %add to i64
				; CHECK-NOT: ptrtoint float*
				%incdec.ptr = getelementptr inbounds float, float* %a.addr.03, i64 1
				%cmp = icmp ult float* %incdec.ptr, %a_end
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body, %entry
				ret void
				}


				define void @test3(float* %a, float* readnone %a_end, i8** %b.i8p) {
				; CHECK-LABEL: @test3
				entry:
				%cmp1 = icmp ult float* %a, %a_end
				br i1 %cmp1, label %for.body.preheader, label %for.end

				for.body.preheader: ; preds = %entry
				%b.i64 = bitcast i8** %b.i8p to i64*
				%b = load i64, i64* %b.i64, align 8
				; CHECK: load float, float*
				br label %for.body

				for.body: ; preds = %for.body, %for.body.preheader
				%a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				%b.addr.02 = phi i64 [ %add.int, %for.body ], [ %b, %for.body.preheader ]

				; CHECK: %a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				; CHECK: %b.addr.02.ptr = phi float* [ %add, %for.body ],
				; CHECK-NOT: %b.addr.02 = phi i64

				%tmp = inttoptr i64 %b.addr.02 to float*
				; CHECK-NOT: inttoptr i64
				%tmp1 = load float, float* %tmp, align 4
				%mul.i = fmul float %tmp1, 4.200000e+01
				store float %mul.i, float* %a.addr.03, align 4
				%add = getelementptr inbounds float, float* %tmp, i64 1
				%add.int = ptrtoint float* %add to i64
				; CHECK-NOT: ptrtoint float*
				%incdec.ptr = getelementptr inbounds float, float* %a.addr.03, i64 1
				%cmp = icmp ult float* %incdec.ptr, %a_end
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body, %entry
				ret void
				}


				define void @test4(float* %a, float* readnone %a_end, float** %b.float) {
				entry:
				; CHECK-LABEL: @test4
				%cmp1 = icmp ult float* %a, %a_end
				br i1 %cmp1, label %for.body.preheader, label %for.end

				for.body.preheader: ; preds = %entry
				%b.f = load float, float* %b.float, align 8
				%b = ptrtoint float* %b.f to i64
				; CHECK: load float, float*
				; CHECK-NOT: ptrtoint float*
				br label %for.body

				for.body: ; preds = %for.body, %for.body.preheader
				%a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				%b.addr.02 = phi i64 [ %add.int, %for.body ], [ %b, %for.body.preheader ]
				%tmp = inttoptr i64 %b.addr.02 to float*
				; CHECK-NOT: inttoptr i64
				%tmp1 = load float, float* %tmp, align 4
				%mul.i = fmul float %tmp1, 4.200000e+01
				store float %mul.i, float* %a.addr.03, align 4
				%add = getelementptr inbounds float, float* %tmp, i64 1
				%add.int = ptrtoint float* %add to i64
				; CHECK-NOT: ptrtoint float*
				%incdec.ptr = getelementptr inbounds float, float* %a.addr.03, i64 1
				%cmp = icmp ult float* %incdec.ptr, %a_end
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body, %entry
				ret void
				}

test/Transforms/InstCombine/intptr2.ll

				; RUN: opt < %s -instcombine -S \| FileCheck %s

				define void @test1(float* %a, float* readnone %a_end, i32* %b.i) {
				; CHECK-LABEL: @test1
				entry:
				%cmp1 = icmp ult float* %a, %a_end
				br i1 %cmp1, label %for.body.preheader, label %for.end

				for.body.preheader: ; preds = %entry
				%b = ptrtoint i32 * %b.i to i64
				; CHECK: bitcast
				; CHECK-NOT: ptrtoint
				br label %for.body

				for.body: ; preds = %for.body, %for.body.preheader
				%a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				%b.addr.02 = phi i64 [ %add.int, %for.body ], [ %b, %for.body.preheader ]

				; CHECK: %a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				; CHECK-NOT: phi i64

				%tmp = inttoptr i64 %b.addr.02 to float*
				; CHECK-NOT: inttoptr
				%tmp1 = load float, float* %tmp, align 4
				%mul.i = fmul float %tmp1, 4.200000e+01
				store float %mul.i, float* %a.addr.03, align 4
				%add = getelementptr inbounds float, float* %tmp, i64 1
				%add.int = ptrtoint float* %add to i64
				; CHECK-NOT: ptrtoint
				%incdec.ptr = getelementptr inbounds float, float* %a.addr.03, i64 1
				%cmp = icmp ult float* %incdec.ptr, %a_end
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body, %entry
				ret void
				}

test/Transforms/InstCombine/intptr3.ll

				; RUN: opt < %s -instcombine -S \| FileCheck %s


				define void @test(float* %a, float* readnone %a_end, i64 %b) unnamed_addr {
				entry:
				%cmp1 = icmp ult float* %a, %a_end
				br i1 %cmp1, label %for.body.preheader, label %for.end

				for.body.preheader: ; preds = %entry
				%b.float = inttoptr i64 %b to float*
				br label %for.body

				for.body: ; preds = %for.body.preheader, %for.body
				%a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				%b.addr.float = phi float* [ %b.addr.float.inc, %for.body ], [ %b.float, %for.body.preheader ]
				%b.addr.i64 = phi i64 [ %b.addr.i64.inc, %for.body ], [ %b, %for.body.preheader ]
				; CHECK: %a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				; CHECK-NEXT: %b.addr.float = phi float* [ %b.addr.float.inc, %for.body ], [ %b.float, %for.body.preheader ]
				; CHECK-NEXT: = load float
				%l = load float, float* %b.addr.float, align 4
				%mul.i = fmul float %l, 4.200000e+01
				store float %mul.i, float* %a.addr.03, align 4
				%b.addr.float.2 = inttoptr i64 %b.addr.i64 to float*
				; CHECK-NOT: inttoptr
				%b.addr.float.inc = getelementptr inbounds float, float* %b.addr.float.2, i64 1
				%b.addr.i64.inc = ptrtoint float* %b.addr.float.inc to i64
				; CHECK-NOT: ptrtoint
				%incdec.ptr = getelementptr inbounds float, float* %a.addr.03, i64 1
				%cmp = icmp ult float* %incdec.ptr, %a_end
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body, %entry
				ret void
				}

test/Transforms/InstCombine/intptr4.ll

				; RUN: opt < %s -instcombine -S \| FileCheck %s


				define void @test(float* %a, float* readnone %a_end, i64 %b, float* %bf) unnamed_addr {
				entry:
				%cmp1 = icmp ult float* %a, %a_end
				%b.float = inttoptr i64 %b to float*
				br i1 %cmp1, label %bb1, label %bb2

				bb1:
				br label %for.body.preheader
				bb2:
				%bfi = ptrtoint float* %bf to i64
				br label %for.body.preheader

				for.body.preheader: ; preds = %entry
				%b.phi = phi i64 [%b, %bb1], [%bfi, %bb2]
				; CHECK-LABEL: for.body.preheader
				; CHECK-NOT: %b.phi = phi i64
				br label %for.body

				for.body: ; preds = %for.body.preheader, %for.body
				; CHECK-LABEL: for.body
				%a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				%b.addr.float = phi float* [ %b.addr.float.inc, %for.body ], [ %b.float, %for.body.preheader ]
				%b.addr.i64 = phi i64 [ %b.addr.i64.inc, %for.body ], [ %b.phi, %for.body.preheader ]
				; CHECK: %a.addr.03 = phi float* [ %incdec.ptr, %for.body ], [ %a, %for.body.preheader ]
				; CHECK-NEXT: %b.addr.float = phi float* [ %b.addr.float.inc, %for.body ], [ %b.float, %for.body.preheader ]
				; CHECK-NOT: = %b.addr.i64
				%l = load float, float* %b.addr.float, align 4
				%mul.i = fmul float %l, 4.200000e+01
				store float %mul.i, float* %a.addr.03, align 4
				%b.addr.float.2 = inttoptr i64 %b.addr.i64 to float*
				; CHECK-NOT: inttoptr
				%b.addr.float.inc = getelementptr inbounds float, float* %b.addr.float.2, i64 1
				%b.addr.i64.inc = ptrtoint float* %b.addr.float.inc to i64
				; CHECK-NOT: ptrtoint
				%incdec.ptr = getelementptr inbounds float, float* %a.addr.03, i64 1
				%cmp = icmp ult float* %incdec.ptr, %a_end
				br i1 %cmp, label %for.body, label %for.end

				for.end: ; preds = %for.body, %entry
				ret void
				}