This is an archive of the discontinued LLVM Phabricator instance.

Assume GetElementPtr offsets to be inbounds
ClosedPublic

Authored by grosser on Nov 22 2014, 12:43 AM.

Download Raw Diff

Details

Reviewers

sebpop
grosser
simbuerg
dpeixott
jdoerfert

Summary

In case a GEP instruction references into a fixed size array e.g., an access
A[i][j] into an array A[100x100], LLVM-IR does not guarantee that the subscripts
always compute values that are within array bounds. We now derive the set of
parameter values for which all accesses are within bounds and add the assumption
that the scop is only every executed with this set of parameter values.

Example:

void foo(float A[10][20], long n, long m {

for (long i = 0; i < n; i++)
  for (long j = 0; j < m; j++)
    A[i][j] = ...

This loop yields out-of-bound accesses if m is at least 20 and at the same time
at least one iteration of the outer loop is executed. Hence, we assume:

n <= 0 or m <= 20.

Doing so simplifies the dependence analysis problem, allows us to perform
more optimizations and generate better code.

TODO: The location where the GEP instruction is executed is not necessarily the
location where the memory is actually accessed. As a result scanning for GEP[s]
is imprecise. Even though this is not a correctness problem, this imprecision
may result in missed optimizations or non-optimal run-time checks.

In polybench where this mismatch between parametric loop bounds and fixed size
arrays is common, we see with this patch significant reductions in compile time
(up to 50%) and execution time (up to 70%). We see two significant compile time
regressions (fdtd-2d, jacobi-2d-imper), and on execution time regression (trmm).
Both regressions arise due to additional optimizations that have been enabled by
this patch. They can be addressed in subsequent commits.

Diff Detail

Event Timeline

grosser updated this revision to Diff 16523.Nov 22 2014, 12:43 AM

grosser retitled this revision from to Assume GetElementPtr offsets to be inbounds.

grosser updated this object.

grosser added reviewers: jdoerfert, sebpop, dpeixott, simbuerg.

grosser added a subscriber: Unknown Object (MLST).

Here the performance results for this patch:

performance-results.html528 KBDownload

Some minor commments, otherwise LGTM

include/polly/ScopInfo.h
476	What about n > 10 and m > 1? Shouldn't that be a violation too?
lib/Analysis/ScopInfo.cpp
850	I would rename Inst to GEP or GEPInst or sth. to make the type clear.
856	Theoretically, we could make this a while and increment the dimension, however I don't think it will ever matter. However, stripping of structs might be worthwile?
867	1 + Dimension is Operand!
869	Please move Parent.getSE() before the loop

Thanks for the review. I will commit it as soon as our LNT builders are green again.

include/polly/ScopInfo.h
476	The first dimension does not carry any size information. I declared the array as A[][20] to make this clear.
lib/Analysis/ScopInfo.cpp
850	Done.
856	Possibly. I do not have a test case yet. I think we can add it in a subsequent commit if found useful.
867	Fixed.
869	Fixed.

Accept revision to be able to close it.

This revision is now accepted and ready to land.Nov 25 2014, 2:53 AM

Committed in 222754.

Revision Contents

Path

Size

include/

polly/

ScopInfo.h

32 lines

lib/

Analysis/

Dependences.cpp

3 lines

ScopInfo.cpp

56 lines

test/

Dependences/

sequential_loops.ll

2 lines

Isl/

Ast/

OpenMP/

nested_loop_both_parallel_parametric.ll

21 lines

alias_simple_1.ll

10 lines

alias_simple_2.ll

10 lines

alias_simple_3.ll

10 lines

ScopInfo/

assume_gep_bounds.ll

76 lines

assume_gep_bounds_2.ll

94 lines

Diff 16523

include/polly/ScopInfo.h

Show First 20 Lines • Show All 449 Lines • ▼ Show 20 Lines	class ScopStmt {
void checkForReductions();		void checkForReductions();

/// @brief Collect loads which might form a reduction chain with @p StoreMA		/// @brief Collect loads which might form a reduction chain with @p StoreMA
void		void
collectCandiateReductionLoads(MemoryAccess *StoreMA,		collectCandiateReductionLoads(MemoryAccess *StoreMA,
llvm::SmallVectorImpl<MemoryAccess *> &Loads);		llvm::SmallVectorImpl<MemoryAccess *> &Loads);
//@}		//@}

		/// @brief Derive assumptions about parameter values from GetElementPtrInst
		///
		/// In case a GEP instruction references into a fixed size array e.g., an
		/// access A[i][j] into an array A[100x100], LLVM-IR does not guarantee that
		/// the subscripts always compute values that are within array bounds. In this
		/// function we derive the set of parameter values for which all accesses are
		/// within bounds and add the assumption that the scop is only every executed
		/// with this set of parameter values.
		///
		/// Example:
		///
		/// void foo(float A[10][20], long n, long m {
		/// for (long i = 0; i < n; i++)
		/// for (long j = 0; j < m; j++)
		/// A[i][j] = ...
		///
		/// This loop yields out-of-bound accesses if m is at least 20 and at the same
		/// time at least one iteration of the outer loop is executed. Hence, we
		/// assume:
		jdoerfertUnsubmitted Not Done Reply Inline Actions What about n > 10 and m > 1? Shouldn't that be a violation too? jdoerfert: What about n > 10 and m > 1? Shouldn't that be a violation too?
		grosserAuthorUnsubmitted Not Done Reply Inline Actions The first dimension does not carry any size information. I declared the array as A[][20] to make this clear. grosser: The first dimension does not carry any size information. I declared the array as A[][20] to…
		///
		/// n <= 0 or m <= 20.
		///
		/// TODO: The location where the GEP instruction is executed is not
		/// necessarily the location where the memory is actually accessed. As a
		/// result scanning for GEP[s] is imprecise. Even though this is not a
		/// correctness problem, this imprecision may result in missed optimizations
		/// or non-optimal run-time checks.
		void deriveAssumptionsFromGEP(GetElementPtrInst *Inst);

		/// @brief Scan the scop and derive assumptions about parameter values.
		void deriveAssumptions();

/// Create the ScopStmt from a BasicBlock.		/// Create the ScopStmt from a BasicBlock.
ScopStmt(Scop &parent, TempScop &tempScop, const Region &CurRegion,		ScopStmt(Scop &parent, TempScop &tempScop, const Region &CurRegion,
BasicBlock &bb, SmallVectorImpl<Loop *> &NestLoops,		BasicBlock &bb, SmallVectorImpl<Loop *> &NestLoops,
SmallVectorImpl<unsigned> &Scatter);		SmallVectorImpl<unsigned> &Scatter);

friend class Scop;		friend class Scop;

public:		public:
▲ Show 20 Lines • Show All 475 Lines • Show Last 20 Lines

lib/Analysis/Dependences.cpp

Show First 20 Lines • Show All 116 Lines • ▼ Show 20 Lines	for (MemoryAccess MA : Stmt) {

if (MA->isRead())		if (MA->isRead())
Read = isl_union_map_add_map(Read, accdom);		Read = isl_union_map_add_map(Read, accdom);
else		else
Write = isl_union_map_add_map(Write, accdom);		Write = isl_union_map_add_map(Write, accdom);
}		}
StmtSchedule = isl_union_map_add_map(StmtSchedule, Stmt->getScattering());		StmtSchedule = isl_union_map_add_map(StmtSchedule, Stmt->getScattering());
}		}

		*StmtSchedule =
		isl_union_map_intersect_params(*StmtSchedule, S.getAssumedContext());
}		}

/// @brief Fix all dimension of @p Zero to 0 and add it to @p user		/// @brief Fix all dimension of @p Zero to 0 and add it to @p user
static int fixSetToZero(__isl_take isl_set Zero, void user) {		static int fixSetToZero(__isl_take isl_set Zero, void user) {
isl_union_set User = (isl_union_set )user;		isl_union_set User = (isl_union_set )user;
for (unsigned i = 0; i < isl_set_dim(Zero, isl_dim_set); i++)		for (unsigned i = 0; i < isl_set_dim(Zero, isl_dim_set); i++)
Zero = isl_set_fix_si(Zero, isl_dim_set, i, 0);		Zero = isl_set_fix_si(Zero, isl_dim_set, i, 0);
User = isl_union_set_add_set(User, Zero);		User = isl_union_set_add_set(User, Zero);
▲ Show 20 Lines • Show All 504 Lines • Show Last 20 Lines

lib/Analysis/ScopInfo.cpp

Show First 20 Lines • Show All 841 Lines • ▼ Show 20 Lines	__isl_give isl_set *ScopStmt::buildDomain(TempScop &tempScop,
Domain = isl_set_universe(Space);		Domain = isl_set_universe(Space);
Domain = addLoopBoundsToDomain(Domain, tempScop);		Domain = addLoopBoundsToDomain(Domain, tempScop);
Domain = addConditionsToDomain(Domain, tempScop, CurRegion);		Domain = addConditionsToDomain(Domain, tempScop, CurRegion);
Domain = isl_set_set_tuple_id(Domain, Id);		Domain = isl_set_set_tuple_id(Domain, Id);

return Domain;		return Domain;
}		}

		void ScopStmt::deriveAssumptionsFromGEP(GetElementPtrInst *Inst) {
		jdoerfertUnsubmitted Not Done Reply Inline Actions I would rename Inst to GEP or GEPInst or sth. to make the type clear. jdoerfert: I would rename Inst to GEP or GEPInst or sth. to make the type clear.
		grosserAuthorUnsubmitted Not Done Reply Inline Actions Done. grosser: Done.
		int Dimension = 0;
		isl_ctx *Ctx = Parent.getIslCtx();
		isl_local_space *LSpace = isl_local_space_from_space(getDomainSpace());
		Type *Ty = Inst->getPointerOperandType();

		if (auto *PtrTy = dyn_cast<PointerType>(Ty)) {
		jdoerfertUnsubmitted Not Done Reply Inline Actions Theoretically, we could make this a while and increment the dimension, however I don't think it will ever matter. However, stripping of structs might be worthwile? jdoerfert: Theoretically, we could make this a while and increment the dimension, however I don't think it…
		grosserAuthorUnsubmitted Not Done Reply Inline Actions Possibly. I do not have a test case yet. I think we can add it in a subsequent commit if found useful. grosser: Possibly. I do not have a test case yet. I think we can add it in a subsequent commit if found…
		Dimension = 1;
		Ty = PtrTy->getElementType();
		}

		while (auto ArrayTy = dyn_cast<ArrayType>(Ty)) {
		unsigned int Operand = 1 + Dimension;

		if (Inst->getNumOperands() <= Operand)
		break;

		const SCEV *Expr = Parent.getSE()->getSCEV(Inst->getOperand(1 + Dimension));
		jdoerfertUnsubmitted Not Done Reply Inline Actions 1 + Dimension is Operand! jdoerfert: 1 + Dimension is Operand!
		grosserAuthorUnsubmitted Not Done Reply Inline Actions Fixed. grosser: Fixed.

		if (isAffineExpr(&Parent.getRegion(), Expr, *Parent.getSE())) {
		jdoerfertUnsubmitted Not Done Reply Inline Actions Please move Parent.getSE() before the loop jdoerfert: Please move Parent.getSE() before the loop
		grosserAuthorUnsubmitted Not Done Reply Inline Actions Fixed. grosser: Fixed.
		isl_pw_aff *AccessOffset = SCEVAffinator::getPwAff(this, Expr);
		AccessOffset =
		isl_pw_aff_set_tuple_id(AccessOffset, isl_dim_in, getDomainId());

		isl_pw_aff *DimSize = isl_pw_aff_from_aff(isl_aff_val_on_domain(
		isl_local_space_copy(LSpace),
		isl_val_int_from_si(Ctx, ArrayTy->getNumElements())));

		isl_set *OutOfBound = isl_pw_aff_ge_set(AccessOffset, DimSize);
		OutOfBound = isl_set_intersect(getDomain(), OutOfBound);
		OutOfBound = isl_set_params(OutOfBound);
		isl_set *InBound = isl_set_complement(OutOfBound);
		isl_set *Executed = isl_set_params(getDomain());

		// A => B == !A or B
		isl_set *InBoundIfExecuted =
		isl_set_union(isl_set_complement(Executed), InBound);

		Parent.addAssumption(InBoundIfExecuted);
		}

		Dimension += 1;
		Ty = ArrayTy->getElementType();
		}

		isl_local_space_free(LSpace);
		}

		void ScopStmt::deriveAssumptions() {
		for (Instruction &Inst : *BB)
		if (auto *GEP = dyn_cast<GetElementPtrInst>(&Inst))
		deriveAssumptionsFromGEP(GEP);
		}

ScopStmt::ScopStmt(Scop &parent, TempScop &tempScop, const Region &CurRegion,		ScopStmt::ScopStmt(Scop &parent, TempScop &tempScop, const Region &CurRegion,
BasicBlock &bb, SmallVectorImpl<Loop *> &Nest,		BasicBlock &bb, SmallVectorImpl<Loop *> &Nest,
SmallVectorImpl<unsigned> &Scatter)		SmallVectorImpl<unsigned> &Scatter)
: Parent(parent), BB(&bb), IVS(Nest.size()), NestLoops(Nest.size()) {		: Parent(parent), BB(&bb), IVS(Nest.size()), NestLoops(Nest.size()) {
// Setup the induction variables.		// Setup the induction variables.
for (unsigned i = 0, e = Nest.size(); i < e; ++i) {		for (unsigned i = 0, e = Nest.size(); i < e; ++i) {
if (!SCEVCodegen) {		if (!SCEVCodegen) {
PHINode *PN = Nest[i]->getCanonicalInductionVariable();		PHINode *PN = Nest[i]->getCanonicalInductionVariable();
assert(PN && "Non canonical IV in Scop!");		assert(PN && "Non canonical IV in Scop!");
IVS[i] = PN;		IVS[i] = PN;
}		}
NestLoops[i] = Nest[i];		NestLoops[i] = Nest[i];
}		}

BaseName = getIslCompatibleName("Stmt_", &bb, "");		BaseName = getIslCompatibleName("Stmt_", &bb, "");

Domain = buildDomain(tempScop, CurRegion);		Domain = buildDomain(tempScop, CurRegion);
buildScattering(Scatter);		buildScattering(Scatter);
buildAccesses(tempScop);		buildAccesses(tempScop);
checkForReductions();		checkForReductions();
		deriveAssumptions();
}		}

/// @brief Collect loads which might form a reduction chain with @p StoreMA		/// @brief Collect loads which might form a reduction chain with @p StoreMA
///		///
/// Check if the stored value for @p StoreMA is a binary operator with one or		/// Check if the stored value for @p StoreMA is a binary operator with one or
/// two loads as operands. If the binary operand is commutative & associative,		/// two loads as operands. If the binary operand is commutative & associative,
/// used only once (by @p StoreMA) and its load operands are also used only		/// used only once (by @p StoreMA) and its load operands are also used only
/// once, we have found a possible reduction chain. It starts at an operand		/// once, we have found a possible reduction chain. It starts at an operand
▲ Show 20 Lines • Show All 647 Lines • ▼ Show 20 Lines
}		}

__isl_give isl_set *Scop::getAssumedContext() const {		__isl_give isl_set *Scop::getAssumedContext() const {
return isl_set_copy(AssumedContext);		return isl_set_copy(AssumedContext);
}		}

void Scop::addAssumption(__isl_take isl_set *Set) {		void Scop::addAssumption(__isl_take isl_set *Set) {
AssumedContext = isl_set_intersect(AssumedContext, Set);		AssumedContext = isl_set_intersect(AssumedContext, Set);
		AssumedContext = isl_set_coalesce(AssumedContext);
}		}

void Scop::printContext(raw_ostream &OS) const {		void Scop::printContext(raw_ostream &OS) const {
OS << "Context:\n";		OS << "Context:\n";

if (!Context) {		if (!Context) {
OS.indent(4) << "n/a\n\n";		OS.indent(4) << "n/a\n\n";
return;		return;
▲ Show 20 Lines • Show All 309 Lines • Show Last 20 Lines

test/Dependences/sequential_loops.ll

	Show First 20 Lines • Show All 267 Lines • ▼ Show 20 Lines
	exit.2:			exit.2:
	ret void			ret void
	}			}

	; VALUE: region: 'S1 => exit.2' in function 'parametric_offset':			; VALUE: region: 'S1 => exit.2' in function 'parametric_offset':
	; VALUE: RAW dependences:			; VALUE: RAW dependences:
	; VALUE: [p] -> {			; VALUE: [p] -> {
	; VALUE: Stmt_S1[i0] -> Stmt_S2[-p + i0] :			; VALUE: Stmt_S1[i0] -> Stmt_S2[-p + i0] :
	; VALUE: i0 >= p and i0 <= 9 + p and i0 >= 0 and i0 <= 99			; VALUE: i0 >= p and i0 <= 9 + p and p <= 190 and i0 <= 99 and i0 >= 0
	; VALUE: }			; VALUE: }
	; VALUE: WAR dependences:			; VALUE: WAR dependences:
	; VALUE: [p] -> {			; VALUE: [p] -> {
	; VALUE: }			; VALUE: }
	; VALUE: WAW dependences:			; VALUE: WAW dependences:
	; VALUE: [p] -> {			; VALUE: [p] -> {
	; VALUE: }			; VALUE: }

	Show All 12 Lines

test/Isl/Ast/OpenMP/nested_loop_both_parallel_parametric.ll

Show All 35 Lines	loop.i.backedge:
%i.next = add nsw i64 %i, 1		%i.next = add nsw i64 %i, 1
br label %loop.i		br label %loop.i

ret:		ret:
fence seq_cst		fence seq_cst
ret void		ret void
}		}

; At the first look both loops seem parallel, however due to the linearization		; CHECK: if (n <= 1024 ? 1 : 0)
; of memory access functions, we get the following dependences:		; CHECK: #pragma omp parallel for
; [n] -> { loop_body[i0, i1] -> loop_body[1024 + i0, -1 + i1]:
; 0 <= i0 < n - 1024 and 1 <= i1 < n}
; They cause the outer loop to be non-parallel. We can only prove their
; absence, if we know that n < 1024. This information is currently not available
; to polly. However, we should be able to obtain it due to the out of bounds
; memory accesses, that would happen if n >= 1024.

; Note that we do not delinearize this access function because it is considered
; to already be affine: {{0,+,4}<%loop.i>,+,4096}<%loop.j>.

; CHECK: for (int c1 = 0; c1 < n; c1 += 1)		; CHECK: for (int c1 = 0; c1 < n; c1 += 1)
; CHECK: #pragma simd		; CHECK: #pragma simd
; CHECK: #pragma omp parallel for
; CHECK: for (int c3 = 0; c3 < n; c3 += 1)		; CHECK: for (int c3 = 0; c3 < n; c3 += 1)
; CHECK: Stmt_loop_body(c1, c3);		; CHECK: Stmt_loop_body(c1, c3);

test/Isl/Ast/alias_simple_1.ll

	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze < %s \| FileCheck %s --check-prefix=NOAA			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze < %s \| FileCheck %s --check-prefix=NOAA
	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -basicaa < %s \| FileCheck %s --check-prefix=BASI			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -basicaa < %s \| FileCheck %s --check-prefix=BASI
	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -tbaa < %s \| FileCheck %s --check-prefix=TBAA			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -tbaa < %s \| FileCheck %s --check-prefix=TBAA
	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -scev-aa < %s \| FileCheck %s --check-prefix=SCEV			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -scev-aa < %s \| FileCheck %s --check-prefix=SCEV
	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -globalsmodref-aa < %s \| FileCheck %s --check-prefix=GLOB			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -globalsmodref-aa < %s \| FileCheck %s --check-prefix=GLOB
	;			;
	; int A[1024];			; int A[1024];
	;			;
	;			;
	; void jd(float *B, int N) {			; void jd(float *B, int N) {
	; for (int i = 0; i < N; i++)			; for (int i = 0; i < N; i++)
	; A[i] = B[i];			; A[i] = B[i];
	; }			; }
	;			;
	; NOAA: if (1 && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))			; NOAA: if ((N <= 1024 ? 1 : 0) && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))
	; BASI: if (1 && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))			; BASI: if ((N <= 1024 ? 1 : 0) && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))
	; TBAA: if (1)			; TBAA: if (N <= 1024 ? 1 : 0)
	; SCEV: if (1 && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))			; SCEV: if ((N <= 1024 ? 1 : 0) && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))
	; GLOB: if (1 && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))			; GLOB: if ((N <= 1024 ? 1 : 0) && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))
	;			;
	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

	@A = common global [1024 x i32] zeroinitializer, align 16			@A = common global [1024 x i32] zeroinitializer, align 16

	define void @jd(float* nocapture readonly %B, i32 %N) {			define void @jd(float* nocapture readonly %B, i32 %N) {
	entry:			entry:
	%cmp6 = icmp sgt i32 %N, 0			%cmp6 = icmp sgt i32 %N, 0
	Show All 31 Lines

test/Isl/Ast/alias_simple_2.ll

	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze < %s \| FileCheck %s --check-prefix=NOAA			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze < %s \| FileCheck %s --check-prefix=NOAA
	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -basicaa < %s \| FileCheck %s --check-prefix=BASI			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -basicaa < %s \| FileCheck %s --check-prefix=BASI
	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -tbaa < %s \| FileCheck %s --check-prefix=TBAA			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -tbaa < %s \| FileCheck %s --check-prefix=TBAA
	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -scev-aa < %s \| FileCheck %s --check-prefix=SCEV			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -scev-aa < %s \| FileCheck %s --check-prefix=SCEV
	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -globalsmodref-aa < %s \| FileCheck %s --check-prefix=GLOB			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -globalsmodref-aa < %s \| FileCheck %s --check-prefix=GLOB
	;			;
	; int A[1024], B[1024];			; int A[1024], B[1024];
	;			;
	;			;
	; void jd(int N) {			; void jd(int N) {
	; for (int i = 0; i < N; i++)			; for (int i = 0; i < N; i++)
	; A[i] = B[i];			; A[i] = B[i];
	; }			; }
	;			;
	; NOAA: if (1 && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))			; NOAA: if ((N <= 1024 ? 1 : 0) && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))
	; BASI: if (1)			; BASI: if (N <= 1024 ? 1 : 0)
	; TBAA: if (1 && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))			; TBAA: if ((N <= 1024 ? 1 : 0) && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))
	; SCEV: if (1 && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))			; SCEV: if ((N <= 1024 ? 1 : 0) && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))
	; GLOB: if (1 && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))			; GLOB: if ((N <= 1024 ? 1 : 0) && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))
	;			;
	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

	@A = common global [1024 x i32] zeroinitializer, align 16			@A = common global [1024 x i32] zeroinitializer, align 16
	@B = common global [1024 x i32] zeroinitializer, align 16			@B = common global [1024 x i32] zeroinitializer, align 16

	define void @jd(i32 %N) {			define void @jd(i32 %N) {
	entry:			entry:
	Show All 31 Lines

test/Isl/Ast/alias_simple_3.ll

	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze < %s \| FileCheck %s --check-prefix=NOAA			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze < %s \| FileCheck %s --check-prefix=NOAA
	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -basicaa < %s \| FileCheck %s --check-prefix=BASI			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -basicaa < %s \| FileCheck %s --check-prefix=BASI
	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -tbaa < %s \| FileCheck %s --check-prefix=TBAA			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -tbaa < %s \| FileCheck %s --check-prefix=TBAA
	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -scev-aa < %s \| FileCheck %s --check-prefix=SCEV			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -scev-aa < %s \| FileCheck %s --check-prefix=SCEV
	; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -globalsmodref-aa < %s \| FileCheck %s --check-prefix=GLOB			; RUN: opt %loadPolly -polly-code-generator=isl -polly-ast -analyze -globalsmodref-aa < %s \| FileCheck %s --check-prefix=GLOB
	;			;
	; int A[1024];			; int A[1024];
	; float B[1024];			; float B[1024];
	;			;
	; void jd(int N) {			; void jd(int N) {
	; for (int i = 0; i < N; i++)			; for (int i = 0; i < N; i++)
	; A[i] = B[i];			; A[i] = B[i];
	; }			; }
	;			;
	; NOAA: if (1 && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))			; NOAA: if ((N <= 1024 ? 1 : 0) && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))
	; BASI: if (1)			; BASI: if (N <= 1024 ? 1 : 0)
	; TBAA: if (1)			; TBAA: if (N <= 1024 ? 1 : 0)
	; SCEV: if (1 && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))			; SCEV: if ((N <= 1024 ? 1 : 0) && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))
	; GLOB: if (1 && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))			; GLOB: if ((N <= 1024 ? 1 : 0) && (&MemRef_A[N] <= &MemRef_B[0] \|\| &MemRef_B[N] <= &MemRef_A[0]))
	;			;
	target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"			target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

	@A = common global [1024 x i32] zeroinitializer, align 16			@A = common global [1024 x i32] zeroinitializer, align 16
	@B = common global [1024 x float] zeroinitializer, align 16			@B = common global [1024 x float] zeroinitializer, align 16

	define void @jd(i32 %N) {			define void @jd(i32 %N) {
	entry:			entry:
	Show All 32 Lines

test/ScopInfo/assume_gep_bounds.ll

This file was added.

				; RUN: opt %loadPolly -polly-scops -analyze < %s \| FileCheck %s

				; void foo(float A[10][20][30], long n, long m, long p) {
				; for (long i = 0; i < n; i++)
				; for (long j = 0; j < m; j++)
				; for (long k = 0; k < p; k++)
				; A[i][j][k] = i + j + k;
				; }

				; For the above code we want to assume that all memory accesses are within the
				; bounds of the array A. In C (and LLVM-IR) this is not required, such that out
				; of bounds accesses are valid. However, as such accesses are uncommon, cause
				; complicated dependence pattern and as a result make dependence analysis more
				; costly and may prevent or hinder useful program transformations, we assume
				; absence of out-of-bound accesses. To do so we derive the set of parameter
				; values for which our assumption holds.

				; CHECK: Assumed Context
				; CHECK-NEXT: [n, m, p] -> { : p <= 30 and m <= 20 }

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				define void @foo([20 x [30 x float]]* %A, i64 %n, i64 %m, i64 %p) {
				entry:
				br label %for.cond

				for.cond: ; preds = %for.inc13, %entry
				%i.0 = phi i64 [ 0, %entry ], [ %inc14, %for.inc13 ]
				%cmp = icmp slt i64 %i.0, %n
				br i1 %cmp, label %for.body, label %for.end15

				for.body: ; preds = %for.cond
				br label %for.cond1

				for.cond1: ; preds = %for.inc10, %for.body
				%j.0 = phi i64 [ 0, %for.body ], [ %inc11, %for.inc10 ]
				%cmp2 = icmp slt i64 %j.0, %m
				br i1 %cmp2, label %for.body3, label %for.end12

				for.body3: ; preds = %for.cond1
				br label %for.cond4

				for.cond4: ; preds = %for.inc, %for.body3
				%k.0 = phi i64 [ 0, %for.body3 ], [ %inc, %for.inc ]
				%cmp5 = icmp slt i64 %k.0, %p
				br i1 %cmp5, label %for.body6, label %for.end

				for.body6: ; preds = %for.cond4
				%add = add nsw i64 %i.0, %j.0
				%add7 = add nsw i64 %add, %k.0
				%conv = sitofp i64 %add7 to float
				%arrayidx9 = getelementptr inbounds [20 x [30 x float]]* %A, i64 %i.0, i64 %j.0, i64 %k.0
				store float %conv, float* %arrayidx9, align 4
				br label %for.inc

				for.inc: ; preds = %for.body6
				%inc = add nsw i64 %k.0, 1
				br label %for.cond4

				for.end: ; preds = %for.cond4
				br label %for.inc10

				for.inc10: ; preds = %for.end
				%inc11 = add nsw i64 %j.0, 1
				br label %for.cond1

				for.end12: ; preds = %for.cond1
				br label %for.inc13

				for.inc13: ; preds = %for.end12
				%inc14 = add nsw i64 %i.0, 1
				br label %for.cond

				for.end15: ; preds = %for.cond
				ret void
				}

test/ScopInfo/assume_gep_bounds_2.ll

This file was added.

				; RUN: opt %loadPolly -basicaa -polly-scops -analyze < %s \| FileCheck %s
				;
				; void foo(float A[restrict 10][20], float B[restrict 10][20], long n, long m,
				; long p) {
				; for (long i = 0; i < n; i++)
				; for (long j = 0; j < m; j++)
				; A[i][j] = i + j;
				; for (long i = 0; i < m; i++)
				; for (long j = 0; j < p; j++)
				; B[i][j] = i + j;
				; }

				; This code is within bounds either if m and p are smaller than the array sizes,
				; but also if only p is smaller than the size of the second B dimension and n
				; is such that the first loop is never executed and consequently A is never
				; accessed. In this case the value of m does not matter.

				; CHECK: Assumed Context:
				; CHECK-NEXT: [n, m, p] -> { : (n <= 0 and p <= 20) or (m <= 20 and p <= 20) }

				target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"

				define void @foo([20 x float]* noalias %A, [20 x float]* noalias %B, i64 %n, i64 %m, i64 %p) {
				entry:
				br label %for.cond

				for.cond: ; preds = %for.inc5, %entry
				%i.0 = phi i64 [ 0, %entry ], [ %inc6, %for.inc5 ]
				%cmp = icmp slt i64 %i.0, %n
				br i1 %cmp, label %for.body, label %for.end7

				for.body: ; preds = %for.cond
				br label %for.cond1

				for.cond1: ; preds = %for.inc, %for.body
				%j.0 = phi i64 [ 0, %for.body ], [ %inc, %for.inc ]
				%cmp2 = icmp slt i64 %j.0, %m
				br i1 %cmp2, label %for.body3, label %for.end

				for.body3: ; preds = %for.cond1
				%add = add nsw i64 %i.0, %j.0
				%conv = sitofp i64 %add to float
				%arrayidx4 = getelementptr inbounds [20 x float]* %A, i64 %i.0, i64 %j.0
				store float %conv, float* %arrayidx4, align 4
				br label %for.inc

				for.inc: ; preds = %for.body3
				%inc = add nsw i64 %j.0, 1
				br label %for.cond1

				for.end: ; preds = %for.cond1
				br label %for.inc5

				for.inc5: ; preds = %for.end
				%inc6 = add nsw i64 %i.0, 1
				br label %for.cond

				for.end7: ; preds = %for.cond
				br label %for.cond9

				for.cond9: ; preds = %for.inc25, %for.end7
				%i8.0 = phi i64 [ 0, %for.end7 ], [ %inc26, %for.inc25 ]
				%cmp10 = icmp slt i64 %i8.0, %m
				br i1 %cmp10, label %for.body12, label %for.end27

				for.body12: ; preds = %for.cond9
				br label %for.cond14

				for.cond14: ; preds = %for.inc22, %for.body12
				%j13.0 = phi i64 [ 0, %for.body12 ], [ %inc23, %for.inc22 ]
				%cmp15 = icmp slt i64 %j13.0, %p
				br i1 %cmp15, label %for.body17, label %for.end24

				for.body17: ; preds = %for.cond14
				%add18 = add nsw i64 %i8.0, %j13.0
				%conv19 = sitofp i64 %add18 to float
				%arrayidx21 = getelementptr inbounds [20 x float]* %B, i64 %i8.0, i64 %j13.0
				store float %conv19, float* %arrayidx21, align 4
				br label %for.inc22

				for.inc22: ; preds = %for.body17
				%inc23 = add nsw i64 %j13.0, 1
				br label %for.cond14

				for.end24: ; preds = %for.cond14
				br label %for.inc25

				for.inc25: ; preds = %for.end24
				%inc26 = add nsw i64 %i8.0, 1
				br label %for.cond9

				for.end27: ; preds = %for.cond9
				ret void
				}

This is an archive of the discontinued LLVM Phabricator instance.

Assume GetElementPtr offsets to be inboundsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 16523

include/polly/ScopInfo.h

lib/Analysis/Dependences.cpp

lib/Analysis/ScopInfo.cpp

test/Dependences/sequential_loops.ll

test/Isl/Ast/OpenMP/nested_loop_both_parallel_parametric.ll

test/Isl/Ast/alias_simple_1.ll

test/Isl/Ast/alias_simple_2.ll

test/Isl/Ast/alias_simple_3.ll

test/ScopInfo/assume_gep_bounds.ll

test/ScopInfo/assume_gep_bounds_2.ll

Assume GetElementPtr offsets to be inbounds
ClosedPublic