This is an archive of the discontinued LLVM Phabricator instance.

Improve the LoopAccessAnalysis to handle the different types in the same size
Needs ReviewPublic

Authored by jinlin on Aug 18 2016, 11:46 AM.

Details

Summary

The following loop fails to be vectorized since the load c[i] is casted as i64 and the store c[i] is double. The loop access analysis gives up since they are in different types.

Since these two memory operations are in the same size, I believe the loop access analysis should return forward dependence and thus the loop can be vectorized.

#define N 1000
double a[N], b[N],c[N];
void foo() {
for (int i=0;i<N;i++) {
b[i] =c[i];
c[i]=0.0;
}
}

for.body: ; preds = %for.body, %entry

%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
%arrayidx = getelementptr inbounds [1000 x double], [1000 x double]* @c, i64 0, i64 %indvars.iv
%0 = bitcast double* %arrayidx to i64*
%1 = load i64, i64* %0, align 8, !tbaa !1
%arrayidx2 = getelementptr inbounds [1000 x double], [1000 x double]* @b, i64 0, i64 %indvars.iv
%2 = bitcast double* %arrayidx2 to i64*
store i64 %1, i64* %2, align 8, !tbaa !1
store double 0.000000e+00, double* %arrayidx, align 8, !tbaa !1
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond = icmp eq i64 %indvars.iv.next, 1000
br i1 %exitcond, label %for.cond.cleanup, label %for.body

LAA: Found a loop in foo: loop.17
LAA: Processing memory accesses...

AST: Alias Set Tracker: 2 alias sets for 3 pointer values.
AliasSet[0x9508b80, 1] must alias, No access Pointers: (<4 x i64>* %1, 18446744073709551615)
AliasSet[0x95f8a70, 2] must alias, No access Pointers: (<4 x double>* %2, 18446744073709551615), (<4 x i64>* %0, 18446744073709551615)

LAA: Accesses(3):

%1 = bitcast double* %arrayIdx11 to <4 x i64>* (write)
%2 = bitcast double* %arrayIdx to <4 x double>* (write)
%0 = bitcast double* %arrayIdx to <4 x i64>* (read-only)

Underlying objects for pointer %1 = bitcast double* %arrayIdx11 to <4 x i64>*

@b = common local_unnamed_addr global [1000 x double] zeroinitializer, align 16

Underlying objects for pointer %2 = bitcast double* %arrayIdx to <4 x double>*

@c = common local_unnamed_addr global [1000 x double] zeroinitializer, align 16

Underlying objects for pointer %0 = bitcast double* %arrayIdx to <4 x i64>*

@c = common local_unnamed_addr global [1000 x double] zeroinitializer, align 16

LAA: Found a runtime check ptr: %1 = bitcast double* %arrayIdx11 to <4 x i64>*
LAA: Found a runtime check ptr: %2 = bitcast double* %arrayIdx to <4 x double>*
LAA: Found a runtime check ptr: %0 = bitcast double* %arrayIdx to <4 x i64>*
LAA: We need to do 0 pointer comparisons.
LAA: We can perform a memory runtime check if needed.
LAA: Checking memory dependencies
LAA: Src Scev: {@c,+,32}<nsw><%loop.17>Sink Scev: {@c,+,32}<nsw><%loop.17>(Induction step: 1)
LAA: Distance for %gepload = load <4 x i64>, <4 x i64>* %0, align 16, !tbaa !1 to store <4 x double> zeroinitializer, <4 x double>* %2, align 16, !tbaa !1: 0
LAA: Zero dependence difference but different types
Total Dependences: 1
LAA: unsafe dependent memory operations in loop

Diff Detail

Repository
rL LLVM

Event Timeline

jinlin updated this revision to Diff 68582.Aug 18 2016, 11:46 AM
jinlin retitled this revision from to Improve the LoopAccessAnalysis to handle the different types in the same size.
jinlin updated this object.
jinlin added reviewers: DavidKreitzer, hfinkel.
jinlin set the repository for this revision to rL LLVM.
jinlin added a subscriber: llvm-commits.
jinlin updated this revision to Diff 68587.Aug 18 2016, 11:59 AM