This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/test/Analysis/LoopCacheAnalysis/PowerPC/
-
test/
-
Analysis/
-
LoopCacheAnalysis/
-
PowerPC/
-
LoopnestFixedSize.ll
1
loads-store.ll
-
matmul.ll
-
matvecmul.ll
-
multi-store.ll
-
single-store.ll
-
stencil.ll

Differential D124984

[NFC][LoopCacheAnalysis] Update test cases to make sure the outputs follow the right order
ClosedPublic

Authored by congzhe on May 4 2022, 11:19 PM.

Download Raw Diff

Details

Reviewers

bmahjour
Whitney
Meinersbur

Group Reviewers

Restricted Project

Commits

rG80ab16d0ed75: [NFC][LoopCacheAnalysis] Update test cases to make sure the outputs follow the…

Summary

In this patch we change test cases from using "CHECK" to using "CHECK-NEXT", which is to ensure the order of loops output by loop cache analysis is correct. After D124725 we fixed the non-deterministic output order hence we did not use "CHECK-DAG" anymore, and now we should really use "CHECK-NEXT" to make sure the loops in the output loop vector follow the right order.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

congzhe created this revision.May 4 2022, 11:19 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2022, 11:19 PM

Herald added subscribers: hiraditya, nemanjai. · View Herald Transcript

congzhe requested review of this revision.May 4 2022, 11:19 PM

Herald added a subscriber: llvm-commits. · View Herald TranscriptMay 4 2022, 11:19 PM

congzhe edited the summary of this revision. (Show Details)May 4 2022, 11:24 PM

congzhe edited the summary of this revision. (Show Details)

congzhe edited the summary of this revision. (Show Details)May 4 2022, 11:27 PM

congzhe edited the summary of this revision. (Show Details)

Harbormaster completed remote builds in B162837: Diff 427203.May 5 2022, 12:12 AM

Although the problem reported in PR55233 is worsened by D123400, the underlying issue existed even before that (as illustrated by the example in PR55233), so this patch won't completely solve the overflow problem, although it would make it a bit less likely to occur. I think a proper solution to PR55233 would need to address the overflow problem itself (eg by increasing the range of values that the cost model can represent).

I had thought about this approach when working on D123400, and while I agree it will make the cost values smaller, I have some reservations about its accuracy.

One observation is that with this patch, the cost values for non-consecutive accesses in deep nests become smaller, but also their relative differences reduce as well. For example in single-store.ll, the cost difference between for.k and for.j is in the orders of magnitude, while the cost difference between for.j and for.i is much smaller. I worry that this dilution of cost difference might make it harder to find the right order in loop nests that contain multiple reference groups. I think this approach slightly diverges from the concepts presented in the paper, inasmuch as, the unit of measurement for the cost value is meant to be number of cache lines, but with the multiplication by the "depth" factor the value will no longer be estimating number of cache lines.

llvm/lib/Analysis/LoopCacheAnalysis.cpp
323 ↗	(On Diff #427203)	i-loop -> j-loop?
324 ↗	(On Diff #427203)	i-loop -> j-loop?
332 ↗	(On Diff #427203)	This will always give a factor of 0 for the inner-most subscript. Is that intentional?
llvm/test/Analysis/LoopCacheAnalysis/PowerPC/loads-store.ll
14–15	why does this change here, but didn't change in D123400?

In D124984#3516334, @bmahjour wrote:

Although the problem reported in PR55233 is worsened by D123400, the underlying issue existed even before that (as illustrated by the example in PR55233), so this patch won't completely solve the overflow problem, although it would make it a bit less likely to occur. I think a proper solution to PR55233 would need to address the overflow problem itself (eg by increasing the range of values that the cost model can represent).

I had thought about this approach when working on D123400, and while I agree it will make the cost values smaller, I have some reservations about its accuracy.

One observation is that with this patch, the cost values for non-consecutive accesses in deep nests become smaller, but also their relative differences reduce as well. For example in single-store.ll, the cost difference between for.k and for.j is in the orders of magnitude, while the cost difference between for.j and for.i is much smaller. I worry that this dilution of cost difference might make it harder to find the right order in loop nests that contain multiple reference groups. I think this approach slightly diverges from the concepts presented in the paper, inasmuch as, the unit of measurement for the cost value is meant to be number of cache lines, but with the multiplication by the "depth" factor the value will no longer be estimating number of cache lines.

Thanks Bardia for the comment! Regarding the test case in loads-store.ll: in D123400 it should actually have been changed to "; CHECK: Loop 'for.i' has cost = 300000000" since the cost becomes 300000000 instead of 3000000. Currently the test still passes just because it happens to also match "Loop 'for.i' has cost = 3000000". After this patch the cost reduces from 300000000 to 6000000, which is consistent with the bahavior in other tests.

I agree that our long-term goal is to address the overflow problem itself possibly by increasing the range of values that the cost model can represent, and this patch might be a short-term solution. IMHO I think however we want to solve the problem, the numbers calculated by loop cache analysis will always be an estimation of some sort. We try to preserve its accuracy but might need to do some adjusments due to various reasons. For example, in loads-store.ll, the cost of "300000000" for loop "for.i" that we currently get is already not an accurate estimation of the accessed cache line numbers -- "3000000" might be a closer estimate but since we want to distinguish between outer loops, we adjusted "3000000" to "300000000".

I was hoping what this patch does align with the reasoning above. Nevertheless, I'm open to other solutions as well. I wonder if you feel like we should continue this patch or we should seek other solutions?

llvm/lib/Analysis/LoopCacheAnalysis.cpp
332 ↗	(On Diff #427203)	Thanks, will update it in the next version.
332 ↗	(On Diff #427203)	For innermost subscript usually the access is consecutive hence this "else" block won't be reached. If the access is not consecutive, the factor being 0 will still make the cost of innermost subscript to be smallest, which kind of works although not perfect. An easy improvement/fix is to use something like `(getNumSubscripts() - Index - 1)==0?1:(getNumSubscripts() - Index - 1)`. If you think this is better I can upate it in the next version.

Hi Bardia @bmahjour, after some thoughts I think regarding this patch we would chase a longer-term solution. For now I think I could rework and change this patch (both title and content) to an NFC patch that only updates the test cases, i.e., from using CHECK to CHECK-NEXT, since we really want to make sure that the output is ordered and the order is correct. I'm wondering how you think about it?

In D124984#3534886, @congzhe wrote:

Hi Bardia @bmahjour, after some thoughts I think regarding this patch we would chase a longer-term solution. For now I think I could rework and change this patch (both title and content) to an NFC patch that only updates the test cases, i.e., from using CHECK to CHECK-NEXT, since we really want to make sure that the output is ordered and the order is correct. I'm wondering how you think about it?

Yes, I agree a long term solution (eg of using double type) would be better. Updating the checks to expect the exact order seems like a good idea to me. Thanks!

@bmahjour I updated the patch to an NFC patch with test case updates only. I'd appreciate it if you could take a look :)

Harbormaster completed remote builds in B166287: Diff 432011.May 25 2022, 9:44 AM

LGTM

This revision is now accepted and ready to land.May 25 2022, 10:09 AM

Closed by commit rG80ab16d0ed75: [NFC][LoopCacheAnalysis] Update test cases to make sure the outputs follow the… (authored by congzhe). · Explain WhyMay 25 2022, 8:34 PM

This revision was automatically updated to reflect the committed changes.

congzhe added a commit: rG80ab16d0ed75: [NFC][LoopCacheAnalysis] Update test cases to make sure the outputs follow the….

Revision Contents

Path

Size

llvm/

test/

Analysis/

LoopCacheAnalysis/

PowerPC/

12 lines

4 lines

4 lines

8 lines

6 lines

4 lines

2 lines

Diff 432184

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/LoopnestFixedSize.ll

; RUN: opt < %s -passes='print<loop-cache-cost>' -disable-output 2>&1 \| FileCheck %s		; RUN: opt < %s -passes='print<loop-cache-cost>' -disable-output 2>&1 \| FileCheck %s

target datalayout = "e-m:e-i64:64-n32:64"		target datalayout = "e-m:e-i64:64-n32:64"
target triple = "powerpc64le-unknown-linux-gnu"		target triple = "powerpc64le-unknown-linux-gnu"

; Check delinearization in loop cache analysis can handle fixed-size arrays.		; Check delinearization in loop cache analysis can handle fixed-size arrays.
; The IR is copied from llvm/test/Analysis/DependenceAnalysis/SimpleSIVNoValidityCheckFixedSize.ll		; The IR is copied from llvm/test/Analysis/DependenceAnalysis/SimpleSIVNoValidityCheckFixedSize.ll

; CHECK: Loop 'for.body' has cost = 4186116		; CHECK: Loop 'for.body' has cost = 4186116
; CHECK: Loop 'for.body4' has cost = 128898		; CHECK-NEXT: Loop 'for.body4' has cost = 128898

;; #define N 1024		;; #define N 1024
;; #define M 2048		;; #define M 2048
;; void t1(int a[N][M]) {		;; void t1(int a[N][M]) {
;; for (int i = 0; i < N-1; ++i)		;; for (int i = 0; i < N-1; ++i)
;; for (int j = 2; j < M; ++j)		;; for (int j = 2; j < M; ++j)
;; a[i][j] = a[i+1][j-2];		;; a[i][j] = a[i+1][j-2];
;; }		;; }
Show All 25 Lines	for.inc11: ; preds = %for.body4
br i1 %exitcond7, label %for.body, label %for.end13		br i1 %exitcond7, label %for.body, label %for.end13

for.end13: ; preds = %for.inc11		for.end13: ; preds = %for.inc11
ret void		ret void
}		}


; CHECK: Loop 'for.body' has cost = 4186116		; CHECK: Loop 'for.body' has cost = 4186116
; CHECK: Loop 'for.body4' has cost = 128898		; CHECK-NEXT: Loop 'for.body4' has cost = 128898

define void @t2([2048 x i32]* %a) {		define void @t2([2048 x i32]* %a) {
entry:		entry:
br label %for.body		br label %for.body

for.body: ; preds = %entry, %for.inc11		for.body: ; preds = %entry, %for.inc11
%indvars.iv4 = phi i64 [ 0, %entry ], [ %indvars.iv.next5, %for.inc11 ]		%indvars.iv4 = phi i64 [ 0, %entry ], [ %indvars.iv.next5, %for.inc11 ]
br label %for.body4		br label %for.body4
Show All 18 Lines

for.end13: ; preds = %for.inc11		for.end13: ; preds = %for.inc11
ret void		ret void
}		}

declare [2048 x i32]* @func_with_returned_arg([2048 x i32]* returned %arg)		declare [2048 x i32]* @func_with_returned_arg([2048 x i32]* returned %arg)

; CHECK: Loop 'for.body' has cost = 2112128815104000000		; CHECK: Loop 'for.body' has cost = 2112128815104000000
; CHECK: Loop 'for.body4' has cost = 16762927104000000		; CHECK-NEXT: Loop 'for.body4' has cost = 16762927104000000
; CHECK: Loop 'for.body8' has cost = 130960368000000		; CHECK-NEXT: Loop 'for.body8' has cost = 130960368000000
; CHECK: Loop 'for.body12' has cost = 1047682944000		; CHECK-NEXT: Loop 'for.body12' has cost = 1047682944000
; CHECK: Loop 'for.body16' has cost = 32260032000		; CHECK-NEXT: Loop 'for.body16' has cost = 32260032000

;; #define N 128		;; #define N 128
;; #define M 2048		;; #define M 2048
;; void t3(int a[][N][N][N][M]) {		;; void t3(int a[][N][N][N][M]) {
;; for (int i1 = 0; i1 < N-1; ++i1)		;; for (int i1 = 0; i1 < N-1; ++i1)
;; for (int i2 = 2; i2 < N; ++i2)		;; for (int i2 = 2; i2 < N; ++i2)
;; for (int i3 = 0; i3 < N; ++i3)		;; for (int i3 = 0; i3 < N; ++i3)
;; for (int i4 = 3; i4 < N; ++i4)		;; for (int i4 = 3; i4 < N; ++i4)
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/loads-store.ll

	; RUN: opt < %s -passes='print<loop-cache-cost>' -disable-output 2>&1 \| FileCheck %s			; RUN: opt < %s -passes='print<loop-cache-cost>' -disable-output 2>&1 \| FileCheck %s

	target datalayout = "e-m:e-i64:64-n32:64"			target datalayout = "e-m:e-i64:64-n32:64"
	target triple = "powerpc64le-unknown-linux-gnu"			target triple = "powerpc64le-unknown-linux-gnu"

	; void foo(long n, long m, long o, int A[n][m][o], int B[n][m][o], int C[n][m][o]) {			; void foo(long n, long m, long o, int A[n][m][o], int B[n][m][o], int C[n][m][o]) {
	; for (long i = 0; i < n; i++)			; for (long i = 0; i < n; i++)
	; for (long j = 0; j < m; j++)			; for (long j = 0; j < m; j++)
	; for (long k = 0; k < o; k++)			; for (long k = 0; k < o; k++)
	; A[i][k][j] += B[i][k][j] + C[i][j][k];			; A[i][k][j] += B[i][k][j] + C[i][j][k];
	; }			; }

	; CHECK: Loop 'for.i' has cost = 3000000			; CHECK: Loop 'for.i' has cost = 3000000
	; CHECK: Loop 'for.k' has cost = 2030000			; CHECK-NEXT: Loop 'for.k' has cost = 2030000
	; CHECK: Loop 'for.j' has cost = 1060000			; CHECK-NEXT: Loop 'for.j' has cost = 1060000
				bmahjourUnsubmitted Not Done Reply Inline Actions why does this change here, but didn't change in D123400? bmahjour: why does this change here, but didn't change in D123400?

	define void @foo(i64 %n, i64 %m, i64 %o, i32* %A, i32* %B, i32* %C) {			define void @foo(i64 %n, i64 %m, i64 %o, i32* %A, i32* %B, i32* %C) {
	entry:			entry:
	%cmp32 = icmp sgt i64 %n, 0			%cmp32 = icmp sgt i64 %n, 0
	%cmp230 = icmp sgt i64 %m, 0			%cmp230 = icmp sgt i64 %m, 0
	%cmp528 = icmp sgt i64 %o, 0			%cmp528 = icmp sgt i64 %o, 0
	br i1 %cmp32, label %for.cond1.preheader.lr.ph, label %for.end			br i1 %cmp32, label %for.cond1.preheader.lr.ph, label %for.end

	▲ Show 20 Lines • Show All 65 Lines • Show Last 20 Lines

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/matmul.ll

	; RUN: opt < %s -passes='print<loop-cache-cost>' -disable-output 2>&1 \| FileCheck %s			; RUN: opt < %s -passes='print<loop-cache-cost>' -disable-output 2>&1 \| FileCheck %s

	target datalayout = "e-m:e-i64:64-n32:64"			target datalayout = "e-m:e-i64:64-n32:64"
	target triple = "powerpc64le-unknown-linux-gnu"			target triple = "powerpc64le-unknown-linux-gnu"

	; void matmul(long n, long m, long o, int A[n][m], int B[n][m], int C[n]) {			; void matmul(long n, long m, long o, int A[n][m], int B[n][m], int C[n]) {
	; for (long i = 0; i < n; i++)			; for (long i = 0; i < n; i++)
	; for (long j = 0; j < m; j++)			; for (long j = 0; j < m; j++)
	; for (long k = 0; k < o; k++)			; for (long k = 0; k < o; k++)
	; C[i][j] = C[i][j] + A[i][k] * B[k][j];			; C[i][j] = C[i][j] + A[i][k] * B[k][j];
	; }			; }

	; CHECK:Loop 'for.i' has cost = 2010000			; CHECK:Loop 'for.i' has cost = 2010000
	; CHECK:Loop 'for.k' has cost = 1040000			; CHECK-NEXT:Loop 'for.k' has cost = 1040000
	; CHECK:Loop 'for.j' has cost = 70000			; CHECK-NEXT:Loop 'for.j' has cost = 70000

	define void @matmul(i64 %n, i64 %m, i64 %o, i32* %A, i32* %B, i32* %C) {			define void @matmul(i64 %n, i64 %m, i64 %o, i32* %A, i32* %B, i32* %C) {
	entry:			entry:
	br label %for.i			br label %for.i

	for.i: ; preds = %entry, %for.inc.i			for.i: ; preds = %entry, %for.inc.i
	%i = phi i64 [ 0, %entry ], [ %i.next, %for.inc.i ]			%i = phi i64 [ 0, %entry ], [ %i.next, %for.inc.i ]
	%muli = mul i64 %i, %m			%muli = mul i64 %i, %m
	▲ Show 20 Lines • Show All 58 Lines • Show Last 20 Lines

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/matvecmul.ll

	Show All 9 Lines
	; for (int j=1;j<ny,++j)			; for (int j=1;j<ny,++j)
	; for (int i=1;i<nx,++i)			; for (int i=1;i<nx,++i)
	; for (int l=1;l<nb,++l)			; for (int l=1;l<nb,++l)
	; for (int m=1;m<nb,++m)			; for (int m=1;m<nb,++m)
	; y[k+1][j][i][l] = y[k+1][j][i][l] + b[k][j][i][m][l]*x[k][j][i][m]			; y[k+1][j][i][l] = y[k+1][j][i][l] + b[k][j][i][m][l]*x[k][j][i][m]
	; }			; }

	; CHECK: Loop 'k_loop' has cost = 10200000000000000			; CHECK: Loop 'k_loop' has cost = 10200000000000000
	; CHECK: Loop 'j_loop' has cost = 102000000000000			; CHECK-NEXT: Loop 'j_loop' has cost = 102000000000000
	; CHECK: Loop 'i_loop' has cost = 1020000000000			; CHECK-NEXT: Loop 'i_loop' has cost = 1020000000000
	; CHECK: Loop 'm_loop' has cost = 10700000000			; CHECK-NEXT: Loop 'm_loop' has cost = 10700000000
	; CHECK: Loop 'l_loop' has cost = 1300000000			; CHECK-NEXT: Loop 'l_loop' has cost = 1300000000

	%_elem_type_of_double = type <{ double }>			%_elem_type_of_double = type <{ double }>

	; Function Attrs: norecurse nounwind			; Function Attrs: norecurse nounwind
	define void @mat_vec_mpy([0 x %_elem_type_of_double]* noalias %y, [0 x %_elem_type_of_double]* noalias readonly %x,			define void @mat_vec_mpy([0 x %_elem_type_of_double]* noalias %y, [0 x %_elem_type_of_double]* noalias readonly %x,
	[0 x %_elem_type_of_double]* noalias readonly %b, i32* noalias readonly %nb, i32* noalias readonly %nx,			[0 x %_elem_type_of_double]* noalias readonly %b, i32* noalias readonly %nb, i32* noalias readonly %nx,
	i32* noalias readonly %ny, i32* noalias readonly %nz) {			i32* noalias readonly %ny, i32* noalias readonly %nz) {
	mat_times_vec_entry:			mat_times_vec_entry:
	▲ Show 20 Lines • Show All 156 Lines • Show Last 20 Lines

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/multi-store.ll

	; RUN: opt < %s -opaque-pointers -passes='print<loop-cache-cost>' -disable-output 2>&1 \| FileCheck %s			; RUN: opt < %s -opaque-pointers -passes='print<loop-cache-cost>' -disable-output 2>&1 \| FileCheck %s

	target datalayout = "e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512"			target datalayout = "e-m:e-i64:64-n32:64-S128-v256:256:256-v512:512:512"
	target triple = "powerpc64le-unknown-linux-gnu"			target triple = "powerpc64le-unknown-linux-gnu"

	; CHECK-DAG: Loop 'for.j' has cost = 201000000			; CHECK: Loop 'for.j' has cost = 201000000
	; CHECK-DAG: Loop 'for.i' has cost = 102000000			; CHECK-NEXT: Loop 'for.i' has cost = 102000000
	; CHECK-DAG: Loop 'for.k' has cost = 90000			; CHECK-NEXT: Loop 'for.k' has cost = 90000

	;; Test to make sure when we have multiple conflicting access patterns, the			;; Test to make sure when we have multiple conflicting access patterns, the
	;; chosen loop configuration favours the majority of those accesses.			;; chosen loop configuration favours the majority of those accesses.
	;; For example this nest should be ordered as j-i-k.			;; For example this nest should be ordered as j-i-k.
	;; for (int i = 0; i < n; i++)			;; for (int i = 0; i < n; i++)
	;; for (int j = 0; j < n; j++)			;; for (int j = 0; j < n; j++)
	;; for (int k = 0; k < n; k++) {			;; for (int k = 0; k < n; k++) {
	;; A[i][j][k] = 1;			;; A[i][j][k] = 1;
	▲ Show 20 Lines • Show All 86 Lines • Show Last 20 Lines

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/single-store.ll

	; RUN: opt < %s -passes='print<loop-cache-cost>' -disable-output 2>&1 \| FileCheck %s			; RUN: opt < %s -passes='print<loop-cache-cost>' -disable-output 2>&1 \| FileCheck %s

	target datalayout = "e-m:e-i64:64-n32:64"			target datalayout = "e-m:e-i64:64-n32:64"
	target triple = "powerpc64le-unknown-linux-gnu"			target triple = "powerpc64le-unknown-linux-gnu"

	; void foo(long n, long m, long o, int A[n][m][o]) {			; void foo(long n, long m, long o, int A[n][m][o]) {
	; for (long i = 0; i < n; i++)			; for (long i = 0; i < n; i++)
	; for (long j = 0; j < m; j++)			; for (long j = 0; j < m; j++)
	; for (long k = 0; k < o; k++)			; for (long k = 0; k < o; k++)
	; A[2i+3][3j-4][2*k+7] = 1;			; A[2i+3][3j-4][2*k+7] = 1;
	; }			; }

	; CHECK: Loop 'for.i' has cost = 100000000			; CHECK: Loop 'for.i' has cost = 100000000
	; CHECK: Loop 'for.j' has cost = 1000000			; CHECK-NEXT: Loop 'for.j' has cost = 1000000
	; CHECK: Loop 'for.k' has cost = 60000			; CHECK-NEXT: Loop 'for.k' has cost = 60000

	define void @foo(i64 %n, i64 %m, i64 %o, i32* %A) {			define void @foo(i64 %n, i64 %m, i64 %o, i32* %A) {
	entry:			entry:
	%cmp32 = icmp sgt i64 %n, 0			%cmp32 = icmp sgt i64 %n, 0
	%cmp230 = icmp sgt i64 %m, 0			%cmp230 = icmp sgt i64 %m, 0
	%cmp528 = icmp sgt i64 %o, 0			%cmp528 = icmp sgt i64 %o, 0
	br i1 %cmp32, label %for.cond1.preheader.lr.ph, label %for.end			br i1 %cmp32, label %for.cond1.preheader.lr.ph, label %for.end

	▲ Show 20 Lines • Show All 131 Lines • Show Last 20 Lines

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/stencil.ll

	; RUN: opt < %s -passes='print<loop-cache-cost>' -disable-output 2>&1 \| FileCheck %s			; RUN: opt < %s -passes='print<loop-cache-cost>' -disable-output 2>&1 \| FileCheck %s

	target datalayout = "e-m:e-i64:64-n32:64"			target datalayout = "e-m:e-i64:64-n32:64"
	target triple = "powerpc64le-unknown-linux-gnu"			target triple = "powerpc64le-unknown-linux-gnu"

	; void foo(long n, long m, long o, int A[n][m], int B[n][m], int C[n]) {			; void foo(long n, long m, long o, int A[n][m], int B[n][m], int C[n]) {
	; for (long i = 0; i < n; i++)			; for (long i = 0; i < n; i++)
	; for (long j = 0; j < m; j++) {			; for (long j = 0; j < m; j++) {
	; A[i][j] = A[i][j+1] + B[i-1][j] + B[i+1][j+1] + C[i];			; A[i][j] = A[i][j+1] + B[i-1][j] + B[i+1][j+1] + C[i];
	; A[i][j] += B[i][i];			; A[i][j] += B[i][i];
	; }			; }
	; }			; }

	; CHECK: Loop 'for.i' has cost = 20600			; CHECK: Loop 'for.i' has cost = 20600
	; CHECK: Loop 'for.j' has cost = 800			; CHECK-NEXT: Loop 'for.j' has cost = 800

	define void @foo(i64 %n, i64 %m, i32* %A, i32* %B, i32* %C) {			define void @foo(i64 %n, i64 %m, i32* %A, i32* %B, i32* %C) {
	entry:			entry:
	%cmp32 = icmp sgt i64 %n, 0			%cmp32 = icmp sgt i64 %n, 0
	%cmp230 = icmp sgt i64 %m, 0			%cmp230 = icmp sgt i64 %m, 0
	br i1 %cmp32, label %for.cond1.preheader.lr.ph, label %for.end			br i1 %cmp32, label %for.cond1.preheader.lr.ph, label %for.end

	for.cond1.preheader.lr.ph: ; preds = %entry			for.cond1.preheader.lr.ph: ; preds = %entry
	▲ Show 20 Lines • Show All 75 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[NFC][LoopCacheAnalysis] Update test cases to make sure the outputs follow the right orderClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 432184

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/LoopnestFixedSize.ll

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/loads-store.ll

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/matmul.ll

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/matvecmul.ll

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/multi-store.ll

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/single-store.ll

llvm/test/Analysis/LoopCacheAnalysis/PowerPC/stencil.ll

[NFC][LoopCacheAnalysis] Update test cases to make sure the outputs follow the right order
ClosedPublic