This is an archive of the discontinued LLVM Phabricator instance.

[LNT] Reduce I/O execution time for Polybench
ClosedPublic

Authored by rengolin on Jul 7 2015, 5:58 AM.

Download Raw Diff

Details

Reviewers

kristof.beyls
rengolin
grosser
• ddunbar

Summary

Polybench had large execution time due to the successive call to
fprintf as much as 4000*4000 times. For most programs, this was more
than 1/2 of its execution time.

The current solution is to transform the values into a stream of
nibbles as a char string, and print it once for every row, ie.
only as much as 4000 times, by using fputs instead of fprintf.

Overall new execution time is 47% of previous with some as low as 5%.
The reduction on x86_64 was 53%, on ARM was 51% and on AArch64 was 55%,
which means most of the time was spent on I/O, not the actual benchmark.

I ran this on all three architectures with small and full workloads. Checksums updated.

Diff Detail

Repository: rL LLVM

Event Timeline

rengolin updated this revision to Diff 29167.Jul 7 2015, 5:58 AM

rengolin retitled this revision from to [LNT] Reduce I/O execution time for Polybench.

rengolin updated this object.

rengolin added reviewers: kristof.beyls, • ddunbar, grosser.

rengolin set the repository for this revision to rL LLVM.

rengolin added subscribers: llvm-commits, cmatthews.

Herald added a subscriber: aemerson. · View Herald TranscriptJul 7 2015, 5:58 AM

Hi Renato,

thank you for working on this. This is goes definitely in the right direction. We could possibly even compute a single output hash in Polybench and just print this hash. In the end, we only want to know if the output changed.

In D10991#200265, @grosser wrote:

thank you for working on this. This is goes definitely in the right direction. We could possibly even compute a single output hash in Polybench and just print this hash. In the end, we only want to know if the output changed.

That was the original plan, but I'd need to add a hash function to Polybench.

So I did the intermediate solution: a simple dump function that would be quick enough to avoid being the hottest function in the benchmark, but concise enough to not rely on I/O speed or to invoke printf-like functions for floating point data.

In the end, we have one malloc+free, ~4000 inlined calls to a bunch of op+write and one memcpy. All of them easy to optimise on almost every target.

The speed up is already impressive, I think over-engineering this would be counter-productive. :)

cheers,
--renato

In D10991#200266, @rengolin wrote:

The speed up is already impressive, I think over-engineering this would be counter-productive. :)

Not to mention that now we'll be tracking actual performance in Polybench, not I/O randomness.

cheers,
--renato

Hi Renato,

I also don't think this needs to be over-engineered. Maybe I missed something, but could
we not just copy https://en.wikipedia.org/wiki/MurmurHash, add a call to

int32_t hash = murmur3_32(A, N * N * sizeof(float), 0)
printf("%d", hash);

and we are done as well. The solutions are almost identical, but here we are really fully eliminate IO cost.

In the end, the difference is probably minor. So if you prefer to go with your hand-written approach, that's fine with me as well.

In D10991#200277, @grosser wrote:

I also don't think this needs to be over-engineered. Maybe I missed something, but could
we not just copy https://en.wikipedia.org/wiki/MurmurHash, add a call to

Choosing an arbitrary hash function may encourage people to disagree, etc. Also, the I/O is already negligible.

I have compiled with and without -DPOLYBENCH_DUMP_ARRAYS, and on most cases, the difference is lower than noise.

kristof.beyls added inline comments.Jul 7 2015, 9:50 AM

SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.c
46	Do I understand correctly that this code basically only prints out the values of the last row of the entire matrix (the offset is j*8)? I think we'd want whatever the hash function implementation we end up with to still take all elements as input, to improve the chance of detecting a mis-compilation. I think the hash function can be really simple - no need for anything complex or secure; but we probably should feed in all matrix elements into the hash function. Maybe the straightforward solution here is to just print out the sum of all elements in a row, rather than each element in the row? Tobias may now these tests better: are we expecting bit-reproducible results for these tests? I'm guessing so unless DATA_PRINTF_MODIFIER in the original code was chosen so that it prints out with less precision?
SingleSource/Benchmarks/Polybench/utilities/polybench.h
609–630	I'm not sure, but it looks like this may give different answers on big versus little endian machines (which is something the previous implementation didn't have)? Maybe just printing out the sum of each row of a matrix (i.e. 4000 floats being printed) instead of the entire matrix (4000*4000 floats being printed) already reduces the IO overhead to be in the noise? If not, a couple of rows could be summed up together to reduce the amount of IO further?

rengolin added inline comments.Jul 7 2015, 10:00 AM

SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.c
46	No. print_element receives a float value (4 bytes) and expand into 8 nibles (8 bytes). So, every iteration of the print of A[i][j] will be on printmat[j8]. In there, j8 is only the initial position, on a streak of 8, not the only position printed.
SingleSource/Benchmarks/Polybench/utilities/polybench.h
609–630	Yes, it does, and that's ok. We already have many tests with different output for big and little endian, and we deal with it by having a file called .reference_outputs.big-endian. I can't run it, as I don't have any big-endian box. If you do, and want to do it now, feel free to send me some reference outputs. If not, we can wait until someone that runs it sends us. That's how we've done it in the past. A sum of all the elements would also* not be perfect, as a double change would go unnoticed. I/O is not an issue any more, and the differences are indistinguishable from noise.

kristof.beyls added inline comments.Jul 7 2015, 10:09 AM

SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.c
46	Yes, I got that, but given that this is a 2-dimensional matrix, with i indicating the row and j indicating the column, only using j to index the printed out result means that every iteration of the i-loop overwrites the results in printmat written on the previous iteration, right? The mallox(n*8) also indicates there is only room to print a single row, not the entire matrix. Or maybe I'm still missing something?

rengolin added inline comments.Jul 7 2015, 10:22 AM

SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.c
46	That's why the fputs below is inside the i loop. I'm printing one row at a time. This also saves a lot of memory and avoids trashing the allocators, helps caching, etc. Since the runtime now is indistinguishable from when not printing anything, I think it's a good trade-off.

kristof.beyls added inline comments.Jul 7 2015, 10:32 AM

SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.c
46	D'oh - I missed that. Cool, so we're still producing roughly the same amount of output - but way more efficiently. Provided these tests were already checking for bit-accurate results (I'm not sure - it probably depends on the DATA_PRINTF_MODIFIER), this looks good to me.

rengolin added inline comments.Jul 7 2015, 12:14 PM

SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.c
46	Cool, so we're still producing roughly the same amount of output - but way more efficiently. Precisely. :) Provided these tests were already checking for bit-accurate results (I'm not sure - it probably depends on the DATA_PRINTF_MODIFIER), this looks good to me. They weren't, the modifier was mostly "%0.2f", and that's why one of the tests (fdtd-apml) didn't work with the new technique. But the gain was too small to justify any work towards that goal. However, most of the others worked out of the box on ARM, AArch64 and x86_64. I think having a more strict checking is ok, as long as we understand that this was not a requirement. Though, I think it's a good thing. I'll add a comment before print_element() to that effect, so if people see it failing, we can revert the ones that weren't exact. I can't see how an alternative would work without either using printf or accumulation of values.

The results of Polybench are supposed to be bit-identical. However, some of the tests yield nans/infs (which they should not, but which they may in some configurations do), and for those I am unsure about the bitwise identity.

In D10991#201001, @grosser wrote:

The results of Polybench are supposed to be bit-identical. However, some of the tests yield nans/infs (which they should not, but which they may in some configurations do), and for those I am unsure about the bitwise identity.

Ok, so IIUC, my changes make the test more accurate and will not only increase benchmarking quality, but also testing quality.

I'll push these changes as they are, since they have clear value, but I'll add a note to fdtd-apml why I haven't moved it. This is a job for another commit.

cheers,
--renato

rengolin accepted this revision.Jul 8 2015, 3:23 AM

rengolin added a reviewer: rengolin.

This revision is now accepted and ready to land.Jul 8 2015, 3:23 AM

r241675

Hi Renato,

I'm seeing intermittent failures on some of our internal arm64 builders
that I think are related to this. For example, I'm seeing:

TEST (simple) 'reg_detect' FAILED! ****

Execution Context Diff:
/.../build/tools/fpcmp: files differ without tolerance allowance

TEST (simple) 'reg_detect' ********

I ran "reg_detect" manually 10 times and grabbed the raw output, and I'm
seeing variation from run to run:

000000220000003300000033000000330000003300000033000000000000003300000044000000440000004400008844000000000000000000000044000000440000004400000055000000000000000000000000000000440000005500008855000000000000000000000000000000??0000<<550000<<55000000000000000000000000000000??0000000000006655exit
0

000000220000003300000033000000330000003300000033@000000000000003300000044000000440000004400008844
@000000000000000000000044000000440000004400000055@000000000000000000000000000000440000005500008855
@000000000000000000000000000000??0000<<550000<<55@000000000000000000000000000000
??0000000000006655@exit 0

Any thoughts on what could be causing this?

Cheers,
Lang.

Hi Lang,

Yes, the test should be bit-exact, but not all of them are. I'll revert that one test for now, and add a FIXME to someone look into it.

cheers,
--renato

Should be fixed in r241709.

In D10991#201216, @rengolin wrote:

Hi Lang,

Yes, the test should be bit-exact, but not all of them are. I'll revert that one test for now, and add a FIXME to someone look into it.

-Nan, Nan and such values are not required to have the same bitvalue. We can probably just map them to a fixed bitpattern.

Tobias

Not even on the same architecture across multiple runs? This one seems odd...

I think it's a bug that was masked because people moved the precision
way to low to avoid variations in the past...

Mh, interesting. But yes, you are probably right.

Polybench was known to have some issues, but this was not one I was aware of. There is also a newer version of polybench (4.x) which we may want to upgrade to at some point. It has most performance issues resolved.

Hi Renato,

I'm also seeing a similar failure on gramschmidt. Could you fix or revert
this there too? Sorry!

Lang.

Revision Contents

Path

Size

SingleSource/

Benchmarks/

Polybench/

datamining/

correlation/

correlation.c

15 lines

correlation.reference_output

2 lines

correlation.reference_output.small

2 lines

covariance/

covariance.c

15 lines

covariance.reference_output

2 lines

covariance.reference_output.small

2 lines

linear-algebra/

kernels/

2mm/

2mm.c

13 lines

2mm.reference_output

2 lines

2mm.reference_output.small

2 lines

3mm/

3mm.c

13 lines

3mm.reference_output

2 lines

3mm.reference_output.small

2 lines

atax/

atax.c

10 lines

atax.reference_output

2 lines

bicg/

bicg.c

20 lines

bicg.reference_output

2 lines

cholesky/

cholesky.c

10 lines

cholesky.reference_output

2 lines

cholesky.reference_output.small

2 lines

doitgen/

doitgen.c

13 lines

doitgen.reference_output

2 lines

doitgen.reference_output.small

2 lines

gemm/

gemm.c

13 lines

gemm.reference_output

2 lines

gemm.reference_output.small

2 lines

gemver/

gemver.c

1 line

gesummv/

gesummv.c

9 lines

gesummv.reference_output

2 lines

gesummv.reference_output.small

2 lines

mvt/

mvt.c

13 lines

mvt.reference_output

2 lines

symm/

symm.c

13 lines

symm.reference_output

2 lines

symm.reference_output.small

2 lines

syr2k/

syr2k.c

13 lines

syr2k.reference_output

2 lines

syr2k.reference_output.small

2 lines

syrk/

syrk.c

13 lines

syrk.reference_output

2 lines

syrk.reference_output.small

2 lines

trisolv/

trisolv.c

9 lines

trisolv.reference_output

2 lines

trmm/

trmm.c

15 lines

trmm.reference_output

2 lines

trmm.reference_output.small

2 lines

solvers/

durbin/

durbin.c

1 line

gramschmidt/

gramschmidt.c

31 lines

gramschmidt.reference_output

2 lines

gramschmidt.reference_output.small

2 lines

lu/

lu.c

13 lines

lu.reference_output

2 lines

medley/

floyd-warshall/

floyd-warshall.c

13 lines

floyd-warshall.reference_output

2 lines

floyd-warshall.reference_output.small

2 lines

reg_detect/

reg_detect.c

15 lines

reg_detect.reference_output

2 lines

stencils/

adi/

adi.c

13 lines

adi.reference_output

2 lines

adi.reference_output.small

2 lines

fdtd-2d/

fdtd-2d.c

21 lines

fdtd-2d.reference_output

2 lines

fdtd-2d.reference_output.small

2 lines

fdtd-apml/

fdtd-apml.c

11 lines

jacobi-1d-imper/

jacobi-1d-imper.c

9 lines

jacobi-1d-imper.reference_output

2 lines

jacobi-2d-imper/

jacobi-2d-imper.c

13 lines

jacobi-2d-imper.reference_output

2 lines

jacobi-2d-imper.reference_output.small

2 lines

seidel-2d/

seidel-2d.c

13 lines

seidel-2d.reference_output

2 lines

seidel-2d.reference_output.small

2 lines

utilities/

polybench.h

23 lines

Diff 29167

SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.c

	Show All 37 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int m,			void print_array(int m,
	DATA_TYPE POLYBENCH_2D(symmat,M,M,m,m))			DATA_TYPE POLYBENCH_2D(symmat,M,M,m,m))

	{			{
	int i, j;			int i, j;
				char printmat = malloc(m8);

	for (i = 0; i < m; i++)			for (i = 0; i < m; i++) {
	for (j = 0; j < m; j++) {			for (j = 0; j < m; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, symmat[i][j]);			print_element(symmat[i][j], j*8, printmat);
	if ((i * m + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_correlation(int m, int n,			void kernel_correlation(int m, int n,
	DATA_TYPE float_n,			DATA_TYPE float_n,
	▲ Show 20 Lines • Show All 103 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.reference_output

7a921eb499d9fab524cee1ae065eee8c

ab44d09216ee09fe72e448e7c9a0a13b

SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.reference_output.small

dd629592033680e24cb3b818818292e8

4dcdd4b95a19a6cfbd32f4d5f24e17be

SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.c

	Show All 36 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int m,			void print_array(int m,
	DATA_TYPE POLYBENCH_2D(symmat,M,M,m,m))			DATA_TYPE POLYBENCH_2D(symmat,M,M,m,m))

	{			{
	int i, j;			int i, j;
				char printmat = malloc(m8);

	for (i = 0; i < m; i++)			for (i = 0; i < m; i++) {
	for (j = 0; j < m; j++) {			for (j = 0; j < m; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, symmat[i][j]);			print_element(symmat[i][j], j*8, printmat);
	if ((i * m + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_covariance(int m, int n,			void kernel_covariance(int m, int n,
	DATA_TYPE float_n,			DATA_TYPE float_n,
	▲ Show 20 Lines • Show All 75 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.reference_output

1af493f3c2d7d6b5acbd61a3852415c8

7a284ecb4d8b0f9e4132181b8c8754cc

SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.reference_output.small

e7060634372dc319d382649cf969bb65

cd7b8424a3c3635d1ea90f42e415faa7

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.c

	Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines

	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int ni, int nl,			void print_array(int ni, int nl,
	DATA_TYPE POLYBENCH_2D(D,NI,NL,ni,nl))			DATA_TYPE POLYBENCH_2D(D,NI,NL,ni,nl))
	{			{
	int i, j;			int i, j;
				char printmat = malloc(nl8);

	for (i = 0; i < ni; i++)			for (i = 0; i < ni; i++) {
	for (j = 0; j < nl; j++) {			for (j = 0; j < nl; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, D[i][j]);			print_element(D[i][j], j*8, printmat);
	if ((i * ni + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_2mm(int ni, int nj, int nk, int nl,			void kernel_2mm(int ni, int nj, int nk, int nl,
	DATA_TYPE alpha,			DATA_TYPE alpha,
	▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.reference_output

1545695c0f729449227b181c7785d63c

f9f294040187318bc2cc477e7eecab01

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.reference_output.small

2b8395f3686794ef2358eb60cd02ecbb

68efcbd188b404f4fb09803b61c02f60

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.c

	Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines

	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int ni, int nl,			void print_array(int ni, int nl,
	DATA_TYPE POLYBENCH_2D(G,NI,NL,ni,nl))			DATA_TYPE POLYBENCH_2D(G,NI,NL,ni,nl))
	{			{
	int i, j;			int i, j;
				char printmat = malloc(nl8);

	for (i = 0; i < ni; i++)			for (i = 0; i < ni; i++) {
	for (j = 0; j < nl; j++) {			for (j = 0; j < nl; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, G[i][j]);			print_element(G[i][j], j*8, printmat);
	if ((i * ni + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_3mm(int ni, int nj, int nk, int nl, int nm,			void kernel_3mm(int ni, int nj, int nk, int nl, int nm,
	DATA_TYPE POLYBENCH_2D(E,NI,NJ,ni,nj),			DATA_TYPE POLYBENCH_2D(E,NI,NJ,ni,nj),
	▲ Show 20 Lines • Show All 96 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.reference_output

66eb54f857bc65f405ee520b4f181935

f52121e0bccc072008e52a06798e05db

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.reference_output.small

ca0d129828c9ad9902da52afe3748e44

7779d064d80186d46a0d91b8f6116c62

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/atax/atax.c

	Show All 36 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int nx,			void print_array(int nx,
	DATA_TYPE POLYBENCH_1D(y,NX,nx))			DATA_TYPE POLYBENCH_1D(y,NX,nx))

	{			{
	int i;			int i;
				char printmat = malloc(nx8);

	for (i = 0; i < nx; i++) {			for (i = 0; i < nx; i++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, y[i]);			print_element(y[i], i*8, printmat);
	if (i % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			free(printmat);
	fprintf (stderr, "\n");
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_atax(int nx, int ny,			void kernel_atax(int nx, int ny,
	DATA_TYPE POLYBENCH_2D(A,NX,NY,nx,ny),			DATA_TYPE POLYBENCH_2D(A,NX,NY,nx,ny),
	▲ Show 20 Lines • Show All 63 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/atax/atax.reference_output

d56aab8d63bea01bd7e864dd3406f266

88a04eb3d01f6346a6ee4b678e311fe5

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/bicg/bicg.c

Show All 40 Lines	/* DCE code. Must scan the entire live-out data.
Can be used also to check the correctness of the output. */		Can be used also to check the correctness of the output. */
static		static
void print_array(int nx, int ny,		void print_array(int nx, int ny,
DATA_TYPE POLYBENCH_1D(s,NY,ny),		DATA_TYPE POLYBENCH_1D(s,NY,ny),
DATA_TYPE POLYBENCH_1D(q,NX,nx))		DATA_TYPE POLYBENCH_1D(q,NX,nx))

{		{
int i;		int i;
		int n = nx > ny ? nx : ny;
		char printmat = malloc(n8);

for (i = 0; i < ny; i++) {		for (i = 0; i < ny; i++)
fprintf (stderr, DATA_PRINTF_MODIFIER, s[i]);		print_element(s[i], i*8, printmat);
if (i % 20 == 0) fprintf (stderr, "\n");		*(printmat+i) = 0;
}		fputs(printmat, stderr);
for (i = 0; i < nx; i++) {		for (i = 0; i < nx; i++)
fprintf (stderr, DATA_PRINTF_MODIFIER, q[i]);		print_element(q[i], i*8, printmat);
if (i % 20 == 0) fprintf (stderr, "\n");		*(printmat+i) = 0;
}		fputs(printmat, stderr);
fprintf (stderr, "\n");		free(printmat);
}		}


/* Main computational kernel. The whole function will be timed,		/* Main computational kernel. The whole function will be timed,
including the call and return. */		including the call and return. */
static		static
void kernel_bicg(int nx, int ny,		void kernel_bicg(int nx, int ny,
DATA_TYPE POLYBENCH_2D(A,NX,NY,nx,ny),		DATA_TYPE POLYBENCH_2D(A,NX,NY,nx,ny),
▲ Show 20 Lines • Show All 71 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/bicg/bicg.reference_output

cb5bae69c1a3653edab30d8bcf027bbf

22fab907b016327a2808510b55959131

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/cholesky/cholesky.c

	Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int n,			void print_array(int n,
	DATA_TYPE POLYBENCH_2D(A,N,N,n,n))			DATA_TYPE POLYBENCH_2D(A,N,N,n,n))

	{			{
	int i, j;			int i, j;
				char printmat = malloc(n8);

	for (i = 0; i < n; i++)			for (i = 0; i < n; i++) {
	for (j = 0; j < n; j++) {			for (j = 0; j < n; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, A[i][j]);			print_element(A[i][j], j*8, printmat);
	if ((i * N + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
				free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_cholesky(int n,			void kernel_cholesky(int n,
	DATA_TYPE POLYBENCH_1D(p,N,n),			DATA_TYPE POLYBENCH_1D(p,N,n),
	▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/cholesky/cholesky.reference_output

1fb54fb516d4be7c0bb185f73c7d3292

5da79cc798c356034dfdf5ac98b3ecaa

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/cholesky/cholesky.reference_output.small

fc86853283ef65fd623fbd31159c2a65

e9aaf9cf2963bd1cc5183174c0dfadc1

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/doitgen/doitgen.c

	Show All 37 Lines

	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int nr, int nq, int np,			void print_array(int nr, int nq, int np,
	DATA_TYPE POLYBENCH_3D(A,NR,NQ,NP,nr,nq,np))			DATA_TYPE POLYBENCH_3D(A,NR,NQ,NP,nr,nq,np))
	{			{
	int i, j, k;			int i, j, k;
				char printmat = malloc(np8);

	for (i = 0; i < nr; i++)			for (i = 0; i < nr; i++) {
	for (j = 0; j < nq; j++)			for (j = 0; j < nq; j++)
	for (k = 0; k < np; k++) {			for (k = 0; k < np; k++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, A[i][j][k]);			print_element(A[i][j][k], k*8, printmat);
	if (i % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_doitgen(int nr, int nq, int np,			void kernel_doitgen(int nr, int nq, int np,
	DATA_TYPE POLYBENCH_3D(A,NR,NQ,NP,nr,nq,np),			DATA_TYPE POLYBENCH_3D(A,NR,NQ,NP,nr,nq,np),
	▲ Show 20 Lines • Show All 62 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/doitgen/doitgen.reference_output

65adf58a59d20588e34936ce58fd7e0f

73c37cc505bf1b3706695ed80db85dbf

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/doitgen/doitgen.reference_output.small

6b853e964e12d02f8cf93a93f101eacd

3c8a6a1e9fe6738b6d2d79335360194d

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemm/gemm.c

	Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines

	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int ni, int nj,			void print_array(int ni, int nj,
	DATA_TYPE POLYBENCH_2D(C,NI,NJ,ni,nj))			DATA_TYPE POLYBENCH_2D(C,NI,NJ,ni,nj))
	{			{
	int i, j;			int i, j;
				char printmat = malloc(nj8);

	for (i = 0; i < ni; i++)			for (i = 0; i < ni; i++) {
	for (j = 0; j < nj; j++) {			for (j = 0; j < nj; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, C[i][j]);			print_element(C[i][j], j*8, printmat);
	if ((i * ni + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_gemm(int ni, int nj, int nk,			void kernel_gemm(int ni, int nj, int nk,
	DATA_TYPE alpha,			DATA_TYPE alpha,
	▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemm/gemm.reference_output

8e49771874c426a32c3cf2f1f3aa0385

622af90c831b234ea835317cd4bb4123

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemm/gemm.reference_output.small

3361d8b76f26fc4d6ad4a8729d24d488

8685643738936fe8ee951f5e06200e56

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemver/gemver.c

	Show First 20 Lines • Show All 55 Lines • ▼ Show 20 Lines

	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int n,			void print_array(int n,
	DATA_TYPE POLYBENCH_1D(w,N,n))			DATA_TYPE POLYBENCH_1D(w,N,n))
	{			{
	int i;			int i;

	for (i = 0; i < n; i++) {			for (i = 0; i < n; i++) {
	fprintf (stderr, DATA_PRINTF_MODIFIER, w[i]);			fprintf (stderr, DATA_PRINTF_MODIFIER, w[i]);
	if (i % 20 == 0) fprintf (stderr, "\n");			if (i % 20 == 0) fprintf (stderr, "\n");
	}			}
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv.c

	Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int n,			void print_array(int n,
	DATA_TYPE POLYBENCH_1D(y,N,n))			DATA_TYPE POLYBENCH_1D(y,N,n))

	{			{
	int i;			int i;
				char printmat = malloc(n8);

	for (i = 0; i < n; i++) {			for (i = 0; i < n; i++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, y[i]);			print_element(y[i], i*8, printmat);
	if (i % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_gesummv(int n,			void kernel_gesummv(int n,
	DATA_TYPE alpha,			DATA_TYPE alpha,
	▲ Show 20 Lines • Show All 75 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv.reference_output

8cafbcc8e2ea63cccdf5de115af25744

23b6382cb41cacbb93c669e92d7c4b7e

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv.reference_output.small

09469571af2ac7213418225f9dd07f68

d0f71814a5109a3f373ed7587e6b0b0b

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/mvt/mvt.c

Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	/* DCE code. Must scan the entire live-out data.
Can be used also to check the correctness of the output. */		Can be used also to check the correctness of the output. */
static		static
void print_array(int n,		void print_array(int n,
DATA_TYPE POLYBENCH_1D(x1,N,n),		DATA_TYPE POLYBENCH_1D(x1,N,n),
DATA_TYPE POLYBENCH_1D(x2,N,n))		DATA_TYPE POLYBENCH_1D(x2,N,n))

{		{
int i;		int i;
		char printmat = malloc(n8);

for (i = 0; i < n; i++) {		for (i = 0; i < n; i++)
fprintf (stderr, DATA_PRINTF_MODIFIER, x1[i]);		print_element(x1[i], i*8, printmat);
fprintf (stderr, DATA_PRINTF_MODIFIER, x2[i]);		fputs(printmat, stderr);
if (i % 20 == 0) fprintf (stderr, "\n");		for (i = 0; i < n; i++)
}		print_element(x2[i], i*8, printmat);
		fputs(printmat, stderr);
		free(printmat);
}		}


/* Main computational kernel. The whole function will be timed,		/* Main computational kernel. The whole function will be timed,
including the call and return. */		including the call and return. */
static		static
void kernel_mvt(int n,		void kernel_mvt(int n,
DATA_TYPE POLYBENCH_1D(x1,N,n),		DATA_TYPE POLYBENCH_1D(x1,N,n),
▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/mvt/mvt.reference_output

6b091f2008985d45dbdcf6c382fbd7b8

74b403cb1904ef24879eeb0f8e031b4e

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.c

	Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines

	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int ni, int nj,			void print_array(int ni, int nj,
	DATA_TYPE POLYBENCH_2D(C,NI,NJ,ni,nj))			DATA_TYPE POLYBENCH_2D(C,NI,NJ,ni,nj))
	{			{
	int i, j;			int i, j;
				char printmat = malloc(nj8);

	for (i = 0; i < ni; i++)			for (i = 0; i < ni; i++) {
	for (j = 0; j < nj; j++) {			for (j = 0; j < nj; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, C[i][j]);			print_element(C[i][j], j*8, printmat);
	if ((i * ni + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_symm(int ni, int nj,			void kernel_symm(int ni, int nj,
	DATA_TYPE alpha,			DATA_TYPE alpha,
	▲ Show 20 Lines • Show All 70 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.reference_output

a3ec0a2360baf0348b8ee4ba64fbf096

e1dddd8c9c19bf9f1e7cbf0c239ce46a

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.reference_output.small

c597648a5a55afed6a92eca9245c70bf

7d05186f98518c24e49d2229c5e0ad1c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syr2k/syr2k.c

	Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines

	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int ni,			void print_array(int ni,
	DATA_TYPE POLYBENCH_2D(C,NI,NI,ni,ni))			DATA_TYPE POLYBENCH_2D(C,NI,NI,ni,ni))
	{			{
	int i, j;			int i, j;
				char printmat = malloc(ni8);

	for (i = 0; i < ni; i++)			for (i = 0; i < ni; i++) {
	for (j = 0; j < ni; j++) {			for (j = 0; j < ni; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, C[i][j]);			print_element(C[i][j], j*8, printmat);
	if ((i * ni + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_syr2k(int ni, int nj,			void kernel_syr2k(int ni, int nj,
	DATA_TYPE alpha,			DATA_TYPE alpha,
	▲ Show 20 Lines • Show All 68 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syr2k/syr2k.reference_output

00e3df1371e3f5b9576862a0f36ef565

3c957bddeddf967b7ae73a4c4546e229

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syr2k/syr2k.reference_output.small

6e337b7202d0024b09c3e7f6ff76f034

5e7018090d8a48bc5171713008ad0831

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syrk/syrk.c

	Show All 40 Lines

	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int ni,			void print_array(int ni,
	DATA_TYPE POLYBENCH_2D(C,NI,NI,ni,ni))			DATA_TYPE POLYBENCH_2D(C,NI,NI,ni,ni))
	{			{
	int i, j;			int i, j;
				char printmat = malloc(ni8);

	for (i = 0; i < ni; i++)			for (i = 0; i < ni; i++) {
	for (j = 0; j < ni; j++) {			for (j = 0; j < ni; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, C[i][j]);			print_element(C[i][j], j*8, printmat);
	if ((i * ni + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_syrk(int ni, int nj,			void kernel_syrk(int ni, int nj,
	DATA_TYPE alpha,			DATA_TYPE alpha,
	▲ Show 20 Lines • Show All 55 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syrk/syrk.reference_output

8e49771874c426a32c3cf2f1f3aa0385

622af90c831b234ea835317cd4bb4123

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syrk/syrk.reference_output.small

3361d8b76f26fc4d6ad4a8729d24d488

8685643738936fe8ee951f5e06200e56

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv.c

	Show First 20 Lines • Show All 49 Lines • ▼ Show 20 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int n,			void print_array(int n,
	DATA_TYPE POLYBENCH_1D(x,N,n))			DATA_TYPE POLYBENCH_1D(x,N,n))

	{			{
	int i;			int i;
				char printmat = malloc(n8);

	for (i = 0; i < n; i++) {			for (i = 0; i < n; i++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, x[i]);			print_element(x[i], i*8, printmat);
	if (i % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_trisolv(int n,			void kernel_trisolv(int n,
	DATA_TYPE POLYBENCH_2D(A,N,N,n,n),			DATA_TYPE POLYBENCH_2D(A,N,N,n,n),
	▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv.reference_output

41151c37936a7c5c0ed56884b86b6270

6b6b1c3ed0d6abeda5af3b76f89a3331

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trmm/trmm.c

	Show All 37 Lines

	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int ni,			void print_array(int ni,
	DATA_TYPE POLYBENCH_2D(B,NI,NI,ni,ni))			DATA_TYPE POLYBENCH_2D(B,NI,NI,ni,ni))
	{			{
	int i, j;			int i, j;
				char printmat = malloc(ni8);

	for (i = 0; i < ni; i++)			for (i = 0; i < ni; i++) {
	for (j = 0; j < ni; j++) {			for (j = 0; j < ni; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, B[i][j]);			print_element(B[i][j], j*8, printmat);
	if ((i * ni + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_trmm(int ni,			void kernel_trmm(int ni,
	DATA_TYPE alpha,			DATA_TYPE alpha,
	▲ Show 20 Lines • Show All 49 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trmm/trmm.reference_output

fd1579a9f0ec5c2585fdc77df8e870cc

93f3c10749d75b23e8a19b6931b48486

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trmm/trmm.reference_output.small

be6096b5b46ba95633b79a710819fa55

1d2d6c667c5f036d5e51428f5aed37ef

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/durbin/durbin.c

	Show First 20 Lines • Show All 44 Lines • ▼ Show 20 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int n,			void print_array(int n,
	DATA_TYPE POLYBENCH_1D(out,N,n))			DATA_TYPE POLYBENCH_1D(out,N,n))

	{			{
	int i;			int i;

	for (i = 0; i < n; i++) {			for (i = 0; i < n; i++) {
	fprintf (stderr, DATA_PRINTF_MODIFIER, out[i]);			fprintf (stderr, DATA_PRINTF_MODIFIER, out[i]);
	if (i % 20 == 0) fprintf (stderr, "\n");			if (i % 20 == 0) fprintf (stderr, "\n");
	}			}
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	▲ Show 20 Lines • Show All 86 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.c

Show First 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	/* DCE code. Must scan the entire live-out data.
Can be used also to check the correctness of the output. */		Can be used also to check the correctness of the output. */
static		static
void print_array(int ni, int nj,		void print_array(int ni, int nj,
DATA_TYPE POLYBENCH_2D(A,NI,NJ,ni,nj),		DATA_TYPE POLYBENCH_2D(A,NI,NJ,ni,nj),
DATA_TYPE POLYBENCH_2D(R,NJ,NJ,nj,nj),		DATA_TYPE POLYBENCH_2D(R,NJ,NJ,nj,nj),
DATA_TYPE POLYBENCH_2D(Q,NI,NJ,ni,nj))		DATA_TYPE POLYBENCH_2D(Q,NI,NJ,ni,nj))
{		{
int i, j;		int i, j;
		char printmat = malloc(nj8);

for (i = 0; i < ni; i++)		for (i = 0; i < ni; i++) {
for (j = 0; j < nj; j++) {		for (j = 0; j < nj; j++)
fprintf (stderr, DATA_PRINTF_MODIFIER, A[i][j]);		print_element(A[i][j], j*8, printmat);
if (i % 20 == 0) fprintf (stderr, "\n");		fputs(printmat, stderr);
}		for (j = 0; j < nj; j++)
fprintf (stderr, "\n");		print_element(R[i][j], j*8, printmat);
for (i = 0; i < nj; i++)		fputs(printmat, stderr);
for (j = 0; j < nj; j++) {		for (j = 0; j < nj; j++)
fprintf (stderr, DATA_PRINTF_MODIFIER, R[i][j]);		print_element(Q[i][j], j*8, printmat);
if (i % 20 == 0) fprintf (stderr, "\n");		fputs(printmat, stderr);
}
fprintf (stderr, "\n");
for (i = 0; i < ni; i++)
for (j = 0; j < nj; j++) {
fprintf (stderr, DATA_PRINTF_MODIFIER, Q[i][j]);
if (i % 20 == 0) fprintf (stderr, "\n");
}		}
fprintf (stderr, "\n");		free(printmat);
}		}


/* Main computational kernel. The whole function will be timed,		/* Main computational kernel. The whole function will be timed,
including the call and return. */		including the call and return. */
static		static
void kernel_gramschmidt(int ni, int nj,		void kernel_gramschmidt(int ni, int nj,
DATA_TYPE POLYBENCH_2D(A,NI,NJ,ni,nj),		DATA_TYPE POLYBENCH_2D(A,NI,NJ,ni,nj),
▲ Show 20 Lines • Show All 71 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.reference_output

afd6148086976ac742f6e23d958a9abd

8dad7c1bd410ad7972658a7a8f920174

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.reference_output.small

2ef88c2dc96dfee9806c2e3436a724d9

872192fedd8da0eb3a2ec5f0f5054b72

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/lu/lu.c

	Show All 33 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int n,			void print_array(int n,
	DATA_TYPE POLYBENCH_2D(A,N,N,n,n))			DATA_TYPE POLYBENCH_2D(A,N,N,n,n))

	{			{
	int i, j;			int i, j;
				char printmat = malloc(n8);

	for (i = 0; i < n; i++)			for (i = 0; i < n; i++) {
	for (j = 0; j < n; j++) {			for (j = 0; j < n; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, A[i][j]);			print_element(A[i][j], j*8, printmat);
	if ((i * n + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_lu(int n,			void kernel_lu(int n,
	DATA_TYPE POLYBENCH_2D(A,N,N,n,n))			DATA_TYPE POLYBENCH_2D(A,N,N,n,n))
	▲ Show 20 Lines • Show All 56 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/lu/lu.reference_output

e755501e720addec3df7e4a44b31982c

83a3b7d418f8e147ab88fa04966c7c37

SingleSource/Benchmarks/Polybench/medley/floyd-warshall/floyd-warshall.c

	Show All 33 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int n,			void print_array(int n,
	DATA_TYPE POLYBENCH_2D(path,N,N,n,n))			DATA_TYPE POLYBENCH_2D(path,N,N,n,n))

	{			{
	int i, j;			int i, j;
				char printmat = malloc(n8);

	for (i = 0; i < n; i++)			for (i = 0; i < n; i++) {
	for (j = 0; j < n; j++) {			for (j = 0; j < n; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, path[i][j]);			print_element(path[i][j], j*8, printmat);
	if ((i * n + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_floyd_warshall(int n,			void kernel_floyd_warshall(int n,
	DATA_TYPE POLYBENCH_2D(path,N,N,n,n))			DATA_TYPE POLYBENCH_2D(path,N,N,n,n))
	▲ Show 20 Lines • Show All 47 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/medley/floyd-warshall/floyd-warshall.reference_output

1e06b1215b8234ad7dc8012e6c9ad7ab

72d6ab856485d06c244c238122cd13a8

SingleSource/Benchmarks/Polybench/medley/floyd-warshall/floyd-warshall.reference_output.small

cb25ad8d996cd98b5033b75a49811c84

90b8c539c785cf4ca34231d074058d43

SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect.c

	Show All 37 Lines

	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int maxgrid,			void print_array(int maxgrid,
	DATA_TYPE POLYBENCH_2D(path,MAXGRID,MAXGRID,maxgrid,maxgrid))			DATA_TYPE POLYBENCH_2D(path,MAXGRID,MAXGRID,maxgrid,maxgrid))
	{			{
	int i, j;			int i, j;
				char printmat = malloc(maxgrid8);

	for (i = 0; i < maxgrid; i++)			for (i = 0; i < maxgrid; i++) {
	for (j = 0; j < maxgrid; j++) {			for (j = 0; j < maxgrid; j++)
	fprintf (stderr, DATA_PRINTF_MODIFIER, path[i][j]);			print_element(path[i][j], j*8, printmat);
	if ((i * maxgrid + j) % 20 == 0) fprintf (stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf (stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	/* Source (modified): http://www.cs.uic.edu/~iluican/reg_detect.c */			/* Source (modified): http://www.cs.uic.edu/~iluican/reg_detect.c */
	static			static
	void kernel_reg_detect(int niter, int maxgrid, int length,			void kernel_reg_detect(int niter, int maxgrid, int length,
	▲ Show 20 Lines • Show All 87 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect.reference_output

eef6cdaf9c1cdebe72d8f267816fad2d

7f77e12e2f9cf5f1c34d7a8b3d684c52

SingleSource/Benchmarks/Polybench/stencils/adi/adi.c

	Show All 39 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int n,			void print_array(int n,
	DATA_TYPE POLYBENCH_2D(X,N,N,n,n))			DATA_TYPE POLYBENCH_2D(X,N,N,n,n))

	{			{
	int i, j;			int i, j;
				char printmat = malloc(n8);

	for (i = 0; i < n; i++)			for (i = 0; i < n; i++) {
	for (j = 0; j < n; j++) {			for (j = 0; j < n; j++)
	fprintf(stderr, DATA_PRINTF_MODIFIER, X[i][j]);			print_element(X[i][j], j*8, printmat);
	if ((i * N + j) % 20 == 0) fprintf(stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf(stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_adi(int tsteps,			void kernel_adi(int tsteps,
	int n,			int n,
	▲ Show 20 Lines • Show All 78 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/stencils/adi/adi.reference_output

af131f7f24a4d9666b459cb425f6c6d2

1823d3f6cf83a9b28749d6edaa5fedc9

SingleSource/Benchmarks/Polybench/stencils/adi/adi.reference_output.small

3d3f4195b44049d25bcad04741e9b671

5182fd4f225bdeea7e9cb31766de5ee2

SingleSource/Benchmarks/Polybench/stencils/fdtd-2d/fdtd-2d.c

	Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines
	static			static
	void print_array(int nx,			void print_array(int nx,
	int ny,			int ny,
	DATA_TYPE POLYBENCH_2D(ex,NX,NY,nx,ny),			DATA_TYPE POLYBENCH_2D(ex,NX,NY,nx,ny),
	DATA_TYPE POLYBENCH_2D(ey,NX,NY,nx,ny),			DATA_TYPE POLYBENCH_2D(ey,NX,NY,nx,ny),
	DATA_TYPE POLYBENCH_2D(hz,NX,NY,nx,ny))			DATA_TYPE POLYBENCH_2D(hz,NX,NY,nx,ny))
	{			{
	int i, j;			int i, j;
				char printmat = malloc(ny8);

	for (i = 0; i < nx; i++)			for (i = 0; i < nx; i++) {
	for (j = 0; j < ny; j++) {			for (j = 0; j < ny; j++)
	fprintf(stderr, DATA_PRINTF_MODIFIER, ex[i][j]);			print_element(ex[i][j], j*8, printmat);
	fprintf(stderr, DATA_PRINTF_MODIFIER, ey[i][j]);			fputs(printmat, stderr);
	fprintf(stderr, DATA_PRINTF_MODIFIER, hz[i][j]);			for (j = 0; j < ny; j++)
	if ((i * nx + j) % 20 == 0) fprintf(stderr, "\n");			print_element(ey[i][j], j*8, printmat);
				fputs(printmat, stderr);
				for (j = 0; j < ny; j++)
				print_element(hz[i][j], j*8, printmat);
				fputs(printmat, stderr);
	}			}
	fprintf(stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_fdtd_2d(int tmax,			void kernel_fdtd_2d(int tmax,
	int nx,			int nx,
	▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/stencils/fdtd-2d/fdtd-2d.reference_output

a0c67eb784fcf1b38991d3c7ffe7381b

70ea9d7528f3022d056656af4d540929

SingleSource/Benchmarks/Polybench/stencils/fdtd-2d/fdtd-2d.reference_output.small

9246a4ce3e2d09992a0bff0693f4ac3c

cd41e645df13df9fa373727b6e285f64

SingleSource/Benchmarks/Polybench/stencils/fdtd-apml/fdtd-apml.c

Show First 20 Lines • Show All 77 Lines • ▼ Show 20 Lines	void print_array(int cz,
int cxm,		int cxm,
int cym,		int cym,
DATA_TYPE POLYBENCH_3D(Bza,CZ+1,CYM+1,CXM+1,cz+1,cym+1,cxm+1),		DATA_TYPE POLYBENCH_3D(Bza,CZ+1,CYM+1,CXM+1,cz+1,cym+1,cxm+1),
DATA_TYPE POLYBENCH_3D(Ex,CZ+1,CYM+1,CXM+1,cz+1,cym+1,cxm+1),		DATA_TYPE POLYBENCH_3D(Ex,CZ+1,CYM+1,CXM+1,cz+1,cym+1,cxm+1),
DATA_TYPE POLYBENCH_3D(Ey,CZ+1,CYM+1,CXM+1,cz+1,cym+1,cxm+1),		DATA_TYPE POLYBENCH_3D(Ey,CZ+1,CYM+1,CXM+1,cz+1,cym+1,cxm+1),
DATA_TYPE POLYBENCH_3D(Hz,CZ+1,CYM+1,CXM+1,cz+1,cym+1,cxm+1))		DATA_TYPE POLYBENCH_3D(Hz,CZ+1,CYM+1,CXM+1,cz+1,cym+1,cxm+1))
{		{
int i, j, k;		int i, j, k;

for (i = 0; i <= cz; i++)		for (i = 0; i <= cz; i++)
for (j = 0; j <= cym; j++)		for (j = 0; j <= cym; j++)
for (k = 0; k <= cxm; k++) {		for (k = 0; k <= cxm; k++) {
fprintf(stderr, DATA_PRINTF_MODIFIER, Bza[i][j][k]);		fprintf(stderr, DATA_PRINTF_MODIFIER, Bza[i][j][k]);
fprintf(stderr, DATA_PRINTF_MODIFIER, Ex[i][j][k]);		fprintf(stderr, DATA_PRINTF_MODIFIER, Ex[i][j][k]);
fprintf(stderr, DATA_PRINTF_MODIFIER, Ey[i][j][k]);		fprintf(stderr, DATA_PRINTF_MODIFIER, Ey[i][j][k]);
fprintf(stderr, DATA_PRINTF_MODIFIER, Hz[i][j][k]);		fprintf(stderr, DATA_PRINTF_MODIFIER, Hz[i][j][k]);
if ((i * cxm + j) % 20 == 0) fprintf(stderr, "\n");		if ((i * cxm + j) % 20 == 0) fprintf(stderr, "\n");
}		}
fprintf(stderr, "\n");		fprintf(stderr, "\n");
}		}


/* Main computational kernel. The whole function will be timed,		/* Main computational kernel. The whole function will be timed,
including the call and return. */		including the call and return. */
static		static
▲ Show 20 Lines • Show All 154 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/stencils/jacobi-1d-imper/jacobi-1d-imper.c

	Show All 36 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int n,			void print_array(int n,
	DATA_TYPE POLYBENCH_1D(A,N,n))			DATA_TYPE POLYBENCH_1D(A,N,n))

	{			{
	int i;			int i;
				char printmat = malloc(n8);

	for (i = 0; i < n; i++)			for (i = 0; i < n; i++)
	{			print_element(A[i], i*8, printmat);
	fprintf(stderr, DATA_PRINTF_MODIFIER, A[i]);			fputs(printmat, stderr);
	if (i % 20 == 0) fprintf(stderr, "\n");			free(printmat);
	}
	fprintf(stderr, "\n");
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_jacobi_1d_imper(int tsteps,			void kernel_jacobi_1d_imper(int tsteps,
	int n,			int n,
	▲ Show 20 Lines • Show All 52 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/stencils/jacobi-1d-imper/jacobi-1d-imper.reference_output

2eddb1dbda3879a4038397cfc8bf9a61

53110044128f4550e864b8dd19771ba2

SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper/jacobi-2d-imper.c

	Show All 37 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int n,			void print_array(int n,
	DATA_TYPE POLYBENCH_2D(A,N,N,n,n))			DATA_TYPE POLYBENCH_2D(A,N,N,n,n))

	{			{
	int i, j;			int i, j;
				char printmat = malloc(n8);

	for (i = 0; i < n; i++)			for (i = 0; i < n; i++) {
	for (j = 0; j < n; j++) {			for (j = 0; j < n; j++)
	fprintf(stderr, DATA_PRINTF_MODIFIER, A[i][j]);			print_element(A[i][j], j*8, printmat);
	if ((i * n + j) % 20 == 0) fprintf(stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf(stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_jacobi_2d_imper(int tsteps,			void kernel_jacobi_2d_imper(int tsteps,
	int n,			int n,
	▲ Show 20 Lines • Show All 54 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper/jacobi-2d-imper.reference_output

61690122698707dc06d0f72e055fe42f

58213157f3b250a08e33f260ecf70ee0

SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper/jacobi-2d-imper.reference_output.small

cc4e73b94e2b16da2a2cad3694c4bb8f

6a17fdd70db1d0cab444fdb6e69c1363

SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.c

	Show All 33 Lines
	/* DCE code. Must scan the entire live-out data.			/* DCE code. Must scan the entire live-out data.
	Can be used also to check the correctness of the output. */			Can be used also to check the correctness of the output. */
	static			static
	void print_array(int n,			void print_array(int n,
	DATA_TYPE POLYBENCH_2D(A,N,N,n,n))			DATA_TYPE POLYBENCH_2D(A,N,N,n,n))

	{			{
	int i, j;			int i, j;
				char printmat = malloc(n8);

	for (i = 0; i < n; i++)			for (i = 0; i < n; i++) {
	for (j = 0; j < n; j++) {			for (j = 0; j < n; j++)
	fprintf(stderr, DATA_PRINTF_MODIFIER, A[i][j]);			print_element(A[i][j], j*8, printmat);
				kristof.beylsUnsubmitted Not Done Reply Inline Actions Do I understand correctly that this code basically only prints out the values of the last row of the entire matrix (the offset is j8)? I think we'd want whatever the hash function implementation we end up with to still take all elements as input, to improve the chance of detecting a mis-compilation. I think the hash function can be really simple - no need for anything complex or secure; but we probably should feed in all matrix elements into the hash function. Maybe the straightforward solution here is to just print out the sum of all elements in a row, rather than each element in the row? Tobias may now these tests better: are we expecting bit-reproducible results for these tests? I'm guessing so unless DATA_PRINTF_MODIFIER in the original code was chosen so that it prints out with less precision? kristof.beyls:* Do I understand correctly that this code basically only prints out the values of the last row…
				rengolinAuthorUnsubmitted Not Done Reply Inline Actions No. print_element receives a float value (4 bytes) and expand into 8 nibles (8 bytes). So, every iteration of the print of A[i][j] will be on printmat[j8]. In there, j8 is only the initial position, on a streak of 8, not the only position printed. rengolin: No. print_element receives a float value (4 bytes) and expand into 8 nibles (8 bytes). So…
				kristof.beylsUnsubmitted Not Done Reply Inline Actions Yes, I got that, but given that this is a 2-dimensional matrix, with i indicating the row and j indicating the column, only using j to index the printed out result means that every iteration of the i-loop overwrites the results in printmat written on the previous iteration, right? The mallox(n8) also indicates there is only room to print a single row, not the entire matrix. Or maybe I'm still missing something? kristof.beyls:* Yes, I got that, but given that this is a 2-dimensional matrix, with i indicating the row and j…
				rengolinAuthorUnsubmitted Not Done Reply Inline Actions That's why the fputs below is inside the i loop. I'm printing one row at a time. This also saves a lot of memory and avoids trashing the allocators, helps caching, etc. Since the runtime now is indistinguishable from when not printing anything, I think it's a good trade-off. rengolin: That's why the fputs below is inside the i loop. I'm printing one row at a time. This also…
				kristof.beylsUnsubmitted Not Done Reply Inline Actions D'oh - I missed that. Cool, so we're still producing roughly the same amount of output - but way more efficiently. Provided these tests were already checking for bit-accurate results (I'm not sure - it probably depends on the DATA_PRINTF_MODIFIER), this looks good to me. kristof.beyls: D'oh - I missed that. Cool, so we're still producing roughly the same amount of output - but…
				rengolinAuthorUnsubmitted Not Done Reply Inline Actions Cool, so we're still producing roughly the same amount of output - but way more efficiently. Precisely. :) Provided these tests were already checking for bit-accurate results (I'm not sure - it probably depends on the DATA_PRINTF_MODIFIER), this looks good to me. They weren't, the modifier was mostly "%0.2f", and that's why one of the tests (fdtd-apml) didn't work with the new technique. But the gain was too small to justify any work towards that goal. However, most of the others worked out of the box on ARM, AArch64 and x86_64. I think having a more strict checking is ok, as long as we understand that this was not a requirement. Though, I think it's a good thing. I'll add a comment before print_element() to that effect, so if people see it failing, we can revert the ones that weren't exact. I can't see how an alternative would work without either using printf or accumulation of values. rengolin: > Cool, so we're still producing roughly the same amount of output - but way more efficiently.
	if ((i * n + j) % 20 == 0) fprintf(stderr, "\n");			fputs(printmat, stderr);
	}			}
	fprintf(stderr, "\n");			free(printmat);
	}			}


	/* Main computational kernel. The whole function will be timed,			/* Main computational kernel. The whole function will be timed,
	including the call and return. */			including the call and return. */
	static			static
	void kernel_seidel_2d(int tsteps,			void kernel_seidel_2d(int tsteps,
	int n,			int n,
	▲ Show 20 Lines • Show All 48 Lines • Show Last 20 Lines

SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.reference_output

73eaf8ec02cd5ecce448f83e00f93d1f

58213157f3b250a08e33f260ecf70ee0

SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.reference_output.small

cc4e73b94e2b16da2a2cad3694c4bb8f

6a17fdd70db1d0cab444fdb6e69c1363

SingleSource/Benchmarks/Polybench/utilities/polybench.h

Show First 20 Lines • Show All 600 Lines • ▼ Show 20 Lines	void* polybench_alloc_data(unsigned long long int n, int elt_size)
/// FIXME: detect overflow!		/// FIXME: detect overflow!
size_t val = n;		size_t val = n;
val *= elt_size;		val *= elt_size;
void* ret = xmalloc (val);		void* ret = xmalloc (val);

return ret;		return ret;
}		}

		/* To avoid calling printf M*M times (and make it run
		for a long time), we split the output into an encoded string,
		and print it as a simple char pointer, M times.*/
		static inline
		void print_element(float el, int pos, char *out)
		{
		union {
		float datum;
		char bytes[4];
		} block;

		block.datum = el;
		/* each nibble as a char, within the printable range */
		*(out+pos) = (block.bytes[0]&0xF0>>4)+'0';
		*(out+pos+1) = (block.bytes[0]&0x0F) +'0';
		*(out+pos+2) = (block.bytes[1]&0xF0>>4)+'0';
		*(out+pos+3) = (block.bytes[1]&0x0F) +'0';
		*(out+pos+4) = (block.bytes[2]&0xF0>>4)+'0';
		*(out+pos+5) = (block.bytes[2]&0x0F) +'0';
		*(out+pos+6) = (block.bytes[3]&0xF0>>4)+'0';
		*(out+pos+7) = (block.bytes[3]&0x0F) +'0';
		}
		kristof.beylsUnsubmitted Not Done Reply Inline Actions I'm not sure, but it looks like this may give different answers on big versus little endian machines (which is something the previous implementation didn't have)? Maybe just printing out the sum of each row of a matrix (i.e. 4000 floats being printed) instead of the entire matrix (40004000 floats being printed) already reduces the IO overhead to be in the noise? If not, a couple of rows could be summed up together to reduce the amount of IO further? kristof.beyls:* I'm not sure, but it looks like this may give different answers on big versus little endian…
		rengolinAuthorUnsubmitted Not Done Reply Inline Actions Yes, it does, and that's ok. We already have many tests with different output for big and little endian, and we deal with it by having a file called .reference_outputs.big-endian. I can't run it, as I don't have any big-endian box. If you do, and want to do it now, feel free to send me some reference outputs. If not, we can wait until someone that runs it sends us. That's how we've done it in the past. A sum of all the elements would also* not be perfect, as a double change would go unnoticed. I/O is not an issue any more, and the differences are indistinguishable from noise. rengolin: Yes, it does, and that's ok. We already have many tests with different output for big and…

#endif /* !POLYBENCH_H */		#endif /* !POLYBENCH_H */

This is an archive of the discontinued LLVM Phabricator instance.

[LNT] Reduce I/O execution time for PolybenchClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 29167

SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.c

SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.reference_output

SingleSource/Benchmarks/Polybench/datamining/correlation/correlation.reference_output.small

SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.c

SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.reference_output

SingleSource/Benchmarks/Polybench/datamining/covariance/covariance.reference_output.small

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/2mm/2mm.reference_output.small

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/3mm/3mm.reference_output.small

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/atax/atax.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/atax/atax.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/bicg/bicg.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/bicg/bicg.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/cholesky/cholesky.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/cholesky/cholesky.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/cholesky/cholesky.reference_output.small

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/doitgen/doitgen.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/doitgen/doitgen.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/doitgen/doitgen.reference_output.small

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemm/gemm.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemm/gemm.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemm/gemm.reference_output.small

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gemver/gemver.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv.reference_output.small

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/mvt/mvt.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/mvt/mvt.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/symm/symm.reference_output.small

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syr2k/syr2k.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syr2k/syr2k.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syr2k/syr2k.reference_output.small

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syrk/syrk.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syrk/syrk.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/syrk/syrk.reference_output.small

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trmm/trmm.c

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trmm/trmm.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trmm/trmm.reference_output.small

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/durbin/durbin.c

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.c

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.reference_output

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/gramschmidt/gramschmidt.reference_output.small

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/lu/lu.c

SingleSource/Benchmarks/Polybench/linear-algebra/solvers/lu/lu.reference_output

SingleSource/Benchmarks/Polybench/medley/floyd-warshall/floyd-warshall.c

SingleSource/Benchmarks/Polybench/medley/floyd-warshall/floyd-warshall.reference_output

SingleSource/Benchmarks/Polybench/medley/floyd-warshall/floyd-warshall.reference_output.small

SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect.c

SingleSource/Benchmarks/Polybench/medley/reg_detect/reg_detect.reference_output

SingleSource/Benchmarks/Polybench/stencils/adi/adi.c

SingleSource/Benchmarks/Polybench/stencils/adi/adi.reference_output

SingleSource/Benchmarks/Polybench/stencils/adi/adi.reference_output.small

SingleSource/Benchmarks/Polybench/stencils/fdtd-2d/fdtd-2d.c

SingleSource/Benchmarks/Polybench/stencils/fdtd-2d/fdtd-2d.reference_output

SingleSource/Benchmarks/Polybench/stencils/fdtd-2d/fdtd-2d.reference_output.small

SingleSource/Benchmarks/Polybench/stencils/fdtd-apml/fdtd-apml.c

SingleSource/Benchmarks/Polybench/stencils/jacobi-1d-imper/jacobi-1d-imper.c

SingleSource/Benchmarks/Polybench/stencils/jacobi-1d-imper/jacobi-1d-imper.reference_output

SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper/jacobi-2d-imper.c

SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper/jacobi-2d-imper.reference_output

SingleSource/Benchmarks/Polybench/stencils/jacobi-2d-imper/jacobi-2d-imper.reference_output.small

SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.c

SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.reference_output

SingleSource/Benchmarks/Polybench/stencils/seidel-2d/seidel-2d.reference_output.small

SingleSource/Benchmarks/Polybench/utilities/polybench.h

[LNT] Reduce I/O execution time for Polybench
ClosedPublic