This is an archive of the discontinued LLVM Phabricator instance.

[test-suite] Add Image Processing Kernels Using Benchmark Library: Dither Algorithms
ClosedPublic

Authored by proton on Jul 18 2018, 12:24 PM.

Details

Diff Detail

Repository
rL LLVM

Event Timeline

proton created this revision.Jul 18 2018, 12:24 PM
homerdin added inline comments.Jul 26 2018, 11:42 AM
MicroBenchmarks/ImageProcessing/Dither/floydDitherKernel.cpp
9 ↗(On Diff #156126)

WhiteSpace

10–11 ↗(On Diff #156126)

Commented Code

MicroBenchmarks/ImageProcessing/Dither/orderedDitherKernel.cpp
17 ↗(On Diff #156126)

Commented Code

52 ↗(On Diff #156126)

You could remove the M==2, M==3 and M==8 conditions since M is defined as 4? That or you could pass another argument to set M and have a test for each size image with each value of M.

proton updated this revision to Diff 157593.Jul 26 2018, 3:30 PM
Meinersbur added inline comments.Jul 26 2018, 7:51 PM
MicroBenchmarks/ImageProcessing/Dither/floydDitherKernel.cpp
9 ↗(On Diff #157593)

This file does not use C++ features, so you can use C99 variable length array parameters.

43 ↗(On Diff #157593)

temp1 is only used in the else case. No need to read the outputImage in all cases.

proton updated this revision to Diff 157760.Jul 27 2018, 2:01 PM

Changes: Used C99 VLAs in kernels.

Meinersbur added inline comments.Jul 27 2018, 2:22 PM
MicroBenchmarks/ImageProcessing/Dither/main.cpp
36 ↗(On Diff #157760)

Does this work? It dereferences the pointer returned by malloc (which should be the uninitialized first bytes of the memory allocated be malloc) and then casts it to a pointer. Id' expect this to result in a segfault.

MicroBenchmarks/ImageProcessing/Dither/orderedDitherKernel.c
13 ↗(On Diff #157760)

Do you need m to by a variable? It is only passed the constant 4 in this benchmark. The code would be more optimizable if the compiler knew the constant.

Alternatively, you could run the kernel several times with different values of m.

proton added inline comments.Jul 27 2018, 2:48 PM
MicroBenchmarks/ImageProcessing/Dither/main.cpp
36 ↗(On Diff #157760)

Yes, It works properly because the pointer returned by malloc is casted as int (*) [][]. So to access the array we have to first dereference the pointer (use it as (*inputImage)[i][j]).

proton updated this revision to Diff 157773.Jul 27 2018, 2:50 PM

Changes: update m along with image size when called with the benchmark library. For verification, only m=4 is used.

proton updated this revision to Diff 157792.Jul 27 2018, 3:40 PM
proton marked an inline comment as done.

changed array type from int (*) [][] to int * in main.cpp

Meinersbur retitled this revision from [test-suite] Added Image Processing Kernels Using Benchmark Library: Dither Algorithms to [test-suite] Add Image Processing Kernels Using Benchmark Library: Dither Algorithms.Jul 27 2018, 4:48 PM
Meinersbur accepted this revision.Jul 27 2018, 4:52 PM

What is the execution time of this?

MicroBenchmarks/ImageProcessing/Dither/main.cpp
36 ↗(On Diff #157760)

Looks much better now

99 ↗(On Diff #157792)

Could you add a comment that this exists to avoid the computation being optimized away?

148 ↗(On Diff #157792)

Here as well

This revision is now accepted and ready to land.Jul 27 2018, 4:52 PM
proton updated this revision to Diff 157855.Jul 28 2018, 7:07 AM

added some comments

Did you check the execution time?

This revision was automatically updated to reflect the committed changes.
MatzeB added inline comments.Sep 13 2018, 5:40 PM
test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/main.cpp
144–154

I just noticed an odd effect here. A benchmark run now gives us 12 results for the same function running.

test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/3
test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/2
test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/3
test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/2
test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/2
test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/3
test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/4
test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/4
test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/4
test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/8
test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/8
test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/8

I just had a change that was mostly improving things but happened to slightly regress this benchmark. Unfortunately this effect is now multiplied by 12 so this benchmark has a far bigger weight than others...

Meinersbur added inline comments.Sep 14 2018, 12:51 PM
test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/main.cpp
144–154

Do you suggest to reduce the number of variants?

proton added inline comments.Sep 14 2018, 1:31 PM
test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/main.cpp
144–154

We can reduce the number of runs here but I think instead of hardcoding it here (like it is now), we can pass it as an argument from cmake file.
Earlier I dropped this idea because I remember someone mentioning that multiple test size is not used in the test suite now so it seemed unnecessary then.
This can help someone who may want to see the effect of some optimization (say tiling) on different input Matrix sizes and different zoom ratio here.
For hardcoding it, we can remove the multiple zoom ratio here
from

b->Args({i, 2});
b->Args({i, 3});
b->Args({i, 4});
b->Args({i, 8});

to

b->Args({i, 2});
b->Args({i, 8});

See Line: 107