Download Raw Diff

Details

Reviewers

Meinersbur
homerdin
dberris
MatzeB
hfinkel

Commits

rOLDT339006: Add Image dithering kernels using Benchmark Library
rL339006: Add Image dithering kernels using Benchmark Library

Summary

Apply D49339 befor this patch. as that contains some utilities functions required by this.

Diff Detail

Repository: rL LLVM

Event Timeline

proton created this revision.Jul 18 2018, 12:24 PM

Herald added subscribers: llvm-commits, mgorny. · View Herald TranscriptJul 18 2018, 12:24 PM

Meinersbur added a parent revision: D49339: [test-suite] Added Image Processing Kernels Using Benchmark Library: utilities functions.Jul 20 2018, 9:30 AM

homerdin added inline comments.Jul 26 2018, 11:42 AM

MicroBenchmarks/ImageProcessing/Dither/floydDitherKernel.cpp
9 ↗	(On Diff #156126)	WhiteSpace
10–11 ↗	(On Diff #156126)	Commented Code
MicroBenchmarks/ImageProcessing/Dither/orderedDitherKernel.cpp
17 ↗	(On Diff #156126)	Commented Code
52 ↗	(On Diff #156126)	You could remove the M==2, M==3 and M==8 conditions since M is defined as 4? That or you could pass another argument to set M and have a test for each size image with each value of M.

proton updated this revision to Diff 157593.Jul 26 2018, 3:30 PM

Meinersbur added inline comments.Jul 26 2018, 7:51 PM

MicroBenchmarks/ImageProcessing/Dither/floydDitherKernel.cpp
9 ↗	(On Diff #157593)	This file does not use C++ features, so you can use C99 variable length array parameters.
43 ↗	(On Diff #157593)	`temp1` is only used in the else case. No need to read the outputImage in all cases.

Changes: Used C99 VLAs in kernels.

Meinersbur added inline comments.Jul 27 2018, 2:22 PM

MicroBenchmarks/ImageProcessing/Dither/main.cpp
36 ↗	(On Diff #157760)	Does this work? It dereferences the pointer returned by malloc (which should be the uninitialized first bytes of the memory allocated be malloc) and then casts it to a pointer. Id' expect this to result in a segfault.
MicroBenchmarks/ImageProcessing/Dither/orderedDitherKernel.c
13 ↗	(On Diff #157760)	Do you need `m` to by a variable? It is only passed the constant `4` in this benchmark. The code would be more optimizable if the compiler knew the constant. Alternatively, you could run the kernel several times with different values of `m`.

proton added inline comments.Jul 27 2018, 2:48 PM

MicroBenchmarks/ImageProcessing/Dither/main.cpp
36 ↗	(On Diff #157760)	Yes, It works properly because the pointer returned by malloc is casted as *int () [][]*. So to access the array we have to first dereference the pointer (use it as (inputImage)[i][j]).

Changes: update m along with image size when called with the benchmark library. For verification, only m=4 is used.

changed array type from int (*) [][] to int * in main.cpp

Meinersbur retitled this revision from [test-suite] Added Image Processing Kernels Using Benchmark Library: Dither Algorithms to [test-suite] Add Image Processing Kernels Using Benchmark Library: Dither Algorithms.Jul 27 2018, 4:48 PM

What is the execution time of this?

MicroBenchmarks/ImageProcessing/Dither/main.cpp
36 ↗	(On Diff #157760)	Looks much better now
99 ↗	(On Diff #157792)	Could you add a comment that this exists to avoid the computation being optimized away?
148 ↗	(On Diff #157792)	Here as well

This revision is now accepted and ready to land.Jul 27 2018, 4:52 PM

added some comments

Did you check the execution time?

Closed by commit rL339006: Add Image dithering kernels using Benchmark Library (authored by proton). · Explain WhyAug 6 2018, 4:19 AM

This revision was automatically updated to reflect the committed changes.

MatzeB added inline comments.Sep 13 2018, 5:40 PM

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/main.cpp
144–154	I just noticed an odd effect here. A benchmark run now gives us 12 results for the same function running. test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/3 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/2 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/3 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/2 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/2 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/3 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/4 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/4 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/4 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/8 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/8 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/8 I just had a change that was mostly improving things but happened to slightly regress this benchmark. Unfortunately this effect is now multiplied by 12 so this benchmark has a far bigger weight than others...

Meinersbur added inline comments.Sep 14 2018, 12:51 PM

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/main.cpp
144–154	Do you suggest to reduce the number of variants?

proton added inline comments.Sep 14 2018, 1:31 PM

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/main.cpp
144–154	We can reduce the number of runs here but I think instead of hardcoding it here (like it is now), we can pass it as an argument from cmake file. Earlier I dropped this idea because I remember someone mentioning that multiple test size is not used in the test suite now so it seemed unnecessary then. This can help someone who may want to see the effect of some optimization (say tiling) on different input Matrix sizes and different zoom ratio here. For hardcoding it, we can remove the multiple zoom ratio here from b->Args({i, 2}); b->Args({i, 3}); b->Args({i, 4}); b->Args({i, 8}); to b->Args({i, 2}); b->Args({i, 8}); See Line: 107

Meinersbur mentioned this in D101844: [MicroBenchmarks] Add initial loop vectorization benchmarks..May 11 2021, 9:02 AM

Diff 159273

test-suite/trunk/MicroBenchmarks/ImageProcessing/CMakeLists.txt

add_subdirectory(Dither)

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/CMakeLists.txt

				set(IMAGEPROC_UTILS MicroBenchmarks/ImageProcessing/utils)
				list(APPEND CPPFLAGS -I ${CMAKE_SOURCE_DIR}/${IMAGEPROC_UTILS} -std=c++11)

				llvm_test_verify("${CMAKE_SOURCE_DIR}/HashProgramOutput.sh ${CMAKE_CURRENT_BINARY_DIR}/orderedOutput.txt")
				llvm_test_verify("${FPCMP} ${CMAKE_CURRENT_BINARY_DIR}/orderedOutput.txt ${CMAKE_CURRENT_SOURCE_DIR}/orderedDither.reference_output")

				llvm_test_verify("${CMAKE_SOURCE_DIR}/HashProgramOutput.sh ${CMAKE_CURRENT_BINARY_DIR}/floydOutput.txt")
				llvm_test_verify("${FPCMP} ${CMAKE_CURRENT_BINARY_DIR}/floydOutput.txt ${CMAKE_CURRENT_SOURCE_DIR}/floydDither.reference_output")

				llvm_test_run(WORKDIR ${CMAKE_CURRENT_BINARY_DIR})
				llvm_test_executable(Dither main.cpp orderedDitherKernel.c floydDitherKernel.c ../utils/ImageHelper.cpp ../utils/glibc_compat_rand.c)

				target_link_libraries(Dither benchmark)

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/dither.h

				/**
				Pankaj Kukreja
				github.com/proton0001
				Indian Institute of Technology Hyderabad
				*/
				#ifndef _DITHER_H_
				#define _DITHER_H_

				#define MaxGray 255
				#define MXGRAY 256

				#define HEIGHT 512
				#define WIDTH 512

				#endif /* _DITHER_H_ */

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/floydDither.reference_output

23473c2d34c91e33eaa3d5008ce3640e

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/floydDitherKernel.c

				/**
				Source: https://imagej.net/Dithering
				Modified by Pankaj Kukreja (github.com/proton0001)
				Indian Institute of Technology Hyderabad
				*/
				#include "dither.h"
				void floydDitherKernel(int height, int width, int inputImage[HEIGHT][WIDTH],
				int outputImage[height][width]) {
				for (int i = 0; i < height; i++) {
				for (int j = 0; j < width; j++) {
				outputImage[i][j] = inputImage[i][j];
				}
				}

				int err;
				int a, b, c, d;

				for (int i = 1; i < height - 1; i++) {
				for (int j = 1; j < width - 1; j++) {
				if (outputImage[i][j] > 127) {
				err = outputImage[i][j] - 255;
				outputImage[i][j] = 255;
				} else {
				err = outputImage[i][j] - 0;
				outputImage[i][j] = 0;
				}
				a = (err * 7) / 16;
				b = (err * 1) / 16;
				c = (err * 5) / 16;
				d = (err * 3) / 16;

				int temp1 = (outputImage[i][j + 1] + a);
				if (temp1 > 255) {
				outputImage[i][j + 1] = 255;
				} else if (temp1 < 0) {
				outputImage[i][j + 1] = 0;
				} else {
				outputImage[i][j + 1] = temp1;
				}

				int temp2 = (outputImage[i + 1][j + 1] + b);
				if (temp2 > 255) {
				outputImage[i + 1][j + 1] = 255;
				} else if (temp2 < 0) {
				outputImage[i + 1][j + 1] = 0;
				} else {
				outputImage[i + 1][j + 1] = temp2;
				}

				int temp3 = outputImage[i + 1][j + 0] + c;
				if (temp3 > 255) {
				outputImage[i + 1][j + 0] = 255;
				} else if (temp3 < 0) {
				outputImage[i + 1][j + 0] = 0;
				} else {
				outputImage[i + 1][j + 0] = temp3;
				}

				int temp4 = outputImage[i + 1][j - 1] + d;
				if (temp4 > 255) {
				outputImage[i + 1][j - 1] = 255;
				} else if (temp4 < 0) {
				outputImage[i + 1][j - 1] = 0;
				} else {
				outputImage[i + 1][j - 1] = temp4;
				}
				}
				}
				}

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/main.cpp

				/**
				Pankaj Kukreja
				github.com/proton0001
				Indian Institute of Technology Hyderabad
				*/
				#include "ImageHelper.h"
				#include "dither.h"
				#include <cmath>
				#include <iostream> // std::cerr

				#define BENCHMARK_LIB
				#ifdef BENCHMARK_LIB
				#include "benchmark/benchmark.h"
				#endif

				int *inputImage;
				extern "C" {
				void orderedDitherKernel(int height, int width, int inpImage, int outImage,
				int *temp, int n, int m);
				void floydDitherKernel(int height, int width, int inpImage, int outImage);
				}
				int main(int argc, char **argv) {
				#ifdef BENCHMARK_LIB
				::benchmark::Initialize(&argc, argv);
				#endif

				const char orderedOutputFilename = (char )"./orderedOutput.txt";
				const char floydOutputFilename = (char )"./floydOutput.txt";
				inputImage = (int )malloc(sizeof(int) HEIGHT * WIDTH);
				if (inputImage == NULL) {
				std::cerr << "Insufficient memory\n";
				exit(1);
				}
				initializeRandomImage(inputImage, HEIGHT, WIDTH);

				#ifdef BENCHMARK_LIB
				::benchmark::RunSpecifiedBenchmarks();
				#endif
				int outputImage = (int )malloc(sizeof(int) * HEIGHT * WIDTH);
				int temp = (int )malloc(sizeof(int) * HEIGHT * WIDTH);
				if (outputImage == NULL \|\| temp == NULL) {
				std::cerr << "Insufficient memory\n";
				exit(1);
				}
				orderedDitherKernel(HEIGHT, WIDTH, inputImage, outputImage, temp, 16, 4);
				saveImage(outputImage, orderedOutputFilename, HEIGHT, WIDTH);
				floydDitherKernel(HEIGHT, WIDTH, inputImage, outputImage);

				for (int i = 0; i < HEIGHT; i++) {
				outputImage[(i)*WIDTH + 0] = 0;
				outputImage[(i)*WIDTH + WIDTH - 1] = 0;
				}

				for (int j = 0; j < WIDTH; j++) {
				outputImage[(0) * WIDTH + j] = 0;
				outputImage[(HEIGHT - 1) * WIDTH + j] = 0;
				}

				saveImage(outputImage, floydOutputFilename, HEIGHT, WIDTH);
				free(temp);
				free(outputImage);
				free(inputImage);
				return EXIT_SUCCESS;
				}

				#ifdef BENCHMARK_LIB
				void BENCHMARK_ORDERED_DITHER(benchmark::State &state) {
				int height = state.range(0);
				int width = state.range(0);
				int m = state.range(1);
				int n = pow(m, 2);
				int outputImage = (int )malloc(sizeof(int) * height * width);
				int temp = (int )malloc(sizeof(int) * height * width);

				if (outputImage == NULL) {
				std::cerr << "Insufficient memory\n";
				exit(1);
				}
				/* This call is to warm up the cache */
				orderedDitherKernel(height, width, inputImage, outputImage, temp, n, m);

				for (auto _ : state) {
				orderedDitherKernel(height, width, inputImage, outputImage, temp, n, m);
				}
				/* Since we are not passing state.range as 20 this if case will always be
				* false. This call is to make compiler think that outputImage may be used
				* later so that above kernel calls will not optimize out */
				if (state.range(0) == 20) {
				saveImage(outputImage, (const char *)"failedCase.txt", height, width);
				}
				free(temp);
				free(outputImage);
				}

				#if (HEIGHT < WIDTH)
				#define MINIMUM_DIM HEIGHT
				#else
				#define MINIMUM_DIM WIDTH
				#endif

				static void CustomArguments(benchmark::internal::Benchmark *b) {
				int limit = MINIMUM_DIM;
				int start = 1;
				if (limit > 128) {
				start = 128;
				}
				for (int i = start; i <= limit; i <<= 1) {
				b->Args({i, 2});
				b->Args({i, 3});
				b->Args({i, 4});
				b->Args({i, 8});
				}
				}
				BENCHMARK(BENCHMARK_ORDERED_DITHER)
				->Apply(CustomArguments)
				->Unit(benchmark::kMicrosecond);

				void BENCHMARK_FLOYD_DITHER(benchmark::State &state) {

				int height = state.range(0);
				int width = state.range(0);

				int outputImage = (int )malloc(sizeof(int) * height * width);

				if (outputImage == NULL) {
				std::cerr << "Insufficient memory\n";
				exit(1);
				}
				/* This call is to warm up the cache */
				floydDitherKernel(height, width, inputImage, outputImage);
				for (auto _ : state) {
				floydDitherKernel(height, width, inputImage, outputImage);
				}
				/* Since we are not passing state.range as 20 this if case will always be
				* false. This call is to make compiler think that outputImage may be used
				* later so that above kernel calls will not optimize out */
				if (state.range(0) == 20) {
				saveImage(outputImage, (const char *)"failedCase.txt", height, width);
				}

				free(outputImage);
				}

				#if MINIMUM_DIM > 128
				BENCHMARK(BENCHMARK_FLOYD_DITHER)
				->RangeMultiplier(2)
				->Range(128, MINIMUM_DIM)
				->Unit(benchmark::kMicrosecond);
				#else
				BENCHMARK(BENCHMARK_FLOYD_DITHER)
				->RangeMultiplier(2)
				->Range(1, MINIMUM_DIM)
				->Unit(benchmark::kMicrosecond);
				#endif
				MatzeBUnsubmitted Not Done Reply Inline Actions I just noticed an odd effect here. A benchmark run now gives us 12 results for the same function running. test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/3 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/2 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/3 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/2 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/2 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/3 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/4 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/4 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/4 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/512/8 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/128/8 test-suite :: MicroBenchmarks/ImageProcessing/Dither/Dither.test:BENCHMARK_ORDERED_DITHER/256/8 I just had a change that was mostly improving things but happened to slightly regress this benchmark. Unfortunately this effect is now multiplied by 12 so this benchmark has a far bigger weight than others... MatzeB: I just noticed an odd effect here. A benchmark run now gives us 12 results for the same…
				MeinersburUnsubmitted Not Done Reply Inline Actions Do you suggest to reduce the number of variants? Meinersbur: Do you suggest to reduce the number of variants?
				protonAuthorUnsubmitted Not Done Reply Inline Actions We can reduce the number of runs here but I think instead of hardcoding it here (like it is now), we can pass it as an argument from cmake file. Earlier I dropped this idea because I remember someone mentioning that multiple test size is not used in the test suite now so it seemed unnecessary then. This can help someone who may want to see the effect of some optimization (say tiling) on different input Matrix sizes and different zoom ratio here. For hardcoding it, we can remove the multiple zoom ratio here from b->Args({i, 2}); b->Args({i, 3}); b->Args({i, 4}); b->Args({i, 8}); to b->Args({i, 2}); b->Args({i, 8}); See Line: 107 proton: We can reduce the number of runs here but I think instead of hardcoding it here (like it is…

				#endif

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/orderedDither.reference_output

7b339ccc04bbaebf30f44f4f3129756f

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/orderedDitherKernel.c

				/**
				Source: github -> https://github.com/brianwu02/ImageProcessing.git
				Modified by Pankaj Kukreja (github.com/proton0001)
				Indian Institute of Technology Hyderabad
				*/
				#include "dither.h"
				#include <math.h> // pow

				#define GAMMA 0.5

				void orderedDitherKernel(int height, int width, int inputImage[HEIGHT][WIDTH],
				int outputImage[height][width],
				int temp[height][width], int n, int m) {
				int scale;

				for (int i = 0; i < height; i++) {
				for (int j = 0; j < width; j++) {
				temp[i][j] =
				(int)(pow((double)inputImage[i][j] / 255.0, (1.0 / GAMMA)) * 255.0);
				}
				}

				scale = 256 / n;
				for (int i = 0; i < height; i++) {
				for (int j = 0; j < width; j++) {
				outputImage[i][j] = (int)(scale * (temp[i][j] / scale)) / scale;
				}
				}

				if (m == 2) {
				int dither[2][2] = {{0, 2}, {3, 1}};
				for (int y = 0; y < height; y++) {
				for (int x = 0; x < width; x++) {
				int i = x % m;
				int j = y % m;
				outputImage[y][x] = ((outputImage[y][x] > dither[i][j]) ? 255 : 0);
				}
				}
				} else if (m == 3) {
				int dither[3][3] = {{6, 8, 4}, {1, 0, 3}, {5, 2, 7}};
				for (int y = 0; y < height; y++) {
				for (int x = 0; x < width; x++) {
				int i = x % m;
				int j = y % m;
				outputImage[y][x] = ((outputImage[y][x] > dither[i][j]) ? 255 : 0);
				}
				}
				} else if (m == 4) {
				int dither[4][4] = {
				{0, 8, 2, 10}, {12, 4, 14, 6}, {3, 11, 1, 9}, {15, 7, 13, 5}};
				for (int y = 0; y < height; y++) {
				for (int x = 0; x < width; x++) {
				int i = x % m;
				int j = y % m;
				outputImage[y][x] = ((outputImage[y][x] > dither[i][j]) ? 255 : 0);
				}
				}
				} else if (m == 8) {
				int dither[8][8] = {
				{0, 48, 12, 60, 3, 51, 15, 63}, {32, 16, 44, 28, 35, 19, 47, 31},
				{8, 56, 4, 52, 11, 59, 7, 55}, {40, 24, 36, 20, 43, 27, 39, 23},
				{2, 50, 14, 62, 1, 49, 13, 61}, {34, 18, 46, 30, 33, 17, 45, 29},
				{10, 58, 6, 54, 9, 57, 5, 53}, {42, 26, 38, 22, 41, 25, 37, 21}};
				for (int y = 0; y < height; y++) {
				for (int x = 0; x < width; x++) {
				int i = x % m;
				int j = y % m;
				outputImage[y][x] = ((outputImage[y][x] > dither[i][j]) ? 255 : 0);
				}
				}
				}
				}

This is an archive of the discontinued LLVM Phabricator instance.

[test-suite] Add Image Processing Kernels Using Benchmark Library: Dither Algorithms
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 159273

test-suite/trunk/MicroBenchmarks/ImageProcessing/CMakeLists.txt

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/CMakeLists.txt

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/dither.h

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/floydDither.reference_output

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/floydDitherKernel.c

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/main.cpp

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/orderedDither.reference_output

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/orderedDitherKernel.c

This is an archive of the discontinued LLVM Phabricator instance.

[test-suite] Add Image Processing Kernels Using Benchmark Library: Dither AlgorithmsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 159273

test-suite/trunk/MicroBenchmarks/ImageProcessing/CMakeLists.txt

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/CMakeLists.txt

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/dither.h

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/floydDither.reference_output

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/floydDitherKernel.c

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/main.cpp

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/orderedDither.reference_output

test-suite/trunk/MicroBenchmarks/ImageProcessing/Dither/orderedDitherKernel.c

[test-suite] Add Image Processing Kernels Using Benchmark Library: Dither Algorithms
ClosedPublic