Add benchmarks to check runtime of truncate or zero-extend vector operations in AArch64.
This patch adds an initial set of benchmarks to check runtime of vectorized truncate or zero-extend operations in a loop for different vector types over different vector widths.
The goal of this initial benchmark is to check the impact of D133495, D135229 and D120571.
Details
Diff Detail
- Repository
- rT test-suite
- Build Status
Buildable 195605 Build 296494: arc lint + arc unit
Event Timeline
Thanks for the patch! I think it would be good to move the benchmark to a different file, as it is unrelated to measuring runtime check performance.
MicroBenchmarks/LoopVectorization/RuntimeChecks.cpp | ||
---|---|---|
7 ↗ | (On Diff #468978) | It should be sufficient to use a much larger iteration count like 10000, the main benchmark loop will make sure the function is run long enough to collect stable data. |
135 ↗ | (On Diff #468978) | I don't think this is doing what you want at the moment. Instead of truncating to to i8 it is extending from i8. B and A should probably be flipped? |
144 ↗ | (On Diff #468978) | It looks like this is missing the main benchmark loop that google benchmark requires: for (auto _ : state) { ... } |
Made a separate file for testing vector operations for truncate or zero extend. Added tests for truncate of different types of data types, with different vectorization width settings.
Removed the addition operation to keep only the truncate or zero-extend operation for a more focused performance comparison