This is an archive of the discontinued LLVM Phabricator instance.

Microbenchmark to test runtime of truncate or zero-extend vector operations in AArch64

Authored by nilanjana_basu on Oct 19 2022, 10:59 AM.



Add benchmarks to check runtime of truncate or zero-extend vector operations in AArch64.
This patch adds an initial set of benchmarks to check runtime of vectorized truncate or zero-extend operations in a loop for different vector types over different vector widths.
The goal of this initial benchmark is to check the impact of D133495, D135229 and D120571.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptOct 19 2022, 10:59 AM
nilanjana_basu requested review of this revision.Oct 19 2022, 10:59 AM

Removed redundant code

fhahn requested changes to this revision.Oct 20 2022, 12:00 PM
fhahn added a subscriber: fhahn.

Thanks for the patch! I think it would be good to move the benchmark to a different file, as it is unrelated to measuring runtime check performance.

7 ↗(On Diff #468978)

It should be sufficient to use a much larger iteration count like 10000, the main benchmark loop will make sure the function is run long enough to collect stable data.

135 ↗(On Diff #468978)

I don't think this is doing what you want at the moment. Instead of truncating to to i8 it is extending from i8. B and A should probably be flipped?

144 ↗(On Diff #468978)

It looks like this is missing the main benchmark loop that google benchmark requires:

  for (auto _ : state) {
This revision now requires changes to proceed.Oct 20 2022, 12:00 PM
nilanjana_basu marked 2 inline comments as done.

Made a separate file for testing vector operations for truncate or zero extend. Added tests for truncate of different types of data types, with different vectorization width settings.

nilanjana_basu marked an inline comment as done.Oct 20 2022, 7:25 PM

Fixed a mistake where the same test was being ran twice

nilanjana_basu retitled this revision from Microbenchmark to test runtime of truncate vector operations in AArch64 to Microbenchmark to test runtime of truncate or zero-extend vector operations in AArch64.

Extended it to be generic enough for both truncate & zero-extend vector operations

nilanjana_basu set the repository for this revision to rT test-suite.Oct 25 2022, 5:12 PM

Removed two test cases whose related patches are not yet available.

nilanjana_basu edited the summary of this revision. (Show Details)Oct 31 2022, 1:15 PM
nilanjana_basu edited the summary of this revision. (Show Details)

All the comments have been addressed in the latest patch.

Removed the addition operation to keep only the truncate or zero-extend operation for a more focused performance comparison

t.p.northover accepted this revision.Nov 2 2022, 8:10 AM

I think this looks reasonable now.

Minor fix to comments

This revision was not accepted when it landed; it landed in state Needs Review.Nov 2 2022, 2:05 PM
This revision was automatically updated to reflect the committed changes.