This is an archive of the discontinued LLVM Phabricator instance.

Microbenchmark to test runtime of truncate or zero-extend vector operations in AArch64
ClosedPublic

Authored by nilanjana_basu on Oct 19 2022, 10:59 AM.

Details

Summary

Add benchmarks to check runtime of truncate or zero-extend vector operations in AArch64.
This patch adds an initial set of benchmarks to check runtime of vectorized truncate or zero-extend operations in a loop for different vector types over different vector widths.
The goal of this initial benchmark is to check the impact of D133495, D135229 and D120571.

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptOct 19 2022, 10:59 AM
nilanjana_basu requested review of this revision.Oct 19 2022, 10:59 AM

Removed redundant code

fhahn requested changes to this revision.Oct 20 2022, 12:00 PM
fhahn added a subscriber: fhahn.

Thanks for the patch! I think it would be good to move the benchmark to a different file, as it is unrelated to measuring runtime check performance.

MicroBenchmarks/LoopVectorization/RuntimeChecks.cpp
7 ↗(On Diff #468978)

It should be sufficient to use a much larger iteration count like 10000, the main benchmark loop will make sure the function is run long enough to collect stable data.

135 ↗(On Diff #468978)

I don't think this is doing what you want at the moment. Instead of truncating to to i8 it is extending from i8. B and A should probably be flipped?

144 ↗(On Diff #468978)

It looks like this is missing the main benchmark loop that google benchmark requires:

  for (auto _ : state) {
...
  }
This revision now requires changes to proceed.Oct 20 2022, 12:00 PM
nilanjana_basu marked 2 inline comments as done.

Made a separate file for testing vector operations for truncate or zero extend. Added tests for truncate of different types of data types, with different vectorization width settings.

nilanjana_basu marked an inline comment as done.Oct 20 2022, 7:25 PM

Fixed a mistake where the same test was being ran twice

nilanjana_basu retitled this revision from Microbenchmark to test runtime of truncate vector operations in AArch64 to Microbenchmark to test runtime of truncate or zero-extend vector operations in AArch64.

Extended it to be generic enough for both truncate & zero-extend vector operations

nilanjana_basu set the repository for this revision to rT test-suite.Oct 25 2022, 5:12 PM

Removed two test cases whose related patches are not yet available.

nilanjana_basu edited the summary of this revision. (Show Details)Oct 31 2022, 1:15 PM
nilanjana_basu edited the summary of this revision. (Show Details)

All the comments have been addressed in the latest patch.

Removed the addition operation to keep only the truncate or zero-extend operation for a more focused performance comparison

t.p.northover accepted this revision.Nov 2 2022, 8:10 AM

I think this looks reasonable now.

Minor fix to comments

This revision was not accepted when it landed; it landed in state Needs Review.Nov 2 2022, 2:05 PM
This revision was automatically updated to reflect the committed changes.