Page MenuHomePhabricator

[AArch64] Swap 'lsl(val1,small-shmt)' to right hand side for ADD(lsl(val1,small-shmt), lsl(val2,large-shmt))
ClosedPublic

Authored by mingmingl on Oct 4 2022, 2:58 PM.

Details

Summary

On many aarch64 processors (Cortex A78, Neoverse N1/N2/V1, etc), ADD with LSL shift (shift-amount <= 4) has smaller latency and higher
throughput than ADD with larger shift (shift-amunt > 4). This is at least no-op for the rest of the processors.

Diff Detail

Event Timeline

mingmingl created this revision.Oct 4 2022, 2:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2022, 2:58 PM
mingmingl requested review of this revision.Oct 4 2022, 2:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2022, 2:58 PM
mingmingl updated this revision to Diff 465269.Oct 4 2022, 9:05 PM
mingmingl retitled this revision from [AArch64] Swap 'lsl(val1,small-shmt)' to right hand side for AND(lsl(val1,small-shmt), lsl(val2,large-shmt)) to [AArch64] Swap 'lsl(val1,small-shmt)' to right hand side for ADD(lsl(val1,small-shmt), lsl(val2,large-shmt)).
mingmingl edited the summary of this revision. (Show Details)

Run 'clang-format' and fix typo (AND->ADD)

dmgreen accepted this revision.Oct 9 2022, 1:43 AM

Sorry for the delay. This sounds OK to me. Like you say, it could be based on LSLFast, but it shouldn't be worse for other CPUs and would be worthwhile for generic at least.

Other than some nitpicks, LGTM

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
16750

AND -> ADD

16751

> 4

16752

correctness ;)

This revision is now accepted and ready to land.Oct 9 2022, 1:43 AM
mingmingl updated this revision to Diff 466401.Oct 9 2022, 5:23 PM
mingmingl marked 3 inline comments as done.

Address the comments.

Thanks for the reviews!

mingmingl updated this revision to Diff 466402.Oct 9 2022, 5:25 PM

Fix the typo (ADDD -> ADD)

This revision was landed with ongoing or failed builds.Oct 9 2022, 5:38 PM
This revision was automatically updated to reflect the committed changes.