Hi,
This patch adds an AArch64 specific PostRA MachineScheduler to try to schedule STP Q's to the same base-address in ascending order of offsets. We have found this to improve performance on Neoverse N1 and should not hurt other AArch64 cores.
Paths
| Differential D125377
[AArch64] Order STP Q's by ascending address ClosedPublic Authored by avieira on May 11 2022, 5:12 AM.
Details Summary Hi, This patch adds an AArch64 specific PostRA MachineScheduler to try to schedule STP Q's to the same base-address in ascending order of offsets. We have found this to improve performance on Neoverse N1 and should not hurt other AArch64 cores.
Diff Detail
Unit TestsFailed Event TimelineHerald added subscribers: javed.absar, hiraditya, kristof.beyls and 2 others. · View Herald Transcript Comment Actions In order sounds sensible to me. We may need a subtarget feature for this, it depends on what the Apple folks think, but we can always add one later if needed.
Comment Actions
I *think* it should be fine, but I can double-check.
avieira added inline comments.
Comment Actions LGTM, I think that should be fine. ordering by ascending address is likely also slightly more readable for our users.
This revision is now accepted and ready to land.May 19 2022, 1:11 PM Comment Actions hi @avieira , Does Neoverse N1 Out of order execution, why does the order of instruction launch significantly affect the performance? Comment Actions Hi @Allen , I believe that 'scheduling doesn't matter for Out of Order cores' is a long-standing myth that we've seen not to be correct. Yes, scheduling is definitely different for out of order cores, the problem shifts towards thinking about a sliding window of instructions that go into specific pipelines and dispatch queues and the likes. And you now find yourself trying to optimize the utilisation of pipelines, avoiding bubbles, rather than looking for the 'perfect sequence'. Out of order execution has some other limits and in some cases it helps if the compiler can lend the core a hand. In this case the Neoverse N1 prefers ascending STP Q's and an updated Neoverse N1 Software Optimization Guide will be reflecting this. This revision was landed with ongoing or failed builds.May 23 2022, 1:51 AM Closed by commit rG572fc7d2fd14: [AArch64] Order STP Q's by ascending address (authored by avieira). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 429991 llvm/lib/Target/AArch64/AArch64MachineScheduler.h
llvm/lib/Target/AArch64/AArch64MachineScheduler.cpp
llvm/lib/Target/AArch64/AArch64TargetMachine.cpp
llvm/lib/Target/AArch64/CMakeLists.txt
llvm/test/CodeGen/AArch64/GlobalISel/call-translator-variadic-musttail.ll
llvm/test/CodeGen/AArch64/argument-blocks-array-of-struct.ll
llvm/test/CodeGen/AArch64/arm64-memset-inline.ll
llvm/test/CodeGen/AArch64/ragreedy-local-interval-cost.ll
llvm/test/CodeGen/AArch64/sve-fixed-length-fp-select.ll
llvm/utils/gn/secondary/llvm/lib/Target/AArch64/BUILD.gn
|
nit: Stray newline