This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] FP load balancing pass for Cortex-A57
ClosedPublic

Authored by jmolloy on Aug 5 2014, 8:37 AM.

Details

Summary

For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations.

This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU.

Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness).

This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected.

This patch has been reviewed off-list by Silviu Baranga and Jiangning Liu, and I have their LGTM.

Diff Detail

Event Timeline

jmolloy updated this revision to Diff 12197.Aug 5 2014, 8:37 AM
jmolloy retitled this revision from to [AArch64] FP load balancing pass for Cortex-A57.
jmolloy updated this object.
jmolloy edited the test plan for this revision. (Show Details)
jmolloy added a reviewer: rengolin.
jmolloy set the repository for this revision to rL LLVM.
jmolloy added a project: deleted.
jmolloy added a subscriber: t.p.northover.
rengolin removed a subscriber: t.p.northover.
rengolin accepted this revision.Aug 5 2014, 8:46 AM
rengolin edited edge metadata.

Hi James,

LGTM, but please wait for Tim's review.

cheers,
--renato

This revision is now accepted and ready to land.Aug 5 2014, 8:46 AM
t.p.northover edited edge metadata.Aug 8 2014, 2:19 AM

Hi James,

Just a couple of comments (one real, the other a nit), otherwise it looks fine to me.

Cheers.

Tim.

lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
318

Aren't these chains leaked? It *looks* like you could make this (or the set) a unique_ptr<Chain>.

520–522

It's usually neater to avoid the extra nesting:

if (!G->contains(I) && (&*I != G->getKill() || G->isKillImmutable())
  continue;
t.p.northover edited edge metadata.Aug 8 2014, 2:41 AM
t.p.northover added a subscriber: Unknown Object (MLST).

Just added llvm-commits.

lib/Target/AArch64/AArch64A57FPLoadBalancing.cpp
318

Also, this should definitely be the set. I hadn't noticed things get removed from ActiveChains when I wrote this. Should have been obvious, in retrospect.

rengolin closed this revision.Sep 17 2014, 3:19 PM

r215199