This feature explicitly enables the fusion of instructions for literal
generation and PC-relative address calculations on Cortex-A72, as
recommended in its Software Optimisation Guide, sections 4.11 and
4.12. This is a NFC, as the Cortex-A72 currently uses the Cortex-A57
processor model which already schedules those instructions back-to-back
when the PostRAScheduler is not used.
Details
Diff Detail
Event Timeline
lib/Target/AArch64/AArch64.td | ||
---|---|---|
221 | Should this be listed right after FeatureFPARMv8 and not after FeatureNEON to respect consistent ordering? e.g. see ProcA57 above. |
Hi Florian,
Is the plan to add the missing features (which would include also include FeatureFuseAES I guess) and then turn PostRA scheduling on?
I think we should benchmark this with PostRA scheduling on, to check its impact.
Cheers,
Silviu
Hi Silviu,
yes the plan is to include FeatureFuseAES for A72 as well, I'm going to commit a patch soon that adds FeatureFuseAES for Cortex-A72 and I'll also put up a patch for review that makes instruction fusion slightly more aggressive on AArch64.
I didn't plan to do anything with respect to PostRA, but I could benchmark the difference with and without PostRA.
Cheers
I didn't plan to do anything with respect to PostRA, but I could benchmark the difference with and without PostRA.
Hi Florian:
I looks to me like you don't have to to anything special for PostRA (see AArch64TargetMachine.cpp::282-292 - if(hasFuseLiterals()) ), other than benchmark it like Silviu mentions.
Best Regards, Javed.
Should this be listed right after FeatureFPARMv8 and not after FeatureNEON to respect consistent ordering? e.g. see ProcA57 above.