This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Fix N2 SchedModel INS instruction latencies
ClosedPublic

Authored by SjoerdMeijer on Feb 21 2023, 11:06 AM.

Details

Summary

The instruction regexp "^INSv" for the insert gen-reg-to-element was also matching the element-to-element instruction, which has a latency of 2 and not 5 according to the Software Optimization Guide [1], so we were getting that wrong.

I haven't done any performance runs with this change because I don't have access to N2 hardware and also because the fix is hopefully obvious enough. My use-case with this was llvm-mca which is getting things wrong because of this.

[1] https://developer.arm.com/documentation/PJDOC-466751330-18256/latest/

Diff Detail

Event Timeline

SjoerdMeijer created this revision.Feb 21 2023, 11:06 AM
SjoerdMeijer requested review of this revision.Feb 21 2023, 11:06 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 21 2023, 11:06 AM
dmgreen accepted this revision.Feb 21 2023, 11:47 PM

Sounds good to me. This LGTM but you may be able to remove the extra pattern, and it may be worth quickly adding tests for each of the 4 type sizes.

llvm/lib/Target/AArch64/AArch64SchedNeoverseN2.td
1216

I don't think this line needs to be added, as it will be handled by WriteV being N2Write_2cyc_1V already. The tighter regex on INSv..gpr should be enough. I believe that is what it means above in:

// ASIMD insert, element to element
// ...
// Handled by SchedAlias<WriteV[dq], ...>
This revision is now accepted and ready to land.Feb 21 2023, 11:47 PM

Thanks, and I will apply those changes before committing.

This revision was landed with ongoing or failed builds.Feb 22 2023, 3:21 AM
This revision was automatically updated to reflect the committed changes.