This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SchedModels] Handle virtual registers in FP/NEON predicates
ClosedPublic

Authored by kpdev42 on Nov 26 2021, 7:26 AM.

Details

Summary

Current implementation of Check[HSDQ]Form predicates doesn’t handle virtual registers and therefore isn’t useful for pre-RA scheduling. Patch fixes this implementing two function predicates: CheckQForm for checking that instruction writes 128-bit NEON register and CheckFpOrNEON which checks that instruction writes FP register (any width). The latter supersedes Check[HSD]Form predicates which are not used individually.

OS Laboratory. Huawei Russian Research Institute. Saint-Petersburg

Diff Detail

Event Timeline

kpdev42 created this revision.Nov 26 2021, 7:26 AM
kpdev42 requested review of this revision.Nov 26 2021, 7:26 AM
kpdev42 updated this revision to Diff 398883.Jan 11 2022, 2:44 AM

Strip unneeded code and fix compilation issues

Ping

Sorry for the late reply.

The changes to the predicates look good to me.

However, I am not an expert of AArch64, so somebody else should review functions isFpOrNEON and isQForm.

This gets used by the exynos cpu schedule models, which are something I know little about. Ideally someone from that team would be able to check this and verify it was OK for performance. I'm not sure if that team exists in the same form it did in the past though.

llvm/lib/Target/AArch64/AArch64InstrInfo.h
106

This isn't a very general function if it only checks operand 0. There are AArch64 instructions that I would count as FP that return in X/W registers (the fcvt's for example). Perhaps make it clean in the description, that it is designed for use in scheduling models and only checks operand 0.

I would move this to bellow the getLoadStoreImmIdx and isPairableLdStInst too, so that the "load/store" functions can be kept together.

llvm/test/CodeGen/AArch64/misched-predicate-virtreg.mir
41

Is it possible to a better example? COPY instructions (especially no-op copys like these) are often expected to be removed by reg-alloc and any scheduling info you give them is likely misleading.

kpdev42 updated this revision to Diff 407861.Feb 11 2022, 6:24 AM

@dmgreen

Is it possible to a better example? COPY instructions (especially no-op copys like these) are often expected to be removed by reg-alloc and any scheduling info you give them is likely misleading.”

Well, after some further studying of Exynos M5 model the only instruction group where Q-Form predicate us used is FMINV/FMAXV/FMAXNMV/FMINNMV:

def : InstRW<[M5WriteNEONZ],  (instregex "^F(MAX|MIN)(NM)?Vv")>;

Q-Form of these instructions can be detected by operand 1, not operand 0. For instance default write resource for “fmaxv h0, v1.4h” is WriteVd and WriteVq for “fmaxv s0, v1.4s”. That’s why we’re now scanning all operands instead of just operand 0 (and both isFpOrNeon and isQForm can probably be considered general). See updated test case. Thank you!

dmgreen accepted this revision.Feb 14 2022, 1:14 AM

@dmgreen

Is it possible to a better example? COPY instructions (especially no-op copys like these) are often expected to be removed by reg-alloc and any scheduling info you give them is likely misleading.”

Well, after some further studying of Exynos M5 model the only instruction group where Q-Form predicate us used is FMINV/FMAXV/FMAXNMV/FMINNMV:

def : InstRW<[M5WriteNEONZ],  (instregex "^F(MAX|MIN)(NM)?Vv")>;

Q-Form of these instructions can be detected by operand 1, not operand 0. For instance default write resource for “fmaxv h0, v1.4h” is WriteVd and WriteVq for “fmaxv s0, v1.4s”. That’s why we’re now scanning all operands instead of just operand 0 (and both isFpOrNeon and isQForm can probably be considered general). See updated test case. Thank you!

I see. Perhaps it is sometimes clearer to be explicit in the schedule with opcode matching.

This looks like an improvement in either case, so LGTM.

This revision is now accepted and ready to land.Feb 14 2022, 1:14 AM
This revision was landed with ongoing or failed builds.Feb 17 2022, 2:43 AM
This revision was automatically updated to reflect the committed changes.