This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Add VLDx/VSTx sched defs for machine-schedulers. NFCI.
ClosedPublic

Authored by javed.absar on May 12 2017, 2:31 AM.

Details

Summary

This patch adds missing scheds for Neon VLDx/VSTx instructions.
This will help one write schedulers easier/faster in the future for ARM sub-targets.
Existing models will not affected by this patch.

Diff Detail

Repository
rL LLVM

Event Timeline

javed.absar created this revision.May 12 2017, 2:31 AM
rengolin edited edge metadata.May 17 2017, 8:37 AM

Hi Javed,

This is an interesting addition, but without an idea how you're doing to use it, it looks like a lot of changes for not many improvements. Do you have a patch which can show this working on a target-specific basis?

Also, maybe the title should mention VLDx/VSTx, not "generic".

cheers,
--renato

javed.absar retitled this revision from [ARM] Generic Improvements to ARM Schedulers. NFCI. to [ARM] Add VLDx/VSTx sched defs for machine-schedulers. NFCI..May 19 2017, 2:49 AM

Hi Renato:
Thanks for reviewing. I will soon come back with the use-case, as you suggested.
Best Regards
Javed

javed.absar updated this revision to Diff 99805.EditedMay 22 2017, 2:41 PM

Hi Renato.

Here is one use for the sched-defs. I have simplified ARMScheduleR52.td where I found the cpu-specific vldx sched-defs to be equivalent to the generic ones.

The sched definitions provided here for vldx/vstx will help write schedulers quickly and efficiently in general. They wont cover all quirks for specific pipeleines of course.

Hi Javed,

Nice reduction! Do you have a rough idea of how much more we can save in complexity by this patch? For now, I think that one example is reason enough to do it anyway.

Just a small inline comment.

--renato

lib/Target/ARM/ARMScheduleR52.td
838 ↗(On Diff #99805)

Why not the same for VSTx?

Hi Renato:
It depends a bit on how much 'special features' the sub-target pipeline has. In some cases, just defining WriteRes to associate resource and latency etc with each SchedWrite type could suffice (e.g. for the Cortex-R52 the default WriteVLDx is sufficient).

For VSTx, unfortunately you will notice that things are bit different - e.g. R52Read_F2 which has a ReadAdvance of +2 and the default Sched in ARMInstrNEON.td does not capture it.
One could annotate further SchedRead types in ARMInstrNEON.td to capture this, but I think that would make it unnecessarily too detailed there.
Best Regards
--Javed

rengolin accepted this revision.May 23 2017, 9:08 AM

Ok, I think it's good as it is. Looking forward to more patches in this area. :)

--renato

This revision is now accepted and ready to land.May 23 2017, 9:08 AM
This revision was automatically updated to reflect the committed changes.