Page MenuHomePhabricator

[ARM] Add more validForTailPredication

Authored by samparker on Sep 16 2020, 3:26 AM.



Modify the unit test to inspect all MVE instructions and mark the load/store/move of vpr/p0 as valid, as well as the remaining scalar shifts.

Diff Detail

Event Timeline

samparker created this revision.Sep 16 2020, 3:26 AM
SjoerdMeijer accepted this revision.Sep 16 2020, 3:48 AM

Looks reasonable

This revision is now accepted and ready to land.Sep 16 2020, 3:48 AM
This revision was landed with ongoing or failed builds.Sep 16 2020, 3:54 AM
This revision was automatically updated to reflect the committed changes.
dmgreen added inline comments.Sep 16 2020, 10:09 AM

Can you explain what ramifications making these validForTailPredication has? And why it helps to make these part of the MVEDomain? (The are really just VFP instructions. They are unpredictable for non-mve, but given the options I think I would move these and scalar shifts out of the MVE domain!)

They sound like they would need to be handled specially in the backend pass anyway, but that pass is a bit complex for me to follow at this point.

samparker added inline comments.Sep 16 2020, 11:47 PM

It makes checking for instructions easier, as we ignore any non-mve instructions as they shouldn't affect anything with tail predication. These load/store/move can have an affect because they access VPR/P0. All these require MVE so I'm not sure I understand why you'd prefer them out of the MVEDomain?

dmgreen added inline comments.Sep 17 2020, 10:47 AM

It is useful to have a list of instructions that are all the beatwise MVE instructions. We use it for that (downstream) in scheduling, which will need an adjustment for this. We essentially get to choose what goes into the domain (as far as I understand), so treating it for just the set of instructions that work like MVE instructions sounds more useful to me.

The ARMLowOverheadLoops pass presumably has to look at VFP instructions anyway (and anything that could touch vfp regs).

samparker added inline comments.Sep 17 2020, 11:35 PM

This sounds fair enough, maybe MVEBeatDomain or something should be used for clarity. I can have a helper which looks for this domain and/or vpr operands. We don't explicitly track vfp regs, but they should be considered as use/defs of their q-reg. Are there are instructions that you are particularly concerned with?

dmgreen added inline comments.Sep 21 2020, 12:11 AM

I was thinking of anything that crossed lanes. like vmov s7, s0. I think a reverse load at the moment (if it got past cost modelling) would be something like, but you could imagine something similar in a lot of other cases too, if they were written from intrinsics.