Page MenuHomePhabricator

[ARM][MVE] validForTailPredication insts

Authored by samparker on Sep 23 2019, 3:49 AM.



'Blacklist' a few more instruction classes, including:

  • instructions that 'narrow' their result.
  • lane moves.
  • byte swapping instructions.
  • interleaving loads and stores.
  • cross-beat carries.
  • top/bottom instructions.
  • complex operations.
  • long shifts and muls.

Please shout if you know of any other instructions that could be troublesome to predict.

Diff Detail

Event Timeline

samparker created this revision.Sep 23 2019, 3:49 AM

Exactly which instructions can and cannot go into tail predicated loops is still a bit of a mystery to me. As we talked about, a whitelist may be simpler than a blacklist to make sure we account for everything. (Thats maybe an approach if this ends up needing more in the future).

One thing I thought of, some lane moves may go through standard vmovs, or the vmovx and vins fp16 instructions. Also will interleaved fp instructions be fine in the same loop?

1945 ↗(On Diff #221281)

This has a cross beat carry, although I'm not sure that makes it invalid for tail predication.

1976 ↗(On Diff #221281)

VMOVL is used to perform widening.

2089 ↗(On Diff #221281)


3365 ↗(On Diff #221281)

Some of the complex instructions needed earlyclobber on them as they reach cross beat.

3387 ↗(On Diff #221281)


3489 ↗(On Diff #221281)

Is anything that is top/bottom a problem?

3536 ↗(On Diff #221281)

Cross beat carry again.

3931 ↗(On Diff #221281)

VDUP puts the value into all lanes, but I presume that would be OK.

Cheers Dave, top-bottom and early clobber sound like complications. Having a whitelist sounds like a good idea, with the list of our unknowns growing, it sounds like the switch statement could be smaller too!

samparker updated this revision to Diff 221926.Sep 26 2019, 5:26 AM
samparker retitled this revision from [ARM][MVE] More invalidForTailPredication insts to [ARM][MVE] validForTailPredication insts.
samparker edited the summary of this revision. (Show Details)

Inverted the predicate and created a whitelist instead.

danilaml added inline comments.Sep 26 2019, 5:30 AM
2 ↗(On Diff #221926)

Thanks! I realize that two other tests in this (Target) folder also have the same pattern in their CMakeLists and would benefit from the same adjustment. Not sure if it makes sense to include them in the same patch though.

samparker updated this revision to Diff 222844.Oct 2 2019, 8:36 AM

Added support for VFMA and VMLA

dmgreen accepted this revision.Oct 15 2019, 2:44 AM

Sounds good to me. Looks like a improvement over what was already here.

This revision is now accepted and ready to land.Oct 15 2019, 2:44 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptOct 15 2019, 6:15 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
This comment was removed by Yi-Hong.Lyu.