Page MenuHomePhabricator

[ARM] MVE saturating truncates

Authored by dmgreen on Apr 6 2020, 2:49 PM.



This adds some custom lowering for VQMOV, an instruction that can be used to perform saturating truncates from a pair of min(max(X, -0x8000), 0x7fff), providing those constants are correct. This leaves a VQMOVNBs which saturates the value and inserts that into the bottom lanes of an existing vector. We then need to do something with the other lanes, extending the value using a vmovlb.

Ideally, as will often be the case, only the bottom lane of what remains will be demanded, allowing the vmovlb to be removed. Which should mean the instruction is either a equal or a win most of the time, and allows some extra follow-up folding to happen.

Diff Detail

Event Timeline

dmgreen created this revision.Apr 6 2020, 2:49 PM
yroux added a subscriber: yroux.Apr 7 2020, 9:33 AM

I'm not that familiar with that part to approve it without the review from someone else, but it looks good to me.

About the tests: if I haven't missed anything, I don't see any VQMOVNT variants being generated?


Can we do an early exit here?

if (VT != MVT::v4i32 || VT != MVT::v8i16)
  return SDValue();

I think that then also saves a few else-clauses in the different if-elseif-else statements below.



Also: top half -> bottom half?


now starting to doubt if I read the ARMARM correct, but same typo here?


nit: perhaps a comment explaining more what the opcode is:

// Vector (V) Saturating (Q) Move and Narrow (N),  signed/unsigned (s/u)
dmgreen marked 4 inline comments as done.Apr 8 2020, 4:12 AM

Thanks for taking a look.

About the tests: if I haven't missed anything, I don't see any VQMOVNT variants being generated?

Not with this one. I'll put up a review that transforms VMOVNT(VQMOVNB) -> VQMOVNT, once I have the tests.


Yep. Will fix.

The "Signed extended in to the top half" means we signed extend the result of the VQMOVNB into the top half of the registers with the SIGN_EXTEND_INREG. i.e. we create a "bottom", but need to do something with the top half too.


Like it.

dmgreen updated this revision to Diff 255954.Apr 8 2020, 4:14 AM
dmgreen marked an inline comment as done.


SjoerdMeijer accepted this revision.Apr 8 2020, 4:33 AM

Cheers, nice optimisation.
One nit/question that doesn't need another review.


ah yes, &&, boolean logic, it always confuses me ;-)


sorry, just double checking again, should this be:

Signed extended in to the bottom half.
This revision is now accepted and ready to land.Apr 8 2020, 4:33 AM
dmgreen updated this revision to Diff 255973.Apr 8 2020, 5:10 AM
dmgreen marked an inline comment as done.
dmgreen added inline comments.

We are producing a:
v8i16 %X = VQMOVNB v8i16 undef, v4i32 %in
v4i32 %Y = bitcast %X to v4i32
v4i32 %res = SIGN_EXTEND_INREG %Y, v4i16

So the VQMOVNB will set the "bottom" lanes (of the v8i16), but leave the top lanes undef. The SIGN_EXTEND_INREG will extend the bottom half into the top half of the v4i32, maing sure it's the same value as before with the min/max pair. Hopefully the SIGN_EXTEND_INREG will be demand-bitted away in a lot of cases, leaving the VQMOVNB.

I've tried to write that into the comment.

SjoerdMeijer added inline comments.Apr 8 2020, 5:55 AM

Ah, yep, sorry, got it.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptSat, May 16, 7:22 AM
Herald added a subscriber: danielkiss. · View Herald Transcript