This is an archive of the discontinued LLVM Phabricator instance.

[ARM] MVE saturating truncates
ClosedPublic

Authored by dmgreen on Apr 6 2020, 2:49 PM.

Details

Summary

This adds some custom lowering for VQMOV, an instruction that can be used to perform saturating truncates from a pair of min(max(X, -0x8000), 0x7fff), providing those constants are correct. This leaves a VQMOVNBs which saturates the value and inserts that into the bottom lanes of an existing vector. We then need to do something with the other lanes, extending the value using a vmovlb.

Ideally, as will often be the case, only the bottom lane of what remains will be demanded, allowing the vmovlb to be removed. Which should mean the instruction is either a equal or a win most of the time, and allows some extra follow-up folding to happen.

Diff Detail

Event Timeline

dmgreen created this revision.Apr 6 2020, 2:49 PM
yroux added a subscriber: yroux.Apr 7 2020, 9:33 AM

I'm not that familiar with that part to approve it without the review from someone else, but it looks good to me.

About the tests: if I haven't missed anything, I don't see any VQMOVNT variants being generated?

llvm/lib/Target/ARM/ARMISelLowering.cpp
14883

Can we do an early exit here?

if (VT != MVT::v4i32 || VT != MVT::v8i16)
  return SDValue();

I think that then also saves a few else-clauses in the different if-elseif-else statements below.

14921

Typo? VQMONVB -> VQMOVB?

Also: top half -> bottom half?

14965

now starting to doubt if I read the ARMARM correct, but same typo here?

llvm/lib/Target/ARM/ARMISelLowering.h
207

nit: perhaps a comment explaining more what the opcode is:

// Vector (V) Saturating (Q) Move and Narrow (N),  signed/unsigned (s/u)
dmgreen marked 4 inline comments as done.Apr 8 2020, 4:12 AM

Thanks for taking a look.

About the tests: if I haven't missed anything, I don't see any VQMOVNT variants being generated?

Not with this one. I'll put up a review that transforms VMOVNT(VQMOVNB) -> VQMOVNT, once I have the tests.

llvm/lib/Target/ARM/ARMISelLowering.cpp
14921

Yep. Will fix.

The "Signed extended in to the top half" means we signed extend the result of the VQMOVNB into the top half of the registers with the SIGN_EXTEND_INREG. i.e. we create a "bottom", but need to do something with the top half too.

llvm/lib/Target/ARM/ARMISelLowering.h
207

Like it.

dmgreen updated this revision to Diff 255954.Apr 8 2020, 4:14 AM
dmgreen marked an inline comment as done.

Updates.

SjoerdMeijer accepted this revision.Apr 8 2020, 4:33 AM

Cheers, nice optimisation.
One nit/question that doesn't need another review.

llvm/lib/Target/ARM/ARMISelLowering.cpp
14884

ah yes, &&, boolean logic, it always confuses me ;-)

14921

sorry, just double checking again, should this be:

Signed extended in to the bottom half.
This revision is now accepted and ready to land.Apr 8 2020, 4:33 AM
dmgreen updated this revision to Diff 255973.Apr 8 2020, 5:10 AM
dmgreen marked an inline comment as done.
dmgreen added inline comments.
llvm/lib/Target/ARM/ARMISelLowering.cpp
14921

We are producing a:
v8i16 %X = VQMOVNB v8i16 undef, v4i32 %in
v4i32 %Y = bitcast %X to v4i32
v4i32 %res = SIGN_EXTEND_INREG %Y, v4i16

So the VQMOVNB will set the "bottom" lanes (of the v8i16), but leave the top lanes undef. The SIGN_EXTEND_INREG will extend the bottom half into the top half of the v4i32, maing sure it's the same value as before with the min/max pair. Hopefully the SIGN_EXTEND_INREG will be demand-bitted away in a lot of cases, leaving the VQMOVNB.

I've tried to write that into the comment.

SjoerdMeijer added inline comments.Apr 8 2020, 5:55 AM
llvm/lib/Target/ARM/ARMISelLowering.cpp
14921

Ah, yep, sorry, got it.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMay 16 2020, 7:22 AM
Herald added a subscriber: danielkiss. · View Herald Transcript