Page MenuHomePhabricator

[ARM] Convert VDUPLANE to VDUP under MVE
ClosedPublic

Authored by dmgreen on Thu, May 7, 3:31 PM.

Details

Summary

Unlike Neon, MVE does not have a way of duplicating from a vector lane, so a VDUPLANE currently selects to a VDUP(move_from_lane(..)). This forces that to be done earlier as a dag combine to allow other folds to happen.

It converts to a VDUP(EXTRACT). On FP16 this is then folded to a VGETLANEu to prevent it from creating a vmovx;vmovhr pair, using a single move_from_reg instead.

Diff Detail

Event Timeline

dmgreen created this revision.Thu, May 7, 3:31 PM

Some of the code differences here make me suspect we're missing combines for VDUPLANE. But that's not really something you need to concern yourself with here, I guess.

If you never want VDUPLANE, it doesn't seem like there's much point to generating it in the first place; I guess you want to continue supporting it just to make it easier to share code between NEON and MVE?

llvm/lib/Target/ARM/ARMISelLowering.cpp
13858

I guess if you didn't have a special case for f16 here, you could still eventually get to the same place, but it would take some extra steps?

dmgreen marked an inline comment as done.Fri, May 8, 12:45 AM

If you never want VDUPLANE, it doesn't seem like there's much point to generating it in the first place; I guess you want to continue supporting it just to make it easier to share code between NEON and MVE?

Yep. They can be generate in a few different place, and although it would be possible to stop them being created, it complicates the logic. I agree it's strange on it's own to create a node only to convert it into something else, but if it keeps the buildvector/vectorshuffle code simpler and helps them be shared between neon and mve, I think this is probably simpler overall.

llvm/lib/Target/ARM/ARMISelLowering.cpp
13858

I was originally thinking this would need to look at the demanded bits of the VMOVrh which complicate things, but yeah it's simpler than that. With VGETLANEu we can add a fold easily enough and still get the top lanes correct. I can change things around to do it that way.

dmgreen updated this revision to Diff 262828.Fri, May 8, 12:56 AM

Now with an extra VMOVrh(extract(..)) -> VGETLANEu fold.

dmgreen edited the summary of this revision. (Show Details)Fri, May 8, 12:57 AM
This revision is now accepted and ready to land.Fri, May 8, 10:29 AM
This revision was automatically updated to reflect the committed changes.