This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Split FPExt loads
ClosedPublic

Authored by dmgreen on Jun 7 2020, 4:25 AM.

Details

Summary

This extends PerformSplittingToWideningLoad to also handle FP_Ext, as well as sign and zero extends. It uses an integer extending load followed by a VCVTL on the bottom lanes to efficiently perform an fpext on a smaller than legal type.

The existing code had to be rewritten a little to not just split the node in two and let legalization handle it from there, but to actually split into legal chunks.

Diff Detail

Event Timeline

dmgreen created this revision.Jun 7 2020, 4:25 AM

You prefer to do more loads, as opposed to doing a single load and shuffling the result? Are loads really that cheap, or is the alternative just too terrible?

You prefer to do more loads, as opposed to doing a single load and shuffling the result? Are loads really that cheap, or is the alternative just too terrible?

Yeah. both actually. Loads are expected to be cheap (you can usually do a load with no stalls into a following mve instruction) and the alternative is to need to shuffle every lane into and out of registers.

MVE was designed with this "beats" system in mind, where 32bit chunks of the vector can architecturally overlap. Any instructions that cross beats are deemed to be expensive, and many you would expect just don't exist. So there is nothing that takes the bottom 4 lanes of a v8i16 and extends them into a v4i32. All the extends are done as values are loaded, or are done with t/b instructions like vmovlt/b.

Just adding up instructions will get a rough indication of cost. In some places there will be more depending on the CPU, but the M in MVE stands for M-Profile (err, I think) so in many ways it's fairly simple.

efriedma accepted this revision.Jun 9 2020, 10:21 AM

LGTM with one minor comment.

llvm/lib/Target/ARM/ARMISelLowering.cpp
15159

Instead of looping like this, you can probably just return DAG.getNode(ISD::CONCAT_VECTORS, DL, ConcatVT, Loads);.

This revision is now accepted and ready to land.Jun 9 2020, 10:21 AM
This revision was automatically updated to reflect the committed changes.