This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Select VINS from vector inserts
ClosedPublic

Authored by dmgreen on Jan 25 2021, 11:43 AM.

Details

Summary

This patch adds tablegen patterns for pairs of i16/f16 insert/extracts. If we are inserting into two adjacent vector lanes (0 and 1 for example), we can use either a vmov; vins or vmovx; vins` to insert the pair together, avoiding a round-trip from GRP registers. This is quite a large patterns with a number of EXTRACT_SUBREG/INSERT_SUBREG/COPY_TO_REGCLASS nodes, but hopefully as most of those become copies all that will be cleaned up by further optimizations.

The VINS pattern was also adjusted to allow it to represent that it is inserting into the top half of an existing register.

Diff Detail

Event Timeline

dmgreen created this revision.Jan 25 2021, 11:43 AM
dmgreen requested review of this revision.Jan 25 2021, 11:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 25 2021, 11:43 AM
SjoerdMeijer added inline comments.Jan 26 2021, 5:28 AM
llvm/lib/Target/ARM/ARMInstrVFP.td
1131

It's unclear to me why we need 2 inputs to model that it is inserting into the top half of an existing register. I forgot if this is how that's done, is there precedent for this?

dmgreen added inline comments.Jan 26 2021, 9:37 AM
llvm/lib/Target/ARM/ARMInstrVFP.td
1131

It's similar to how most things work, yeah. Things like VMOVHcc work like that, or any of the MVE vp.inactive where you conditionally move lanes of the input into the output. The VMOVN also have an input for Qd_src and an output for Qd, combining the two into a single value.

We are taking some of the value from Sda (the bottom half) and some from Sm (the top half), combining them into a single value.

SjoerdMeijer added inline comments.Jan 26 2021, 9:45 AM
llvm/lib/Target/ARM/ARMInstrVFP.td
1131

We are taking some of the value from Sda (the bottom half) and some from Sm (the top half), combining them into a single value.

Okay, but should the constraint then be:

"$Sd = $Sm"

Sorry, I might be using you to relearn isel patterns....

SjoerdMeijer added inline comments.Jan 26 2021, 9:47 AM
llvm/lib/Target/ARM/ARMInstrVFP.td
1131

correction:

"$Sda = $Sm"
dmgreen updated this revision to Diff 319371.EditedJan 26 2021, 11:55 AM

I managed to simplify the f16 side of this whilst working on another pattern. It now only handles the two inserts - not requiring them to be extracts too. This allows it to handle more cases with less code, which is always good to see.

It does require adding an added complexity to VCVTT to make it take precedence still.

llvm/lib/Target/ARM/ARMInstrVFP.td
1131

I think it's good as-is - we want to make sure the Sd and Sda are the same register. The final instruction is then VINS Sd, Sm, and with Sda==Sd we are now inserting Sm into the top half of Sda, to produce Sd.

SjoerdMeijer accepted this revision.Jan 27 2021, 7:58 AM

LGTM

llvm/lib/Target/ARM/ARMInstrVFP.td
1131

Ah, yep, got it again, thanks.

This revision is now accepted and ready to land.Jan 27 2021, 7:58 AM
This revision was automatically updated to reflect the committed changes.