This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE] Asm: Add MOVPRFX instructions.
ClosedPublic

Authored by sdesmalen on Jul 20 2018, 2:36 AM.

Details

Summary

This patch adds predicated and unpredicated MOVPRFX instructions, which
can be prepended to SVE instructions that are destructive on their first
source operand, to make them a constructive operation, e.g.

add z1.s, p0/m, z1.s, z2.s        <=> z1 = z1 + z2

can be made constructive:

movprfx z0, z1
add z0.s, p0/m, z0.s, z2.s        <=> z0 = z1 + z2

The predicated MOVPRFX instruction can additionally be used to zero
inactive elements, e.g.

movprfx z0.s, p0/z, z1.s
add z0.s, p0/m, z0.s, z2.s

Not all instructions can be prefixed with the MOVPRFX instruction
which is why this patch also adds a mechanism to validate prefixed
instructions. The exact rules when a MOVPRFX applies is detailed in
the SVE supplement of the Architectural Reference Manual.

This is patch [1/2] in a series to add MOVPRFX instructions:

Diff Detail

Repository
rL LLVM

Event Timeline

sdesmalen created this revision.Jul 20 2018, 2:36 AM

I've been discussing this with @olista01, and we feel this is a bit of a heavy weight approach to build another lookup table and put stuff in AArch64SystemOperands, while all we want to do is to mark an instruction if it can be combined with MOVPRFX. Machine instructions have a flag for Target Specific values (TSFlags), and we think this could be a good solution to mark an instruction if it can be combined. From a quick grep, it looks like we don't use TSFlags yet in AArch64AsmParser, but we do in the ARM backend and ARMAsmParser, so there are some examples there.

Hi Sjoerd, thanks for your feedback! We actually use the TSFlags in our downstream assembler, but had a few reasons to re-implement it with a GenericTable instead:

  • The TSFlags are used for all instructions, taking up global bits for only a relatively small selection of instructions, where these bits make little sense to non-movprfxable instructions. To better utilize the (potentially) cheap move/zeroing capabilities of MOVPRFX, we'll need more fields/bits to describe the kind of operation e.g. whether it is unary, binary, commutative, whether it has a reversed operation (e.g. sub and subr), which in our downstream compiler takes a total of 9 bits. With this in mind, I thought it made more sense to describe this in a separate table instead.
  • The table is not queried that often (only when encountered together with a MOVPRFX), so there is little runtime overhead.

What was the part that you think is heavyweight about creating a table in AArch64SystemOperands?

Okay, maybe "heavyweight" was not the right word to describe it, but I meant setting up all the machinery here: creating the lookup tables, doing the lookup. Whether an instruction can be combined, is a property of an instruction, so it looks a natural fit to encode this little bit of extra semantic information in the TSFlags of an instructions. It looks like TSFlags was invented to give instructions these kind of (niche) target properties, and it looks like this is what other backends are doing too. Yes, they are used for all instructions, but it's just 1 bit in a 64bit integer, and we are not using a lot of them yet, so don't risk exceeding the maximum of 64.

Updated the patch to use TSFlags instead of a separate look-up table to annotate instructions as being destructive operations.

SjoerdMeijer accepted this revision.Jul 30 2018, 6:39 AM

LGTM, thanks!

This revision is now accepted and ready to land.Jul 30 2018, 6:39 AM
This revision was automatically updated to reflect the committed changes.