This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Translate shufflevector
ClosedPublic

Authored by volkan on Mar 14 2017, 5:17 PM.

Diff Detail

Event Timeline

volkan created this revision.Mar 14 2017, 5:17 PM
javed.absar edited edge metadata.Mar 16 2017, 2:42 PM

Thanks Volkan.
Adding a zip to the test above might make the testing stronger, i.e something like shufflevector <8 x i8> %tmp1, <8 x i8> %tmp2, <16 x i32> <i32 0, i32 8, i32 1, i32 9, i32 2, i32 10, i32 3, i32 11, i32 4, i32 12, i32 5, i32 13, i32 6, i32 14, i32 7, i32 15>

Also, would you consider adding a test for arm (32-bit) as well (need not be a big one).
Best Regards.

volkan updated this revision to Diff 92187.Mar 17 2017, 1:32 PM

Added more tests.

javed.absar accepted this revision.Mar 18 2017, 9:09 AM

Thanks Volkan. LGTM.

This revision is now accepted and ready to land.Mar 18 2017, 9:09 AM
ab edited edge metadata.Mar 19 2017, 9:07 AM

We don't need to do this now, but should we encode the mask in the G_SHUFFLE_VECTOR itself? Either as multiple Imm MOs, or a single new 'VectorMask' operand of some sort?

test/CodeGen/AArch64/GlobalISel/arm64-irtranslator.ll
1423

Should we turn this into an extractelt in the IRTranslator? Seems like having vector ops 0,1,2 is a nice invariant. (which would be nice to check in the verifier, or even in the MIR parser, if we ever add type constraints in GenericOpcodes.td)

dsanders edited edge metadata.Mar 20 2017, 6:26 AM
In D30962#704916, @ab wrote:

We don't need to do this now, but should we encode the mask in the G_SHUFFLE_VECTOR itself? Either as multiple Imm MOs, or a single new 'VectorMask' operand of some sort?

I think we should probably stick with a single operand to represent the mask since some targets can use vector registers to specify the shuffle mask (e.g. VSHF.df on Mips), but I do agree that a single operand containing a whole constant mask would be nicer to match than the current G_MERGE_VALUES of G_CONSTANTS since there's lots of ways of writing the same constant right now. For example:

(G_MERGE_VALUES (G_CONSTANT i32 0), (G_CONSTANT i32 1), (G_CONSTANT i32 2), (G_CONSTANT i32 3))

and:

(G_MERGE_VALUES (G_MERGE_VALUES (G_CONSTANT i32 0), (G_CONSTANT i32 1)), (G_MERGE_VALUES (G_CONSTANT i32 2), (G_CONSTANT i32 3)))

are the same. The snag is that we would need to limit any flattening to the supportable masks (or be able to revert the transformation).

While I'm thinking about legalizing shuffles, it would be nice in some ways if LegalizeAction were an object so we could write something like:

extern bool isReversedElements(const MachineOperand &);
extern bool isInterleavedElements(const MachineOperand &);

for (const auto &Ty : {v2s64, v4s32, v8s16, v16s8}) {
  setAction({G_SHUFFLE_VECTOR, Ty}, ContentSensitiveLegalizer(/* Default case */ Lower));
  auto &Legalizer = getAction({G_SHUFFLE_VECTOR, Ty});
  if (hasFoo()) {
    Legalizer.addCase(isReversedElements(), Legal);
    Legalizer.addCase(isInterleavedElements(), Legal);
  }
  if (hasBar()) {
    Legalizer.addCase([](const MachineOperand &Op) { ... each 4-elements sub-sequence only shuffles within itself ... }, Legal);
    // Register-based shuffles handle everything.
    Legalizer.setDefaultCase(Legal);
  }
}

We can achieve the same effect with Custom but it seems nice to keep the high-level conditions together in AArch64LegalizerInfo::AArch64LegalizerInfo() instead of distributing them across various callbacks and re-checking them for every MachineInstr.

In D30962#704916, @ab wrote:

We don't need to do this now, but should we encode the mask in the G_SHUFFLE_VECTOR itself? Either as multiple Imm MOs, or a single new 'VectorMask' operand of some sort?

I think we should probably stick with a single operand to represent the mask since some targets can use vector registers to specify the shuffle mask (e.g. VSHF.df on Mips), but I do agree that a single operand containing a whole constant mask would be nicer to match than the current G_MERGE_VALUES of G_CONSTANTS since there's lots of ways of writing the same constant right now. For example:

I agree, a single operand would be better. In this way, we can simplify constant vectors as well and this problem would be fixed automatically as the mask is a constant vector.

(G_MERGE_VALUES (G_CONSTANT i32 0), (G_CONSTANT i32 1), (G_CONSTANT i32 2), (G_CONSTANT i32 3))

and:

(G_MERGE_VALUES (G_MERGE_VALUES (G_CONSTANT i32 0), (G_CONSTANT i32 1)), (G_MERGE_VALUES (G_CONSTANT i32 2), (G_CONSTANT i32 3)))

are the same. The snag is that we would need to limit any flattening to the supportable masks (or be able to revert the transformation).

While I'm thinking about legalizing shuffles, it would be nice in some ways if LegalizeAction were an object so we could write something like:

extern bool isReversedElements(const MachineOperand &);
extern bool isInterleavedElements(const MachineOperand &);

for (const auto &Ty : {v2s64, v4s32, v8s16, v16s8}) {
  setAction({G_SHUFFLE_VECTOR, Ty}, ContentSensitiveLegalizer(/* Default case */ Lower));
  auto &Legalizer = getAction({G_SHUFFLE_VECTOR, Ty});
  if (hasFoo()) {
    Legalizer.addCase(isReversedElements(), Legal);
    Legalizer.addCase(isInterleavedElements(), Legal);
  }
  if (hasBar()) {
    Legalizer.addCase([](const MachineOperand &Op) { ... each 4-elements sub-sequence only shuffles within itself ... }, Legal);
    // Register-based shuffles handle everything.
    Legalizer.setDefaultCase(Legal);
  }
}

We can achieve the same effect with Custom but it seems nice to keep the high-level conditions together in AArch64LegalizerInfo::AArch64LegalizerInfo() instead of distributing them across various callbacks and re-checking them for every MachineInstr.

volkan closed this revision.Mar 21 2017, 1:56 AM