Page MenuHomePhabricator

[PowerPC] Legalize v256i1 and v512i1 and implement load and store of these types

Authored by bsaleil on Jul 30 2020, 11:50 AM.



This patch legalizes the v256i1 and v512i1 types that will be used for MMA.

It implements loads and stores of these types.
v256i1 is a pair of VSX registers, so for this type, we load/store the two underlying registers.
v512i1 is used for MMA accumulators. So in addition to loading and storing the 4 associated VSX registers, we generate instructions to prime (copy the VSX registers to the accumulator) after loading and unprime (copy the accumulator back to the VSX registers) before storing.

This patch also adds the UACC register class that is necessary to implement the loads and stores. This class represents accumulator in their unprimed form and allow the distinction between primed and unprimed accumulators to avoid invalid copies of the VSX registers associated with primed accumulators.

Depends on D84847

Diff Detail

Event Timeline

bsaleil created this revision.Jul 30 2020, 11:50 AM
bsaleil requested review of this revision.Jul 30 2020, 11:50 AM
bsaleil updated this revision to Diff 290804.Sep 9 2020, 12:47 PM

Rebase so the patch can be applied on top of master. Also change the datalayout string on all ppc64 platforms to improve compatibility between object files.

bsaleil edited the summary of this revision. (Show Details)Sep 9 2020, 12:48 PM
lei accepted this revision.Sep 18 2020, 6:26 AM

Just some minor comments. Please address them prior to commit.


Maybe we can do an early exit instead of this if stmt here

if (VT != MVT::v256i1 && VT != MVT::v512i1)
  return Op;

assert(Subtarget.pairedVectorMemops()) &&
            "Type unsupported without paired vector support");

I believe the ck for v256i1 here is redundant cause MMA should also set pairedVectorMemops ....


same comment as above... early exit.

This revision is now accepted and ready to land.Sep 18 2020, 6:26 AM
amyk added a subscriber: amyk.Sep 18 2020, 1:58 PM
amyk added inline comments.

Just my opinion but maybe we can put this block under the anonymous patterns.

bsaleil updated this revision to Diff 294776.Sep 28 2020, 12:07 PM

Use early exit in lowering functions and extend test case

This revision was landed with ongoing or failed builds.Sep 28 2020, 12:40 PM
This revision was automatically updated to reflect the committed changes.