Page MenuHomePhabricator

[PowerPC] Update handling of splat loads for v4i32/v4f32/v2i64 to require non-extending loads.

Authored by amyk on Jan 20 2022, 9:45 AM.



This patch updates how splat loads handled and is an extension of D106555.

Particularly, for v2i64/v4f32/v4i32 types, they are updated to handle only non-extending loads.
For v8i16/v16i8 types, they are updated to handle extending loads only if the memory VT width is
the same vector element VT type.

A test case has been added to illustrate a scenario where a PPCISD::LD_SPLAT node should not
be produced. In this test, it depicts the following f64 extending load used in a v2f64 build vector, but
the extending load is actually used in more places other than the build vector (such as in t12 and t16).

Type-legalized selection DAG: %bb.0 'test:entry'
SelectionDAG has 20 nodes:
  t0: ch = EntryToken
  t4: i64,ch = CopyFromReg t0, Register:i64 %1
  t6: i64,ch = CopyFromReg t0, Register:i64 %2
  t11: f64,ch = load<(load (s64) from %ir.b, !tbaa !7)> t0, t4, undef:i64
        t16: f64 = fadd t31, t37
      t34: ch = store<(store (s64) into %ir.c, !tbaa !7)> t31:1, t16, t6, undef:i64
    t36: ch = TokenFactor t34, t37:1
    t27: v2f64 = BUILD_VECTOR t37, t37
  t22: ch,glue = CopyToReg t36, Register:v2f64 $v2, t27
      t12: f64 = fadd t11, t37
    t28: ch = store<(store (s64) into %ir.b, !tbaa !7)> t11:1, t12, t4, undef:i64
  t31: f64,ch = load<(load (s64) from %ir.c, !tbaa !7)> t28, t6, undef:i64
    t2: i64,ch = CopyFromReg t0, Register:i64 %0
  t37: f64,ch = load<(load (s32) from %ir.a, !tbaa !3), anyext from f32> t0, t2, undef:i64
  t23: ch = PPCISD::RET_FLAG t22, Register:v2f64 $v2, t22:1

Diff Detail

Event Timeline

amyk created this revision.Jan 20 2022, 9:45 AM
amyk requested review of this revision.Jan 20 2022, 9:45 AM
nemanjai added inline comments.Jan 21 2022, 5:21 AM

Can you move this up and change to dyn_cast? Then we should be able to simplify the early exit to something like

if (!LN || !Subtarget.hasVSX() || !ISD::isUNINDEXEDLoad(LN))
  return false;

Please add a note that the load will be an extending load because scalar i8/i16 are not legal types.

amyk updated this revision to Diff 402066.Jan 21 2022, 12:13 PM

Address Nemanja's review comments:

  • Add a comment as to why extending loads are allowed for i8/i16 types (because they're not legal types)
  • Change the cast to LoadSDNode to dyn_cast, add early exits
amyk marked 2 inline comments as done.Jan 21 2022, 12:13 PM
nemanjai accepted this revision.Jan 25 2022, 2:10 AM

LGTM other than a small nit.


The word width here is out of place. We are not checking the width but the actual type so you can simply remove that word.

This revision is now accepted and ready to land.Jan 25 2022, 2:10 AM