This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Add `X,Y<dead> = G_UNMERGE Z` -> X = G_TRUNC Z
ClosedPublic

Authored by qcolombet on Sep 4 2020, 6:22 PM.

Details

Summary

Add a combiner helper that replaces G_UNMERGE where all the destination lanes
are dead except the first one with a G_TRUNC.

Diff Detail

Event Timeline

qcolombet created this revision.Sep 4 2020, 6:22 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 4 2020, 6:22 PM
qcolombet requested review of this revision.Sep 4 2020, 6:22 PM
qcolombet added inline comments.Sep 4 2020, 6:30 PM
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.store.2d.d16.ll
164

@arsenm At first glance all the changes in AMDGPU seems fine but this one.

Looking at when the transformation kicks in, the input is:

%16:_(<6 x s16>) = G_CONCAT_VECTORS %13:_(<2 x s16>), %14:_(<2 x s16>), %15:_(<2 x s16>)
%3:_(<3 x s16>), %17:_(<3 x s16>) = G_UNMERGE_VALUES %16:_(<6 x s16>)
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.image.store.2d), %3:_(<3 x s16>), 7, %1:_(s32), %2:_(s32), %0:_(<8 x s32>), 0, 0 :: (dereferenceable store 6 into custom "TargetCustom8", align 8)
S_ENDPGM 0

And the output is:

%16:_(<6 x s16>) = G_CONCAT_VECTORS %13:_(<2 x s16>), %14:_(<2 x s16>), %15:_(<2 x s16>)
%19:_(s96) = G_BITCAST %16:_(<6 x s16>)
%20:_(s48) = G_TRUNC %19:_(s96)
%3:_(<3 x s16>) = G_BITCAST %20:_(s48)
G_INTRINSIC_W_SIDE_EFFECTS intrinsic(@llvm.amdgcn.image.store.2d), %3:_(<3 x s16>), 7, %1:_(s32), %2:_(s32), %0:_(<8 x s32>), 0, 0 :: (dereferenceable store 6 into custom "TargetCustom8", align 8)
S_ENDPGM 0

So far so good.

Then after the legalizer it is when we have the craziness:

%16:_(<6 x s16>) = G_CONCAT_VECTORS %13:_(<2 x s16>), %14:_(<2 x s16>), %15:_(<2 x s16>)
%19:_(s96) = G_BITCAST %16:_(<6 x s16>)
%28:_(s32), %29:_(s32), %30:_(s32) = G_UNMERGE_VALUES %19:_(s96)
%35:_(s32) = G_CONSTANT i32 16
%36:_(s32) = G_LSHR %28:_, %35:_(s32)
%37:_(s32) = G_LSHR %29:_, %35:_(s32)
%46:_(s32) = G_CONSTANT i32 65535
%49:_(s32) = COPY %28:_(s32)
%40:_(s32) = G_AND %49:_, %46:_
%48:_(s32) = COPY %36:_(s32)
%41:_(s32) = G_AND %48:_, %46:_
%42:_(s32) = G_SHL %41:_, %35:_(s32)
%38:_(s32) = G_OR %40:_, %42:_
%32:_(<2 x s16>) = G_BITCAST %38:_(s32)
%47:_(s32) = COPY %29:_(s32)
%43:_(s32) = G_AND %47:_, %46:_
%44:_(s32) = G_CONSTANT i32 0
%45:_(s32) = G_SHL %44:_, %35:_(s32)
%39:_(s32) = G_OR %43:_, %45:_
%33:_(<2 x s16>) = G_BITCAST %39:_(s32)
%34:_(<6 x s16>) = G_CONCAT_VECTORS %32:_(<2 x s16>), %33:_(<2 x s16>), %15:_(<2 x s16>)
%3:_(<3 x s16>) = G_EXTRACT %34:_(<6 x s16>), 0
%21:_(<2 x s32>) = G_BUILD_VECTOR %1:_(s32), %2:_(s32)
G_AMDGPU_INTRIN_IMAGE_STORE intrinsic(@llvm.amdgcn.image.store.2d), %3:_(<3 x s16>), 7, %21:_(<2 x s32>), $noreg, %0:_(<8 x s32>), 0, 0, 0 :: (dereferenceable store 6 into custom "TargetCustom8", align 8)
S_ENDPGM 0

Do you think the AMDGPU target is missing something or should I disable the combine for vector types, at least for now?

qcolombet added inline comments.Sep 4 2020, 6:35 PM
llvm/test/CodeGen/AMDGPU/GlobalISel/regbankselect-amdgcn.s.buffer.load.ll
34

FYI, this change is just that update_mir now doesn't want to reuse prefixes for RUN lines :(.

foad added a subscriber: foad.Sep 5 2020, 2:50 AM

Typo in summary "except".

qcolombet edited the summary of this revision. (Show Details)Sep 8 2020, 10:58 AM
arsenm added inline comments.Sep 8 2020, 11:51 AM
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-llvm.amdgcn.image.store.2d.d16.ll
164

This is fine. <3 x s16> is problematic and I'm working on eliminating all of them now.

arsenm added inline comments.Sep 8 2020, 11:52 AM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
1683

can use auto and avoid .getReg(0)

  • Use auto for MachineIRBuilder instead of use the register directly
qcolombet marked an inline comment as done.Sep 10 2020, 10:12 AM
arsenm accepted this revision.Sep 10 2020, 4:41 PM
This revision is now accepted and ready to land.Sep 10 2020, 4:41 PM
This revision was landed with ongoing or failed builds.Sep 14 2020, 5:32 PM
This revision was automatically updated to reflect the committed changes.