This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Add v3f16/v3i16 support to SDag
ClosedPublic

Authored by Flakebi on Jul 23 2020, 8:21 AM.

Details

Summary

Fix lowering and instruction selection for v3x16 types
and enable InstCombine to emit them.

This patch only implements it for the selection dag.
GlobalISel tests in GlobalISel/llvm.amdgcn.image.load.1d.d16.ll and GlobalISel/llvm.amdgcn.image.store.2d.d16.ll
still don’t work. Does anyone have hints on where this needs to be implemented for GlobalISel?

Sidenote: I’m on vacation for the next two weeks, so I’ll pick up comments afterwards.
@rdomingu, if you are bored and want to pick this up, feel free.

Diff Detail

Event Timeline

Flakebi created this revision.Jul 23 2020, 8:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 23 2020, 8:21 AM
arsenm added inline comments.Jul 23 2020, 10:22 AM
llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp
960–961

It looks like you intended this to be an assert?

llvm/lib/Target/AMDGPU/BUFInstructions.td
1217–1227

The boilerplate could be nicer and treated more uniformly. What you're really doing is defining an element vector extload/truncstore on intrinsics.

Can you move this to SIInstrInfo.td and have it follow along with something like truncstore_* or atomic nodes?

Something like:
class mubuf_intrinsic_load_ : PatFrag() ...
def mubuf_intrinsic_load_4 ..
def mubuf_intrinsic_load_6
def mubuf_intrinsic_load_8
...

1827–1830

It shouldn't be necessary to use predicate code. I think you can set MemoryVT which will also work for GlobalSel

Flakebi updated this revision to Diff 285595.Aug 14 2020, 2:14 AM

Address review comments: Move patterns to SIInstrInfo.td and use MemoryVT.

Flakebi added inline comments.Aug 14 2020, 2:20 AM
llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp
960–961

The check was indeed copied from the assert in SetWidenedVector, thanks for the notice.
For context: We need to make this check finer so that SetWidenedVector does not get called for return values which are not widened (like with tfe).

arsenm added inline comments.Aug 28 2020, 11:49 AM
llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.cpp
960–961

I'm still confused by this change. This is relying on getTypeToTransformTo returning other for other?

llvm/lib/Target/AMDGPU/BUFInstructions.td
845

Can you leave this as v3f16 and swap out the register/pattern type inside MUBUF_Pseudo_Loads?

1703–1705

Is there a better name to use here besides "Inner"?

1766

Having this one use a different multiclass is weird looking. Why can't it directly use the same multiclass as the other cases?

I would expect this to look something like

class MTBUF_LoadIntrinsicPta<SDPatternOperator node, ValueType memvt, ValueType vt = memvt>
and then only override the vt in the weird v3 cases

Flakebi updated this revision to Diff 292187.Sep 16 2020, 4:47 AM

Having this one use a different multiclass is weird looking. Why can't it directly use the same multiclass as the other cases?

I would expect this to look something like

class MTBUF_LoadIntrinsicPta<SDPatternOperator node, ValueType memvt, ValueType vt = memvt>
and then only override the vt in the weird v3 cases

Good idea, looks a lot better now.

Flakebi updated this revision to Diff 292189.Sep 16 2020, 5:05 AM

Improve is-widened check in CustomWidenLowerNode to determine if a value was widened or not.

arsenm accepted this revision.Sep 16 2020, 8:07 AM

LGTM with formatting nit

llvm/lib/Target/AMDGPU/SIInstrInfo.td
586

Move these to previous line

This revision is now accepted and ready to land.Sep 16 2020, 8:07 AM
Flakebi updated this revision to Diff 292229.Sep 16 2020, 8:20 AM

Fix formatting

This revision was landed with ongoing or failed builds.Sep 16 2020, 8:20 AM
This revision was automatically updated to reflect the committed changes.