This is an archive of the discontinued LLVM Phabricator instance.

[GISel] Combine G_INSERT_VECTOR_ELT to G_SHUFFLE_VECTOR
AbandonedPublic

Authored by Pierre-vh on Oct 4 2022, 4:54 AM.

Details

Reviewers
arsenm
Summary

Depends on D136922

Diff Detail

Event Timeline

Pierre-vh created this revision.Oct 4 2022, 4:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2022, 4:54 AM
Pierre-vh requested review of this revision.Oct 4 2022, 4:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 4 2022, 4:54 AM
arsenm added inline comments.Oct 12 2022, 8:28 AM
llvm/lib/Target/AMDGPU/AMDGPUCombine.td
48–52 ↗(On Diff #467049)

This is generic

llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp
162–164 ↗(On Diff #467049)

The DAG treats this as an initial canonicalization, so the obvious codegen benefit isn't so important

166–170 ↗(On Diff #467049)

Don't see why this would restrict the vector type

Pierre-vh added inline comments.Oct 13 2022, 12:46 AM
llvm/lib/Target/AMDGPU/AMDGPUCombine.td
48–52 ↗(On Diff #467049)

Do you mean it should go in the generic combiner?
I'm worried that if we put it there, and remove the filtering on small vectors/hasScalarPackInsts that all INSERT_VECTOR_ELT instructions will become SHUFFLE_VECTOR and that some targets won't like it?

If we remove

if (!MI.getMF()->getSubtarget<GCNSubtarget>().hasScalarPackInsts())
  return false;

// TODO: Only on small vectors?
LLT VecTy = MRI.getType(MI.getOperand(0).getReg());
if (VecTy.getElementType() != LLT::scalar(16) ||
    (VecTy.getSizeInBits() % 32) != 0)
  return false;

I would leave it in the AMDGPUCombiner, if we want to make it generic, I would at least add some safeguard so it doesn't turn every INSERT_VECTOR_ELT into a shuffle - maybe only do it for 2-elt vectors?
That or just don't add it to "all_combines" - we put it in the generic helper but it's opt-in and targets have too add the combine to their pipeline.

llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp
166–170 ↗(On Diff #467049)

Shouldn't this just be on 2-element vectors?

nhaehnle removed a subscriber: nhaehnle.Oct 13 2022, 1:29 AM
foad added a comment.Oct 17 2022, 3:20 AM

What is the motivation for this?

Can you add a comment somewhere explaining what the combine does?

Pierre-vh marked an inline comment as done.

Relax some restrictions on the combine and add comment to describe why the current restrictions are in place

What is the motivation for this?

Can you add a comment somewhere explaining what the combine does?

We want to use SHUFFLE_VECTOR (which is always lowered during legalization anyway) as the canonical form for this kind of operation (INSERT_VECTOR_ELT w/ a constant index on small vectors). It benefits mad_mix codegen.

foad added inline comments.Oct 24 2022, 7:29 AM
llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp
162 ↗(On Diff #468794)

"G_SHUFFLE_VECTOR"

Pierre-vh updated this revision to Diff 470445.Oct 25 2022, 4:30 AM
Pierre-vh marked an inline comment as done.

Comment
@arsenm please review so D134354 can land?

arsenm added inline comments.Oct 27 2022, 8:12 AM
llvm/lib/Target/AMDGPU/AMDGPUCombine.td
48–52 ↗(On Diff #467049)

Yes. This is a generic combine as it is. What the target directly wants isn't necessarily the point. A larger shuffle should be legalizable to what the target does want, and is a better canonical form

48–52 ↗(On Diff #467049)

By as-is I mean DAGCombiner

Pierre-vh updated this revision to Diff 471430.Oct 28 2022, 1:16 AM
Pierre-vh marked 3 inline comments as done.

Rebase on D136922, make combine generic

Pierre-vh retitled this revision from [AMDGPU][GISel] Combine G_INSERT_VECTOR_ELT to G_SHUFFLE_VECTOR to [GISel] Combine G_INSERT_VECTOR_ELT to G_SHUFFLE_VECTOR.Oct 28 2022, 1:17 AM
Pierre-vh edited the summary of this revision. (Show Details)
arsenm added inline comments.Nov 1 2022, 2:36 PM
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
2688–2694

There are cases where insert_vector_elts combine to form shuffles but you don't seem to be handling those. This looks like you're just handling basic cases that can use build_vector (which is already implemented in matchCombineInsertVecElts). I'm not following what the shuffles are adding here

2697

I'm not really sure why this eraseInst helper exists

llvm/test/CodeGen/AMDGPU/GlobalISel/combine-insertvecelt-to-shufflevector.mir
2

Don't need -global-isel with -run-pass

Pierre-vh updated this revision to Diff 472543.Nov 2 2022, 2:20 AM
Pierre-vh marked 2 inline comments as done.

Comments + rebase

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp
2688–2694

I thought we ultimately wanted insert_vector_elt to be lowered to shuffle_vector, as the latter is easier to handle (and is needed for mad_mix selection)? Is that not the case?
It the reason why we're lowering shuffle vectors now, no?

In any case, I'm not sure I understand the issue:

  • Is it that the combine is unnecessary? (Then why are working towards this? Why did we lower shuffle vector in the Legalizer?)
  • Is it that the combine as-is is fine, but should be handling more (like chained insert_vector_elt) ? But then, what makes it different from matchCombineInsertVecElts?

Note that, IIRC, matchCombineInsertVecElts only handles chains of insert_vector_elt. If there's a single one, it doesn't touch it. This combine is targeted towards single insert_vector_elts.

2697

Not sure either, it's a Combiner helper. I thought it was doing some other things like notifying the observer but it really just calls MI.eraseFromParent().
I've removed this use and will propose a patch to remove it entirely.

Pierre-vh abandoned this revision.Nov 6 2022, 11:37 PM

Not sure yet this is the right thing to do, I can resurrect the diff later if we still want to do it.