Depends on D136922
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
| Time | Test | |
|---|---|---|
| 60,060 ms | x64 debian > MLIR.Examples/standalone::test.toy |
Event Timeline
| llvm/lib/Target/AMDGPU/AMDGPUCombine.td | ||
|---|---|---|
| 48–52 ↗ | (On Diff #467049) | This is generic |
| llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp | ||
| 162–164 ↗ | (On Diff #467049) | The DAG treats this as an initial canonicalization, so the obvious codegen benefit isn't so important |
| 166–170 ↗ | (On Diff #467049) | Don't see why this would restrict the vector type |
| llvm/lib/Target/AMDGPU/AMDGPUCombine.td | ||
|---|---|---|
| 48–52 ↗ | (On Diff #467049) | Do you mean it should go in the generic combiner? If we remove if (!MI.getMF()->getSubtarget<GCNSubtarget>().hasScalarPackInsts())
return false;
// TODO: Only on small vectors?
LLT VecTy = MRI.getType(MI.getOperand(0).getReg());
if (VecTy.getElementType() != LLT::scalar(16) ||
(VecTy.getSizeInBits() % 32) != 0)
return false;I would leave it in the AMDGPUCombiner, if we want to make it generic, I would at least add some safeguard so it doesn't turn every INSERT_VECTOR_ELT into a shuffle - maybe only do it for 2-elt vectors? |
| llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp | ||
| 166–170 ↗ | (On Diff #467049) | Shouldn't this just be on 2-element vectors? |
What is the motivation for this?
Can you add a comment somewhere explaining what the combine does?
Relax some restrictions on the combine and add comment to describe why the current restrictions are in place
We want to use SHUFFLE_VECTOR (which is always lowered during legalization anyway) as the canonical form for this kind of operation (INSERT_VECTOR_ELT w/ a constant index on small vectors). It benefits mad_mix codegen.
| llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp | ||
|---|---|---|
| 162 ↗ | (On Diff #468794) | "G_SHUFFLE_VECTOR" |
| llvm/lib/Target/AMDGPU/AMDGPUCombine.td | ||
|---|---|---|
| 48–52 ↗ | (On Diff #467049) | Yes. This is a generic combine as it is. What the target directly wants isn't necessarily the point. A larger shuffle should be legalizable to what the target does want, and is a better canonical form |
| 48–52 ↗ | (On Diff #467049) | By as-is I mean DAGCombiner |
| llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
|---|---|---|
| 2688–2694 | There are cases where insert_vector_elts combine to form shuffles but you don't seem to be handling those. This looks like you're just handling basic cases that can use build_vector (which is already implemented in matchCombineInsertVecElts). I'm not following what the shuffles are adding here | |
| 2697 | I'm not really sure why this eraseInst helper exists | |
| llvm/test/CodeGen/AMDGPU/GlobalISel/combine-insertvecelt-to-shufflevector.mir | ||
| 3 | Don't need -global-isel with -run-pass | |
Comments + rebase
| llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
|---|---|---|
| 2688–2694 | I thought we ultimately wanted insert_vector_elt to be lowered to shuffle_vector, as the latter is easier to handle (and is needed for mad_mix selection)? Is that not the case? In any case, I'm not sure I understand the issue:
Note that, IIRC, matchCombineInsertVecElts only handles chains of insert_vector_elt. If there's a single one, it doesn't touch it. This combine is targeted towards single insert_vector_elts. | |
| 2697 | Not sure either, it's a Combiner helper. I thought it was doing some other things like notifying the observer but it really just calls MI.eraseFromParent(). | |
Not sure yet this is the right thing to do, I can resurrect the diff later if we still want to do it.
There are cases where insert_vector_elts combine to form shuffles but you don't seem to be handling those. This looks like you're just handling basic cases that can use build_vector (which is already implemented in matchCombineInsertVecElts). I'm not following what the shuffles are adding here