Depends on D136922
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Time | Test | |
---|---|---|
60,060 ms | x64 debian > MLIR.Examples/standalone::test.toy |
Event Timeline
llvm/lib/Target/AMDGPU/AMDGPUCombine.td | ||
---|---|---|
48–52 ↗ | (On Diff #467049) | This is generic |
llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp | ||
162–164 ↗ | (On Diff #467049) | The DAG treats this as an initial canonicalization, so the obvious codegen benefit isn't so important |
166–170 ↗ | (On Diff #467049) | Don't see why this would restrict the vector type |
llvm/lib/Target/AMDGPU/AMDGPUCombine.td | ||
---|---|---|
48–52 ↗ | (On Diff #467049) | Do you mean it should go in the generic combiner? If we remove if (!MI.getMF()->getSubtarget<GCNSubtarget>().hasScalarPackInsts()) return false; // TODO: Only on small vectors? LLT VecTy = MRI.getType(MI.getOperand(0).getReg()); if (VecTy.getElementType() != LLT::scalar(16) || (VecTy.getSizeInBits() % 32) != 0) return false; I would leave it in the AMDGPUCombiner, if we want to make it generic, I would at least add some safeguard so it doesn't turn every INSERT_VECTOR_ELT into a shuffle - maybe only do it for 2-elt vectors? |
llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp | ||
166–170 ↗ | (On Diff #467049) | Shouldn't this just be on 2-element vectors? |
What is the motivation for this?
Can you add a comment somewhere explaining what the combine does?
Relax some restrictions on the combine and add comment to describe why the current restrictions are in place
We want to use SHUFFLE_VECTOR (which is always lowered during legalization anyway) as the canonical form for this kind of operation (INSERT_VECTOR_ELT w/ a constant index on small vectors). It benefits mad_mix codegen.
llvm/lib/Target/AMDGPU/AMDGPUPreLegalizerCombiner.cpp | ||
---|---|---|
162 ↗ | (On Diff #468794) | "G_SHUFFLE_VECTOR" |
llvm/lib/Target/AMDGPU/AMDGPUCombine.td | ||
---|---|---|
48–52 ↗ | (On Diff #467049) | Yes. This is a generic combine as it is. What the target directly wants isn't necessarily the point. A larger shuffle should be legalizable to what the target does want, and is a better canonical form |
48–52 ↗ | (On Diff #467049) | By as-is I mean DAGCombiner |
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
2688–2694 | There are cases where insert_vector_elts combine to form shuffles but you don't seem to be handling those. This looks like you're just handling basic cases that can use build_vector (which is already implemented in matchCombineInsertVecElts). I'm not following what the shuffles are adding here | |
2697 | I'm not really sure why this eraseInst helper exists | |
llvm/test/CodeGen/AMDGPU/GlobalISel/combine-insertvecelt-to-shufflevector.mir | ||
3 | Don't need -global-isel with -run-pass |
Comments + rebase
llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp | ||
---|---|---|
2688–2694 | I thought we ultimately wanted insert_vector_elt to be lowered to shuffle_vector, as the latter is easier to handle (and is needed for mad_mix selection)? Is that not the case? In any case, I'm not sure I understand the issue:
Note that, IIRC, matchCombineInsertVecElts only handles chains of insert_vector_elt. If there's a single one, it doesn't touch it. This combine is targeted towards single insert_vector_elts. | |
2697 | Not sure either, it's a Combiner helper. I thought it was doing some other things like notifying the observer but it really just calls MI.eraseFromParent(). |
Not sure yet this is the right thing to do, I can resurrect the diff later if we still want to do it.
There are cases where insert_vector_elts combine to form shuffles but you don't seem to be handling those. This looks like you're just handling basic cases that can use build_vector (which is already implemented in matchCombineInsertVecElts). I'm not following what the shuffles are adding here