Introduce g_maybe_cross_reg_bank_copy, operator that can be used in patterns to ignore cross register bank copies.
Operator will be recognized by global-isel emitter and ignored by sdag emitter (sdag uses types not register banks).
Generates GIM_CheckOpcodeIgnoreCopies instead of GIM_CheckOpcode.
Does not affect complexity of the pattern.
Details
Diff Detail
Event Timeline
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
4535 ↗ | (On Diff #286821) | Forcing a function to be shared that refers to an SDNode is also pretty awkward |
4538 ↗ | (On Diff #286821) | I was trying to avoid doing things this way. This is selecting instructions incorrectly, and then relying on something else to fix them up. I think we should select trivially correct instructions, and a follow on pass could do a better job figuring out the optimal SGPR operands. This is why RegBankSelect is forcing everything to VGPRs for VALU instructions. The later pass will have an easier time if everything is expected to consistently be in VGPRs, rather than having to figure out if some operands are already scalar and moving them back to VGPRs if it decided another copy was better |
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
4538 ↗ | (On Diff #286821) | We're also incorrectly letting some constant bus violations through with ThreeOp_i32_Pats, which do need a complex predicate to avoid the constant bus violations. Technically the later folding pass could form these cases, but it would be nice to keep these in patterns. I haven't thought of a good solution for this yet |
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
4538 ↗ | (On Diff #286821) | How do we "select trivially correct instructions" if regbankselect has already run, and it has no foreknowledge of what complex patterns we might want to match as part of instruction selection? Some of the current failures to select v_lshl_add are because the inputs to the G_SHL are uniformn (or constant) and regbankselect doesn't know that we might want to combine it into a G_ADD whose other operand is divergent. |
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
4538 ↗ | (On Diff #286821) | It's trivial if you look at a single instruction at a time. The constant bus violations only start showing up when you start looking through copies when pattern matching multiple input operations. The patterns have to make sure they don't violate the constant bus restriction. As I mentioned, ThreeOp_i32_Pats is broken and really should fail to import. ThreeOpFrag needs a non-trivial predicate that uses PredicateCodeUsesOperands, but GlobalISelEmitter doesn't implement it (or even check if it's in use). |
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp | ||
---|---|---|
4538 ↗ | (On Diff #286821) |
|
4538 ↗ | (On Diff #286821) | OK, makes sense. We can look at implementing PredicateCodeUsesOperands for GlobalISel in tablegen. |
Attempt to teach emitter to collect operands that predicate with 'let PredicateCodeUsesOperands = 1' requires.
I think copies that can be ignored should be taken care of by a combiner pass, and the selector should not have to do this
Can you explain how you think this should work for selecting e.g. v_add_lshl where some of the operands can be sgprs? A combiner pass can't remove cross-register-bank copies, can it?
A combiner could, but under the current scheme it should not. I want to minimize the amount of places/code that need to worry about the constant bus restriction. The pattern for these instructions should search through the copies if it can. We may need a new tablegen operator or something to look through copies
llvm/include/llvm/Target/TargetSelectionDAG.td | ||
---|---|---|
313 ↗ | (On Diff #292800) | I think the name needs work. I don't think it should have a g_ prefix. Maybe ignore_copies? |
@Petar.Avramovic this could be relevant for https://discourse.llvm.org/t/cleanly-addressing-globalisel-dag-pattern-matching-differences/64817/14, do you think you have time to resume work on this patch & get it landed, or may I take over it?
Thank you
You can take over. I am not so sure this will be useful as is. Just looking through copies will break constant bus limit. Then there are constants and their register banks.
I didn't like g_maybe_cross_reg_bank_copy it would be better to have some field in PatFrag recognized by the global-isel emitter (it would generate GIM_CheckOpcodeIgnoreCopies instead of GIM_CheckOpcode for all opcode checks in pattern).
But that can break constant bus limit. There is a mechanism to collect operands with
PredicateCodeUsesOperands = 1
Maybe you could add another field in PatFrag that will be a signal to generate jump-table entries that will go through collected operands and check constant bus limit for collected operands.
Just to clarify by have some field in PatFrag recognized by the global-isel emitter I meant something like HasNoUse but sdag emitter will ignore it.
Can't this still be inside the class? I.e. move it down two or three lines?