This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/GlobalISel: Add tablegen operator that looks through copies
AbandonedPublic

Authored by Petar.Avramovic on Aug 20 2020, 8:12 AM.

Details

Summary

Introduce g_maybe_cross_reg_bank_copy, operator that can be used in patterns to ignore cross register bank copies.
Operator will be recognized by global-isel emitter and ignored by sdag emitter (sdag uses types not register banks).
Generates GIM_CheckOpcodeIgnoreCopies instead of GIM_CheckOpcode.
Does not affect complexity of the pattern.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptAug 20 2020, 8:12 AM
Petar.Avramovic requested review of this revision.Aug 20 2020, 8:12 AM
foad added inline comments.Aug 20 2020, 8:33 AM
llvm/include/llvm/CodeGen/GlobalISel/InstructionSelector.h
396 ↗(On Diff #286821)

Can't this still be inside the class? I.e. move it down two or three lines?

llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
2855–2856 ↗(On Diff #286821)

Why not call AdjustInstrPostInstrSelectionBase on each MI in OutMIs?

arsenm added inline comments.Aug 20 2020, 9:17 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4535 ↗(On Diff #286821)

Forcing a function to be shared that refers to an SDNode is also pretty awkward

4538 ↗(On Diff #286821)

I was trying to avoid doing things this way. This is selecting instructions incorrectly, and then relying on something else to fix them up. I think we should select trivially correct instructions, and a follow on pass could do a better job figuring out the optimal SGPR operands. This is why RegBankSelect is forcing everything to VGPRs for VALU instructions. The later pass will have an easier time if everything is expected to consistently be in VGPRs, rather than having to figure out if some operands are already scalar and moving them back to VGPRs if it decided another copy was better

arsenm added inline comments.Aug 20 2020, 9:23 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4538 ↗(On Diff #286821)

We're also incorrectly letting some constant bus violations through with ThreeOp_i32_Pats, which do need a complex predicate to avoid the constant bus violations. Technically the later folding pass could form these cases, but it would be nice to keep these in patterns. I haven't thought of a good solution for this yet

foad added inline comments.Aug 20 2020, 9:55 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4538 ↗(On Diff #286821)

How do we "select trivially correct instructions" if regbankselect has already run, and it has no foreknowledge of what complex patterns we might want to match as part of instruction selection? Some of the current failures to select v_lshl_add are because the inputs to the G_SHL are uniformn (or constant) and regbankselect doesn't know that we might want to combine it into a G_ADD whose other operand is divergent.

arsenm added inline comments.Aug 20 2020, 10:01 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4538 ↗(On Diff #286821)

It's trivial if you look at a single instruction at a time. The constant bus violations only start showing up when you start looking through copies when pattern matching multiple input operations.

The patterns have to make sure they don't violate the constant bus restriction. As I mentioned, ThreeOp_i32_Pats is broken and really should fail to import. ThreeOpFrag needs a non-trivial predicate that uses PredicateCodeUsesOperands, but GlobalISelEmitter doesn't implement it (or even check if it's in use).

foad added inline comments.Aug 21 2020, 1:53 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
4538 ↗(On Diff #286821)

It's trivial if you look at a single instruction at a time. The constant bus violations only start showing up when you start looking through copies when pattern matching multiple input operations.

The patterns have to make sure they don't violate the constant bus restriction. As I mentioned, ThreeOp_i32_Pats is broken and really should fail to import. ThreeOpFrag needs a non-trivial predicate that uses PredicateCodeUsesOperands, but GlobalISelEmitter doesn't implement it (or even check if it's in use).

4538 ↗(On Diff #286821)

OK, makes sense. We can look at implementing PredicateCodeUsesOperands for GlobalISel in tablegen.

Attempt to teach emitter to collect operands that predicate with 'let PredicateCodeUsesOperands = 1' requires.

Petar.Avramovic retitled this revision from [RFC] AMDGPU/GlobalISel: Look through copies in GIM_CheckOpcode and add post-isel hook to [RFC] AMDGPU/GlobalISel: Look through copies in GIM_CheckOpcode.
Petar.Avramovic edited the summary of this revision. (Show Details)

Move support for predicate code that uses operands to a separate patch.

I think copies that can be ignored should be taken care of by a combiner pass, and the selector should not have to do this

foad added a comment.Sep 9 2020, 2:01 AM

I think copies that can be ignored should be taken care of by a combiner pass, and the selector should not have to do this

Can you explain how you think this should work for selecting e.g. v_add_lshl where some of the operands can be sgprs? A combiner pass can't remove cross-register-bank copies, can it?

arsenm added a comment.Sep 9 2020, 7:51 AM

I think copies that can be ignored should be taken care of by a combiner pass, and the selector should not have to do this

Can you explain how you think this should work for selecting e.g. v_add_lshl where some of the operands can be sgprs? A combiner pass can't remove cross-register-bank copies, can it?

A combiner could, but under the current scheme it should not. I want to minimize the amount of places/code that need to worry about the constant bus restriction. The pattern for these instructions should search through the copies if it can. We may need a new tablegen operator or something to look through copies

Petar.Avramovic retitled this revision from [RFC] AMDGPU/GlobalISel: Look through copies in GIM_CheckOpcode to AMDGPU/GlobalISel: Add tablegen operator that looks through copies.
Petar.Avramovic edited the summary of this revision. (Show Details)

Allow patterns to explicitly ask to look through copies.

arsenm added inline comments.
llvm/include/llvm/Target/TargetSelectionDAG.td
313 ↗(On Diff #292800)

I think the name needs work. I don't think it should have a g_ prefix. Maybe ignore_copies?

Herald added a project: Restricted Project. · View Herald TranscriptSep 28 2022, 3:07 PM
Herald added a subscriber: kosarev. · View Herald Transcript

@Petar.Avramovic this could be relevant for https://discourse.llvm.org/t/cleanly-addressing-globalisel-dag-pattern-matching-differences/64817/14, do you think you have time to resume work on this patch & get it landed, or may I take over it?
Thank you

You can take over. I am not so sure this will be useful as is. Just looking through copies will break constant bus limit. Then there are constants and their register banks.
I didn't like g_maybe_cross_reg_bank_copy it would be better to have some field in PatFrag recognized by the global-isel emitter (it would generate GIM_CheckOpcodeIgnoreCopies instead of GIM_CheckOpcode for all opcode checks in pattern).
But that can break constant bus limit. There is a mechanism to collect operands with
PredicateCodeUsesOperands = 1
Maybe you could add another field in PatFrag that will be a signal to generate jump-table entries that will go through collected operands and check constant bus limit for collected operands.
Just to clarify by have some field in PatFrag recognized by the global-isel emitter I meant something like HasNoUse but sdag emitter will ignore it.

D136234 landed and is an alternative implementation of this

Pierre-vh resigned from this revision.Feb 21 2023, 3:24 AM
Petar.Avramovic abandoned this revision.Feb 21 2023, 5:43 AM