This is an archive of the discontinued LLVM Phabricator instance.

[MachineInstr] Introduce generic predicated copy opcode
AbandonedPublic

Authored by cdevadas on Feb 10 2023, 9:20 AM.

Details

Summary

The copy operation with a predicated dependency is currently
marked with an implicit operand. It is good to have a separate
opcode for such instances to distinguish them from the regular
copies that have no dependencies at all. This will allow targets
more control while lowering the copies. AMDGPU, for instance,
have the implicit dependency with exec register for all vector
copies and no such dependency for scalar copies and there is a
custom pass currently adds the implicit exec operand to every
vector copy and that doesn't look right.

This patch introduces a new opcode PRED_COPY, called the
predicated equivalent of the existing COPY opcode. Targets
can define the default copy opcode and the getCopyOpcode
callback would return the chosen opcode appropriately. Every
interface that needs TargetOpcode::COPY should now be replaced
with a call to getCopyOpcode. Targets have the control over how
and when to lower the PRED_COPY instructions.

Diff Detail

Event Timeline

cdevadas created this revision.Feb 10 2023, 9:20 AM
Herald added a project: Restricted Project. · View Herald Transcript
cdevadas requested review of this revision.Feb 10 2023, 9:20 AM
lkail added a subscriber: lkail.Feb 10 2023, 9:36 AM

D143757 in the review chain enables predicated copy for AMDGPU target.
The application of pred-copy is here: D143762.
Testcase llvm/test/CodeGen/AMDGPU/whole-wave-register-copy.ll added along with D124196 demonstrates the need for pred-copy.

kparzysz requested changes to this revision.Feb 10 2023, 11:31 AM
kparzysz added a subscriber: kparzysz.

Please post an RFC for this. So far it looks to me like something that can be dealt with in AMDGPU codegen.

This revision now requires changes to proceed.Feb 10 2023, 11:31 AM

Please post an RFC for this. So far it looks to me like something that can be dealt with in AMDGPU codegen.

Whatever we end up with needs to be deeply embedded in everything related to register allocation, so it can't be handled just in AMDGPU code

bjope added a subscriber: bjope.Feb 15 2023, 1:27 PM
bjope added inline comments.
llvm/include/llvm/Support/TargetOpcodes.def
114

So what is the result when the predicate is false?.

(That kind of question does not exist for any existing generic opcodes afaik, because this would be the first predicable generic opcode.)

llvm/lib/CodeGen/GlobalISel/GISelKnownBits.cpp
204

Here we have a typical example where I think it gets complicated to say that PRED_COPY is predicable. If the predicate is false, then you can't guarantee anything about the known bits. OTOH, if the predicate is guaranteed to be true, then there is no need to have a PRED_COPY.

llvm/lib/CodeGen/TwoAddressInstructionPass.cpp
1927

Here is a general comment to about all these places that now may or may not create a predicated instruction.

In our downstream target all predicated instructions also have extra operands. So whenever we create a predicated instruction those predicate operands should be added to the instruction. But that won't happen for any of these places where you use getCopyOpcode(). That is of course not a problem as long as we avoid using PRED_COPY for our target. I'm just saying that this probably wouldn't work for any target, depending on how isPredicable=1 is handled.

The IsPredicable field will always be turned to 1 for Pred_Copy to indicate that it is the Predicated version of the copy.

I actually wanted to have the Pred_Copy without these implicit operands. We should be able to model the dependency using an additional operand.
But I don't know how to model this operand and hence decided to continue using the implicit operand itself.

May be an immediate operand that takes const values (0, 1, 2, etc.)?
The immediate values can be mapped to enumerations defined via target files to represent these predicated dependencies.
When there is no dependency, use the value 0. It means, there is no dependency and can change to the regular COPY.
If required, the Pred_Copy with Imm_opnd value 0 can also be modeled for the "predicate is false" scenario.

For example. the following can be defined for AMDGPU target in llvm/lib/Target/AMDGPU/SIDefines.h

enum Copy_Predication {

None = 0,    // No dependency. It is for SGPR copies. At some point, convert the instruction into regular COPY.
Lane_Copy,  // copy only the active lanes of VGPRs.
Wave_Copy // copy all lanes of VGPRs

}

Any thoughts?

llvm/lib/CodeGen/GlobalISel/GISelKnownBits.cpp
204

The PRED_COPY here was added (like many other instances) for the codegen transformations to work as similar to the COPY opcode. Is there a need to consider the Predicable field here?

I wrote more details on the RFC on discourse. But this feels pretty AMDGPU specific, why wouldn't it used a target specific opcode then?

MatzeB added inline comments.Feb 16 2023, 11:18 AM
llvm/lib/CodeGen/MachineVerifier.cpp
1837

As I still don't fully understand the point of the new opcode... I was hoping to see some extra rules and constraints here, but this feels like it allows the exact same operands and constraints as a normal COPY if there's no extra verified rules here... But then why do we need a new opcode in the first place?

The IsPredicable field will always be turned to 1 for Pred_Copy to indicate that it is the Predicated version of the copy.

I actually wanted to have the Pred_Copy without these implicit operands. We should be able to model the dependency using an additional operand.
But I don't know how to model this operand and hence decided to continue using the implicit operand itself.

May be an immediate operand that takes const values (0, 1, 2, etc.)?
The immediate values can be mapped to enumerations defined via target files to represent these predicated dependencies.
When there is no dependency, use the value 0. It means, there is no dependency and can change to the regular COPY.
If required, the Pred_Copy with Imm_opnd value 0 can also be modeled for the "predicate is false" scenario.

For example. the following can be defined for AMDGPU target in llvm/lib/Target/AMDGPU/SIDefines.h

enum Copy_Predication {

None = 0,    // No dependency. It is for SGPR copies. At some point, convert the instruction into regular COPY.
Lane_Copy,  // copy only the active lanes of VGPRs.
Wave_Copy // copy all lanes of VGPRs

}

Any thoughts?

This would still be very target specific. And since all instructions with isPredicable=1 in my target takes 3 extra operands, then I assume other targets can have similar rules that should be fulfilled for such instructions. So if target generic opcodes starts to use isPredicable=1 it might end up a bit messy as such target sepcific expectations on TargetInstrInfo::isPredicable() isn't fulfilled.

I also agree with others that this feels quite target specific for AMDGPU (but that also makes me feel that maybe I've misunderstood why you want to make this generic).

In the description you have written "The copy operation with a predicated dependency is currently marked with an implicit operand". I don't really understand what that is referring to. Is that something AMDGPU specific, or what is this referring to?

cdevadas abandoned this revision.Jun 3 2023, 6:00 PM

This patch is no longer required after the introduction of liverange split target opcode with D150388