This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel][TableGen] Add handling of unannotated dst pattern ops
AbandonedPublic

Authored by ehjogab on Aug 19 2020, 3:19 AM.

Details

Summary

GlobalISelEmitter.cpp requires that destination pattern have the
register class annotated to its operands, e.g.:

def : Pat<(add i32:$src1, i32:$str2),
          (instr regclass:$src1, regclass:$src2)>;

This information is not necessary for SelectionIDAG, and subsequently
for certain targets the register class is omitted from the operands,
which causes the pattern to be rejected. However, this information is
still available by looking at the corresponding operand in the
instruction definition. This patch performs such a lookup when a
register class is expected and none has been annotated to the operand.

Diff Detail

Event Timeline

ehjogab created this revision.Aug 19 2020, 3:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 19 2020, 3:19 AM
Herald added subscribers: bjope, rovka. · View Herald Transcript
ehjogab requested review of this revision.Aug 19 2020, 3:19 AM

What happens if the instruction operand uses an unallocatable pseudo-class combining register classes with different banks? I would expect it would need to check any of the bank possibilities

What happens if the instruction operand uses an unallocatable pseudo-class combining register classes with different banks? I would expect it would need to check any of the bank possibilities

Hm, not sure what you mean. Could you provide a brief example?

What happens if the instruction operand uses an unallocatable pseudo-class combining register classes with different banks? I would expect it would need to check any of the bank possibilities

Hm, not sure what you mean. Could you provide a brief example?

The most common AMDGPU instructions define VS_* pseudoclasses for operand constraints. The two classes added have different register banks, since the instructions support directly reading from either.
e.g.
RegisterClass VS_32 = {VGPR_32RegClass, SGPR_32RegClass}, where both of these register classes are different banks.

In terms of selecting a particular operand, there's an additional hazard. We can't freely use SGPR_32 in a given operand without considering the context of the other operands. We avoid solving this in GlobalISel by always assigning these operands to VGPRRegBankID, and relying on a later pass to optimize out these copies. We're sorely missing a RegBank legality verifier check, so I'm not sure I would want the generated patterns to infer either bank is OK.

What happens if the instruction operand uses an unallocatable pseudo-class combining register classes with different banks? I would expect it would need to check any of the bank possibilities

Hm, not sure what you mean. Could you provide a brief example?

The most common AMDGPU instructions define VS_* pseudoclasses for operand constraints. The two classes added have different register banks, since the instructions support directly reading from either.
e.g.
RegisterClass VS_32 = {VGPR_32RegClass, SGPR_32RegClass}, where both of these register classes are different banks.

In terms of selecting a particular operand, there's an additional hazard. We can't freely use SGPR_32 in a given operand without considering the context of the other operands. We avoid solving this in GlobalISel by always assigning these operands to VGPRRegBankID, and relying on a later pass to optimize out these copies. We're sorely missing a RegBank legality verifier check, so I'm not sure I would want the generated patterns to infer either bank is OK.

The operands of these VS_* instructions, are the defined (in the instruction) to use the VS_32 class, or VGPR_32/SGPR_32 classes? If it is the former, then what this patch would do is to use VS_32 in the patterns. If the patterns should really be using VGPR_32/SGPR_32, then this is a problem. However, you still have the option of annotating the pattern operands just as before, so this patch would not interfere with any patterns that today are accepted by TableGen. If it is the latter, then this patch would use whatever class (VGPR_32 or SGPR_32) that is specified for the corresponding operands in the instruction, so here I don't see there being any problem.

Perhaps we could add an option that this feature is disabled by default, and by activating it you take responsibility to ensure that you don't have these kinds of situations for your target?

madhur13490 added a subscriber: madhur13490.EditedAug 21 2020, 4:31 AM

What happens if the instruction operand uses an unallocatable pseudo-class combining register classes with different banks? I would expect it would need to check any of the bank possibilities

Hm, not sure what you mean. Could you provide a brief example?

The most common AMDGPU instructions define VS_* pseudoclasses for operand constraints. The two classes added have different register banks, since the instructions support directly reading from either.
e.g.
RegisterClass VS_32 = {VGPR_32RegClass, SGPR_32RegClass}, where both of these register classes are different banks.

In terms of selecting a particular operand, there's an additional hazard. We can't freely use SGPR_32 in a given operand without considering the context of the other operands. We avoid solving this in GlobalISel by always assigning these operands to VGPRRegBankID, and relying on a later pass to optimize out these copies. We're sorely missing a RegBank legality verifier check, so I'm not sure I would want the generated patterns to infer either bank is OK.

The operands of these VS_* instructions, are the defined (in the instruction) to use the VS_32 class, or VGPR_32/SGPR_32 classes? If it is the former, then what this patch would do is to use VS_32 in the patterns. If the patterns should really be using VGPR_32/SGPR_32, then this is a problem. However, you still have the option of annotating the pattern operands just as before, so this patch would not interfere with any patterns that today are accepted by TableGen. If it is the latter, then this patch would use whatever class (VGPR_32 or SGPR_32) that is specified for the corresponding operands in the instruction, so here I don't see there being any problem.

Perhaps we could add an option that this feature is disabled by default, and by activating it you take responsibility to ensure that you don't have these kinds of situations for your target?

When a register class is an union of two or more classes then such problems bound to occur for one or more targets. Why not do such inference when it is obvious i.e. register class in instruction definition is a leaf register class and NOT an union of some base other base classes. I think this is better than having an runtime option to enable/disable the inference.
I think leaf classes can be detected programmatically.

What happens if the instruction operand uses an unallocatable pseudo-class combining register classes with different banks? I would expect it would need to check any of the bank possibilities

Hm, not sure what you mean. Could you provide a brief example?

The most common AMDGPU instructions define VS_* pseudoclasses for operand constraints. The two classes added have different register banks, since the instructions support directly reading from either.
e.g.
RegisterClass VS_32 = {VGPR_32RegClass, SGPR_32RegClass}, where both of these register classes are different banks.

In terms of selecting a particular operand, there's an additional hazard. We can't freely use SGPR_32 in a given operand without considering the context of the other operands. We avoid solving this in GlobalISel by always assigning these operands to VGPRRegBankID, and relying on a later pass to optimize out these copies. We're sorely missing a RegBank legality verifier check, so I'm not sure I would want the generated patterns to infer either bank is OK.

The operands of these VS_* instructions, are the defined (in the instruction) to use the VS_32 class, or VGPR_32/SGPR_32 classes? If it is the former, then what this patch would do is to use VS_32 in the patterns. If the patterns should really be using VGPR_32/SGPR_32, then this is a problem. However, you still have the option of annotating the pattern operands just as before, so this patch would not interfere with any patterns that today are accepted by TableGen. If it is the latter, then this patch would use whatever class (VGPR_32 or SGPR_32) that is specified for the corresponding operands in the instruction, so here I don't see there being any problem.

Perhaps we could add an option that this feature is disabled by default, and by activating it you take responsibility to ensure that you don't have these kinds of situations for your target?

When a register class is an union of two or more classes then such problems bound to occur for one or more targets. Why not do such inference when it is obvious i.e. register class in instruction definition is a leaf register class and NOT an union of some base other base classes. I think this is better than having an runtime option to enable/disable the inference.
I think leaf classes can be detected programmatically.

I like that idea, and this is only applied on leaf nodes to begin with.

foad added a subscriber: foad.Aug 21 2020, 6:03 AM

What happens if the instruction operand uses an unallocatable pseudo-class combining register classes with different banks? I would expect it would need to check any of the bank possibilities

Hm, not sure what you mean. Could you provide a brief example?

The most common AMDGPU instructions define VS_* pseudoclasses for operand constraints. The two classes added have different register banks, since the instructions support directly reading from either.
e.g.
RegisterClass VS_32 = {VGPR_32RegClass, SGPR_32RegClass}, where both of these register classes are different banks.

In terms of selecting a particular operand, there's an additional hazard. We can't freely use SGPR_32 in a given operand without considering the context of the other operands. We avoid solving this in GlobalISel by always assigning these operands to VGPRRegBankID, and relying on a later pass to optimize out these copies. We're sorely missing a RegBank legality verifier check, so I'm not sure I would want the generated patterns to infer either bank is OK.

The operands of these VS_* instructions, are the defined (in the instruction) to use the VS_32 class, or VGPR_32/SGPR_32 classes? If it is the former, then what this patch would do is to use VS_32 in the patterns. If the patterns should really be using VGPR_32/SGPR_32, then this is a problem.

As I mentioned, the VS_32 is unallocatable and only exists to model the operand constraints. The actual operands should be either VGPR_32 or SGPR_32. Querying the register bank bank for VS_32 doesn't make any sense, and I suspect would assert.

However, you still have the option of annotating the pattern operands just as before, so this patch would not interfere with any patterns that today are accepted by TableGen. If it is the latter, then this patch would use whatever class (VGPR_32 or SGPR_32) that is specified for the corresponding operands in the instruction, so here I don't see there being any problem.

This would produce a broken import by default with unannotated patterns, which is undesirable. This still has the same problem before, but now all the patterns are broken and it's harder to find which ones need fixing.

Perhaps we could add an option that this feature is disabled by default, and by activating it you take responsibility to ensure that you don't have these kinds of situations for your target?

I think tablegen is hard enough to understand as is, and changing behaviors based on targets would make that problem worse.

What happens if the instruction operand uses an unallocatable pseudo-class combining register classes with different banks? I would expect it would need to check any of the bank possibilities

Hm, not sure what you mean. Could you provide a brief example?

The most common AMDGPU instructions define VS_* pseudoclasses for operand constraints. The two classes added have different register banks, since the instructions support directly reading from either.
e.g.
RegisterClass VS_32 = {VGPR_32RegClass, SGPR_32RegClass}, where both of these register classes are different banks.

In terms of selecting a particular operand, there's an additional hazard. We can't freely use SGPR_32 in a given operand without considering the context of the other operands. We avoid solving this in GlobalISel by always assigning these operands to VGPRRegBankID, and relying on a later pass to optimize out these copies. We're sorely missing a RegBank legality verifier check, so I'm not sure I would want the generated patterns to infer either bank is OK.

The operands of these VS_* instructions, are the defined (in the instruction) to use the VS_32 class, or VGPR_32/SGPR_32 classes? If it is the former, then what this patch would do is to use VS_32 in the patterns. If the patterns should really be using VGPR_32/SGPR_32, then this is a problem. However, you still have the option of annotating the pattern operands just as before, so this patch would not interfere with any patterns that today are accepted by TableGen. If it is the latter, then this patch would use whatever class (VGPR_32 or SGPR_32) that is specified for the corresponding operands in the instruction, so here I don't see there being any problem.

Perhaps we could add an option that this feature is disabled by default, and by activating it you take responsibility to ensure that you don't have these kinds of situations for your target?

When a register class is an union of two or more classes then such problems bound to occur for one or more targets. Why not do such inference when it is obvious i.e. register class in instruction definition is a leaf register class and NOT an union of some base other base classes. I think this is better than having an runtime option to enable/disable the inference.
I think leaf classes can be detected programmatically.

I like that idea, and this is only applied on leaf nodes to begin with.

If it is applied to leaf nodes in this patch then it shouldn't be a problem to our AMDGPU backend, isn't it @arsenm?

What happens if the instruction operand uses an unallocatable pseudo-class combining register classes with different banks? I would expect it would need to check any of the bank possibilities

Hm, not sure what you mean. Could you provide a brief example?

The most common AMDGPU instructions define VS_* pseudoclasses for operand constraints. The two classes added have different register banks, since the instructions support directly reading from either.
e.g.
RegisterClass VS_32 = {VGPR_32RegClass, SGPR_32RegClass}, where both of these register classes are different banks.

In terms of selecting a particular operand, there's an additional hazard. We can't freely use SGPR_32 in a given operand without considering the context of the other operands. We avoid solving this in GlobalISel by always assigning these operands to VGPRRegBankID, and relying on a later pass to optimize out these copies. We're sorely missing a RegBank legality verifier check, so I'm not sure I would want the generated patterns to infer either bank is OK.

The operands of these VS_* instructions, are the defined (in the instruction) to use the VS_32 class, or VGPR_32/SGPR_32 classes? If it is the former, then what this patch would do is to use VS_32 in the patterns. If the patterns should really be using VGPR_32/SGPR_32, then this is a problem.

As I mentioned, the VS_32 is unallocatable and only exists to model the operand constraints. The actual operands should be either VGPR_32 or SGPR_32. Querying the register bank bank for VS_32 doesn't make any sense, and I suspect would assert.

However, you still have the option of annotating the pattern operands just as before, so this patch would not interfere with any patterns that today are accepted by TableGen. If it is the latter, then this patch would use whatever class (VGPR_32 or SGPR_32) that is specified for the corresponding operands in the instruction, so here I don't see there being any problem.

This would produce a broken import by default with unannotated patterns, which is undesirable. This still has the same problem before, but now all the patterns are broken and it's harder to find which ones need fixing.

Perhaps we could add an option that this feature is disabled by default, and by activating it you take responsibility to ensure that you don't have these kinds of situations for your target?

I think tablegen is hard enough to understand as is, and changing behaviors based on targets would make that problem worse.

If we could make this so that it only applies in situations where what is already annotated to the instruction operand is the only sensable option for the pattern opernads - i.e. only do default annotation when the pattern operand is a leaf and the register class of the instruction operand is (1) allocatable, and (2) not a union of two other register classes - would that be acceptable? I agree that we should not create broken imports nor make things harder to debug, but if we can automate things in a safe manner, why not do it?

ehjogab updated this revision to Diff 287921.Aug 26 2020, 4:33 AM

Added check that the register class to be annotated is allocatable and is not a union of other register classes.

Is this still planned to be checked-in?

Is this still planned to be checked-in?

Dunno. I got no response on my last edit, and I've been on vacation since then until today so there's been no updates on my side.
I still don't see why this shouldn't be checked in, but I might be overlooking something that speaks against this.

Does not annotating the output really buy you much? I think inconsistent behaviors in tablegen is a bigger obstacle to understanding patterns

llvm/utils/TableGen/GlobalISelEmitter.cpp
4161–4162

I think trying to parse this manually is problematic

Does not annotating the output really buy you much? I think inconsistent behaviors in tablegen is a bigger obstacle to understanding patterns

It buys me the effort of having to go through all patterns in our target and add these manually. There are over 1300 patterns, meaning it will be tedious and time-consuming having to go through each and every one, look up the register classes of all arguments to the instruction, and annotate these. But since everyone else have had to do this for other targets, I suppose I will be forced to do the same.

Unless someone else is willing to argue towards having this patch accepted, I see no point in spending more time on this.

bjope added a comment.Oct 5 2020, 5:33 AM

One thing that I've not understood is how GlobalISel is different from the legacy ISel here. Apparently it isn't necessary to annotate things with the legacy ISel today? So is legacy ISel doing the same thing that this patch suggests to do also for GlobalISel, or why do we suddenly need to update all patterns now (I mean, somehow it has worked fine in the past, right)?

Seems a bit tedious (also from a maintenance perspective) to keep all patterns up-to-date with the instruction definition if the goal is that they should match (which I think has been the case when the annotation has been omitted in the past). If I've for example wanted to restrict an instruction to a more narrow register class, then it's been possible to make a single update in the instruction definition in the past. But now I'd have to find all the patterns using that instruction to also change the annotation? Or maybe it's worse when going in the opposite direction and I want to use a more general register class, then I for sure need to make updates all over the place to make it work.

If annotations are needed, then maybe we need some kind of debug tool to find any inconsistencies? Or is there already an easy way to spot such things?

ehjogab abandoned this revision.Oct 12 2020, 1:39 AM

Seems no one else is requesting this.

foad added a comment.Oct 12 2020, 1:49 AM

Querying the register bank bank for VS_32 doesn't make any sense, and I suspect would assert.

Would it really be so bad for AMDGPU to define getRegBankFromRegClass(VS_32) == VGPRRegBank in order to get the behaviour it wants from imported patterns?

bjope added a comment.Dec 10 2020, 1:55 AM

But if I understand this patch correctly it has tried to derive the type from "some_isnt" when it is omitted, rather than reusing the type from the source pattern. Would it make more sense to infer the type from the source pattern instead of from the target instruction when it is omitted? Is that possible?

I mean, as far as I can tell the output from the emitter is the same if specifying the register class or just the type from the source pattern.

Example:

def RC : RegisterClass<"MyTarget", [i32], 32, (add R0)>;

// This works with both SelectionDAGISel and GISel:
def : Pat<(add RC:$src1, RC:$src2),
          (some_inst RC:$src1, RC:$src2)>;

// This works with both SelectionDAGISel and GISel:
def : Pat<(add i32:$src1, i32:$src2),
          (some_inst i32:$src1, i32:$src2)>;

// This is currently not supported by GISel:
def : Pat<(add i32:$src1, i32:$src2),
          (some_inst $src1, $src2)>;

// This is currently not supported by GISel:
def : Pat<(add RC:$src1, RC:$src2),
          (some_inst $src1, $src2)>;

But if I understand this patch correctly it has tried to derive the type from "some_isnt" when it is omitted, rather than reusing the type from the source pattern. Would it make more sense to infer the type from the source pattern instead of from the target instruction when it is omitted? Is that possible?

The problem isn't the type, it's the register class. It can be ambiguous to go from the target instruction back to the source bank if the target register class supports multiple banks

bjope added a comment.Dec 14 2020, 3:37 PM

But if I understand this patch correctly it has tried to derive the type from "some_isnt" when it is omitted, rather than reusing the type from the source pattern. Would it make more sense to infer the type from the source pattern instead of from the target instruction when it is omitted? Is that possible?

The problem isn't the type, it's the register class. It can be ambiguous to go from the target instruction back to the source bank if the target register class supports multiple banks

I assume tblgen will complain if it find something ambigous?

However, this pattern seem to work with both SelectionDAGISel and GISel

def : Pat<(add i32:$src1, i32:$src2),
          (some_inst i32:$src1, i32:$src2)>;

and there aren't even any register class annotations (that is why I don't quite get how the register class is the problem).
Also, I could not find any differences in the GenGlobalISel.inc file is using a registerclass instead of "i32" (e.g. in the output part of the pattern), so it did not seem to matter really (unless some_inst was INSERT_SUBREG or EXTRACT_SUBREG or similar that actually might depend on the register class).

I still haven't figured out the use-case when I want/need to use different annotations for an operand in the input pattern compared to the output pattern (is that perhaps an "implicit" COPY_TO_REGCLASS?). It really feels a bit unexpected/unintuitive that for example $src1 could have different types (or register classes if you prefer) in the input and output pattern fragments (isn't it a weird syntax?).

My idea was simply to support an omitted annotation in the output part of the pattern, by defining it as having the same annotation as in the input pattern if it is omitted. After all, it is allowed to omit the annotation today (but then the GlobalISelEmitter ignores the pattern so it isn't really that useful). I think that either we should require an annotation also in the output pattern, or we should support omitting it in some useful way.

Defining what an omitted annotation means does not prevent anyone from adding annotations when they are needed (to resolve ambiguity etc).

ehjogab added a comment.EditedDec 14 2020, 10:51 PM

Hm, I had another look at this and it seems that it's enough to provide either a register class or a value type. In both cases you enter the same if condition (see line 4266 in GlobalISelEmitter.cpp in this review), which checks that the child record is either a RegisterClass, RegisterOperand, or ValueType. If the check passes, it will output a GIR_COPY, which just copies the operand from one instruction (the one matched) to another (the instruction to emit) and hence does not look at the register class of the destination (that seems to be taken care of at a later stage, which does not look at the annotation of the operand in the pattern).

In other words, as long as the operand in the destination pattern is annotated with either register class or register value, TableGen will not reject the pattern. In light of this, maybe it's a better idea to look for the corresponding operand in the source pattern and try to copy its annotation (like bjope suggested).

But since everyone else have had to do this for other targets, I suppose I will be forced to do the same.

I think it was mandatory to have them at one point and many targets have it because of that. This might be the first time I've seen a rule omit it.

One thing that I've not understood is how GlobalISel is different from the legacy ISel here. Apparently it isn't necessary to annotate things with the legacy ISel today? So is legacy ISel doing the same thing that this patch suggests to do also for GlobalISel, or why do we suddenly need to update all patterns now (I mean, somehow it has worked fine in the past, right)?

The semantics for DAGISel are complicated and largely undocumented. There's quite a few features that only come up once or twice or where a small change can make a pattern more conventional and remove the need to implement something. The goal wasn't really to re-implement the whole thing but rather to aid bring up and eventually transition to GlobalISel.

The big difference that matters w.r.t this though is that GlobalISel has a concept of register banks that didn't exist in DAGISel. The matcher side treats ValueType as a wildcard for RegisterBanks that matches anything. That can be a problem for some targets and such targets need to avoid using it but for others it can be fine. The renderer will copy the operand and thereby preserve bank information that already exists but something (typically GIR_ConstrainSelectedInstOperands ) will have to constrain it to a class for ISel to have done its job.

If annotations are needed, then maybe we need some kind of debug tool to find any inconsistencies? Or is there already an easy way to spot such things?

CodeGenDAGPatterns should object if the source and dest patterns disagree

I assume tblgen will complain if it find something ambigous?

Maybe. It will try for some things but I wouldn't rely on that to be complete.

Would it make more sense to infer the type from the source pattern instead of from the target instruction when it is omitted? Is that possible?

I think that's probably the right semantics to match DAGISel and if it is, then I feel that it ought to be CodeGenDAGPatterns's job to fill in the blanks. Have you found DAGISel's code for it?

But if I understand this patch correctly it has tried to derive the type from "some_isnt" when it is omitted, rather than reusing the type from the source pattern. Would it make more sense to infer the type from the source pattern instead of from the target instruction when it is omitted? Is that possible?

The problem isn't the type, it's the register class. It can be ambiguous to go from the target instruction back to the source bank if the target register class supports multiple banks

I think you may be talking cross-purposes here. It seems to me that there's two problems being discussed:

  • What does the rule mean? I think filling in the blanks in the dst pattern using the values in the src pattern is the right semantics but I haven't dug into DAGISel to confirm that
  • Is the resulting rule valid/importable? The answer will be target dependent according to the class->bank mappings. If it's unambiguous then I think accepting it is fine but if it's ambiguous (either by a single ambiguous class->multiple bank mapping, or by type->multiple class->multiple bank mapping) then we should reject the rule for that reason. That information should exist in RegisterBankEmitter.cpp, we might need to make it available to GlobalISelEmitter.cpp somehow
llvm/utils/TableGen/GlobalISelEmitter.cpp
4161–4162

I agree. That should be left to SetTheory, we should only be accessing register classes via CodeGenRegisterClass in this code

4256

I expect CodeGenRegisterClass has something for this. I haven't checked though