This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][NFC] Alter ComplexPattern types to be consistent with their uses
ClosedPublic

Authored by jrtc27 on Aug 31 2021, 5:00 PM.

Details

Summary

When used as a non-leaf node, TableGen does not currently use the type
of a ComplexPattern for type inference, which also means it does not
check it doesn't conflict with the use. This differs from when used as a
leaf value, where the type is used for inference. Fixing that
discrepancy is something I intend to upstream as a subsequent review.

AMDGPU currently has several ComplexPatterns that are used in contexts
where they're expected to be an iPTR, and where using an iPTR instead of
a fixed-width integer type matters. With my locally-patched TableGen,
none of these mismatches result in type contradictions, but do change
the patterns and cause various failures to select. These changes to the
ComplexPatterns' types reflect how they are actually used, result in
bit-for-bit identical TableGen output (without my local TableGen patch),
and ensure that with improved type inference AMDGPU's backend will
continue to work.

Diff Detail

Unit TestsFailed

Event Timeline

jrtc27 created this revision.Aug 31 2021, 5:00 PM
jrtc27 requested review of this revision.Aug 31 2021, 5:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2021, 5:00 PM

I'm not sure where this is going without the context of the other patch.

I think iPTR is a flawed concept that doesn't work without the address space. AMDGPU does not want the type changing magic of iPTR when there are mixed and constant pointer sizes that never change, so the fixed types are more logical.

llvm/lib/Target/AMDGPU/SIInstrInfo.td
1320–1322

I think iPTR is a flawed concept that doesn't work without the address space, so I don't think AMDGPU should ever use it

jrtc27 added a comment.EditedAug 31 2021, 5:22 PM

I'm not sure where this is going without the context of the other patch.

I think iPTR is a flawed concept that doesn't work without the address space. AMDGPU does not want the type changing magic of iPTR when there are mixed and constant pointer sizes that never change, so the fixed types are more logical.

I agree with you in a sense, it causes pain for us downstream to not have address spaces here (so we just add a new type for our other pointer representation downstream, and up until now it's been hacked together, but I'm trying to rewrite it to do things properly, and adding a new possible pointer type everywhere beyond iPTR causes a lot of believed-ambiguous-but-not patterns without better type inference in TableGen). However, if you look at the generated .inc files, you will see that all the cases where these are used TableGen has already inferred that iPTR is in use, and if I don't make this change but have my subsequent TableGen patch applied to actually propagate the types as they should have been from the start then the AMDGPU backend becomes very crash-prone with failures to select all over the place as it seems it really is expecting iPTR not i32/i64 for various things. The changes here have zero effect on the generated files without my TableGen patch, and ensure that even with my TableGen patch the output remains unchanged (otherwise they go from iPTR to i32/i64), so this doesn't change behaviour, it just makes the code more accurately state what is actually currently going on.

I'm not sure where this is going without the context of the other patch.

I think iPTR is a flawed concept that doesn't work without the address space. AMDGPU does not want the type changing magic of iPTR when there are mixed and constant pointer sizes that never change, so the fixed types are more logical.

I agree with you in a sense, it causes pain for us downstream to not have address spaces here (so we just add a new type for our other pointer representation downstream, and up until now it's been hacked together, but I'm trying to rewrite it to do things properly, and adding a new possible pointer type everywhere beyond iPTR causes a lot of believed-ambiguous-but-not patterns without better type inference in TableGen). However, if you look at the generated .inc files, you will see that all the cases where these are used TableGen has already inferred that iPTR is in use, and if I don't make this change but have my subsequent TableGen patch applied to actually propagate the types as they should have been from the start then the AMDGPU backend becomes very crash-prone with failures to select all over the place as it seems it really is expecting iPTR not i32/i64 for various things. The changes here have zero effect on the generated files without my TableGen patch, and ensure that even with my TableGen patch the output remains unchanged (otherwise they go from iPTR to i32/i64), so this doesn't change behaviour, it just makes the code more accurately state what is actually currently going on.

For example, picking the first ComplexPattern here, MUBUFAddr64, I see 56 instances of MUBUFAddr64:{ in AMDGPUGenDAGISel.inc, all of which are MUBUFAddr64:{ *:[iPTR] }. Changing the ComplexPattern to have type iPTR rather than i64 ensures that that remains the case once the ComplexPattern type is included in the type inference, rather than altering behaviour by inferring MUBUFAddr64:{ *:[i64] } instead (which is, I imagine, what the author _intended_ it to be, but not what it _actually_ is currently, and does not work if that's actually the type used).

Ping; does the above help explain why, despite what it may look like, this is actually NFC, down to the generated code?

I'm not sure where this is going without the context of the other patch.

I think iPTR is a flawed concept that doesn't work without the address space. AMDGPU does not want the type changing magic of iPTR when there are mixed and constant pointer sizes that never change, so the fixed types are more logical.

I agree with you in a sense, it causes pain for us downstream to not have address spaces here (so we just add a new type for our other pointer representation downstream, and up until now it's been hacked together, but I'm trying to rewrite it to do things properly, and adding a new possible pointer type everywhere beyond iPTR causes a lot of believed-ambiguous-but-not patterns without better type inference in TableGen). However, if you look at the generated .inc files, you will see that all the cases where these are used TableGen has already inferred that iPTR is in use, and if I don't make this change but have my subsequent TableGen patch applied to actually propagate the types as they should have been from the start then the AMDGPU backend becomes very crash-prone with failures to select all over the place as it seems it really is expecting iPTR not i32/i64 for various things. The changes here have zero effect on the generated files without my TableGen patch, and ensure that even with my TableGen patch the output remains unchanged (otherwise they go from iPTR to i32/i64), so this doesn't change behaviour, it just makes the code more accurately state what is actually currently going on.

For example, picking the first ComplexPattern here, MUBUFAddr64, I see 56 instances of MUBUFAddr64:{ in AMDGPUGenDAGISel.inc, all of which are MUBUFAddr64:{ *:[iPTR] }. Changing the ComplexPattern to have type iPTR rather than i64 ensures that that remains the case once the ComplexPattern type is included in the type inference, rather than altering behaviour by inferring MUBUFAddr64:{ *:[i64] } instead (which is, I imagine, what the author _intended_ it to be, but not what it _actually_ is currently, and does not work if that's actually the type used).

The MUBUFAddr64 case isn't that interesting, since you're changing one form of "i64" to another form of "i64". I'm much more puzzled about how this is not blowing up on the cases that do use 32 bit pointers (e.g. all the DS* patterns)

llvm/lib/Target/AMDGPU/FLATInstructions.td
9–10

I would believe these are NFC

11

There's no way this is actually correct

llvm/lib/Target/AMDGPU/SIInstrInfo.td
1320–1322

These are also i32, I don't see how this is working

I'm not sure where this is going without the context of the other patch.

I think iPTR is a flawed concept that doesn't work without the address space. AMDGPU does not want the type changing magic of iPTR when there are mixed and constant pointer sizes that never change, so the fixed types are more logical.

I agree with you in a sense, it causes pain for us downstream to not have address spaces here (so we just add a new type for our other pointer representation downstream, and up until now it's been hacked together, but I'm trying to rewrite it to do things properly, and adding a new possible pointer type everywhere beyond iPTR causes a lot of believed-ambiguous-but-not patterns without better type inference in TableGen). However, if you look at the generated .inc files, you will see that all the cases where these are used TableGen has already inferred that iPTR is in use, and if I don't make this change but have my subsequent TableGen patch applied to actually propagate the types as they should have been from the start then the AMDGPU backend becomes very crash-prone with failures to select all over the place as it seems it really is expecting iPTR not i32/i64 for various things. The changes here have zero effect on the generated files without my TableGen patch, and ensure that even with my TableGen patch the output remains unchanged (otherwise they go from iPTR to i32/i64), so this doesn't change behaviour, it just makes the code more accurately state what is actually currently going on.

For example, picking the first ComplexPattern here, MUBUFAddr64, I see 56 instances of MUBUFAddr64:{ in AMDGPUGenDAGISel.inc, all of which are MUBUFAddr64:{ *:[iPTR] }. Changing the ComplexPattern to have type iPTR rather than i64 ensures that that remains the case once the ComplexPattern type is included in the type inference, rather than altering behaviour by inferring MUBUFAddr64:{ *:[i64] } instead (which is, I imagine, what the author _intended_ it to be, but not what it _actually_ is currently, and does not work if that's actually the type used).

The MUBUFAddr64 case isn't that interesting, since you're changing one form of "i64" to another form of "i64". I'm much more puzzled about how this is not blowing up on the cases that do use 32 bit pointers (e.g. all the DS* patterns)

Those examples may not be what you intend them to be, but I can promise you that if you go read AMDGPUGenDAGISel.inc yourself you will find that every single one of them is currently matching an iPTR. For example:

// Src: (ld:{ *:[i32] } (ScratchOffset:{ *:[iPTR] } VGPR_32:{ *:[i32] }:$vaddr, i16:{ *:[i16] }:$offset))<<P:Predicate_unindexedload>><<P:Predicate_extload>><<P:Predicate_extloadi8_private>> - Complexity = 22
// Dst: (SCRATCH_LOAD_UBYTE:{ *:[i32] } ?:{ *:[i32] }:$vaddr, ?:{ *:[i16] }:$offset)

is what I see *without* any of my patches, so I'm just preserving that. Things blow up with failures to select, or poor codegen that regresses tests, when these instead infer i32. If you want to see for yourself, apply D109035, note that the TableGen output changes to honour the existing intended types and that many AMDGPU tests fail.

I'm not sure where this is going without the context of the other patch.

I think iPTR is a flawed concept that doesn't work without the address space. AMDGPU does not want the type changing magic of iPTR when there are mixed and constant pointer sizes that never change, so the fixed types are more logical.

I agree with you in a sense, it causes pain for us downstream to not have address spaces here (so we just add a new type for our other pointer representation downstream, and up until now it's been hacked together, but I'm trying to rewrite it to do things properly, and adding a new possible pointer type everywhere beyond iPTR causes a lot of believed-ambiguous-but-not patterns without better type inference in TableGen). However, if you look at the generated .inc files, you will see that all the cases where these are used TableGen has already inferred that iPTR is in use, and if I don't make this change but have my subsequent TableGen patch applied to actually propagate the types as they should have been from the start then the AMDGPU backend becomes very crash-prone with failures to select all over the place as it seems it really is expecting iPTR not i32/i64 for various things. The changes here have zero effect on the generated files without my TableGen patch, and ensure that even with my TableGen patch the output remains unchanged (otherwise they go from iPTR to i32/i64), so this doesn't change behaviour, it just makes the code more accurately state what is actually currently going on.

For example, picking the first ComplexPattern here, MUBUFAddr64, I see 56 instances of MUBUFAddr64:{ in AMDGPUGenDAGISel.inc, all of which are MUBUFAddr64:{ *:[iPTR] }. Changing the ComplexPattern to have type iPTR rather than i64 ensures that that remains the case once the ComplexPattern type is included in the type inference, rather than altering behaviour by inferring MUBUFAddr64:{ *:[i64] } instead (which is, I imagine, what the author _intended_ it to be, but not what it _actually_ is currently, and does not work if that's actually the type used).

The MUBUFAddr64 case isn't that interesting, since you're changing one form of "i64" to another form of "i64". I'm much more puzzled about how this is not blowing up on the cases that do use 32 bit pointers (e.g. all the DS* patterns)

Those examples may not be what you intend them to be, but I can promise you that if you go read AMDGPUGenDAGISel.inc yourself you will find that every single one of them is currently matching an iPTR. For example:

// Src: (ld:{ *:[i32] } (ScratchOffset:{ *:[iPTR] } VGPR_32:{ *:[i32] }:$vaddr, i16:{ *:[i16] }:$offset))<<P:Predicate_unindexedload>><<P:Predicate_extload>><<P:Predicate_extloadi8_private>> - Complexity = 22
// Dst: (SCRATCH_LOAD_UBYTE:{ *:[i32] } ?:{ *:[i32] }:$vaddr, ?:{ *:[i16] }:$offset)

is what I see *without* any of my patches, so I'm just preserving that. Things blow up with failures to select, or poor codegen that regresses tests, when these instead infer i32. If you want to see for yourself, apply D109035, note that the TableGen output changes to honour the existing intended types and that many AMDGPU tests fail.

Ping regarding this explanation?

I'm not sure where this is going without the context of the other patch.

I think iPTR is a flawed concept that doesn't work without the address space. AMDGPU does not want the type changing magic of iPTR when there are mixed and constant pointer sizes that never change, so the fixed types are more logical.

I agree with you in a sense, it causes pain for us downstream to not have address spaces here (so we just add a new type for our other pointer representation downstream, and up until now it's been hacked together, but I'm trying to rewrite it to do things properly, and adding a new possible pointer type everywhere beyond iPTR causes a lot of believed-ambiguous-but-not patterns without better type inference in TableGen). However, if you look at the generated .inc files, you will see that all the cases where these are used TableGen has already inferred that iPTR is in use, and if I don't make this change but have my subsequent TableGen patch applied to actually propagate the types as they should have been from the start then the AMDGPU backend becomes very crash-prone with failures to select all over the place as it seems it really is expecting iPTR not i32/i64 for various things. The changes here have zero effect on the generated files without my TableGen patch, and ensure that even with my TableGen patch the output remains unchanged (otherwise they go from iPTR to i32/i64), so this doesn't change behaviour, it just makes the code more accurately state what is actually currently going on.

For example, picking the first ComplexPattern here, MUBUFAddr64, I see 56 instances of MUBUFAddr64:{ in AMDGPUGenDAGISel.inc, all of which are MUBUFAddr64:{ *:[iPTR] }. Changing the ComplexPattern to have type iPTR rather than i64 ensures that that remains the case once the ComplexPattern type is included in the type inference, rather than altering behaviour by inferring MUBUFAddr64:{ *:[i64] } instead (which is, I imagine, what the author _intended_ it to be, but not what it _actually_ is currently, and does not work if that's actually the type used).

The MUBUFAddr64 case isn't that interesting, since you're changing one form of "i64" to another form of "i64". I'm much more puzzled about how this is not blowing up on the cases that do use 32 bit pointers (e.g. all the DS* patterns)

Those examples may not be what you intend them to be, but I can promise you that if you go read AMDGPUGenDAGISel.inc yourself you will find that every single one of them is currently matching an iPTR. For example:

// Src: (ld:{ *:[i32] } (ScratchOffset:{ *:[iPTR] } VGPR_32:{ *:[i32] }:$vaddr, i16:{ *:[i16] }:$offset))<<P:Predicate_unindexedload>><<P:Predicate_extload>><<P:Predicate_extloadi8_private>> - Complexity = 22
// Dst: (SCRATCH_LOAD_UBYTE:{ *:[i32] } ?:{ *:[i32] }:$vaddr, ?:{ *:[i16] }:$offset)

is what I see *without* any of my patches, so I'm just preserving that. Things blow up with failures to select, or poor codegen that regresses tests, when these instead infer i32. If you want to see for yourself, apply D109035, note that the TableGen output changes to honour the existing intended types and that many AMDGPU tests fail.

Ping regarding this explanation?

Ping?

jrtc27 added a comment.Nov 1 2021, 1:02 PM

I'm not sure where this is going without the context of the other patch.

I think iPTR is a flawed concept that doesn't work without the address space. AMDGPU does not want the type changing magic of iPTR when there are mixed and constant pointer sizes that never change, so the fixed types are more logical.

I agree with you in a sense, it causes pain for us downstream to not have address spaces here (so we just add a new type for our other pointer representation downstream, and up until now it's been hacked together, but I'm trying to rewrite it to do things properly, and adding a new possible pointer type everywhere beyond iPTR causes a lot of believed-ambiguous-but-not patterns without better type inference in TableGen). However, if you look at the generated .inc files, you will see that all the cases where these are used TableGen has already inferred that iPTR is in use, and if I don't make this change but have my subsequent TableGen patch applied to actually propagate the types as they should have been from the start then the AMDGPU backend becomes very crash-prone with failures to select all over the place as it seems it really is expecting iPTR not i32/i64 for various things. The changes here have zero effect on the generated files without my TableGen patch, and ensure that even with my TableGen patch the output remains unchanged (otherwise they go from iPTR to i32/i64), so this doesn't change behaviour, it just makes the code more accurately state what is actually currently going on.

For example, picking the first ComplexPattern here, MUBUFAddr64, I see 56 instances of MUBUFAddr64:{ in AMDGPUGenDAGISel.inc, all of which are MUBUFAddr64:{ *:[iPTR] }. Changing the ComplexPattern to have type iPTR rather than i64 ensures that that remains the case once the ComplexPattern type is included in the type inference, rather than altering behaviour by inferring MUBUFAddr64:{ *:[i64] } instead (which is, I imagine, what the author _intended_ it to be, but not what it _actually_ is currently, and does not work if that's actually the type used).

The MUBUFAddr64 case isn't that interesting, since you're changing one form of "i64" to another form of "i64". I'm much more puzzled about how this is not blowing up on the cases that do use 32 bit pointers (e.g. all the DS* patterns)

Those examples may not be what you intend them to be, but I can promise you that if you go read AMDGPUGenDAGISel.inc yourself you will find that every single one of them is currently matching an iPTR. For example:

// Src: (ld:{ *:[i32] } (ScratchOffset:{ *:[iPTR] } VGPR_32:{ *:[i32] }:$vaddr, i16:{ *:[i16] }:$offset))<<P:Predicate_unindexedload>><<P:Predicate_extload>><<P:Predicate_extloadi8_private>> - Complexity = 22
// Dst: (SCRATCH_LOAD_UBYTE:{ *:[i32] } ?:{ *:[i32] }:$vaddr, ?:{ *:[i16] }:$offset)

is what I see *without* any of my patches, so I'm just preserving that. Things blow up with failures to select, or poor codegen that regresses tests, when these instead infer i32. If you want to see for yourself, apply D109035, note that the TableGen output changes to honour the existing intended types and that many AMDGPU tests fail.

Ping regarding this explanation?

Ping?

Ping?

arsenm accepted this revision.Nov 1 2021, 2:33 PM

The MUBUFAddr64 case isn't that interesting, since you're changing one form of "i64" to another form of "i64". I'm much more puzzled about how this is not blowing up on the cases that do use 32 bit pointers (e.g. all the DS* patterns)

Those examples may not be what you intend them to be, but I can promise you that if you go read AMDGPUGenDAGISel.inc yourself you will find that every single one of them is currently matching an iPTR. For example:

// Src: (ld:{ *:[i32] } (ScratchOffset:{ *:[iPTR] } VGPR_32:{ *:[i32] }:$vaddr, i16:{ *:[i16] }:$offset))<<P:Predicate_unindexedload>><<P:Predicate_extload>><<P:Predicate_extloadi8_private>> - Complexity = 22
// Dst: (SCRATCH_LOAD_UBYTE:{ *:[i32] } ?:{ *:[i32] }:$vaddr, ?:{ *:[i16] }:$offset)

is what I see *without* any of my patches, so I'm just preserving that. Things blow up with failures to select, or poor codegen that regresses tests, when these instead infer i32. If you want to see for yourself, apply D109035, note that the TableGen output changes to honour the existing intended types and that many AMDGPU tests fail.

Ping regarding this explanation?

This is more of another datapoint that nothing with iPTR makes any sense, I don't think this really explains it.

This revision is now accepted and ready to land.Nov 1 2021, 2:33 PM
jrtc27 added a comment.Nov 1 2021, 4:06 PM

The MUBUFAddr64 case isn't that interesting, since you're changing one form of "i64" to another form of "i64". I'm much more puzzled about how this is not blowing up on the cases that do use 32 bit pointers (e.g. all the DS* patterns)

Those examples may not be what you intend them to be, but I can promise you that if you go read AMDGPUGenDAGISel.inc yourself you will find that every single one of them is currently matching an iPTR. For example:

// Src: (ld:{ *:[i32] } (ScratchOffset:{ *:[iPTR] } VGPR_32:{ *:[i32] }:$vaddr, i16:{ *:[i16] }:$offset))<<P:Predicate_unindexedload>><<P:Predicate_extload>><<P:Predicate_extloadi8_private>> - Complexity = 22
// Dst: (SCRATCH_LOAD_UBYTE:{ *:[i32] } ?:{ *:[i32] }:$vaddr, ?:{ *:[i16] }:$offset)

is what I see *without* any of my patches, so I'm just preserving that. Things blow up with failures to select, or poor codegen that regresses tests, when these instead infer i32. If you want to see for yourself, apply D109035, note that the TableGen output changes to honour the existing intended types and that many AMDGPU tests fail.

Ping regarding this explanation?

This is more of another datapoint that nothing with iPTR makes any sense, I don't think this really explains it.

Oh I know, I'm cursed with having to bodge iPTR downstream even more than normal because we're SelectionDAG-based (and I don't see RISC-V switching to GISel any time soon)...

Thanks for the review

This revision was landed with ongoing or failed builds.Dec 2 2021, 11:05 PM
This revision was automatically updated to reflect the committed changes.