This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][GlobalISel] Check exact width in get*ClassForBitWidth and widen if necessary
ClosedPublic

Authored by matejam on Apr 12 2023, 2:44 AM.

Details

Summary

Instead of checking if the given bitwidth is less or equal to a bitwidth of an existing RegClass,
check if it has the exact same value.

For LLVM vector types that don't have a corresponding Register Class, widen them during legalization.
That goes for G_EXTRACT_VECTOR_ELT, G_INSERT_VECTOR_ELT and G_BUILD_VECTOR.

Diff Detail

Event Timeline

matejam created this revision.Apr 12 2023, 2:44 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 12 2023, 2:44 AM
matejam requested review of this revision.Apr 12 2023, 2:44 AM
matejam updated this revision to Diff 515334.Apr 20 2023, 7:51 AM

This is an extension of https://reviews.llvm.org/D144198.
It includes checking the exact width in get*ClassForBitWidth and also widening of the vector operand of
G_BUILD_VECTOR, G_INSERT_VECTOR_ELT and G_EXTRACT_VECTOR_ELT instructions.

matejam updated this revision to Diff 515338.Apr 20 2023, 8:01 AM
foad added inline comments.Apr 24 2023, 6:35 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
1486–1488

Put this in a named LegalityPredicate - or can you just use the existing isRegisterType?

1490–1498

Put this in a named LegalizeMutation

1601

We're already checking isRegisterType here. Perhaps we just need to improve isRegisterSize so that it knows about the missing vector types?

matejam added inline comments.Apr 24 2023, 8:44 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
1486–1488

I will put it in a LegalityPredicate.

1490–1498

Will do.

1601

Even if I improve isRegisterType and isRegisterSize to check if the type has a RegClass representative, it still doesn't widen the vector operand like moreElements does.

matejam updated this revision to Diff 516429.Apr 24 2023, 9:06 AM

Added named LegalityPredicate and LegalizeMutation for checking if the type doesn't have a corresponding AMDGPU RegClass and
if not, widenening the vector to the first next legal RegClass size.

arsenm added inline comments.Apr 24 2023, 9:59 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
141–143

I think a for loop from current size to maximum size would be clearer

1484–1486

These should be moved to the bottom. Legalization rules should list the legal cases first (which we should be doing here, but aren't) and the less common cases should be at the bottom. These queries do execute a large number of times

matejam updated this revision to Diff 516749.Apr 25 2023, 4:40 AM

Moved call of moreElementsIf to the bottom. Made changes to customIf call for G_EXTRACT_VECTOR_ELT and G_INSERT_VECTOR_ELT.

foad added inline comments.Apr 25 2023, 5:40 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
135

Can we assert that the element type is s32 or s64? (Do the current rules ensure that?)

Can we assert that the type size is < MaxRegisterSize? Falling off the end of the "for" loop should really be unreachable.

240

Do we really need this new concept of "RegClassType" in addition to "RegisterType"? I'd really prefer if we can make them the same thing.

matejam updated this revision to Diff 516813.Apr 25 2023, 8:15 AM
matejam added inline comments.Apr 26 2023, 1:53 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
240

I changed it so that the condition is that the type has to be a RegisterType and that it doesn't have a corresponding RegClass.
I can't completely abandon the concept of RegClass, because depending on the existing RegClasses, we widen the vector operand in G_EXTRACT/INSERT_VECTOR_ELT.

Needs new tests that now legalize? The test updates only show regressions?

llvm/test/CodeGen/AMDGPU/GlobalISel/extractelement.ll
3619–3621

I'm surprised there are so many test changes / regressions

Needs new tests that now legalize? The test updates only show regressions?

Well, the point of this patch is to fix the errors that occur in insertelements.ll and extractelements.ll
when the conditions for get*ClassForBitWidth are changed from <= to ==.

Test cases that were changed from GFX10PLUS to GFX10 and GFX11 differentiate from
one another by just the register name (one uses s0 and the other one s4) .

Other test cases that were change include only the ones in which the vector type didn't have
a corresponding Register Class and had to be widened.

matejam retitled this revision from [AMDGPU][GlobalISel] Widen the vector operand in G_BUILD/INSERT/EXTRACT_VECTOR to [AMDGPU][GlobalISel] Check exact width in get*ClassForBitWidth and widen if necessary.May 3 2023, 6:52 AM
matejam edited the summary of this revision. (Show Details)
arsenm accepted this revision.May 3 2023, 8:03 AM
This revision is now accepted and ready to land.May 3 2023, 8:03 AM
This revision was landed with ongoing or failed builds.May 3 2023, 8:35 AM
This revision was automatically updated to reflect the committed changes.