This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Simplify widenScalar condition for BigTy for G_(UN)MERGE_VALUES
ClosedPublic

Authored by foad on Feb 17 2023, 2:19 AM.

Diff Detail

Event Timeline

foad created this revision.Feb 17 2023, 2:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 17 2023, 2:19 AM
foad requested review of this revision.Feb 17 2023, 2:19 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 17 2023, 2:19 AM
foad added a comment.Feb 17 2023, 2:22 AM

This is NFC. The condition was "not a power of two and not a multiple of 16". But this was already clamped to at least S32 on line 1616, so the condition simplifies to "not a multiple of 16".

However, I am sceptical that this condition is what was intended, since it doesn't seem to match the widening rule of "Pick the next power of 2, or a multiple of 64 over 128".

arsenm accepted this revision.Feb 17 2023, 3:04 AM

We probably only want 32-bit multiple merge results (except maybe with true16)

This revision is now accepted and ready to land.Feb 17 2023, 3:04 AM
foad added a comment.Feb 17 2023, 3:12 AM

We probably only want 32-bit multiple merge results (except maybe with true16)

AMDGPUInstructionSelector::selectG_UNMERGE_VALUES seems to assume that the source type has a corresponding register class. We certainly don't have register classes for every multiple of 32 up to MaxScalar (currently 1024). How should this work? Should the legalization be guided by exactly which register classes exist?

This revision was landed with ongoing or failed builds.Feb 17 2023, 3:12 AM
This revision was automatically updated to reflect the committed changes.

We probably only want 32-bit multiple merge results (except maybe with true16)

AMDGPUInstructionSelector::selectG_UNMERGE_VALUES seems to assume that the source type has a corresponding register class. We certainly don't have register classes for every multiple of 32 up to MaxScalar (currently 1024). How should this work? Should the legalization be guided by exactly which register classes exist?

The point of the legalizer is to match the register classes, which would imply rounding up in the non-existent cases. I was thinking we would eventually just have every multiple of 32 register classes