This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Add even aligned VGPR/AGPR register classes
ClosedPublic

Authored by arsenm on Feb 23 2021, 11:39 AM.

Details

Reviewers
rampitec
kzhuravl
Summary

gfx90a operations require even aligned registers, but this was
previously achieved by reserving registers inside the full class.

Ideally this would be captured in the static instruction definitions
for the operands, and we would have different instructions per
subtarget. The hackiest part of this is we need to manually reassign
AGPR register classes after instruction selection (we get away without
this for VGPRs since those types are actually registered for legal
types).

Diff Detail

Event Timeline

arsenm created this revision.Feb 23 2021, 11:39 AM
arsenm requested review of this revision.Feb 23 2021, 11:39 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 23 2021, 11:39 AM
Herald added a subscriber: wdng. · View Herald Transcript

I do not think it will help. There will be the same issue as with reserved registers. Assume you have VReg_128_align2. For sure RA will allocate an aligned register. But something like coalescer will happily use sub1_sub2 on it and you will end up with an unaligned physreg at the end. To make this working you need to make sure these registers do not have unaligned subregs.

llvm/lib/Target/AMDGPU/SIISelLowering.cpp
11586

llvm_unreachable?

I do not think it will help. There will be the same issue as with reserved registers. Assume you have VReg_128_align2. For sure RA will allocate an aligned register. But something like coalescer will happily use sub1_sub2 on it and you will end up with an unaligned physreg at the end. To make this working you need to make sure these registers do not have unaligned subregs.

This is already the case. The subregister indexes are considered unsupported by the class (e.g. getMatchingSuperRegClass and the composeSubReg* functions will fail)

I do not think it will help. There will be the same issue as with reserved registers. Assume you have VReg_128_align2. For sure RA will allocate an aligned register. But something like coalescer will happily use sub1_sub2 on it and you will end up with an unaligned physreg at the end. To make this working you need to make sure these registers do not have unaligned subregs.

This is already the case. The subregister indexes are considered unsupported by the class (e.g. getMatchingSuperRegClass and the composeSubReg* functions will fail)

Do you mean because to get a subreg a pass needs to query a subclass and aligned classes have only aligned subclasses? Are you sure there is no other way to get a subreg?

I do not think it will help. There will be the same issue as with reserved registers. Assume you have VReg_128_align2. For sure RA will allocate an aligned register. But something like coalescer will happily use sub1_sub2 on it and you will end up with an unaligned physreg at the end. To make this working you need to make sure these registers do not have unaligned subregs.

This is already the case. The subregister indexes are considered unsupported by the class (e.g. getMatchingSuperRegClass and the composeSubReg* functions will fail)

Do you mean because to get a subreg a pass needs to query a subclass and aligned classes have only aligned subclasses? Are you sure there is no other way to get a subreg?

Yes, the verifier fails if you use an unaligned subregister with an aligned class. For example, the custom verifier check fails on this case:

GLOBAL_STORE_DWORDX2 %0, %5:vreg_128_align2.sub1_sub2, 0, 0, 0, 0, 0, implicit $exec

The general verifier catches this problem in other cases. Technically you need to verify the register class supports a given subreg index and constrain/copy a register to a compatible class (for example GlobalISel selectG_INSERT checks for class compatibility in cases where we would want to use unaligned indexes on SGPR tuples)

I do not think it will help. There will be the same issue as with reserved registers. Assume you have VReg_128_align2. For sure RA will allocate an aligned register. But something like coalescer will happily use sub1_sub2 on it and you will end up with an unaligned physreg at the end. To make this working you need to make sure these registers do not have unaligned subregs.

This is already the case. The subregister indexes are considered unsupported by the class (e.g. getMatchingSuperRegClass and the composeSubReg* functions will fail)

Do you mean because to get a subreg a pass needs to query a subclass and aligned classes have only aligned subclasses? Are you sure there is no other way to get a subreg?

Yes, the verifier fails if you use an unaligned subregister with an aligned class. For example, the custom verifier check fails on this case:

GLOBAL_STORE_DWORDX2 %0, %5:vreg_128_align2.sub1_sub2, 0, 0, 0, 0, 0, implicit $exec

The general verifier catches this problem in other cases. Technically you need to verify the register class supports a given subreg index and constrain/copy a register to a compatible class (for example GlobalISel selectG_INSERT checks for class compatibility in cases where we would want to use unaligned indexes on SGPR tuples)

You can probably remove check from SIRegisterInfo::shouldCoalesce():

if (ST.hasGFX90AInsts() && !isSGPRClass(NewRC) &&
    (getChannelFromSubReg(DstSubReg) & 1) !=
    (getChannelFromSubReg(SubReg) & 1))
  return false;

If coalesce-vgpr-alignment.ll and hiloeo-odd-coalesce.ll pass then you are right.

If coalesce-vgpr-alignment.ll and hiloeo-odd-coalesce.ll pass then you are right.

I have a separate patch for this. coalesce-vgpr-alignment.ll is better and saves a move. I don't see hiloeo-odd-coalesce.ll

If coalesce-vgpr-alignment.ll and hiloeo-odd-coalesce.ll pass then you are right.

I have a separate patch for this. coalesce-vgpr-alignment.ll is better and saves a move. I don't see hiloeo-odd-coalesce.ll

Ah, right! It was included into coalesce-vgpr-alignment.ll. So does coalesce-vgpr-alignment.ll pass with that code from shouldCoalesce() removed? If it does I have no objections (except a nit with llvm_unreachable and some reported failing tests from the robot).

If coalesce-vgpr-alignment.ll and hiloeo-odd-coalesce.ll pass then you are right.

I have a separate patch for this. coalesce-vgpr-alignment.ll is better and saves a move. I don't see hiloeo-odd-coalesce.ll

Ah, right! It was included into coalesce-vgpr-alignment.ll. So does coalesce-vgpr-alignment.ll pass with that code from shouldCoalesce() removed? If it does I have no objections (except a nit with llvm_unreachable and some reported failing tests from the robot).

Oh, I thought you already accepted this and submitted this as 78b6d73a93fc6085d2a2fc84bdce1bbde740cf16. The switch doesn't need unreachable because it's technically just over an integer, and there's no covered enum

rampitec accepted this revision.Feb 24 2021, 12:33 PM

If coalesce-vgpr-alignment.ll and hiloeo-odd-coalesce.ll pass then you are right.

I have a separate patch for this. coalesce-vgpr-alignment.ll is better and saves a move. I don't see hiloeo-odd-coalesce.ll

Ah, right! It was included into coalesce-vgpr-alignment.ll. So does coalesce-vgpr-alignment.ll pass with that code from shouldCoalesce() removed? If it does I have no objections (except a nit with llvm_unreachable and some reported failing tests from the robot).

Oh, I thought you already accepted this and submitted this as 78b6d73a93fc6085d2a2fc84bdce1bbde740cf16. The switch doesn't need unreachable because it's technically just over an integer, and there's no covered enum

OK, not a big deal. Close the review though.

This revision is now accepted and ready to land.Feb 24 2021, 12:33 PM
arsenm closed this revision.Feb 24 2021, 1:21 PM