This is an archive of the discontinued LLVM Phabricator instance.

Remove unnecessary call to getAllocatableRegClass
ClosedPublic

Authored by arsenm on Nov 6 2015, 10:54 PM.

Details

Reviewers
qcolombet
MatzeB
Summary

I'm not sure what the point of this was. I'm not sure why
you would ever define an instruction that produces an unallocatable
register class. No tests fail with this removed, and it seems like
it should be a verifier error to define such an instruction.

This was problematic for AMDGPU because it would make bad decisions
by arbitrarily changing the register class when unsetting isAllocatable
for VS_32/VS_64, which is currently set as a workaround to this problem.

AMDGPU uses the VS_32/VS_64 register classes to represent operands which
can use either VGPRs or SGPRs. When isAllocatable is unset for these,
this would need to pick either the SGPR or VGPR class and insert either
a copy we don't want, or an illegal copy we would need to deal with
later. A semi-arbitrary register class ordering decision is made in tablegen,
which resulted in always picking a VGPR class because it happens to have
more registers than the SGPR register class. We really just want to
use whatever register class the original register had.

Diff Detail

Event Timeline

arsenm updated this revision to Diff 39632.Nov 6 2015, 10:54 PM
arsenm retitled this revision from to Remove unnecessary call to getAllocatableRegClass.
arsenm updated this object.
arsenm added reviewers: qcolombet, MatzeB.
arsenm added a subscriber: llvm-commits.
qcolombet edited edge metadata.Nov 9 2015, 9:20 AM

I'm not sure why you would ever define an instruction that produces an unallocatable register class.

That’s not exactly the semantic here. The register class is not unallocatable per say, only a subset is. By calling getAllocatableClass, we get the subset of the class that is allocatable.
This may happen if say, your instruction can encode GPR + SP but you can only allocate GPR.

In other words, what you’re saying in the next sentence is perfectly legal:

it seems like it should be a verifier error to define such an instruction.

I have to admit I have some troubles figuring why this is a problem in your case. May be worth spending more time to understand what is going on (i.e., probably spending more time to explain to me :)).
That being said, I believe the reason why you don’t have any test failing because of that change, is simply because when we create virtual register we restrict them on allocatable class. I honestly don’t see how we could be an example where this is needed.

The bottom line is okay, to remove this additional check, but two things:

  • I’d like to understand why this is a problem in your case. (Could you post the MI representation exposing the problem.)
  • Please add a test case for your problem. (Could be done as a follow-up commit if you want to track the changes separately.)

Thanks,
-Quentin

I'm not sure why you would ever define an instruction that produces an unallocatable register class.

That’s not exactly the semantic here. The register class is not unallocatable per say, only a subset is. By calling getAllocatableClass, we get the subset of the class that is allocatable.
This may happen if say, your instruction can encode GPR + SP but you can only allocate GPR.

In other words, what you’re saying in the next sentence is perfectly legal:

it seems like it should be a verifier error to define such an instruction.

I have to admit I have some troubles figuring why this is a problem in your case. May be worth spending more time to understand what is going on (i.e., probably spending more time to explain to me :)).

The parts of the problem are:

  • All instructions produce either a VGPR or SGPR, they can never produce one or the other
  • Most instructions are VALU instructions, and most operands can be either a VGPR or SGPR. This is modeled with the VS_32/VS_64 register classes, which I want to make unallocatable because currently they influence pressure heuristics even though there should never be a virtual register with these classes.
  • From LLVM's perspective VGPRs and SGPRs are equivalent in every way except for the number of them.
  • getAllocatableRegClass on the VS_32 superset will just pick one or the other. Based on the ordering scheme, register classes with more registers are ordered first, so this will always select the VGPR allocatable subset.
  • When the input operand is an SGPR, we will always get an SGPR->VGPR copy. This introduces a large number of copies since the most common operands are the combined VS class.
  • All of these extra copies increase the amount of work the later custom operand legalization and folding passes need to do. This is backwards from how we expect this to happen. We have a lot of logic for when to insert these copies as required, and we don't want the generic code deciding to do this for us, and just want to pass through the original result register's type.
  • What happens now is since VS_32/VS_64 happen to be allocatable, that is selected and then the destination constraint is applied and it ends up with what we want.

That being said, I believe the reason why you don’t have any test failing because of that change, is simply because when we create virtual register we restrict them on allocatable class. I honestly don’t see how we could be an example where this is needed.

Yes, we only want it restricted on the result registers and not the inputs.

The bottom line is okay, to remove this additional check, but two things:

  • I’d like to understand why this is a problem in your case. (Could you post the MI representation exposing the problem.)

When pretending isAllocatable for VS_32:

	%vreg11<def> = V_OR_B32_e32 %vreg9<kill>, %vreg10<kill>, %EXEC<imp-use>; VGPR_32:%vreg11,%vreg10 SReg_32:%vreg9

Note %vreg9 is SReg_32. That operand is a VS_32 operand.

With D14478 applied without this patch:

	%vreg12<def> = COPY %vreg9; VGPR_32:%vreg12 SReg_32:%vreg9
	%vreg11<def> = V_OR_B32_e32 %vreg12<kill>, %vreg10<kill>, %EXEC<imp-use>; VGPR_32:%vreg11,%vreg12,%vreg10

The copy is inserted and all of the V_OR_B32 operands are VGPRs. The most common case the existing lit tests is materializing immediates, which are originally placed in SGPRs as a canonical form and may be changed later.

  • Please add a test case for your problem. (Could be done as a follow-up commit if you want to track the changes separately.)

This is pretty well covered by existing tests and fixes test regressions when D14478 is applied. I'm not sure how to come up with a test for the weird pressure tracking changes this is supposed to help with.

Thanks,
-Quentin

qcolombet accepted this revision.Nov 9 2015, 2:44 PM
qcolombet edited edge metadata.

Hi Matt,

Thanks for the clarifications.

This is modeled with the VS_32/VS_64 register classes, which I want to make unallocatable because currently they influence pressure heuristics even though there should never be a virtual register with these classes.

That’s interesting. If all you virtual registers can go either to the V or S bank, that sounds to me like you would want the VS register classes to be actually allocatable. Indeed, you want to take the full advantage of all the space you have when you split a live-range. E.g., say you run out of register in S, after splitting you’ll have to spill whereas you could have use the VS class to fall back on a V register!

Anyway, this is orthogonal to the problem.

getAllocatableRegClass on the VS_32 superset will just pick one or the other. Based on the ordering scheme, register classes with more registers are ordered first, so this will always select the VGPR allocatable subset.

Okay, I see how we can end up in this situation now. Please proceed with the change, but add the suggested assertion. I don’t see how this could be wrong but if it does, I want we know it!.

Cheers,
-Quentin

lib/CodeGen/SelectionDAG/InstrEmitter.cpp
342

Add:
else assert(OpRC.isAllocatable() && “Constraining an allocatable VReg produced an unallocatable class?!");

This revision is now accepted and ready to land.Nov 9 2015, 2:44 PM
arsenm closed this revision.Nov 9 2015, 4:32 PM
arsenm marked an inline comment as done.

r252565

escha added a subscriber: escha.Nov 11 2015, 4:01 PM

This broke one of our out-of-tree targets.

—escha

Hi Fiona,

Does the assert trigger?

Cheers,
-Quentin

This broke one of our out-of-tree targets.

—escha

I would like to try again to get this in. Are you able to test an see if your target still breaks with this patch?

Hi Tom,

The assertion failure was legit. Please read how to fix the problem directly on the thread of r252565.
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312603.html http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312603.html

Cheers,
-Quentin