Page MenuHomePhabricator

[codegen,amdgpu] Enhance MIR DIE and re-arrange it for AMDGPU.
ClosedPublic

Authored by hliao on Jan 14 2020, 8:09 AM.

Details

Summary
  • dead-mi-elimination assumes MIR in the SSA form and cannot be arranged after phi elimination or DeSSA. It's enhanced to handle the dead register definition by skipping use check on it. Once a register def is dead, all its uses, if any, should be undef.
  • Re-arrange the DIE in RA phase for AMDGPU by placing it directly after detect-dead-lanes.
  • Many relevant tests are refined due to different register assignment.

Diff Detail

Event Timeline

hliao created this revision.Jan 14 2020, 8:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 14 2020, 8:09 AM
hliao marked an inline comment as done.Jan 14 2020, 8:12 AM
hliao added inline comments.
llvm/test/CodeGen/AMDGPU/dead-machine-elim-after-dead-lane.ll
2

This is the new test exposing the previous issue due to running MIR DIE after DeSSA. It results in illegal MIR.

You have skipped the dead MO, but was pass reordering really necessary? It seems we have higher register pressure with this change.

llvm/test/CodeGen/AMDGPU/dead-mi-use-same-intr.mir
0

The test became useless. The point was to check def of the same reg as use in the same instruction past SSA.

llvm/test/CodeGen/AMDGPU/spill-vgpr-to-agpr.ll
249

That seem to affect register pressure quite badly. Register numbering in other places suggests other places are negatively affected too.

I'm not sure this logic works in the presence of an undef subregister def

llvm/lib/CodeGen/DeadMachineInstructionElim.cpp
87

Typo regsiter

hliao marked 2 inline comments as done.Jan 14 2020, 1:16 PM
hliao added inline comments.
llvm/test/CodeGen/AMDGPU/dead-mi-use-same-intr.mir
0

dead-mi-elimination only works for SSA form. If we really need a DIE for non-SSA MIR, we need to check dead def with live range.

llvm/test/CodeGen/AMDGPU/spill-vgpr-to-agpr.ll
249

I have checked what causes the register pressure, yet. The changes in other places show register pressure is increased at the beginning but released earlier than the previous assignment. It's difficult to say which is good or bad. But, in general, except this spilling case, all other cases have no extra register requirements.

hliao added a comment.Jan 14 2020, 1:16 PM

I'm not sure this logic works in the presence of an undef subregister def

In SSA form, there's no subreg def.

hliao added a comment.Jan 14 2020, 1:19 PM

You have skipped the dead MO, but was pass reordering really necessary? It seems we have higher register pressure with this change.

The issue is that DIE on MIR assumes SSA form MIR. It's incorrect to run it after DeSSA. It results in invalid IR. I thought we should fix this first and look into the fix for the higher register pressure issue here.

rampitec added inline comments.Jan 14 2020, 1:32 PM
llvm/test/CodeGen/AMDGPU/dead-mi-use-same-intr.mir
0

Still the test became useless and needs to be removed with this change.

hliao updated this revision to Diff 238097.Jan 14 2020, 2:05 PM

revise after review comments.

hliao marked an inline comment as done.Jan 14 2020, 2:06 PM

revise after review comments

dead-mi-elimination should set SSA in getRequiredProperties if it really requires SSA

This revision is now accepted and ready to land.Jan 14 2020, 2:51 PM
This revision was automatically updated to reflect the committed changes.

Hi,

In my out-of-tree target we are running DeadMachineInstructionElim after SSA, and now of course the new assert fails.

But you're saying that it needs SSA or it can cause miscompiles?

yeah, here one example

//
//    a: X:sub1 := ..
//    b: Y      := use(.., X, ..)
//    c: X:sub0 := ..
//       .. no use of neither X nor Y ..
//

when DIE traverses MIR in the reverse order, it won't remove (c) as X is still used by (b). But, when (b) is checked, as Y is not used, it's removed. Then, (a) will be removed since now there's no more use of X. However, after that, only (c) is left. but it is not full definition of X and there's no other definition dominating this def (in fact, a partial def follow a implicit use.)

Even without sub register, similar cases results in the invalid MIR as well, especially considering the PHI node in loop. If you have to DIE in RA phase, you may consider put it before phi-elimination. In general, after DeSSA, only COPYs should be removed and no other should be removed without carefully live range checking.

>>! In D72709#1823444, @uabelho wrote:
> Hi,
> 
> In my out-of-tree target we are running DeadMachineInstructionElim after SSA, and now of course the new assert fails.
> 
> But you're saying that it needs SSA or it can cause miscompiles?